All of lore.kernel.org
 help / color / mirror / Atom feed
* linux-next: Tree for Aug 22
@ 2017-08-22  8:38 Stephen Rothwell
  2017-08-22 10:47 ` possible circular locking dependency detected [was: linux-next: Tree for Aug 22] Sergey Senozhatsky
  2017-08-22 18:11 ` linux-next: Tree for Aug 22 Stephen Rothwell
  0 siblings, 2 replies; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22  8:38 UTC (permalink / raw)
  To: Linux-Next Mailing List; +Cc: Linux Kernel Mailing List

Hi all,

Changes since 20170817:

The btrfs-kdave tree gained a conflict against the btrfs tree.

The v4l-dvb tree gained a conflict against the arm-soc tree.

The net-next tree still had its build failure for which I reverted
a commit.  It also gained a conflict against the rockchip tree.

The l2-mtd tree gained a conflict against the kbuild-current tree.

The security tree lost its build failure.

The tip tree gained conflicts against the arm64 and iommu trees.

The rcu tree gained a conflict against the tip tree.

The staging tree gained a conflict against the spi tree.

The slave-dma tree gained a build failure so I used the version from
next-20170817.

The akpm-current tree gained conflicts against the parisc-hd and tip
trees.

The akpm tree lost 3 patches that turned up elsewhere.

Non-merge commits (relative to Linus' tree): 7901
 7918 files changed, 429533 insertions(+), 150834 deletions(-)

----------------------------------------------------------------------------

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And
finally, a simple boot test of the powerpc pseries_le_defconfig kernel
in qemu.

Below is a summary of the state of the merge.

I am currently merging 268 trees (counting Linus' and 41 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (6470812e2226 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fixes/master (b4b8cbf679c4 Cavium CNN55XX: fix broken default Kconfig entry)
Merging kbuild-current/fixes (2bfbe7881ee0 kbuild: Do not use hyphen in exported variable name)
Merging arc-current/for-curr (a8ec3ee861b6 arc: Mask individual IRQ lines during core INTC init)
Merging arm-current/fixes (1abd35023763 ARM: align .data section)
Merging m68k-current/for-linus (204a2be30a7a m68k: Remove ptrace_signal_deliver)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (5a69aec945d2 powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC)
Merging sparc/master (2dc77533f1e4 sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus())
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files)
Merging net/master (348a4002729c ipv6: repair fib6 tree in failure case)
Merging ipsec/master (4ff0308f06da esp: Fix error handling on layer 2 xmit.)
Merging netfilter/master (9beceb54fa2c netfilter: x_tables: Fix use-after-free in ipt_do_table.)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set)
Merging wireless-drivers/master (e9bf53ab1ee3 brcmfmac: feature check for multi-scheduled scan fails on bcm4343x devices)
Merging mac80211/master (d7f13f745036 cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES)
Merging sound-current/for-linus (dbd7396b4f24 ALSA: firewire-motu: destroy stream data surely at failure of card initialization)
Merging pci-current/for-linus (8466489ef5ba xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue)
Merging driver-core.current/driver-core-linus (ef954844c7ac Linux 4.13-rc5)
Merging tty.current/tty-linus (ef954844c7ac Linux 4.13-rc5)
Merging usb.current/usb-linus (ef954844c7ac Linux 4.13-rc5)
Merging usb-gadget-fixes/fixes (b7d44c36a6f6 usb: renesas_usbhs: gadget: fix unused-but-set-variable warning)
Merging usb-serial-fixes/usb-linus (fd1b8668af59 USB: serial: option: add D-Link DWM-222 device ID)
Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: check before accessing ci_role in ci_role_show)
Merging phy/fixes (5771a8c08880 Linux v4.13-rc1)
Merging staging.current/staging-linus (2c68888f1d76 Merge tag 'fixes-for-4.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus)
Merging char-misc.current/char-misc-linus (ef954844c7ac Linux 4.13-rc5)
Merging input-current/for-linus (d912366a59c5 Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365)
Merging crypto-current/master (8861249c740f crypto: x86/sha1 - Fix reads beyond the number of blocks passed)
Merging ide/master (b671e1703394 PNP: ide: constify pnp_device_id)
Merging vfio-fixes/for-linus (796b755066dd vfio/pci: Fix handling of RC integrated endpoint PCIe capability size)
Merging kselftest-fixes/fixes (622b2fbe625b selftests: timers: freq-step: fix compile error)
Merging backlight-fixes/for-backlight-fixes (68feaca0b13e backlight: pwm: Handle EPROBE_DEFER while requesting the PWM)
Merging ftrace-fixes/for-next-urgent (6224beb12e19 tracing: Have branch tracer use recursive field of task struct)
Merging nand-fixes/nand/fixes (ee02f73e04c0 mtd: nand: atmel: Fix EDO mode check)
Merging spi-nor-fixes/spi-nor/fixes (5771a8c08880 Linux v4.13-rc1)
Merging mfd-fixes/for-mfd-fixes (998b070a9d23 Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model")
Merging v4l-dvb-fixes/fixes (42b884edd6fa media: Revert "[media] v4l: async: make v4l2 coexist with devicetree nodes in a dt overlay")
Merging reset-fixes/reset/fixes (4497a224f759 reset: hi6220: Set module license so that it can be loaded)
Merging drm-intel-fixes/for-linux-next-fixes (31c1a7b8f966 drm/i915/cnl: Fix LSPCON support.)
Merging drm-misc-fixes/for-linux-next-fixes (a0ffc51e20e9 drm/atomic: If the atomic check fails, return its value first)
Merging kbuild/for-next (2bfbe7881ee0 kbuild: Do not use hyphen in exported variable name)
Merging uuid/for-next (832e4c83abc5 uuid: remove uuid_be)
Merging dma-mapping/for-next (ee7b1f31200d of: fix DMA mask generation)
Merging asm-generic/master (a351e9b9fc24 Linux 4.11)
Merging arc/for-next (dd6f7edfa2c1 ARC: [plat-eznps] handle extra aux regs #2: kernel/entry exit)
Merging arm/for-next (862f4c2c1e06 Merge branches 'fixes' and 'misc' into for-next)
Merging arm-perf/for-next/perf (e884f80cf2a7 arm64: perf: add support for Cortex-A35)
Merging arm-soc/for-next (2ec924fd5df2 arm-soc: document merges)
Merging actions/for-next (253d192bc4aa Merge branch 'v4.14/dt' into next)
Merging alpine/alpine/for-next (a1144b2b1ec4 ARM: dts: alpine: add valid clock-frequency values)
Merging amlogic/for-next (3a5fd3ca8f08 Merge branch 'v4.14/drivers' into tmp/aml-rebuild)
Merging aspeed/for-next (4944e5dbb215 Merge branches 'dt-for-v4.12' and 'defconfig-for-v4.12' into for-next)
Merging at91/at91-next (01c1fe77824e Merge branch 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into at91-next)
Merging bcm2835/for-next (e83e7335d059 Merge branch anholt/bcm2835-defconfig-next into for-next)
Merging berlin/berlin/for-next (5153351425c9 Merge branch 'berlin/dt' into berlin/for-next)
Merging cortex-m/for-next (f719a0d6a854 ARM: efm32: switch to vendor,device compatible strings)
Merging imx-mxs/for-next (b90b6c45ff6b Merge branch 'zte/dt64' into for-next)
Merging keystone/next (f922ce7bbcf5 Merge branch 'for_4.14/keystone_dts' into next)
Merging mvebu/for-next (2ca6aa7f6960 Merge branch 'mvebu/dt' into mvebu/for-next)
Merging omap/for-next (a70cb93b6c2f Merge branch 'omap-for-v4.14/dt-v3' into for-next)
Merging omap-pending/for-next (c20c8f750d9f ARM: OMAP2+: hwmod: fix _idle() hwmod state sanity check sequence)
Merging reset/reset/next (ac0c735ac3dd reset: uniphier: add analog amplifiers reset control)
Merging qcom/for-next (e471b255f412 Merge tag 'qcom-arm64-for-4.14' into last-for-4.14)
CONFLICT (content): Merge conflict in arch/arm64/boot/dts/qcom/msm8916.dtsi
Merging realtek/for-next (b18ab60120b9 Merge branch 'v4.14/dt64' into next)
Merging renesas/next (e6bf13206283 Merge branches 'arm64-dt-for-v4.14', 'dt-bindings-for-v4.14' and 'dt-for-v4.14' into next)
Merging rockchip/for-next (b2f212659879 Merge branch 'v4.14-armsoc/dts64' into for-next)
CONFLICT (content): Merge conflict in arch/arm/boot/dts/rk3288.dtsi
Merging rpi/for-rpi-next (bc0195aad0da Linux 4.2-rc2)
Merging samsung/for-next (2ea659a9ef48 Linux 4.12-rc1)
Merging samsung-krzk/for-next (86247b536cd0 Merge branch 'fixes' into for-next)
Merging sunxi/sunxi/for-next (edec428a0cd6 Merge branch 'sunxi/clk-for-4.14' into sunxi/for-next)
Merging tegra/for-next (b4aed883ebbe Merge branch for-4.14/arm/defconfig into for-next)
Merging arm64/for-next/core (a88ce63b642c arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores)
Merging clk/clk-next (8e7be401f2f5 Merge branch 'clk-fixes' into clk-next)
CONFLICT (content): Merge conflict in MAINTAINERS
Merging clk-samsung/for-next (599cebea93e6 clk: samsung: exynos542x: Enable clock rate propagation up to the EPLL)
Merging c6x/for-linux-next (eab89e215ae4 c6x: defconfig: Cleanup from old Kconfig options)
Merging cris/for-next (8f50f2a1b46a cris: No need to append -O2 and $(LINUXINCLUDE))
Merging h8300/h8300-next (58c57526711f h8300: Add missing include file to asm/io.h)
Merging hexagon/linux-next (02cc2ccfe771 Revert "Hexagon: fix signal.c compile error")
Merging ia64/next (fbb0e4da96f4 ia64: salinfo: use a waitqueue instead a sema down/up combo)
Merging m68k/for-next (558d5ad276c9 m68k/mac: Avoid soft-lockup warning after mach_power_off)
Merging m68knommu/for-next (6c25b5cc9968 m68k: allow NULL clock for clk_get_rate)
Merging metag/for-next (e3cd7f013bac metag/mm: Drop pointless increment)
Merging microblaze/next (14ef905bb2ee microblaze: Fix MSR flags when returning from exception)
Merging mips/mips-for-linux-next (aa9a357f236f Merge branches '4.13-fixes' and '4.13-fp-fixes' into mips-for-linux-next)
Merging nios2/for-next (e118c3fec9c0 nios2: remove custom early console implementation)
Merging openrisc/for-next (9d15eb228b10 openrisc: defconfig: Cleanup from old Kconfig options)
Merging parisc-hd/for-next (6850bdba9567 parisc: Enable UBSAN support)
Merging powerpc/next (516fa8d0e19d macintosh/rack-meter: Make of_device_ids const)
Merging fsl/next (61baf1555512 powerpc/64e: Don't place the stack beyond TASK_SIZE)
Merging mpc5xxx/next (39e69f55f857 powerpc: Introduce the use of the managed version of kzalloc)
Merging s390/features (a3c1a2194a7c s390/scm: use common completion path)
Merging sparc-next/master (cc324d1a3e35 Merge branch 'sparc64-16gb-hugepages')
Merging sh/for-next (6e2fbfdd585f sh: fix futex FUTEX_OP_SET op on userspace addresses)
Merging tile/master (637f23abca87 tile: array underflow in setup_maxnodemem())
Merging uml/linux-next (61e8d462457f um: Correctly check for PTRACE_GETRESET/SETREGSET)
Merging unicore32/unicore32 (bc27113620ca unicore32-oldabi: add oldabi syscall interface)
CONFLICT (content): Merge conflict in arch/unicore32/include/asm/Kbuild
Merging xtensa/xtensa-for-next (271335b9726e Merge branch 'xtensa-fixes' into xtensa-for-next)
Merging fscrypt/master (c250b7dd8e73 fscrypt: make ->dummy_context() return bool)
Merging befs/for-next (5771a8c08880 Linux v4.13-rc1)
Merging btrfs/next (ff108ad6692a crypto: Add zstd support)
Merging btrfs-kdave/for-next (0ade70aaa61f Merge branch 'for-next-next-v4.14-20170821' into for-next-20170821)
CONFLICT (content): Merge conflict in fs/btrfs/compression.h
Merging ceph/master (326547a91b3f ceph: fix readpage from fscache)
Merging cifs/for-next (14ccee78fc82 Linux 4.13-rc6)
Merging configfs/for-next (19e72d3abb63 configfs: Introduce config_item_get_unless_zero())
Merging ecryptfs/next (be280b25c328 ecryptfs: remove private bin2hex implementation)
Merging ext3/for_next (434aafb572d2 quota_v2: Delete an error message for a failed memory allocation in v2_read_file_info())
Merging ext4/dev (32aaf194201e ext4: add missing xattr hash update)
Merging f2fs/dev (a36c106dffb6 f2fs: use printk_ratelimited for f2fs_msg)
Merging freevxfs/for-next (bf1bb4b460c8 freevxfs: update Kconfig information)
Merging fscache/fscache (d52bd54db8be Merge branch 'akpm' (patches from Andrew))
Merging fuse/for-next (9183976ef1c8 fuse: set mapping error in writepage_locked when it fails)
Merging jfs/jfs-next (f070e5ac9bc7 jfs: preserve i_mode if __jfs_set_acl() fails)
Merging nfs/linux-next (7af7a5963c40 Merge branch 'bugfixes')
Merging nfsd/nfsd-next (b20dae70bfa5 svcrdma: fix an incorrect check on -E2BIG and -EINVAL)
Merging orangefs/for-next (2f713b5c7d2a orangefs: count directory pieces correctly)
Merging overlayfs/overlayfs-next (4edb83bb1041 ovl: constant d_ino for non-merge dirs)
Merging v9fs/for-next (a333e4bf2556 fs/9p: use fscache mutex rather than spinlock)
Merging ubifs/linux-next (a6664433d383 ubifs: Set double hash cookie also for RENAME_EXCHANGE)
Merging xfs/for-next (77aff8c76425 xfs: don't leak quotacheck dquots when cow recovery)
Merging file-locks/locks-next (3953704fde7e locks: restore a warn for leaked locks on close)
Merging wberr/wberr-next (6d4b51241394 ecryptfs: convert to file_write_and_wait in ->fsync)
Merging vfs/for-next (66bf97967726 annotate RWF_... flags)
Merging vfs-jk/vfs (030b533c4fd4 fs: Avoid premature clearing of capabilities)
Merging vfs-miklos/next (0eb8af4916a5 vfs: use helper for calling f_op->fsync())
Merging printk/for-next (077a1cc06f72 printk: Clean up do_syslog() error handling)
Merging pci/next (9b790d29aa6d Merge branch 'pci/virtualization' into next)
CONFLICT (content): Merge conflict in drivers/pci/probe.c
Merging pstore/for-next/pstore (c71b02e4d207 Revert "pstore: Honor dmesg_restrict sysctl on dmesg dumps")
Merging hid/for-next (abbc4db7a3a0 Merge branch 'for-4.14/upstream' into for-next)
Merging i2c/i2c/for-next (264170d12d9c Merge branch 'i2c/for-4.14' into i2c/for-next)
Merging jdelvare-hwmon/master (08d27eb20666 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging dmi/master (f996c4155d0d dmi: Mark all struct dmi_system_id instances const)
Merging hwmon-staging/hwmon-next (d5553c26198f hwmon: (aspeed-pwm) add THERMAL dependency)
Merging jc_docs/docs-next (8ac5ac1b0ef8 doc: linux-wpan: Change the old function names to the lastest function names)
Merging v4l-dvb/master (0779b8855c74 media: ddbridge: fix semicolon.cocci warnings)
CONFLICT (content): Merge conflict in arch/arm/configs/imx_v6_v7_defconfig
Merging v4l-dvb-next/master (474dfccf3685 media: svg: avoid too long lines)
Merging fbdev/fbdev-for-next (aa55457d26ec omapfb: constify omap_video_timings structures)
Merging pm/linux-next (1764ef1421da Merge branch 'device-properties' into linux-next)
CONFLICT (content): Merge conflict in drivers/acpi/sbs.c
Merging idle/next (306899f94804 x86 tsc: Add the Intel Denverton Processor to native_calibrate_tsc())
Merging thermal/next (1ef4bd4b47f3 Merge branch 'rockchip-rk3328' into next)
Merging thermal-soc/next (aa647852c32a Merge branch 'work-linus' into work-next)
Merging ieee1394/for-next (72f3c27aa646 firewire: net: max MTU off by one)
Merging dlm/next (1c2428537241 dlm: use sock_create_lite inside tcp_accept_from_sock)
Merging swiotlb/linux-next (69369f52d28a swiotlb-xen: implement xen_swiotlb_get_sgtable callback)
Merging net-next/master (e2a7c34fb285 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
CONFLICT (content): Merge conflict in arch/arm64/boot/dts/rockchip/rk3328.dtsi
CONFLICT (content): Merge conflict in arch/arm64/boot/dts/rockchip/rk3328-evb.dts
Applying: Revert "ARM: dts: rk3228-evb: Enable the integrated PHY for gmac"
Merging ipsec-next/master (077fbac405bf net: xfrm: support setting an output mark.)
Merging netfilter-next/master (6b5dc98e8fac netfilter: rt: add support to fetch path mss)
Merging nfc-next/master (bd751808f9ff NFC: trf7970a: Correct register settings for 27MHz clock)
Merging ipvs-next/master (a910d20aa007 netfilter: ipvs: Fix space before '[' error.)
Merging wireless-drivers-next/master (38ef62353acb rsi: security enhancements for AP mode)
Merging bluetooth/master (f00c2987bfdd ieee802154: ca8210: Fix a potential NULL pointer dereference)
Merging mac80211-next/master (3dc02251f43f Merge branch 'skb-accessor-cleanups')
Merging rdma/for-next (7a077bb0ac77 Merge branch 'k.o/net-next-base' into k.o/for-next-merged)
Merging gfs2/for-next (cc1dfa8b7571 gfs2: fix slab corruption during mounting and umounting gfs file system)
Merging mtd/master (9a51544774a5 mtd: blkdevs: Fix mtd block write failure)
Merging l2-mtd/master (75864b301c91 mtd: make device_type const)
CONFLICT (content): Merge conflict in include/asm-generic/vmlinux.lds.h
Merging nand/nand/next (4c88fd36b3ee Merge remote-tracking branch 'l2-mtd/nand/rename-header-file' into nand/next)
Merging spi-nor/spi-nor/next (18ba7101a4e9 mtd: spi-nor: fix "No newline at end of file")
Merging crypto/master (ef4064bb3f6a crypto: ccp - use dma_mapping_error to check map error)
Merging drm/drm-next (735f463af70e Merge tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel into drm-next)
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_display.c
Merging drm-panel/drm/panel/for-next (e4bac408b084 drm/panel: simple: Add support for Winstar WF35LTIACD)
Merging drm-intel/for-linux-next (1853a9daa19e drm/i915/dp: make is_edp non-static and rename to intel_dp_is_edp)
Merging drm-tegra/drm/tegra/for-next (ec73c4cfe7de drm/tegra: Prevent BOs from being freed during job submission)
Merging drm-misc/for-linux-next (0e8841ec7ee5 Merge airlied/drm-next into drm-misc-next)
Merging drm-exynos/exynos-drm/for-next (7d1e04231461 Merge tag 'usercopy-v4.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux)
Merging drm-msm/msm-next (d04fb3394460 drm/msm: Error pointer dereference in error handling)
CONFLICT (content): Merge conflict in drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
Merging hdlcd/for-upstream/hdlcd (fee4964f0a6c drm/arm: hdlcd: remove unused variables)
Merging mali-dp/for-upstream/mali-dp (e46477b8f23c drm: mali-dp: Restore commit_tail behaviour to the runtime_pm friendly version.)
Merging sunxi-drm/sunxi-drm/for-next (7dafb83edd32 Merge branches 'sunxi/drm-fixes-for-4.13' and 'sunxi/drm-for-4.14' into sunxi-drm/for-next)
Merging imx-drm/imx-drm/next (67299c4aa92c drm/imx: parallel-display: use correct connector enum)
Merging etnaviv/etnaviv/next (65375b873cb5 drm/etnaviv: switch GEM allocations to __GFP_RETRY_MAYFAIL)
Merging kconfig/for-next (6f7da290413b Linux 4.12)
Merging regmap/for-next (1874021ec3a8 Merge remote-tracking branches 'regmap/topic/const' and 'regmap/topic/namespace' into regmap-next)
Merging sound/for-next (3a93d082bacf ALSA: firewire-motu: add support for MOTU Audio Express)
Merging sound-asoc/for-next (68f99cc54f3a Merge remote-tracking branch 'asoc/topic/zte' into asoc-next)
Merging modules/modules-next (0bf8bf50eddc module: Remove const attribute from alias for MODULE_DEVICE_TABLE)
Merging input/next (663c8b55d0ab Input: wacom_w8001 - constify serio_device_id)
Merging block/for-next (c3f96cb082dc Merge branch 'for-4.14/block' into for-next)
Merging lightnvm/for-next (1c6286f26301 lightnvm: fix some error code in pblk-init.c)
Merging device-mapper/for-next (2298b6212f1c dm ioctl: constify ioctl lookup table)
Merging pcmcia/master (e8e68fd86d22 pcmcia: do not break rsrc_nonstatic when handling anonymous cards)
Merging mmc/next (36b5c68bd080 mmc: sunxi: Fix clock rate passed to sunxi_mmc_clk_set_phase)
Merging kgdb/kgdb-next (7a6653fca500 kdb: Fix handling of kallsyms_symbol_next() return value)
Merging md/for-next (52c757338a50 md/raid0: attach correct cgroup info in bio)
Merging mfd/for-mfd-next (f7e889c747eb mfd: Add ROHM BD9571MWV-M PMIC DT bindings)
Merging backlight/for-backlight-next (2606706e4d7b backlight: gpio_backlight: Delete pdata inversion)
Merging battery/for-next (810e006a82c9 power: supply: lp8788: Make several arrays static const * const)
Merging omap_dss2/for-next (c456a2f30de5 video: smscufx: remove unused variable)
Merging regulator/for-next (d7c8bc1b1d41 Merge remote-tracking branches 'regulator/topic/mtk' and 'regulator/topic/pwm' into regulator-next)
Merging security/next (08f49ffce052 tpm: ibmvtpm: simplify crq initialization and document crq format)
Merging integrity/next (abf75f56a408 ima: define "fs_unsafe" builtin policy)
Merging keys/keys-next (8ab2a6905b28 KEYS: Add documentation for asymmetric keyring restrictions)
Merging selinux/next (5d72801538eb lsm_audit: update my email address)
Merging tpmdd/next (08f49ffce052 tpm: ibmvtpm: simplify crq initialization and document crq format)
Merging watchdog/master (c013b65ad8a1 watchdog: introduce watchdog_worker_should_ping helper)
Merging iommu/next (b2f34dad02c7 Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next)
Merging dwmw2-iommu/master (910170442944 iommu/vt-d: Fix PASID table allocation)
Merging vfio/next (f203f7f1dbb2 vfio/type1: Give hardware MSI regions precedence)
Merging trivial/for-next (6fbc8798d946 tty: fix comment for __tty_alloc_driver())
Merging audit/next (4070ad3e8ffa audit: update the function comments)
Merging devicetree/for-next (50f9ddaf64e1 of: search scripts/dtc/include-prefixes path for both CPP and DTC)
Merging mailbox/mailbox-for-next (25bfee16d5a3 mailbox: Introduce Qualcomm APCS IPC driver)
Merging spi/for-next (b3f87d0c1aac Merge remote-tracking branches 'spi/topic/sunxi', 'spi/topic/tegra', 'spi/topic/tools' and 'spi/topic/xlp' into spi-next)
CONFLICT (content): Merge conflict in drivers/spi/spi.c
Merging tip/auto-latest (87507809e0d7 Merge branch 'x86/syscall')
CONFLICT (content): Merge conflict in kernel/sched/fair.c
CONFLICT (content): Merge conflict in drivers/video/fbdev/core/fbmem.c
CONFLICT (content): Merge conflict in drivers/iommu/amd_iommu_types.h
CONFLICT (content): Merge conflict in drivers/iommu/amd_iommu_proto.h
CONFLICT (content): Merge conflict in drivers/iommu/amd_iommu_init.c
CONFLICT (content): Merge conflict in drivers/iommu/amd_iommu.c
CONFLICT (content): Merge conflict in drivers/firmware/efi/libstub/arm64-stub.c
Merging clockevents/clockevents/next (2287d8664fe7 timers: Make the cpu base lock raw)
Merging edac/linux_next (345fb0a9a634 Merge tag 'edac_for_4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp)
Merging edac-amd/for-next (398443471f16 EDAC, mce_amd: Get rid of local var in amd_filter_mce())
Merging irqchip/irqchip/for-next (c1ae3cfa0e89 Linux 4.11-rc1)
Merging ftrace/for-next (a7e52ad7ed82 ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU)
Merging rcu/rcu/next (f42cc1fc1c7e rcutorture: Add interrupt-disable capability to stall-warning tests)
CONFLICT (content): Merge conflict in arch/x86/mm/tlb.c
Merging kvm/linux-next (e08d26f07125 KVM: x86: simplify ept_misconfig)
Merging kvm-arm/next (7e5a672289c9 KVM: arm/arm64: Handle hva aging while destroying the vm)
Merging kvm-mips/next (dc44abd6aad2 KVM: MIPS/Emulate: Properly implement TLBR for T&E)
Merging kvm-ppc/kvm-ppc-next (8b24e69fc47e KVM: PPC: Book3S HV: Close race with testing for signals on guest entry)
Merging kvms390/next (5091d7c8ca2d KVM: s390: Multiple Epoch Facility support)
Merging xen-tip/linux-next (bd830917233b paravirt,xen: remove xen_patch())
CONFLICT (content): Merge conflict in arch/x86/entry/entry_64.S
Merging percpu/for-next (5e81ee3e6a79 percpu: update header to contain bitmap allocator explanation.)
Merging workqueues/for-next (1fce1c2b4f51 Merge branch 'for-4.14' into for-next)
Merging drivers-x86/for-next (c801603e6d0c platform/x86: intel-vbtn: match power button on press rather than release)
Merging chrome-platform/for-next (69237e85ece7 platform/chrome: cros_ec: register shutdown function for debugfs)
Merging hsi/for-next (67ddd75771b6 HSI: core: Use kcalloc() in two functions)
Merging leds/for-next (5384fe0e7751 leds: lm3533: constify attribute_group structure)
Merging ipmi/for-next (1d51142219c3 ipmi: fix unsigned long underflow)
Merging driver-core/driver-core-next (f75f6ff2eaa6 Merge 4.13-rc5 into driver-core-next)
Merging usb/usb-next (a85c0f8db332 xhci: rework bus_resume and check ports are suspended before resuming them.)
Merging usb-gadget/next (0852659ef071 usb: gadget: f_ncm/u_ether: Move 'SKB reserve' quirk setup to u_ether)
Merging usb-serial/usb-next (5771a8c08880 Linux v4.13-rc1)
Merging usb-chipidea-next/ci-for-usb-next (f86c9d735b49 usb: chipidea: Add support for Tegra20/30/114/124)
Merging phy-next/next (d9c51f4c53ae phy: brcm-sata: fix a timeout test in init)
Merging tty/tty-next (cf0a1579dda4 Merge 4.13-rc5 into tty-next)
Merging char-misc/char-misc-next (6f3d791f3006 Drivers: hv: vmbus: Fix rescind handling issues)
Merging extcon/extcon-next (ab8a8fbe939d extcon: Use tab instead of space for indentation)
Merging staging/staging-next (a575de2da5e2 staging: rtlwifi: Reviewers fixes)
CONFLICT (content): Merge conflict in tools/Makefile
Merging mux/for-next (be9bac020cac i2c: mux: pinctrl: potential NULL dereference on error)
Merging slave-dma/next (0a0ab6497093 Merge branch 'topic/dmatest' into next)
$ git reset --hard HEAD^
Merging next-20170817 version of slave-dma
Merging cgroup/for-next (7358d47d4776 Merge branch 'for-4.14' into for-next)
Merging scsi/for-next (5fff7d3d4437 Merge branch 'fixes' into for-next)
Merging scsi-mkp/for-next (3dd58655f9c3 scsi: aha1542: constify pnp_device_id)
Merging target-updates/for-next (04229774f692 tcmu: Oops in unmap_thread_fn())
Merging target-merge/for-next-merge (2994a7518317 cxgb4: update Kconfig and Makefile)
Merging target-bva/for-next (2ea659a9ef48 Linux 4.12-rc1)
Merging libata/for-next (b82c949dbea2 Merge branch 'for-4.13-fixes' into for-next)
Merging binfmt_misc/for-next (4af75df6a410 binfmt_misc: add F option description to documentation)
Merging vhost/linux-next (c9379b24d6fc virtio_blk: fix incorrect message when disk is resized)
Merging rpmsg/for-next (99ec27115e39 Merge branches 'hwspinlock-next', 'rpmsg-next' and 'rproc-next' into for-next)
CONFLICT (content): Merge conflict in drivers/soc/qcom/Kconfig
Merging gpio/for-next (592338359064 Merge branch 'devel' into for-next)
Merging pinctrl/for-next (06351d133dea pinctrl: add a Gemini SoC pin controller)
Merging pinctrl-samsung/for-next (bbed85f45b2b pinctrl: samsung: Remove unneeded local variable initialization)
Merging pwm/for-next (7755daf5e7e8 Merge branch 'for-4.14/drivers' into for-next)
Merging dma-buf/for-next (194cad44c4e1 dma-buf/sync_file: improve Kconfig description for Sync Files)
CONFLICT (content): Merge conflict in drivers/dma-buf/Kconfig
Merging userns/for-next (64a76d0d64be signal: Fix sending signals with siginfo)
CONFLICT (content): Merge conflict in arch/mips/kernel/traps.c
Merging ktest/for-next (f7c6401ff84a ktest: Make sure wait_for_input does honor the timeout)
Merging random/dev (72e5c740f633 random: reorder READ_ONCE() in get_random_uXX)
Merging aio/master (4c1d69950da4 fs: aio: fix the increment of aio-nr and counting against aio-max-nr)
Merging kselftest/next (4db26f9e642b rtc: rtctest: Improve support detection)
Merging y2038/y2038 (69973b830859 Linux 4.9)
Merging luto-misc/next (2dcd0af568b0 Linux 4.6)
Merging borntraeger/linux-next (e76d21c40bd6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging livepatching/for-next (26d8d1e9bb48 Merge branch 'for-4.12/upstream-fixes' into for-next)
Merging coresight/next (6d34a5f3806d coresight: STM: Clean up __iomem type usage)
Merging rtc/rtc-next (2f7f1b780dd6 rtc: max8925: remove redundant check on ret)
Merging hwspinlock/for-next (bd5717a4632c hwspinlock: qcom: Correct msb in regmap_field)
Merging nvdimm/libnvdimm-for-next (bbb3be170ac2 device-dax: fix sysfs duplicate warnings)
Merging dax-misc/dax-misc (4d9a2c874667 dax: Remove i_mmap_lock protection)
Merging idr/idr-4.11 (f0f3f2d0a3e0 radix tree test suite: Specify -m32 in LDFLAGS too)
Merging kspp/for-next/kspp (cb624a11ecd5 Merge branch 'for-next/secureexec' into for-next/kspp)
Merging akpm-current/current (8cffa2081e42 ipc: optimize semget/shmget/msgget for lots of keys)
CONFLICT (content): Merge conflict in init/main.c
CONFLICT (content): Merge conflict in arch/parisc/include/uapi/asm/mman.h
$ git checkout -b akpm remotes/origin/akpm/master
Applying: fscache: fix fscache_objlist_show format processing
Applying: IB/mlx4: fix sprintf format warning
Applying: iopoll: avoid -Wint-in-bool-context warning
Applying: select: use get/put_timespec64
Applying: io_getevents: use timespec64 to represent timeouts
Applying: sparc64: NG4 memset 32 bits overflow
Applying: mm: treewide: remove GFP_TEMPORARY allocation flag
Applying: treewide-remove-gfp_temporary-allocation-flag-fix
Applying: treewide-remove-gfp_temporary-allocation-flag-checkpatch-fixes
Applying: drm/i915: fix up for mm: treewide: remove GFP_TEMPORARY allocation flag
Applying: lib/crc-ccitt: add CCITT-FALSE CRC16 variant
Merging akpm/master (a5e970ec039e lib/crc-ccitt: add CCITT-FALSE CRC16 variant)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-22  8:38 linux-next: Tree for Aug 22 Stephen Rothwell
@ 2017-08-22 10:47 ` Sergey Senozhatsky
  2017-08-22 21:43     ` Bart Van Assche
  2017-08-22 18:11 ` linux-next: Tree for Aug 22 Stephen Rothwell
  1 sibling, 1 reply; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-22 10:47 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: Jens Axboe, Martin K. Petersen, Stephen Rothwell,
	Linux-Next Mailing List, Linux Kernel Mailing List

Hello,

======================================================
WARNING: possible circular locking dependency detected
4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
------------------------------------------------------
fsck.ext4/148 is trying to acquire lock:
 (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190

 but now in release context of a crosslock acquired at the following:
 ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 ((complete)&wait#2){+.+.}:
       lock_acquire+0x176/0x19e
       __wait_for_common+0x50/0x1e3
       blk_execute_rq+0xbb/0xda
       scsi_execute+0xc3/0x17d [scsi_mod]
       sd_revalidate_disk+0x112/0x1549 [sd_mod]
       rescan_partitions+0x48/0x2c4
       __blkdev_get+0x14b/0x37c
       blkdev_get+0x191/0x2c0
       device_add_disk+0x2b4/0x3e5
       sd_probe_async+0xf8/0x17e [sd_mod]
       async_run_entry_fn+0x34/0xe0
       process_one_work+0x2af/0x4d1
       worker_thread+0x19a/0x24f
       kthread+0x133/0x13b
       ret_from_fork+0x27/0x40

 -> #0 (&bdev->bd_mutex){+.+.}:
       __blkdev_put+0x33/0x190
       blkdev_close+0x24/0x27
       __fput+0xee/0x18a
       task_work_run+0x79/0xa0
       prepare_exit_to_usermode+0x9b/0xb5

 other info that might help us debug this:
 Possible unsafe locking scenario by crosslock:
       CPU0                    CPU1
       ----                    ----
  lock(&bdev->bd_mutex);
  lock((complete)&wait#2);
                               lock(&bdev->bd_mutex);
                               unlock((complete)&wait#2);

  *** DEADLOCK ***
4 locks held by fsck.ext4/148:
 #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
 #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
 #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
 #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50

 stack backtrace:
CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
Call Trace:
 dump_stack+0x67/0x8e
 print_circular_bug+0x2a1/0x2af
 ? zap_class+0xc5/0xc5
 check_prev_add+0x76/0x20d
 ? __lock_acquire+0xc27/0xcc8
 lock_commit_crosslock+0x327/0x35e
 complete+0x24/0x50
 scsi_end_request+0x8d/0x176 [scsi_mod]
 scsi_io_completion+0x1be/0x423 [scsi_mod]
 __blk_mq_complete_request+0x112/0x131
 ata_scsi_simulate+0x212/0x218 [libata]
 __ata_scsi_queuecmd+0x1be/0x1de [libata]
 ata_scsi_queuecmd+0x41/0x74 [libata]
 scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
 scsi_queue_rq+0x1e0/0x26f [scsi_mod]
 blk_mq_dispatch_rq_list+0x193/0x2a7
 ? _raw_spin_unlock+0x2e/0x40
 blk_mq_sched_dispatch_requests+0x132/0x176
 __blk_mq_run_hw_queue+0x59/0xc5
 __blk_mq_delay_run_hw_queue+0x5f/0xc1
 blk_mq_flush_plug_list+0xfc/0x10b
 blk_flush_plug_list+0xc6/0x1eb
 blk_finish_plug+0x25/0x32
 generic_writepages+0x56/0x63
 do_writepages+0x36/0x70
 __filemap_fdatawrite_range+0x59/0x5f
 filemap_write_and_wait+0x19/0x4f
 __blkdev_put+0x5f/0x190
 blkdev_close+0x24/0x27
 __fput+0xee/0x18a
 task_work_run+0x79/0xa0
 prepare_exit_to_usermode+0x9b/0xb5
 entry_SYSCALL_64_fastpath+0xab/0xad
RIP: 0033:0x7ff5755a2f74
RSP: 002b:00007ffe46fce038 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000555ddeddded0 RCX: 00007ff5755a2f74
RDX: 0000000000001000 RSI: 0000555ddede2580 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000555ddede2580 R09: 0000555ddedde080
R10: 0000000108000000 R11: 0000000000000246 R12: 0000555ddedddfa0
R13: 00007ff576523680 R14: 0000000000001000 R15: 0000555ddeddc2b0

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22  8:38 linux-next: Tree for Aug 22 Stephen Rothwell
  2017-08-22 10:47 ` possible circular locking dependency detected [was: linux-next: Tree for Aug 22] Sergey Senozhatsky
@ 2017-08-22 18:11 ` Stephen Rothwell
  2017-08-22 18:14   ` Stephen Rothwell
  1 sibling, 1 reply; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22 18:11 UTC (permalink / raw)
  To: Linux-Next Mailing List
  Cc: Linux Kernel Mailing List, ppc-dev, Paul E. McKenney, Nicholas Piggin

Hi all,

This tree fails to boot on my qemu test.  2 boot logs attached.

Paul, Nick, is this the same/similar to the other RCU/lockup bug you
are chasing.  This is the first time I have seen this failure.

This qemu boot is in full emulation mode if I add --enable-kvm to the
qemu command, it does not fail to boot.  (the test just boots and then shuts down)
-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 18:11 ` linux-next: Tree for Aug 22 Stephen Rothwell
@ 2017-08-22 18:14   ` Stephen Rothwell
  2017-08-22 18:59       ` Paul E. McKenney
  0 siblings, 1 reply; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22 18:14 UTC (permalink / raw)
  To: Linux-Next Mailing List
  Cc: Linux Kernel Mailing List, ppc-dev, Paul E. McKenney, Nicholas Piggin

[-- Attachment #1: Type: text/plain, Size: 518 bytes --]

Hi all,

On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> This tree fails to boot on my qemu test.  2 boot logs attached.
> 
> Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> are chasing.  This is the first time I have seen this failure.
> 
> This qemu boot is in full emulation mode if I add --enable-kvm to the
> qemu command, it does not fail to boot.  (the test just boots and then shuts down)

Boot logs attached this time.
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: bad-log-1 --]
[-- Type: text/plain, Size: 15670 bytes --]

spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz


SLOF^[[0m^[[?25l **********************************************************************
^[[1mQEMU Starting
^[[0m Build Date = Jan  3 2017 22:22:01
 FW Version = buildd@ release 20161019
 Press "s" to enter Open Firmware.

^[[0m^[[?25hC0000
C0100
C0120
C0140
C0200
C0201
C0220
C0240
C0260
C02E0
C0300
C0320
C0340
C0360
C0370
C0380
C0371
C0372
C0373
C0374
C03F0
C0400
C0480
C04C0
C04D0
C0500
Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
C0580
C05A0
Populating /pci@800000020000000
C0600
C0640
C0690
C06A0
C06A8
C06B0
C06B8
C06C0
C06E0
C0700
C0800
C0880
No NVRAM common partition, re-initializing...
C0890
C08A0
C08A8
C08B0
Scanning USB 
C08C0
C08D0
Using default console: /vdevice/vty@71000000
C08E0
C08E8
Detected RAM kernel at 400000 (1071618 bytes) C08FF
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
Detected machine type: 0000000000000101
command line: 
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000001490000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000080000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000080000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000400000 ...
Page sizes from device-tree:
base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
Using 1TB segments
Initializing hash mmu with SLB
Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
Found initrd at 0xc000000001490000:0xc00000000165d70b
Using pSeries machine description
bootconsole [udbg0] enabled
Partition configured for 1 cpus.
CPU maps initialized for 1 thread per core
 -> smp_release_cpus()
spinning_secondaries = 0
 <- smp_release_cpus()
-----------------------------------------------------
ppc64_pft_size    = 0x18
phys_mem_size     = 0x80000000
dcache_bsize      = 0x80
icache_bsize      = 0x80
cpu_features      = 0x077c7a6c18500249
  possible        = 0x5fffffff18500649
  always          = 0x0000000018100040
cpu_user_features = 0xdc0065c2 0xae000000
mmu_features      = 0x7c006001
firmware_features = 0x00000001405a445f
htab_hash_mask    = 0x1ffff
-----------------------------------------------------
numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
PCI host bridge /pci@800000020000000  ranges:
  IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
 MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
 MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
PPC64 nvram contains 65536 bytes
Zone ranges:
  DMA      [mem 0x0000000000000000-0x000000007fffffff]
  DMA32    empty
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000007fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
Built 1 zonelists, mobility grouping on.  Total pages: 32736
Policy zone: DMA
Kernel command line: 
PID hash table entries: 4096 (order: -1, 32768 bytes)
Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Hierarchical RCU implementation.
	RCU event tracing is enabled.
	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
clocksource: timebase mult[1f40000] shift[24] registered
Console: colour dummy device 80x25
console [hvc0] enabled
console [hvc0] enabled
bootconsole [udbg0] disabled
bootconsole [udbg0] disabled
pid_max: default: 32768 minimum: 301
Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
EEH: pSeries platform initialized
POWER8 performance monitor hardware support registered
Hierarchical SRCU implementation.
smp: Bringing up secondary CPUs ...
smp: Brought up 1 node, 1 CPU
numa: Node 0 CPUs: 0
devtmpfs: initialized
random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 256 (order: -1, 32768 bytes)
NET: Registered protocol family 16
EEH: No capable adapters found
cpuidle: using governor menu
kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
random: fast init done
kworker/u2:0 (17) used greatest stack depth: 12352 bytes left
pstore: using zlib compression
pstore: Registered nvram as persistent store backend

Linux ppc64le
#2 SMP Tue Aug 2PCI: Probing PCI hardware
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
pci_bus 0000:00: root bus resource [bus 00-ff]
IOMMU table initialized, virtual merging enabled
HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
PTP clock support registered
clocksource: Switched to clocksource timebase
NET: Registered protocol family 2
TCP established hash table entries: 16384 (order: 1, 131072 bytes)
TCP bind hash table entries: 16384 (order: 2, 262144 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
UDP hash table entries: 2048 (order: 0, 65536 bytes)
UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Trying to unpack rootfs image as initramfs...
Freeing initrd memory: 1792K
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(1503390342.630:1): state=initialized audit_enabled=0 res=1
workingset: timestamp_bits=38 max_order=15 bucket_order=0
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
io scheduler mq-deadline registered
io scheduler kyber registered
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
brd: module loaded
loop: module loaded
ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
ibmvscsi 71000003: SRP_VERSION: 16.a
ibmvscsi 71000003: Maximum ID: 64 Maximum LUN: 32 Maximum Channel: 3
scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
ibmvscsi 71000003: partner initialization complete
ibmvscsi 71000003: host srp version: 16.a, host partition qemu (0), OS 2, max io 2097152
ibmvscsi 71000003: sent SRP login
ibmvscsi 71000003: SRP_LOGIN succeeded
scsi 0:0:2:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI: 5
sr 0:0:2:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
cdrom: Uniform CD-ROM driver Revision: 3.20
sr 0:0:2:0: Attached scsi generic sg0 type 5
libphy: Fixed MDIO Bus: probed
e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
e1000: Copyright (c) 1999-2006 Intel Corporation.
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
IR NEC protocol handler initialized
IR RC5(x/sz) protocol handler initialized
IR RC6 protocol handler initialized
IR JVC protocol handler initialized
IR Sony protocol handler initialized
IR SANYO protocol handler initialized
IR Sharp protocol handler initialized
IR MCE Keyboard/mouse protocol handler initialized
IR XMP protocol handler initialized
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.36.0-ioctl (2017-06-09) initialised: dm-devel@redhat.com
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
ipip: IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
Key type dns_resolver registered
registered taskstats version 1
console [netcon0] enabled
netconsole: network logging started
rtc-generic rtc-generic: setting system clock to 2017-08-22 08:25:43 UTC (1503390343)
Freeing unused kernel memory: 896K
This architecture does not have kernel memory protection.
INFO: rcu_sched self-detected stall on CPU
	0-...: (2100 ticks this GP) idle=026/140000000000001/0 softirq=1069/1069 fqs=0 
	 (t=2100 jiffies g=-66 c=-67 q=17)
rcu_sched kthread starved for 2100 jiffies! g18446744073709551550 c18446744073709551549 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
rcu_sched       R  running task    14192     8      2 0x00000800
Call Trace:
[c00000007e65f8d0] [c00000007e65f900] 0xc00000007e65f900 (unreliable)
[c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
[c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
[c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
[c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
[c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
[c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
NMI backtrace for cpu 0
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
Call Trace:
[c00000007e65b0e0] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
[c00000007e65b120] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
[c00000007e65b1b0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
[c00000007e65b240] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
[c00000007e65b260] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
[c00000007e65b2b0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
[c00000007e65b3e0] [c00000000017f34c] update_process_times+0x3c/0x90
[c00000007e65b410] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
[c00000007e65b440] [c000000000195cf8] tick_sched_timer+0x58/0xd0
[c00000007e65b480] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
[c00000007e65b500] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
[c00000007e65b5d0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
[c00000007e65b620] [c0000000000238c0] timer_interrupt+0xa0/0xe0
[c00000007e65b650] [c0000000000091c0] decrementer_common+0x150/0x160
--- interrupt: 901 at .L1\x0242+0x0/0x4
    LR = arch_local_irq_restore+0x74/0x90
[c00000007e65b940] [fed0895fbd054278] 0xfed0895fbd054278 (unreliable)
[c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
[c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
[c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
[c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
[c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
[c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
[c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
[c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
[c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
[c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
Modules linked in:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
task: c00000007e62e100 task.stack: c00000007e658000
NIP: c00000000000ad4c LR: c000000000015ae4 CTR: c00000000002d4f0
REGS: c00000007e65b6c0 TRAP: 0901   Not tainted  (4.13.0-rc6)
MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
  CR: 24000244  XER: 20000000
CFAR: c000000000334910 SOFTE: 1 
GPR00: c00000000002c688 c00000007e65b940 c000000000ea5a00 0000000000000900 
GPR04: 0000000000000001 000000007f0b0000 000000055f52d5ae 0000000000000000 
GPR08: c00000000fd40000 0000000000400000 0000000000400000 0000000000000000 
GPR12: 0000000028000222 c00000000fd40000 
NIP [c00000000000ad4c] .L1\x0242+0x0/0x4
LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
Call Trace:
[c00000007e65b940] [98a11310c2925282] 0x98a11310c2925282 (unreliable)
[c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
[c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
[c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
[c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
[c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
[c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
[c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
[c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
[c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
[c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
Instruction dump:
7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0 
60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349 
timeout waiting for login

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[-- Attachment #3: bad-log-2 --]
[-- Type: text/plain, Size: 13810 bytes --]

spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz


SLOF^[[0m^[[?25l **********************************************************************
^[[1mQEMU Starting
^[[0m Build Date = Jan  3 2017 22:22:01
 FW Version = buildd@ release 20161019
 Press "s" to enter Open Firmware.

^[[0m^[[?25hC0000
C0100
C0120
C0140
C0200
C0201
C0220
C0240
C0260
C02E0
C0300
C0320
C0340
C0360
C0370
C0380
C0371
C0372
C0373
C0374
C03F0
C0400
C0480
C04C0
C04D0
C0500
Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
C0580
C05A0
Populating /pci@800000020000000
C0600
C0640
C0690
C06A0
C06A8
C06B0
C06B8
C06C0
C06E0
C0700
C0800
C0880
No NVRAM common partition, re-initializing...
C0890
C08A0
C08A8
C08B0
Scanning USB 
C08C0
C08D0
Using default console: /vdevice/vty@71000000
C08E0
C08E8
Detected RAM kernel at 400000 (1071618 bytes) C08FF
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
Detected machine type: 0000000000000101
command line: 
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000001490000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000080000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000080000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000400000 ...
Page sizes from device-tree:
base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
Using 1TB segments
Initializing hash mmu with SLB
Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
Found initrd at 0xc000000001490000:0xc00000000165d70b
Using pSeries machine description
bootconsole [udbg0] enabled
Partition configured for 1 cpus.
CPU maps initialized for 1 thread per core
 -> smp_release_cpus()
spinning_secondaries = 0
 <- smp_release_cpus()
-----------------------------------------------------
ppc64_pft_size    = 0x18
phys_mem_size     = 0x80000000
dcache_bsize      = 0x80
icache_bsize      = 0x80
cpu_features      = 0x077c7a6c18500249
  possible        = 0x5fffffff18500649
  always          = 0x0000000018100040
cpu_user_features = 0xdc0065c2 0xae000000
mmu_features      = 0x7c006001
firmware_features = 0x00000001405a445f
htab_hash_mask    = 0x1ffff
-----------------------------------------------------
numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
PCI host bridge /pci@800000020000000  ranges:
  IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
 MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
 MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
PPC64 nvram contains 65536 bytes
Zone ranges:
  DMA      [mem 0x0000000000000000-0x000000007fffffff]
  DMA32    empty
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000007fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
Built 1 zonelists, mobility grouping on.  Total pages: 32736
Policy zone: DMA
Kernel command line: 
PID hash table entries: 4096 (order: -1, 32768 bytes)
Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Hierarchical RCU implementation.
	RCU event tracing is enabled.
	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
clocksource: timebase mult[1f40000] shift[24] registered
Console: colour dummy device 80x25
console [hvc0] enabled
console [hvc0] enabled
bootconsole [udbg0] disabled
bootconsole [udbg0] disabled
pid_max: default: 32768 minimum: 301
Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
EEH: pSeries platform initialized
POWER8 performance monitor hardware support registered
Hierarchical SRCU implementation.
smp: Bringing up secondary CPUs ...
smp: Brought up 1 node, 1 CPU
numa: Node 0 CPUs: 0
devtmpfs: initialized
random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 256 (order: -1, 32768 bytes)
NET: Registered protocol family 16
EEH: No capable adapters found
cpuidle: using governor menu
kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
random: fast init done
kworker/u2:0 (17) used greatest stack depth: 12192 bytes left
pstore: using zlib compression
pstore: Registered nvram as persistent store backend

Linux ppc64le
#2 SMP Tue Aug 2PCI: Probing PCI hardware
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
pci_bus 0000:00: root bus resource [bus 00-ff]
IOMMU table initialized, virtual merging enabled
HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
PTP clock support registered
clocksource: Switched to clocksource timebase
Watchdog CPU:0 Hard LOCKUP
Modules linked in:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
task: c00000007e62e100 task.stack: c00000007e658000
NIP: c00000000017bb68 LR: c00000000017bb68 CTR: c000000000106330
REGS: c00000003ffefd80 TRAP: 0900   Not tainted  (4.13.0-rc6)
MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
  CR: 24000802  XER: 00000000
CFAR: c0000000009dbc34 SOFTE: 0 
GPR00: c00000000017bb68 c00000007e65b9b0 c000000000ea5a00 0000000000000000 
GPR04: c00000007e65ba30 c00000007e160000 8000000000000000 c00000007e65bbe8 
GPR08: 0000000000000004 0000000000000000 0000000080000000 c00000007fe0fea8 
GPR12: c000000000106330 c00000000fd40000 c0000000001128b8 c00000007e150180 
GPR16: 0000000000000100 0000000004208040 c00000007e658000 0000000000000000 
GPR20: c000000000d74f00 c000000000ed3b00 00000000ffff8af5 000000000000000a 
GPR24: c000000000d74f00 c000000000d5fe80 c000000000eddbf8 c00000007e65ba30 
GPR28: c000000000d5ee00 c00000007e160048 c00000007fe0fe80 000000007fc80000 
NIP [c00000000017bb68] lock_timer_base+0x98/0xf0
LR [c00000000017bb68] lock_timer_base+0x98/0xf0
Call Trace:
[c00000007e65b9b0] [c00000000017bb68] lock_timer_base+0x98/0xf0 (unreliable)
[c00000007e65ba10] [c00000000017ed9c] mod_timer+0x2fc/0x350
[c00000007e65ba80] [c000000000106468] idle_worker_timeout+0x138/0x190
[c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
[c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
[c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
[c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
[c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
[c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
[c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
Instruction dump:
7be91ae8 4082ffec 7d5a482a 7be96fe3 7fdc5214 893e0025 2f890000 419e000c 
41820008 7fcaca14 7fc3f378 48860065 <60000000> f87b0000 7c641b78 7fc3f378 
INFO: rcu_sched self-detected stall on CPU
	0-...: (2100 ticks this GP) idle=002/140000000000001/0 softirq=183/183 fqs=0 
	 (t=2100 jiffies g=-278 c=-279 q=136)
rcu_sched kthread starved for 2100 jiffies! g18446744073709551338 c18446744073709551337 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
rcu_sched       R  running task    13216     8      2 0x00000800
Call Trace:
[c00000007e65f8d0] [c00000007e65fbd0] 0xc00000007e65fbd0 (unreliable)
[c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
[c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
[c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
[c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
[c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
[c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
NMI backtrace for cpu 0
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
Call Trace:
[c00000007e65b200] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
[c00000007e65b240] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
[c00000007e65b2d0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
[c00000007e65b360] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
[c00000007e65b380] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
[c00000007e65b3d0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
[c00000007e65b500] [c00000000017f34c] update_process_times+0x3c/0x90
[c00000007e65b530] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
[c00000007e65b560] [c000000000195cf8] tick_sched_timer+0x58/0xd0
[c00000007e65b5a0] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
[c00000007e65b620] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
[c00000007e65b6f0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
[c00000007e65b740] [c0000000000238c0] timer_interrupt+0xa0/0xe0
[c00000007e65b770] [c0000000000091c0] decrementer_common+0x150/0x160
--- interrupt: 901 at .L1\x0242+0x0/0x4
    LR = arch_local_irq_restore+0x74/0x90
[c00000007e65ba60] [c000000000ed3b00] jiffies+0x0/0x80 (unreliable)
[c00000007e65ba80] [c000000000106434] idle_worker_timeout+0x104/0x190
[c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
[c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
[c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
[c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
[c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
[c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
[c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
Modules linked in:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
task: c00000007e62e100 task.stack: c00000007e658000
NIP: c00000000000ad4c LR: c000000000015ae4 CTR: 0000000000000001
REGS: c00000007e65b8b0 TRAP: 0901   Not tainted  (4.13.0-rc6)
MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
  CR: 24000804  XER: 00000000
CFAR: c00000000017dbdc SOFTE: 1 
GPR00: c00000000017e0ec c00000007e65bb30 c000000000ea5a00 0000000000000900 
GPR04: 0000000000000001 c00000007e160000 8000000000000000 0000000000000000 
GPR08: c00000000fd40000 0000000000000012 0000000000000000 c00000007fe0fea8 
GPR12: c000000000106330 c00000000fd40000 
NIP [c00000000000ad4c] .L1\x0242+0x0/0x4
LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
Call Trace:
[c00000007e65bb30] [c000000000d74f00] irq_stat+0x0/0x80 (unreliable)
[c00000007e65bb50] [c00000000017e0ec] expire_timers+0x12c/0x1e0
[c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
[c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
[c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
[c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
[c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
[c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
Instruction dump:
7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0 
60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349 
timeout waiting for login

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 18:14   ` Stephen Rothwell
@ 2017-08-22 18:59       ` Paul E. McKenney
  0 siblings, 0 replies; 49+ messages in thread
From: Paul E. McKenney @ 2017-08-22 18:59 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >
> > This tree fails to boot on my qemu test.  2 boot logs attached.
> > 
> > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > are chasing.  This is the first time I have seen this failure.
> > 
> > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > qemu command, it does not fail to boot.  (the test just boots and then shuts down)
> 
> Boot logs attached this time.

That does not look good!

Given that the hard lockup happened during timer lock acquisition, I
have to ask if you built with lockdep...

							Thanx, Paul

> -- 
> Cheers,
> Stephen Rothwell

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz
> 
> 
> SLOF^[[0m^[[?25l **********************************************************************
> ^[[1mQEMU Starting
> ^[[0m Build Date = Jan  3 2017 22:22:01
>  FW Version = buildd@ release 20161019
>  Press "s" to enter Open Firmware.
> 
> ^[[0m^[[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
>        SCSI: Looking for devices
>           8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM common partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB 
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF     
>   Welcome to Open Firmware
> 
>   Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
>   This program and the accompanying materials are made available
>   under the terms of the BSD License available at
>   http://www.opensource.org/licenses/bsd-license.php
> 
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line: 
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000001490000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000080000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
> base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
> base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
> base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
> base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
> base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
> base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
>  -> smp_release_cpus()
> spinning_secondaries = 0
>  <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size    = 0x18
> phys_mem_size     = 0x80000000
> dcache_bsize      = 0x80
> icache_bsize      = 0x80
> cpu_features      = 0x077c7a6c18500249
>   possible        = 0x5fffffff18500649
>   always          = 0x0000000018100040
> cpu_user_features = 0xdc0065c2 0xae000000
> mmu_features      = 0x7c006001
> firmware_features = 0x00000001405a445f
> htab_hash_mask    = 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000  ranges:
>   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
>  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
>  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
> PPC64 nvram contains 65536 bytes
> Zone ranges:
>   DMA      [mem 0x0000000000000000-0x000000007fffffff]
>   DMA32    empty
>   Normal   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on.  Total pages: 32736
> Policy zone: DMA
> Kernel command line: 
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Hierarchical RCU implementation.
> 	RCU event tracing is enabled.
> 	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
> RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12352 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> NET: Registered protocol family 2
> TCP established hash table entries: 16384 (order: 1, 131072 bytes)
> TCP bind hash table entries: 16384 (order: 2, 262144 bytes)
> TCP: Hash tables configured (established 16384 bind 16384)
> UDP hash table entries: 2048 (order: 0, 65536 bytes)
> UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
> NET: Registered protocol family 1
> RPC: Registered named UNIX socket transport module.
> RPC: Registered udp transport module.
> RPC: Registered tcp transport module.
> RPC: Registered tcp NFSv4.1 backchannel transport module.
> Trying to unpack rootfs image as initramfs...
> Freeing initrd memory: 1792K
> audit: initializing netlink subsys (disabled)
> audit: type=2000 audit(1503390342.630:1): state=initialized audit_enabled=0 res=1
> workingset: timestamp_bits=38 max_order=15 bucket_order=0
> NFS: Registering the id_resolver key type
> Key type id_resolver registered
> Key type id_legacy registered
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
> io scheduler noop registered
> io scheduler deadline registered
> io scheduler cfq registered (default)
> io scheduler mq-deadline registered
> io scheduler kyber registered
> Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> brd: module loaded
> loop: module loaded
> ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
> ibmvscsi 71000003: SRP_VERSION: 16.a
> ibmvscsi 71000003: Maximum ID: 64 Maximum LUN: 32 Maximum Channel: 3
> scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
> ibmvscsi 71000003: partner initialization complete
> ibmvscsi 71000003: host srp version: 16.a, host partition qemu (0), OS 2, max io 2097152
> ibmvscsi 71000003: sent SRP login
> ibmvscsi 71000003: SRP_LOGIN succeeded
> scsi 0:0:2:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI: 5
> sr 0:0:2:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
> cdrom: Uniform CD-ROM driver Revision: 3.20
> sr 0:0:2:0: Attached scsi generic sg0 type 5
> libphy: Fixed MDIO Bus: probed
> e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> e100: Copyright(c) 1999-2006 Intel Corporation
> e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> e1000: Copyright (c) 1999-2006 Intel Corporation.
> e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-pci: EHCI PCI platform driver
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci-pci: OHCI PCI platform driver
> rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
> IR NEC protocol handler initialized
> IR RC5(x/sz) protocol handler initialized
> IR RC6 protocol handler initialized
> IR JVC protocol handler initialized
> IR Sony protocol handler initialized
> IR SANYO protocol handler initialized
> IR Sharp protocol handler initialized
> IR MCE Keyboard/mouse protocol handler initialized
> IR XMP protocol handler initialized
> device-mapper: uevent: version 1.0.3
> device-mapper: ioctl: 4.36.0-ioctl (2017-06-09) initialised: dm-devel@redhat.com
> usbcore: registered new interface driver usbhid
> usbhid: USB HID core driver
> ipip: IPv4 and MPLS over IPv4 tunneling driver
> NET: Registered protocol family 17
> Key type dns_resolver registered
> registered taskstats version 1
> console [netcon0] enabled
> netconsole: network logging started
> rtc-generic rtc-generic: setting system clock to 2017-08-22 08:25:43 UTC (1503390343)
> Freeing unused kernel memory: 896K
> This architecture does not have kernel memory protection.
> INFO: rcu_sched self-detected stall on CPU
> 	0-...: (2100 ticks this GP) idle=026/140000000000001/0 softirq=1069/1069 fqs=0 
> 	 (t=2100 jiffies g=-66 c=-67 q=17)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551550 c18446744073709551549 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> rcu_sched       R  running task    14192     8      2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65f900] 0xc00000007e65f900 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b0e0] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b120] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b1b0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
> [c00000007e65b240] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
> [c00000007e65b260] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b2b0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b3e0] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b410] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b440] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b480] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b500] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b5d0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b620] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b650] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L1\x0242+0x0/0x4
>     LR = arch_local_irq_restore+0x74/0x90
> [c00000007e65b940] [fed0895fbd054278] 0xfed0895fbd054278 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: c00000000002d4f0
> REGS: c00000007e65b6c0 TRAP: 0901   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000244  XER: 20000000
> CFAR: c000000000334910 SOFTE: 1 
> GPR00: c00000000002c688 c00000007e65b940 c000000000ea5a00 0000000000000900 
> GPR04: 0000000000000001 000000007f0b0000 000000055f52d5ae 0000000000000000 
> GPR08: c00000000fd40000 0000000000400000 0000000000400000 0000000000000000 
> GPR12: 0000000028000222 c00000000fd40000 
> NIP [c00000000000ad4c] .L1\x0242+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65b940] [98a11310c2925282] 0x98a11310c2925282 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0 
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349 
> timeout waiting for login
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /home/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.cpio.gz
> 
> 
> SLOF^[[0m^[[?25l **********************************************************************
> ^[[1mQEMU Starting
> ^[[0m Build Date = Jan  3 2017 22:22:01
>  FW Version = buildd@ release 20161019
>  Press "s" to enter Open Firmware.
> 
> ^[[0m^[[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
>        SCSI: Looking for devices
>           8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM common partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB 
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF     
>   Welcome to Open Firmware
> 
>   Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
>   This program and the accompanying materials are made available
>   under the terms of the BSD License available at
>   http://www.opensource.org/licenses/bsd-license.php
> 
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line: 
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000001490000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000080000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
> base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
> base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
> base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
> base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
> base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
> base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
>  -> smp_release_cpus()
> spinning_secondaries = 0
>  <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size    = 0x18
> phys_mem_size     = 0x80000000
> dcache_bsize      = 0x80
> icache_bsize      = 0x80
> cpu_features      = 0x077c7a6c18500249
>   possible        = 0x5fffffff18500649
>   always          = 0x0000000018100040
> cpu_user_features = 0xdc0065c2 0xae000000
> mmu_features      = 0x7c006001
> firmware_features = 0x00000001405a445f
> htab_hash_mask    = 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000  ranges:
>   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
>  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
>  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
> PPC64 nvram contains 65536 bytes
> Zone ranges:
>   DMA      [mem 0x0000000000000000-0x000000007fffffff]
>   DMA32    empty
>   Normal   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on.  Total pages: 32736
> Policy zone: DMA
> Kernel command line: 
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 2752K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Hierarchical RCU implementation.
> 	RCU event tracing is enabled.
> 	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
> RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with crng_init=0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12192 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> Watchdog CPU:0 Hard LOCKUP
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000017bb68 LR: c00000000017bb68 CTR: c000000000106330
> REGS: c00000003ffefd80 TRAP: 0900   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000802  XER: 00000000
> CFAR: c0000000009dbc34 SOFTE: 0 
> GPR00: c00000000017bb68 c00000007e65b9b0 c000000000ea5a00 0000000000000000 
> GPR04: c00000007e65ba30 c00000007e160000 8000000000000000 c00000007e65bbe8 
> GPR08: 0000000000000004 0000000000000000 0000000080000000 c00000007fe0fea8 
> GPR12: c000000000106330 c00000000fd40000 c0000000001128b8 c00000007e150180 
> GPR16: 0000000000000100 0000000004208040 c00000007e658000 0000000000000000 
> GPR20: c000000000d74f00 c000000000ed3b00 00000000ffff8af5 000000000000000a 
> GPR24: c000000000d74f00 c000000000d5fe80 c000000000eddbf8 c00000007e65ba30 
> GPR28: c000000000d5ee00 c00000007e160048 c00000007fe0fe80 000000007fc80000 
> NIP [c00000000017bb68] lock_timer_base+0x98/0xf0
> LR [c00000000017bb68] lock_timer_base+0x98/0xf0
> Call Trace:
> [c00000007e65b9b0] [c00000000017bb68] lock_timer_base+0x98/0xf0 (unreliable)
> [c00000007e65ba10] [c00000000017ed9c] mod_timer+0x2fc/0x350
> [c00000007e65ba80] [c000000000106468] idle_worker_timeout+0x138/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7be91ae8 4082ffec 7d5a482a 7be96fe3 7fdc5214 893e0025 2f890000 419e000c 
> 41820008 7fcaca14 7fc3f378 48860065 <60000000> f87b0000 7c641b78 7fc3f378 
> INFO: rcu_sched self-detected stall on CPU
> 	0-...: (2100 ticks this GP) idle=002/140000000000001/0 softirq=183/183 fqs=0 
> 	 (t=2100 jiffies g=-278 c=-279 q=136)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551338 c18446744073709551337 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> rcu_sched       R  running task    13216     8      2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65fbd0] 0xc00000007e65fbd0 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b200] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b240] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b2d0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0/0x220
> [c00000007e65b360] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20/0x40
> [c00000007e65b380] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b3d0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b500] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b530] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b560] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b5a0] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b620] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b6f0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b740] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b770] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L1\x0242+0x0/0x4
>     LR = arch_local_irq_restore+0x74/0x90
> [c00000007e65ba60] [c000000000ed3b00] jiffies+0x0/0x80 (unreliable)
> [c00000007e65ba80] [c000000000106434] idle_worker_timeout+0x104/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: 0000000000000001
> REGS: c00000007e65b8b0 TRAP: 0901   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000804  XER: 00000000
> CFAR: c00000000017dbdc SOFTE: 1 
> GPR00: c00000000017e0ec c00000007e65bb30 c000000000ea5a00 0000000000000900 
> GPR04: 0000000000000001 c00000007e160000 8000000000000000 0000000000000000 
> GPR08: c00000000fd40000 0000000000000012 0000000000000000 c00000007fe0fea8 
> GPR12: c000000000106330 c00000000fd40000 
> NIP [c00000000000ad4c] .L1\x0242+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65bb30] [c000000000d74f00] irq_stat+0x0/0x80 (unreliable)
> [c00000007e65bb50] [c00000000017e0ec] expire_timers+0x12c/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0 
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349 
> timeout waiting for login
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
@ 2017-08-22 18:59       ` Paul E. McKenney
  0 siblings, 0 replies; 49+ messages in thread
From: Paul E. McKenney @ 2017-08-22 18:59 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> Hi all,
>=20
> On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au=
> wrote:
> >
> > This tree fails to boot on my qemu test.  2 boot logs attached.
> >=20
> > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > are chasing.  This is the first time I have seen this failure.
> >=20
> > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > qemu command, it does not fail to boot.  (the test just boots and then =
shuts down)
>=20
> Boot logs attached this time.

That does not look good!

Given that the hard lockup happened during timer lock acquisition, I
have to ask if you built with lockdep...

							Thanx, Paul

> --=20
> Cheers,
> Stephen Rothwell

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /ho=
me/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.c=
pio.gz
>=20
>=20
> SLOF=1B[0m=1B[?25l ******************************************************=
****************
> =1B[1mQEMU Starting
> =1B[0m Build Date =3D Jan  3 2017 22:22:01
>  FW Version =3D buildd@ release 20161019
>  Press "s" to enter Open Firmware.
>=20
> =1B[0m=1B[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C032=
0C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Popu=
lating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
>        SCSI: Looking for devices
>           8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM comm=
on partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB=20
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF    =20
>   Welcome to Open Firmware
>=20
>   Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
>   This program and the accompanying materials are made available
>   under the terms of the BSD License available at
>   http://www.opensource.org/licenses/bsd-license.php
>=20
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version =
5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line:=20
> Max number of cores passed to firmware: 2048 (NR_CPUS =3D 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000001490000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000080000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=3D12: shift=3D12, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D0
> base_shift=3D12: shift=3D16, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D7
> base_shift=3D12: shift=3D24, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D56
> base_shift=3D16: shift=3D16, sllp=3D0x0110, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D1
> base_shift=3D16: shift=3D24, sllp=3D0x0110, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D8
> base_shift=3D24: shift=3D24, sllp=3D0x0100, avpnm=3D0x00000001, tlbiel=3D=
0, penc=3D0
> base_shift=3D34: shift=3D34, sllp=3D0x0120, avpnm=3D0x000007ff, tlbiel=3D=
0, penc=3D3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GC=
C)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
>  -> smp_release_cpus()
> spinning_secondaries =3D 0
>  <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size    =3D 0x18
> phys_mem_size     =3D 0x80000000
> dcache_bsize      =3D 0x80
> icache_bsize      =3D 0x80
> cpu_features      =3D 0x077c7a6c18500249
>   possible        =3D 0x5fffffff18500649
>   always          =3D 0x0000000018100040
> cpu_user_features =3D 0xdc0065c2 0xae000000
> mmu_features      =3D 0x7c006001
> firmware_features =3D 0x00000001405a445f
> htab_hash_mask    =3D 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000  ranges:
>   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
>  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000=20
>  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000=20
> PPC64 nvram contains 65536 bytes
> Zone ranges:
>   DMA      [mem 0x0000000000000000-0x000000007fffffff]
>   DMA32    empty
>   Normal   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on.  Total pages: 32736
> Policy zone: DMA
> Kernel command line:=20
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 27=
52K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=3D128, Order=3D0-3, MinObjects=3D0, CPUs=3D1, Nodes=3D1
> Hierarchical RCU implementation.
> 	RCU event tracing is enabled.
> 	RCU restricting CPUs from NR_CPUS=3D2048 to nr_cpu_ids=3D1.
> RCU: Adjusting geometry for rcu_fanout_leaf=3D16, nr_cpu_ids=3D1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007,=
 max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with cr=
ng_init=3D0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_n=
s: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12352 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x=
0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (b=
us address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <gio=
metti@linux.it>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> NET: Registered protocol family 2
> TCP established hash table entries: 16384 (order: 1, 131072 bytes)
> TCP bind hash table entries: 16384 (order: 2, 262144 bytes)
> TCP: Hash tables configured (established 16384 bind 16384)
> UDP hash table entries: 2048 (order: 0, 65536 bytes)
> UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
> NET: Registered protocol family 1
> RPC: Registered named UNIX socket transport module.
> RPC: Registered udp transport module.
> RPC: Registered tcp transport module.
> RPC: Registered tcp NFSv4.1 backchannel transport module.
> Trying to unpack rootfs image as initramfs...
> Freeing initrd memory: 1792K
> audit: initializing netlink subsys (disabled)
> audit: type=3D2000 audit(1503390342.630:1): state=3Dinitialized audit_ena=
bled=3D0 res=3D1
> workingset: timestamp_bits=3D38 max_order=3D15 bucket_order=3D0
> NFS: Registering the id_resolver key type
> Key type id_resolver registered
> Key type id_legacy registered
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
> io scheduler noop registered
> io scheduler deadline registered
> io scheduler cfq registered (default)
> io scheduler mq-deadline registered
> io scheduler kyber registered
> Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> brd: module loaded
> loop: module loaded
> ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
> ibmvscsi 71000003: SRP_VERSION: 16.a
> ibmvscsi 71000003: Maximum ID: 64 Maximum LUN: 32 Maximum Channel: 3
> scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
> ibmvscsi 71000003: partner initialization complete
> ibmvscsi 71000003: host srp version: 16.a, host partition qemu (0), OS 2,=
 max io 2097152
> ibmvscsi 71000003: sent SRP login
> ibmvscsi 71000003: SRP_LOGIN succeeded
> scsi 0:0:2:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI=
: 5
> sr 0:0:2:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
> cdrom: Uniform CD-ROM driver Revision: 3.20
> sr 0:0:2:0: Attached scsi generic sg0 type 5
> libphy: Fixed MDIO Bus: probed
> e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
> e100: Copyright(c) 1999-2006 Intel Corporation
> e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
> e1000: Copyright (c) 1999-2006 Intel Corporation.
> e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-pci: EHCI PCI platform driver
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci-pci: OHCI PCI platform driver
> rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
> IR NEC protocol handler initialized
> IR RC5(x/sz) protocol handler initialized
> IR RC6 protocol handler initialized
> IR JVC protocol handler initialized
> IR Sony protocol handler initialized
> IR SANYO protocol handler initialized
> IR Sharp protocol handler initialized
> IR MCE Keyboard/mouse protocol handler initialized
> IR XMP protocol handler initialized
> device-mapper: uevent: version 1.0.3
> device-mapper: ioctl: 4.36.0-ioctl (2017-06-09) initialised: dm-devel@red=
hat.com
> usbcore: registered new interface driver usbhid
> usbhid: USB HID core driver
> ipip: IPv4 and MPLS over IPv4 tunneling driver
> NET: Registered protocol family 17
> Key type dns_resolver registered
> registered taskstats version 1
> console [netcon0] enabled
> netconsole: network logging started
> rtc-generic rtc-generic: setting system clock to 2017-08-22 08:25:43 UTC =
(1503390343)
> Freeing unused kernel memory: 896K
> This architecture does not have kernel memory protection.
> INFO: rcu_sched self-detected stall on CPU
> 	0-...: (2100 ticks this GP) idle=3D026/140000000000001/0 softirq=3D1069/=
1069 fqs=3D0=20
> 	 (t=3D2100 jiffies g=3D-66 c=3D-67 q=3D17)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551550 c184467=
44073709551549 f0x0 RCU_GP_WAIT_FQS(3) ->state=3D0x0 ->cpu=3D0
> rcu_sched       R  running task    14192     8      2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65f900] 0xc00000007e65f900 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b0e0] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b120] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b1b0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0=
/0x220
> [c00000007e65b240] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20=
/0x40
> [c00000007e65b260] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b2b0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b3e0] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b410] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b440] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b480] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b500] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b5d0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b620] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b650] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L1=0242+0x0/0x4
>     LR =3D arch_local_irq_restore+0x74/0x90
> [c00000007e65b940] [fed0895fbd054278] 0xfed0895fbd054278 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: c00000000002d4f0
> REGS: c00000007e65b6c0 TRAP: 0901   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000244  XER: 20000000
> CFAR: c000000000334910 SOFTE: 1=20
> GPR00: c00000000002c688 c00000007e65b940 c000000000ea5a00 000000000000090=
0=20
> GPR04: 0000000000000001 000000007f0b0000 000000055f52d5ae 000000000000000=
0=20
> GPR08: c00000000fd40000 0000000000400000 0000000000400000 000000000000000=
0=20
> GPR12: 0000000028000222 c00000000fd40000=20
> NIP [c00000000000ad4c] .L1=0242+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65b940] [98a11310c2925282] 0x98a11310c2925282 (unreliable)
> [c00000007e65b960] [c00000000002c688] wd_smp_clear_cpu_pending+0x168/0x380
> [c00000007e65b9f0] [c00000000002d188] watchdog_timer_interrupt+0x68/0x370
> [c00000007e65ba90] [c00000000002d528] wd_timer_fn+0x38/0x60
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e268] run_timer_softirq+0xc8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0=
=20
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349=
=20
> timeout waiting for login
>=20
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD=20
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

> spawn qemu-system-ppc64 -M pseries -m 2G -vga none -nographic -kernel /ho=
me/sfr/next/powerpc_pseries_le_defconfig/vmlinux -initrd ./ppc64le-rootfs.c=
pio.gz
>=20
>=20
> SLOF=1B[0m=1B[?25l ******************************************************=
****************
> =1B[1mQEMU Starting
> =1B[0m Build Date =3D Jan  3 2017 22:22:01
>  FW Version =3D buildd@ release 20161019
>  Press "s" to enter Open Firmware.
>=20
> =1B[0m=1B[?25hC0000C0100C0120C0140C0200C0201C0220C0240C0260C02E0C0300C032=
0C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Popu=
lating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /vdevice/l-lan@71000002
> Populating /vdevice/v-scsi@71000003
>        SCSI: Looking for devices
>           8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
> C0580C05A0Populating /pci@800000020000000
> C0600C0640C0690C06A0C06A8C06B0C06B8C06C0C06E0C0700C0800C0880No NVRAM comm=
on partition, re-initializing...
> C0890C08A0C08A8C08B0Scanning USB=20
> C08C0C08D0Using default console: /vdevice/vty@71000000
> C08E0C08E8Detected RAM kernel at 400000 (1071618 bytes) C08FF    =20
>   Welcome to Open Firmware
>=20
>   Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
>   This program and the accompanying materials are made available
>   under the terms of the BSD License available at
>   http://www.opensource.org/licenses/bsd-license.php
>=20
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version =
5.2.1 20151008 (GCC)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Detected machine type: 0000000000000101
> command line:=20
> Max number of cores passed to firmware: 2048 (NR_CPUS =3D 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000001490000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000080000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x00000000016a0000 -> 0x00000000016a09df
> Device tree struct  0x00000000016b0000 -> 0x00000000016c0000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> Page sizes from device-tree:
> base_shift=3D12: shift=3D12, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D0
> base_shift=3D12: shift=3D16, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D7
> base_shift=3D12: shift=3D24, sllp=3D0x0000, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D56
> base_shift=3D16: shift=3D16, sllp=3D0x0110, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D1
> base_shift=3D16: shift=3D24, sllp=3D0x0110, avpnm=3D0x00000000, tlbiel=3D=
1, penc=3D8
> base_shift=3D24: shift=3D24, sllp=3D0x0100, avpnm=3D0x00000001, tlbiel=3D=
0, penc=3D0
> base_shift=3D34: shift=3D34, sllp=3D0x0120, avpnm=3D0x000007ff, tlbiel=3D=
0, penc=3D3
> Using 1TB segments
> Initializing hash mmu with SLB
> Linux version 4.13.0-rc6 (sfr@colugo-sfr) (gcc version 5.2.1 20151008 (GC=
C)) #2 SMP Tue Aug 22 18:19:21 AEST 2017
> Found initrd at 0xc000000001490000:0xc00000000165d70b
> Using pSeries machine description
> bootconsole [udbg0] enabled
> Partition configured for 1 cpus.
> CPU maps initialized for 1 thread per core
>  -> smp_release_cpus()
> spinning_secondaries =3D 0
>  <- smp_release_cpus()
> -----------------------------------------------------
> ppc64_pft_size    =3D 0x18
> phys_mem_size     =3D 0x80000000
> dcache_bsize      =3D 0x80
> icache_bsize      =3D 0x80
> cpu_features      =3D 0x077c7a6c18500249
>   possible        =3D 0x5fffffff18500649
>   always          =3D 0x0000000018100040
> cpu_user_features =3D 0xdc0065c2 0xae000000
> mmu_features      =3D 0x7c006001
> firmware_features =3D 0x00000001405a445f
> htab_hash_mask    =3D 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe2300-0x7ffebfff]
> PCI host bridge /pci@800000020000000  ranges:
>   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
>  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000=20
>  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000=20
> PPC64 nvram contains 65536 bytes
> Zone ranges:
>   DMA      [mem 0x0000000000000000-0x000000007fffffff]
>   DMA32    empty
>   Normal   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000000000000-0x000000007fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> percpu: Embedded 3 pages/cpu @c00000007fe00000 s158616 r0 d37992 u1048576
> Built 1 zonelists, mobility grouping on.  Total pages: 32736
> Policy zone: DMA
> Kernel command line:=20
> PID hash table entries: 4096 (order: -1, 32768 bytes)
> Memory: 2060800K/2097152K available (10112K kernel code, 1600K rwdata, 27=
52K rodata, 896K init, 1413K bss, 36352K reserved, 0K cma-reserved)
> SLUB: HWalign=3D128, Order=3D0-3, MinObjects=3D0, CPUs=3D1, Nodes=3D1
> Hierarchical RCU implementation.
> 	RCU event tracing is enabled.
> 	RCU restricting CPUs from NR_CPUS=3D2048 to nr_cpu_ids=3D1.
> RCU: Adjusting geometry for rcu_fanout_leaf=3D16, nr_cpu_ids=3D1
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007,=
 max_idle_ns: 440795202126 ns
> clocksource: timebase mult[1f40000] shift[24] registered
> Console: colour dummy device 80x25
> console [hvc0] enabled
> console [hvc0] enabled
> bootconsole [udbg0] disabled
> bootconsole [udbg0] disabled
> pid_max: default: 32768 minimum: 301
> Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
> Mount-cache hash table entries: 8192 (order: 0, 65536 bytes)
> Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes)
> EEH: pSeries platform initialized
> POWER8 performance monitor hardware support registered
> Hierarchical SRCU implementation.
> smp: Bringing up secondary CPUs ...
> smp: Brought up 1 node, 1 CPU
> numa: Node 0 CPUs: 0
> devtmpfs: initialized
> random: get_random_u32 called from bucket_table_alloc+0x144/0x380 with cr=
ng_init=3D0
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_n=
s: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 32768 bytes)
> NET: Registered protocol family 16
> EEH: No capable adapters found
> cpuidle: using governor menu
> kworker/u2:1 (27) used greatest stack depth: 13872 bytes left
> kworker/u2:1 (28) used greatest stack depth: 13584 bytes left
> random: fast init done
> kworker/u2:0 (17) used greatest stack depth: 12192 bytes left
> pstore: using zlib compression
> pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #2 SMP Tue Aug 2PCI: Probing PCI hardware
> PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x=
0000-0xffff])
> pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (b=
us address [0x80000000-0xffffffff])
> pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> pci_bus 0000:00: root bus resource [bus 00-ff]
> IOMMU table initialized, virtual merging enabled
> HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <gio=
metti@linux.it>
> PTP clock support registered
> clocksource: Switched to clocksource timebase
> Watchdog CPU:0 Hard LOCKUP
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000017bb68 LR: c00000000017bb68 CTR: c000000000106330
> REGS: c00000003ffefd80 TRAP: 0900   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000802  XER: 00000000
> CFAR: c0000000009dbc34 SOFTE: 0=20
> GPR00: c00000000017bb68 c00000007e65b9b0 c000000000ea5a00 000000000000000=
0=20
> GPR04: c00000007e65ba30 c00000007e160000 8000000000000000 c00000007e65bbe=
8=20
> GPR08: 0000000000000004 0000000000000000 0000000080000000 c00000007fe0fea=
8=20
> GPR12: c000000000106330 c00000000fd40000 c0000000001128b8 c00000007e15018=
0=20
> GPR16: 0000000000000100 0000000004208040 c00000007e658000 000000000000000=
0=20
> GPR20: c000000000d74f00 c000000000ed3b00 00000000ffff8af5 000000000000000=
a=20
> GPR24: c000000000d74f00 c000000000d5fe80 c000000000eddbf8 c00000007e65ba3=
0=20
> GPR28: c000000000d5ee00 c00000007e160048 c00000007fe0fe80 000000007fc8000=
0=20
> NIP [c00000000017bb68] lock_timer_base+0x98/0xf0
> LR [c00000000017bb68] lock_timer_base+0x98/0xf0
> Call Trace:
> [c00000007e65b9b0] [c00000000017bb68] lock_timer_base+0x98/0xf0 (unreliab=
le)
> [c00000007e65ba10] [c00000000017ed9c] mod_timer+0x2fc/0x350
> [c00000007e65ba80] [c000000000106468] idle_worker_timeout+0x138/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7be91ae8 4082ffec 7d5a482a 7be96fe3 7fdc5214 893e0025 2f890000 419e000c=
=20
> 41820008 7fcaca14 7fc3f378 48860065 <60000000> f87b0000 7c641b78 7fc3f378=
=20
> INFO: rcu_sched self-detected stall on CPU
> 	0-...: (2100 ticks this GP) idle=3D002/140000000000001/0 softirq=3D183/1=
83 fqs=3D0=20
> 	 (t=3D2100 jiffies g=3D-278 c=3D-279 q=3D136)
> rcu_sched kthread starved for 2100 jiffies! g18446744073709551338 c184467=
44073709551337 f0x0 RCU_GP_WAIT_FQS(3) ->state=3D0x0 ->cpu=3D0
> rcu_sched       R  running task    13216     8      2 0x00000800
> Call Trace:
> [c00000007e65f8d0] [c00000007e65fbd0] 0xc00000007e65fbd0 (unreliable)
> [c00000007e65faa0] [c00000000001b678] __switch_to+0x298/0x460
> [c00000007e65fb00] [c0000000009d5524] __schedule+0x3e4/0xab0
> [c00000007e65fbe0] [c0000000009d5c30] schedule+0x40/0xb0
> [c00000007e65fc10] [c0000000009da4bc] schedule_timeout+0x1fc/0x440
> [c00000007e65fcf0] [c0000000001754ac] rcu_gp_kthread+0x60c/0x1090
> [c00000007e65fdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65fe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> NMI backtrace for cpu 0
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> Call Trace:
> [c00000007e65b200] [c0000000009bbfa0] dump_stack+0xb0/0xf0 (unreliable)
> [c00000007e65b240] [c0000000009c4cb8] nmi_cpu_backtrace+0x208/0x210
> [c00000007e65b2d0] [c0000000009c4ea0] nmi_trigger_cpumask_backtrace+0x1e0=
/0x220
> [c00000007e65b360] [c00000000002d100] arch_trigger_cpumask_backtrace+0x20=
/0x40
> [c00000007e65b380] [c000000000177d00] rcu_dump_cpu_stacks+0xf4/0x164
> [c00000007e65b3d0] [c000000000177174] rcu_check_callbacks+0x994/0xaf0
> [c00000007e65b500] [c00000000017f34c] update_process_times+0x3c/0x90
> [c00000007e65b530] [c000000000195c0c] tick_sched_handle.isra.5+0x2c/0xc0
> [c00000007e65b560] [c000000000195cf8] tick_sched_timer+0x58/0xd0
> [c00000007e65b5a0] [c00000000017fdc8] __hrtimer_run_queues+0xf8/0x360
> [c00000007e65b620] [c000000000180d24] hrtimer_interrupt+0xf4/0x340
> [c00000007e65b6f0] [c0000000000231dc] __timer_interrupt+0x8c/0x270
> [c00000007e65b740] [c0000000000238c0] timer_interrupt+0xa0/0xe0
> [c00000007e65b770] [c0000000000091c0] decrementer_common+0x150/0x160
> --- interrupt: 901 at .L1=0242+0x0/0x4
>     LR =3D arch_local_irq_restore+0x74/0x90
> [c00000007e65ba60] [c000000000ed3b00] jiffies+0x0/0x80 (unreliable)
> [c00000007e65ba80] [c000000000106434] idle_worker_timeout+0x104/0x190
> [c00000007e65bac0] [c00000000017de28] call_timer_fn+0x58/0x1c0
> [c00000007e65bb50] [c00000000017e100] expire_timers+0x140/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [ksoftirqd/0:7]
> Modules linked in:
> CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.13.0-rc6 #2
> task: c00000007e62e100 task.stack: c00000007e658000
> NIP: c00000000000ad4c LR: c000000000015ae4 CTR: 0000000000000001
> REGS: c00000007e65b8b0 TRAP: 0901   Not tainted  (4.13.0-rc6)
> MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>
>   CR: 24000804  XER: 00000000
> CFAR: c00000000017dbdc SOFTE: 1=20
> GPR00: c00000000017e0ec c00000007e65bb30 c000000000ea5a00 000000000000090=
0=20
> GPR04: 0000000000000001 c00000007e160000 8000000000000000 000000000000000=
0=20
> GPR08: c00000000fd40000 0000000000000012 0000000000000000 c00000007fe0fea=
8=20
> GPR12: c000000000106330 c00000000fd40000=20
> NIP [c00000000000ad4c] .L1=0242+0x0/0x4
> LR [c000000000015ae4] arch_local_irq_restore+0x74/0x90
> Call Trace:
> [c00000007e65bb30] [c000000000d74f00] irq_stat+0x0/0x80 (unreliable)
> [c00000007e65bb50] [c00000000017e0ec] expire_timers+0x12c/0x1e0
> [c00000007e65bbc0] [c00000000017e348] run_timer_softirq+0x1a8/0x230
> [c00000007e65bc50] [c0000000009dc7f0] __do_softirq+0x170/0x3e4
> [c00000007e65bd40] [c0000000000eef9c] run_ksoftirqd+0x3c/0xb0
> [c00000007e65bd60] [c000000000118500] smpboot_thread_fn+0x290/0x2a0
> [c00000007e65bdc0] [c000000000112a10] kthread+0x160/0x1a0
> [c00000007e65be30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
> Instruction dump:
> 7d200026 618c8000 2c030900 4182e348 2c030500 4182dda0 2c030a00 4182ffc0=
=20
> 60000000 60000000 60000000 60000000 <4e800020> 7c781b78 48000331 48000349=
=20
> timeout waiting for login
>=20
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  BAD BAD BAD BAD BAD BAD BAD BAD BAD BAD=20
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 18:59       ` Paul E. McKenney
  (?)
@ 2017-08-22 19:12       ` Stephen Rothwell
  2017-08-22 19:32         ` Paul E. McKenney
  -1 siblings, 1 reply; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22 19:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

Hi Paul,

On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > Hi all,
> > 
> > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:  
> > >
> > > This tree fails to boot on my qemu test.  2 boot logs attached.
> > > 
> > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > are chasing.  This is the first time I have seen this failure.
> > > 
> > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > qemu command, it does not fail to boot.  (the test just boots and then shuts down)  
> > 
> > Boot logs attached this time.  
> 
> That does not look good!
> 
> Given that the hard lockup happened during timer lock acquisition, I
> have to ask if you built with lockdep...

$ grep LOCKDEP .config
CONFIG_LOCKDEP_SUPPORT=y

so, no.  This is just a powerpc pseries_le_defconfig build.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 19:12       ` Stephen Rothwell
@ 2017-08-22 19:32         ` Paul E. McKenney
  2017-08-22 19:36           ` Paul E. McKenney
  0 siblings, 1 reply; 49+ messages in thread
From: Paul E. McKenney @ 2017-08-22 19:32 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

On Wed, Aug 23, 2017 at 05:12:16AM +1000, Stephen Rothwell wrote:
> Hi Paul,
> 
> On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >
> > On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > > Hi all,
> > > 
> > > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:  
> > > >
> > > > This tree fails to boot on my qemu test.  2 boot logs attached.
> > > > 
> > > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > > are chasing.  This is the first time I have seen this failure.
> > > > 
> > > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > > qemu command, it does not fail to boot.  (the test just boots and then shuts down)  
> > > 
> > > Boot logs attached this time.  
> > 
> > That does not look good!
> > 
> > Given that the hard lockup happened during timer lock acquisition, I
> > have to ask if you built with lockdep...
> 
> $ grep LOCKDEP .config
> CONFIG_LOCKDEP_SUPPORT=y
> 
> so, no.  This is just a powerpc pseries_le_defconfig build.

This is without Nick's recent patch, I am guessing?

Hmmm...  My testing of that patch omitted lockdep as well.  Rerunning
on the full set of rcutorture scenarios...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 19:32         ` Paul E. McKenney
@ 2017-08-22 19:36           ` Paul E. McKenney
  2017-08-22 21:57             ` Stephen Rothwell
  0 siblings, 1 reply; 49+ messages in thread
From: Paul E. McKenney @ 2017-08-22 19:36 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

On Tue, Aug 22, 2017 at 12:32:31PM -0700, Paul E. McKenney wrote:
> On Wed, Aug 23, 2017 at 05:12:16AM +1000, Stephen Rothwell wrote:
> > Hi Paul,
> > 
> > On Tue, 22 Aug 2017 11:59:23 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > On Wed, Aug 23, 2017 at 04:14:24AM +1000, Stephen Rothwell wrote:
> > > > Hi all,
> > > > 
> > > > On Wed, 23 Aug 2017 04:11:17 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:  
> > > > >
> > > > > This tree fails to boot on my qemu test.  2 boot logs attached.
> > > > > 
> > > > > Paul, Nick, is this the same/similar to the other RCU/lockup bug you
> > > > > are chasing.  This is the first time I have seen this failure.
> > > > > 
> > > > > This qemu boot is in full emulation mode if I add --enable-kvm to the
> > > > > qemu command, it does not fail to boot.  (the test just boots and then shuts down)  
> > > > 
> > > > Boot logs attached this time.  
> > > 
> > > That does not look good!
> > > 
> > > Given that the hard lockup happened during timer lock acquisition, I
> > > have to ask if you built with lockdep...
> > 
> > $ grep LOCKDEP .config
> > CONFIG_LOCKDEP_SUPPORT=y
> > 
> > so, no.  This is just a powerpc pseries_le_defconfig build.
> 
> This is without Nick's recent patch, I am guessing?
> 
> Hmmm...  My testing of that patch omitted lockdep as well.  Rerunning
> on the full set of rcutorture scenarios...

To complete the thought, if you aren't already using it, I suggest
applying Nick's patch:

http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com

							Thanx, Paul

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-22 10:47 ` possible circular locking dependency detected [was: linux-next: Tree for Aug 22] Sergey Senozhatsky
  2017-08-22 21:43     ` Bart Van Assche
@ 2017-08-22 21:43     ` Bart Van Assche
  0 siblings, 0 replies; 49+ messages in thread
From: Bart Van Assche @ 2017-08-22 21:43 UTC (permalink / raw)
  To: byungchul.park
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next

T24gVHVlLCAyMDE3LTA4LTIyIGF0IDE5OjQ3ICswOTAwLCBTZXJnZXkgU2Vub3poYXRza3kgd3Jv
dGU6DQo+ID09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PQ0KPiBXQVJOSU5HOiBwb3NzaWJsZSBjaXJjdWxhciBsb2NraW5nIGRlcGVuZGVuY3kgZGV0
ZWN0ZWQNCj4gNC4xMy4wLXJjNi1uZXh0LTIwMTcwODIyLWRiZy0wMDAyMC1nMzk3NThlZDhhYWUw
LWRpcnR5ICMxNzQ2IE5vdCB0YWludGVkDQo+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KPiBmc2NrLmV4dDQvMTQ4IGlzIHRyeWluZyB0byBh
Y3F1aXJlIGxvY2s6DQo+ICAoJmJkZXYtPmJkX211dGV4KXsrLisufSwgYXQ6IFs8ZmZmZmZmZmY4
MTE2ZTczZT5dIF9fYmxrZGV2X3B1dCsweDMzLzB4MTkwDQo+IA0KPiAgYnV0IG5vdyBpbiByZWxl
YXNlIGNvbnRleHQgb2YgYSBjcm9zc2xvY2sgYWNxdWlyZWQgYXQgdGhlIGZvbGxvd2luZzoNCj4g
ICgoY29tcGxldGUpJndhaXQjMil7Ky4rLn0sIGF0OiBbPGZmZmZmZmZmODEyMTU5ZTA+XSBibGtf
ZXhlY3V0ZV9ycSsweGJiLzB4ZGENCj4gDQo+ICB3aGljaCBsb2NrIGFscmVhZHkgZGVwZW5kcyBv
biB0aGUgbmV3IGxvY2suDQo+IA0KPiAgdGhlIGV4aXN0aW5nIGRlcGVuZGVuY3kgY2hhaW4gKGlu
IHJldmVyc2Ugb3JkZXIpIGlzOg0KPiANCj4gIC0+ICMxICgoY29tcGxldGUpJndhaXQjMil7Ky4r
Ln06DQo+ICAgICAgICBsb2NrX2FjcXVpcmUrMHgxNzYvMHgxOWUNCj4gICAgICAgIF9fd2FpdF9m
b3JfY29tbW9uKzB4NTAvMHgxZTMNCj4gICAgICAgIGJsa19leGVjdXRlX3JxKzB4YmIvMHhkYQ0K
PiAgICAgICAgc2NzaV9leGVjdXRlKzB4YzMvMHgxN2QgW3Njc2lfbW9kXQ0KPiAgICAgICAgc2Rf
cmV2YWxpZGF0ZV9kaXNrKzB4MTEyLzB4MTU0OSBbc2RfbW9kXQ0KPiAgICAgICAgcmVzY2FuX3Bh
cnRpdGlvbnMrMHg0OC8weDJjNA0KPiAgICAgICAgX19ibGtkZXZfZ2V0KzB4MTRiLzB4MzdjDQo+
ICAgICAgICBibGtkZXZfZ2V0KzB4MTkxLzB4MmMwDQo+ICAgICAgICBkZXZpY2VfYWRkX2Rpc2sr
MHgyYjQvMHgzZTUNCj4gICAgICAgIHNkX3Byb2JlX2FzeW5jKzB4ZjgvMHgxN2UgW3NkX21vZF0N
Cj4gICAgICAgIGFzeW5jX3J1bl9lbnRyeV9mbisweDM0LzB4ZTANCj4gICAgICAgIHByb2Nlc3Nf
b25lX3dvcmsrMHgyYWYvMHg0ZDENCj4gICAgICAgIHdvcmtlcl90aHJlYWQrMHgxOWEvMHgyNGYN
Cj4gICAgICAgIGt0aHJlYWQrMHgxMzMvMHgxM2INCj4gICAgICAgIHJldF9mcm9tX2ZvcmsrMHgy
Ny8weDQwDQo+IA0KPiAgLT4gIzAgKCZiZGV2LT5iZF9tdXRleCl7Ky4rLn06DQo+ICAgICAgICBf
X2Jsa2Rldl9wdXQrMHgzMy8weDE5MA0KPiAgICAgICAgYmxrZGV2X2Nsb3NlKzB4MjQvMHgyNw0K
PiAgICAgICAgX19mcHV0KzB4ZWUvMHgxOGENCj4gICAgICAgIHRhc2tfd29ya19ydW4rMHg3OS8w
eGEwDQo+ICAgICAgICBwcmVwYXJlX2V4aXRfdG9fdXNlcm1vZGUrMHg5Yi8weGI1DQo+IA0KPiAg
b3RoZXIgaW5mbyB0aGF0IG1pZ2h0IGhlbHAgdXMgZGVidWcgdGhpczoNCj4gIFBvc3NpYmxlIHVu
c2FmZSBsb2NraW5nIHNjZW5hcmlvIGJ5IGNyb3NzbG9jazoNCj4gICAgICAgIENQVTAgICAgICAg
ICAgICAgICAgICAgIENQVTENCj4gICAgICAgIC0tLS0gICAgICAgICAgICAgICAgICAgIC0tLS0N
Cj4gICBsb2NrKCZiZGV2LT5iZF9tdXRleCk7DQo+ICAgbG9jaygoY29tcGxldGUpJndhaXQjMik7
DQo+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBsb2NrKCZiZGV2LT5iZF9tdXRleCk7
DQo+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB1bmxvY2soKGNvbXBsZXRlKSZ3YWl0
IzIpOw0KPiANCj4gICAqKiogREVBRExPQ0sgKioqDQo+IDQgbG9ja3MgaGVsZCBieSBmc2NrLmV4
dDQvMTQ4Og0KPiAgIzA6ICAoJmJkZXYtPmJkX211dGV4KXsrLisufSwgYXQ6IFs8ZmZmZmZmZmY4
MTE2ZTczZT5dIF9fYmxrZGV2X3B1dCsweDMzLzB4MTkwDQo+ICAjMTogIChyY3VfcmVhZF9sb2Nr
KXsuLi4ufSwgYXQ6IFs8ZmZmZmZmZmY4MTIxN2YxNj5dIHJjdV9sb2NrX2FjcXVpcmUrMHgwLzB4
MjANCj4gICMyOiAgKCYoJmhvc3QtPmxvY2spLT5ybG9jayl7LS4tLn0sIGF0OiBbPGZmZmZmZmZm
YTAwZTc1NTA+XSBhdGFfc2NzaV9xdWV1ZWNtZCsweDIzLzB4NzQgW2xpYmF0YV0NCj4gICMzOiAg
KCZ4LT53YWl0IzE0KXstLi4ufSwgYXQ6IFs8ZmZmZmZmZmY4MTA2YjU5Mz5dIGNvbXBsZXRlKzB4
MTgvMHg1MA0KPiANCj4gIHN0YWNrIGJhY2t0cmFjZToNCj4gQ1BVOiAxIFBJRDogMTQ4IENvbW06
IGZzY2suZXh0NCBOb3QgdGFpbnRlZCA0LjEzLjAtcmM2LW5leHQtMjAxNzA4MjItZGJnLTAwMDIw
LWczOTc1OGVkOGFhZTAtZGlydHkgIzE3NDYNCj4gQ2FsbCBUcmFjZToNCj4gIGR1bXBfc3RhY2sr
MHg2Ny8weDhlDQo+ICBwcmludF9jaXJjdWxhcl9idWcrMHgyYTEvMHgyYWYNCj4gID8gemFwX2Ns
YXNzKzB4YzUvMHhjNQ0KPiAgY2hlY2tfcHJldl9hZGQrMHg3Ni8weDIwZA0KPiAgPyBfX2xvY2tf
YWNxdWlyZSsweGMyNy8weGNjOA0KPiAgbG9ja19jb21taXRfY3Jvc3Nsb2NrKzB4MzI3LzB4MzVl
DQo+ICBjb21wbGV0ZSsweDI0LzB4NTANCj4gIHNjc2lfZW5kX3JlcXVlc3QrMHg4ZC8weDE3NiBb
c2NzaV9tb2RdDQo+ICBzY3NpX2lvX2NvbXBsZXRpb24rMHgxYmUvMHg0MjMgW3Njc2lfbW9kXQ0K
PiAgX19ibGtfbXFfY29tcGxldGVfcmVxdWVzdCsweDExMi8weDEzMQ0KPiAgYXRhX3Njc2lfc2lt
dWxhdGUrMHgyMTIvMHgyMTggW2xpYmF0YV0NCj4gIF9fYXRhX3Njc2lfcXVldWVjbWQrMHgxYmUv
MHgxZGUgW2xpYmF0YV0NCj4gIGF0YV9zY3NpX3F1ZXVlY21kKzB4NDEvMHg3NCBbbGliYXRhXQ0K
PiAgc2NzaV9kaXNwYXRjaF9jbWQrMHgxOTQvMHgyYWYgW3Njc2lfbW9kXQ0KPiAgc2NzaV9xdWV1
ZV9ycSsweDFlMC8weDI2ZiBbc2NzaV9tb2RdDQo+ICBibGtfbXFfZGlzcGF0Y2hfcnFfbGlzdCsw
eDE5My8weDJhNw0KPiAgPyBfcmF3X3NwaW5fdW5sb2NrKzB4MmUvMHg0MA0KPiAgYmxrX21xX3Nj
aGVkX2Rpc3BhdGNoX3JlcXVlc3RzKzB4MTMyLzB4MTc2DQo+ICBfX2Jsa19tcV9ydW5faHdfcXVl
dWUrMHg1OS8weGM1DQo+ICBfX2Jsa19tcV9kZWxheV9ydW5faHdfcXVldWUrMHg1Zi8weGMxDQo+
ICBibGtfbXFfZmx1c2hfcGx1Z19saXN0KzB4ZmMvMHgxMGINCj4gIGJsa19mbHVzaF9wbHVnX2xp
c3QrMHhjNi8weDFlYg0KPiAgYmxrX2ZpbmlzaF9wbHVnKzB4MjUvMHgzMg0KPiAgZ2VuZXJpY193
cml0ZXBhZ2VzKzB4NTYvMHg2Mw0KPiAgZG9fd3JpdGVwYWdlcysweDM2LzB4NzANCj4gIF9fZmls
ZW1hcF9mZGF0YXdyaXRlX3JhbmdlKzB4NTkvMHg1Zg0KPiAgZmlsZW1hcF93cml0ZV9hbmRfd2Fp
dCsweDE5LzB4NGYNCj4gIF9fYmxrZGV2X3B1dCsweDVmLzB4MTkwDQo+ICBibGtkZXZfY2xvc2Ur
MHgyNC8weDI3DQo+ICBfX2ZwdXQrMHhlZS8weDE4YQ0KPiAgdGFza193b3JrX3J1bisweDc5LzB4
YTANCj4gIHByZXBhcmVfZXhpdF90b191c2VybW9kZSsweDliLzB4YjUNCj4gIGVudHJ5X1NZU0NB
TExfNjRfZmFzdHBhdGgrMHhhYi8weGFkDQoNCkJ5dW5nY2h1bCwgZGlkIHlvdSBhZGQgdGhlIGNy
b3NzbG9jayBjaGVja3MgdG8gbG9ja2RlcD8gQ2FuIHlvdSBoYXZlIGEgbG9vayBhdA0KdGhlIGFi
b3ZlIHJlcG9ydD8gVGhhdCByZXBvcnQgbmFtZWx5IGRvZXNuJ3QgbWFrZSBzZW5zZSB0byBtZS4N
Cg0KQmFydC4=

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-22 21:43     ` Bart Van Assche
  0 siblings, 0 replies; 49+ messages in thread
From: Bart Van Assche @ 2017-08-22 21:43 UTC (permalink / raw)
  To: byungchul.park
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next

On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> ------------------------------------------------------
> fsck.ext4/148 is trying to acquire lock:
>  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> 
>  but now in release context of a crosslock acquired at the following:
>  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> 
>  which lock already depends on the new lock.
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 ((complete)&wait#2){+.+.}:
>        lock_acquire+0x176/0x19e
>        __wait_for_common+0x50/0x1e3
>        blk_execute_rq+0xbb/0xda
>        scsi_execute+0xc3/0x17d [scsi_mod]
>        sd_revalidate_disk+0x112/0x1549 [sd_mod]
>        rescan_partitions+0x48/0x2c4
>        __blkdev_get+0x14b/0x37c
>        blkdev_get+0x191/0x2c0
>        device_add_disk+0x2b4/0x3e5
>        sd_probe_async+0xf8/0x17e [sd_mod]
>        async_run_entry_fn+0x34/0xe0
>        process_one_work+0x2af/0x4d1
>        worker_thread+0x19a/0x24f
>        kthread+0x133/0x13b
>        ret_from_fork+0x27/0x40
> 
>  -> #0 (&bdev->bd_mutex){+.+.}:
>        __blkdev_put+0x33/0x190
>        blkdev_close+0x24/0x27
>        __fput+0xee/0x18a
>        task_work_run+0x79/0xa0
>        prepare_exit_to_usermode+0x9b/0xb5
> 
>  other info that might help us debug this:
>  Possible unsafe locking scenario by crosslock:
>        CPU0                    CPU1
>        ----                    ----
>   lock(&bdev->bd_mutex);
>   lock((complete)&wait#2);
>                                lock(&bdev->bd_mutex);
>                                unlock((complete)&wait#2);
> 
>   *** DEADLOCK ***
> 4 locks held by fsck.ext4/148:
>  #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>  #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
>  #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
>  #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> 
>  stack backtrace:
> CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> Call Trace:
>  dump_stack+0x67/0x8e
>  print_circular_bug+0x2a1/0x2af
>  ? zap_class+0xc5/0xc5
>  check_prev_add+0x76/0x20d
>  ? __lock_acquire+0xc27/0xcc8
>  lock_commit_crosslock+0x327/0x35e
>  complete+0x24/0x50
>  scsi_end_request+0x8d/0x176 [scsi_mod]
>  scsi_io_completion+0x1be/0x423 [scsi_mod]
>  __blk_mq_complete_request+0x112/0x131
>  ata_scsi_simulate+0x212/0x218 [libata]
>  __ata_scsi_queuecmd+0x1be/0x1de [libata]
>  ata_scsi_queuecmd+0x41/0x74 [libata]
>  scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
>  scsi_queue_rq+0x1e0/0x26f [scsi_mod]
>  blk_mq_dispatch_rq_list+0x193/0x2a7
>  ? _raw_spin_unlock+0x2e/0x40
>  blk_mq_sched_dispatch_requests+0x132/0x176
>  __blk_mq_run_hw_queue+0x59/0xc5
>  __blk_mq_delay_run_hw_queue+0x5f/0xc1
>  blk_mq_flush_plug_list+0xfc/0x10b
>  blk_flush_plug_list+0xc6/0x1eb
>  blk_finish_plug+0x25/0x32
>  generic_writepages+0x56/0x63
>  do_writepages+0x36/0x70
>  __filemap_fdatawrite_range+0x59/0x5f
>  filemap_write_and_wait+0x19/0x4f
>  __blkdev_put+0x5f/0x190
>  blkdev_close+0x24/0x27
>  __fput+0xee/0x18a
>  task_work_run+0x79/0xa0
>  prepare_exit_to_usermode+0x9b/0xb5
>  entry_SYSCALL_64_fastpath+0xab/0xad

Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
the above report? That report namely doesn't make sense to me.

Bart.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-22 21:43     ` Bart Van Assche
  0 siblings, 0 replies; 49+ messages in thread
From: Bart Van Assche @ 2017-08-22 21:43 UTC (permalink / raw)
  To: byungchul.park
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next

On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> ------------------------------------------------------
> fsck.ext4/148 is trying to acquire lock:
>  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> 
>  but now in release context of a crosslock acquired at the following:
>  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> 
>  which lock already depends on the new lock.
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 ((complete)&wait#2){+.+.}:
>        lock_acquire+0x176/0x19e
>        __wait_for_common+0x50/0x1e3
>        blk_execute_rq+0xbb/0xda
>        scsi_execute+0xc3/0x17d [scsi_mod]
>        sd_revalidate_disk+0x112/0x1549 [sd_mod]
>        rescan_partitions+0x48/0x2c4
>        __blkdev_get+0x14b/0x37c
>        blkdev_get+0x191/0x2c0
>        device_add_disk+0x2b4/0x3e5
>        sd_probe_async+0xf8/0x17e [sd_mod]
>        async_run_entry_fn+0x34/0xe0
>        process_one_work+0x2af/0x4d1
>        worker_thread+0x19a/0x24f
>        kthread+0x133/0x13b
>        ret_from_fork+0x27/0x40
> 
>  -> #0 (&bdev->bd_mutex){+.+.}:
>        __blkdev_put+0x33/0x190
>        blkdev_close+0x24/0x27
>        __fput+0xee/0x18a
>        task_work_run+0x79/0xa0
>        prepare_exit_to_usermode+0x9b/0xb5
> 
>  other info that might help us debug this:
>  Possible unsafe locking scenario by crosslock:
>        CPU0                    CPU1
>        ----                    ----
>   lock(&bdev->bd_mutex);
>   lock((complete)&wait#2);
>                                lock(&bdev->bd_mutex);
>                                unlock((complete)&wait#2);
> 
>   *** DEADLOCK ***
> 4 locks held by fsck.ext4/148:
>  #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>  #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
>  #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
>  #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> 
>  stack backtrace:
> CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> Call Trace:
>  dump_stack+0x67/0x8e
>  print_circular_bug+0x2a1/0x2af
>  ? zap_class+0xc5/0xc5
>  check_prev_add+0x76/0x20d
>  ? __lock_acquire+0xc27/0xcc8
>  lock_commit_crosslock+0x327/0x35e
>  complete+0x24/0x50
>  scsi_end_request+0x8d/0x176 [scsi_mod]
>  scsi_io_completion+0x1be/0x423 [scsi_mod]
>  __blk_mq_complete_request+0x112/0x131
>  ata_scsi_simulate+0x212/0x218 [libata]
>  __ata_scsi_queuecmd+0x1be/0x1de [libata]
>  ata_scsi_queuecmd+0x41/0x74 [libata]
>  scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
>  scsi_queue_rq+0x1e0/0x26f [scsi_mod]
>  blk_mq_dispatch_rq_list+0x193/0x2a7
>  ? _raw_spin_unlock+0x2e/0x40
>  blk_mq_sched_dispatch_requests+0x132/0x176
>  __blk_mq_run_hw_queue+0x59/0xc5
>  __blk_mq_delay_run_hw_queue+0x5f/0xc1
>  blk_mq_flush_plug_list+0xfc/0x10b
>  blk_flush_plug_list+0xc6/0x1eb
>  blk_finish_plug+0x25/0x32
>  generic_writepages+0x56/0x63
>  do_writepages+0x36/0x70
>  __filemap_fdatawrite_range+0x59/0x5f
>  filemap_write_and_wait+0x19/0x4f
>  __blkdev_put+0x5f/0x190
>  blkdev_close+0x24/0x27
>  __fput+0xee/0x18a
>  task_work_run+0x79/0xa0
>  prepare_exit_to_usermode+0x9b/0xb5
>  entry_SYSCALL_64_fastpath+0xab/0xad

Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
the above report? That report namely doesn't make sense to me.

Bart.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 19:36           ` Paul E. McKenney
@ 2017-08-22 21:57             ` Stephen Rothwell
  2017-08-22 22:27               ` Stephen Rothwell
  0 siblings, 1 reply; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22 21:57 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

Hi Paul,

On Tue, 22 Aug 2017 12:36:20 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> To complete the thought, if you aren't already using it, I suggest
> applying Nick's patch:
> 
> http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com

OK, I applied that - with a little shoehorning due to commit

  71acb768f5b3 ("timers: Fix excessive granularity of new timers after a nohz idle")

from your cru tree.

my qemu test now boots and shuts down fine.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: linux-next: Tree for Aug 22
  2017-08-22 21:57             ` Stephen Rothwell
@ 2017-08-22 22:27               ` Stephen Rothwell
  0 siblings, 0 replies; 49+ messages in thread
From: Stephen Rothwell @ 2017-08-22 22:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, ppc-dev,
	Nicholas Piggin

Hi Paul,

On Wed, 23 Aug 2017 07:57:05 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Tue, 22 Aug 2017 12:36:20 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >
> > To complete the thought, if you aren't already using it, I suggest
> > applying Nick's patch:
> > 
> > http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com  
> 
> OK, I applied that - with a little shoehorning due to commit
> 
>   71acb768f5b3 ("timers: Fix excessive granularity of new timers after a nohz idle")
> 
> from your cru tree.
            ^^^
(I meant rcu, of course)

I will apply the resulting patch to linux-next (as part of the rcu tree
merge) today - unless you get around to updating your tree before then.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-22 21:43     ` Bart Van Assche
  (?)
@ 2017-08-23  0:03       ` Byungchul Park
  -1 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  0:03 UTC (permalink / raw)
  To: Bart Van Assche, peterz
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > ------------------------------------------------------
> > fsck.ext4/148 is trying to acquire lock:
> >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > 
> >  but now in release context of a crosslock acquired at the following:
> >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > 
> >  which lock already depends on the new lock.
> > 
> >  the existing dependency chain (in reverse order) is:
> > 
> >  -> #1 ((complete)&wait#2){+.+.}:
> >        lock_acquire+0x176/0x19e
> >        __wait_for_common+0x50/0x1e3
> >        blk_execute_rq+0xbb/0xda
> >        scsi_execute+0xc3/0x17d [scsi_mod]
> >        sd_revalidate_disk+0x112/0x1549 [sd_mod]
> >        rescan_partitions+0x48/0x2c4
> >        __blkdev_get+0x14b/0x37c
> >        blkdev_get+0x191/0x2c0
> >        device_add_disk+0x2b4/0x3e5
> >        sd_probe_async+0xf8/0x17e [sd_mod]
> >        async_run_entry_fn+0x34/0xe0
> >        process_one_work+0x2af/0x4d1
> >        worker_thread+0x19a/0x24f
> >        kthread+0x133/0x13b
> >        ret_from_fork+0x27/0x40
> > 
> >  -> #0 (&bdev->bd_mutex){+.+.}:
> >        __blkdev_put+0x33/0x190
> >        blkdev_close+0x24/0x27
> >        __fput+0xee/0x18a
> >        task_work_run+0x79/0xa0
> >        prepare_exit_to_usermode+0x9b/0xb5
> > 
> >  other info that might help us debug this:
> >  Possible unsafe locking scenario by crosslock:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&bdev->bd_mutex);
> >   lock((complete)&wait#2);
> >                                lock(&bdev->bd_mutex);
> >                                unlock((complete)&wait#2);
> > 
> >   *** DEADLOCK ***
> > 4 locks held by fsck.ext4/148:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >  #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
> >  #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
> >  #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> > 
> >  stack backtrace:
> > CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> > Call Trace:
> >  dump_stack+0x67/0x8e
> >  print_circular_bug+0x2a1/0x2af
> >  ? zap_class+0xc5/0xc5
> >  check_prev_add+0x76/0x20d
> >  ? __lock_acquire+0xc27/0xcc8
> >  lock_commit_crosslock+0x327/0x35e
> >  complete+0x24/0x50
> >  scsi_end_request+0x8d/0x176 [scsi_mod]
> >  scsi_io_completion+0x1be/0x423 [scsi_mod]
> >  __blk_mq_complete_request+0x112/0x131
> >  ata_scsi_simulate+0x212/0x218 [libata]
> >  __ata_scsi_queuecmd+0x1be/0x1de [libata]
> >  ata_scsi_queuecmd+0x41/0x74 [libata]
> >  scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
> >  scsi_queue_rq+0x1e0/0x26f [scsi_mod]
> >  blk_mq_dispatch_rq_list+0x193/0x2a7
> >  ? _raw_spin_unlock+0x2e/0x40
> >  blk_mq_sched_dispatch_requests+0x132/0x176
> >  __blk_mq_run_hw_queue+0x59/0xc5
> >  __blk_mq_delay_run_hw_queue+0x5f/0xc1
> >  blk_mq_flush_plug_list+0xfc/0x10b
> >  blk_flush_plug_list+0xc6/0x1eb
> >  blk_finish_plug+0x25/0x32
> >  generic_writepages+0x56/0x63
> >  do_writepages+0x36/0x70
> >  __filemap_fdatawrite_range+0x59/0x5f
> >  filemap_write_and_wait+0x19/0x4f
> >  __blkdev_put+0x5f/0x190
> >  blkdev_close+0x24/0x27
> >  __fput+0xee/0x18a
> >  task_work_run+0x79/0xa0
> >  prepare_exit_to_usermode+0x9b/0xb5
> >  entry_SYSCALL_64_fastpath+0xab/0xad
> 
> Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> the above report? That report namely doesn't make sense to me.

The report is talking about the following lockup:

A work in a worker                     A task work on exit to user
------------------                     ---------------------------
mutex_lock(&bdev->bd_mutex)
                                       mutext_lock(&bdev->bd_mutex)
blk_execute_rq()
   wait_for_completion_io_timeout(&A)
                                       complete(&A)

Is this impossible?

To Peterz,

Anyway I wanted to avoid lockdep reports in the case using a timeout
interface. Do you think it's still worth reporting the kind of lockup?
I'm ok if you do.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  0:03       ` Byungchul Park
  0 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  0:03 UTC (permalink / raw)
  To: Bart Van Assche, peterz
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > ------------------------------------------------------
> > fsck.ext4/148 is trying to acquire lock:
> >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > 
> >  but now in release context of a crosslock acquired at the following:
> >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > 
> >  which lock already depends on the new lock.
> > 
> >  the existing dependency chain (in reverse order) is:
> > 
> >  -> #1 ((complete)&wait#2){+.+.}:
> >        lock_acquire+0x176/0x19e
> >        __wait_for_common+0x50/0x1e3
> >        blk_execute_rq+0xbb/0xda
> >        scsi_execute+0xc3/0x17d [scsi_mod]
> >        sd_revalidate_disk+0x112/0x1549 [sd_mod]
> >        rescan_partitions+0x48/0x2c4
> >        __blkdev_get+0x14b/0x37c
> >        blkdev_get+0x191/0x2c0
> >        device_add_disk+0x2b4/0x3e5
> >        sd_probe_async+0xf8/0x17e [sd_mod]
> >        async_run_entry_fn+0x34/0xe0
> >        process_one_work+0x2af/0x4d1
> >        worker_thread+0x19a/0x24f
> >        kthread+0x133/0x13b
> >        ret_from_fork+0x27/0x40
> > 
> >  -> #0 (&bdev->bd_mutex){+.+.}:
> >        __blkdev_put+0x33/0x190
> >        blkdev_close+0x24/0x27
> >        __fput+0xee/0x18a
> >        task_work_run+0x79/0xa0
> >        prepare_exit_to_usermode+0x9b/0xb5
> > 
> >  other info that might help us debug this:
> >  Possible unsafe locking scenario by crosslock:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&bdev->bd_mutex);
> >   lock((complete)&wait#2);
> >                                lock(&bdev->bd_mutex);
> >                                unlock((complete)&wait#2);
> > 
> >   *** DEADLOCK ***
> > 4 locks held by fsck.ext4/148:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >  #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
> >  #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
> >  #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> > 
> >  stack backtrace:
> > CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> > Call Trace:
> >  dump_stack+0x67/0x8e
> >  print_circular_bug+0x2a1/0x2af
> >  ? zap_class+0xc5/0xc5
> >  check_prev_add+0x76/0x20d
> >  ? __lock_acquire+0xc27/0xcc8
> >  lock_commit_crosslock+0x327/0x35e
> >  complete+0x24/0x50
> >  scsi_end_request+0x8d/0x176 [scsi_mod]
> >  scsi_io_completion+0x1be/0x423 [scsi_mod]
> >  __blk_mq_complete_request+0x112/0x131
> >  ata_scsi_simulate+0x212/0x218 [libata]
> >  __ata_scsi_queuecmd+0x1be/0x1de [libata]
> >  ata_scsi_queuecmd+0x41/0x74 [libata]
> >  scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
> >  scsi_queue_rq+0x1e0/0x26f [scsi_mod]
> >  blk_mq_dispatch_rq_list+0x193/0x2a7
> >  ? _raw_spin_unlock+0x2e/0x40
> >  blk_mq_sched_dispatch_requests+0x132/0x176
> >  __blk_mq_run_hw_queue+0x59/0xc5
> >  __blk_mq_delay_run_hw_queue+0x5f/0xc1
> >  blk_mq_flush_plug_list+0xfc/0x10b
> >  blk_flush_plug_list+0xc6/0x1eb
> >  blk_finish_plug+0x25/0x32
> >  generic_writepages+0x56/0x63
> >  do_writepages+0x36/0x70
> >  __filemap_fdatawrite_range+0x59/0x5f
> >  filemap_write_and_wait+0x19/0x4f
> >  __blkdev_put+0x5f/0x190
> >  blkdev_close+0x24/0x27
> >  __fput+0xee/0x18a
> >  task_work_run+0x79/0xa0
> >  prepare_exit_to_usermode+0x9b/0xb5
> >  entry_SYSCALL_64_fastpath+0xab/0xad
> 
> Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> the above report? That report namely doesn't make sense to me.

The report is talking about the following lockup:

A work in a worker                     A task work on exit to user
------------------                     ---------------------------
mutex_lock(&bdev->bd_mutex)
                                       mutext_lock(&bdev->bd_mutex)
blk_execute_rq()
   wait_for_completion_io_timeout(&A)
                                       complete(&A)

Is this impossible?

To Peterz,

Anyway I wanted to avoid lockdep reports in the case using a timeout
interface. Do you think it's still worth reporting the kind of lockup?
I'm ok if you do.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  0:03       ` Byungchul Park
  0 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  0:03 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel, linux-block, peterz, sergey.senozhatsky.work,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > ------------------------------------------------------
> > fsck.ext4/148 is trying to acquire lock:
> >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > 
> >  but now in release context of a crosslock acquired at the following:
> >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > 
> >  which lock already depends on the new lock.
> > 
> >  the existing dependency chain (in reverse order) is:
> > 
> >  -> #1 ((complete)&wait#2){+.+.}:
> >        lock_acquire+0x176/0x19e
> >        __wait_for_common+0x50/0x1e3
> >        blk_execute_rq+0xbb/0xda
> >        scsi_execute+0xc3/0x17d [scsi_mod]
> >        sd_revalidate_disk+0x112/0x1549 [sd_mod]
> >        rescan_partitions+0x48/0x2c4
> >        __blkdev_get+0x14b/0x37c
> >        blkdev_get+0x191/0x2c0
> >        device_add_disk+0x2b4/0x3e5
> >        sd_probe_async+0xf8/0x17e [sd_mod]
> >        async_run_entry_fn+0x34/0xe0
> >        process_one_work+0x2af/0x4d1
> >        worker_thread+0x19a/0x24f
> >        kthread+0x133/0x13b
> >        ret_from_fork+0x27/0x40
> > 
> >  -> #0 (&bdev->bd_mutex){+.+.}:
> >        __blkdev_put+0x33/0x190
> >        blkdev_close+0x24/0x27
> >        __fput+0xee/0x18a
> >        task_work_run+0x79/0xa0
> >        prepare_exit_to_usermode+0x9b/0xb5
> > 
> >  other info that might help us debug this:
> >  Possible unsafe locking scenario by crosslock:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&bdev->bd_mutex);
> >   lock((complete)&wait#2);
> >                                lock(&bdev->bd_mutex);
> >                                unlock((complete)&wait#2);
> > 
> >   *** DEADLOCK ***
> > 4 locks held by fsck.ext4/148:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >  #1:  (rcu_read_lock){....}, at: [<ffffffff81217f16>] rcu_lock_acquire+0x0/0x20
> >  #2:  (&(&host->lock)->rlock){-.-.}, at: [<ffffffffa00e7550>] ata_scsi_queuecmd+0x23/0x74 [libata]
> >  #3:  (&x->wait#14){-...}, at: [<ffffffff8106b593>] complete+0x18/0x50
> > 
> >  stack backtrace:
> > CPU: 1 PID: 148 Comm: fsck.ext4 Not tainted 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746
> > Call Trace:
> >  dump_stack+0x67/0x8e
> >  print_circular_bug+0x2a1/0x2af
> >  ? zap_class+0xc5/0xc5
> >  check_prev_add+0x76/0x20d
> >  ? __lock_acquire+0xc27/0xcc8
> >  lock_commit_crosslock+0x327/0x35e
> >  complete+0x24/0x50
> >  scsi_end_request+0x8d/0x176 [scsi_mod]
> >  scsi_io_completion+0x1be/0x423 [scsi_mod]
> >  __blk_mq_complete_request+0x112/0x131
> >  ata_scsi_simulate+0x212/0x218 [libata]
> >  __ata_scsi_queuecmd+0x1be/0x1de [libata]
> >  ata_scsi_queuecmd+0x41/0x74 [libata]
> >  scsi_dispatch_cmd+0x194/0x2af [scsi_mod]
> >  scsi_queue_rq+0x1e0/0x26f [scsi_mod]
> >  blk_mq_dispatch_rq_list+0x193/0x2a7
> >  ? _raw_spin_unlock+0x2e/0x40
> >  blk_mq_sched_dispatch_requests+0x132/0x176
> >  __blk_mq_run_hw_queue+0x59/0xc5
> >  __blk_mq_delay_run_hw_queue+0x5f/0xc1
> >  blk_mq_flush_plug_list+0xfc/0x10b
> >  blk_flush_plug_list+0xc6/0x1eb
> >  blk_finish_plug+0x25/0x32
> >  generic_writepages+0x56/0x63
> >  do_writepages+0x36/0x70
> >  __filemap_fdatawrite_range+0x59/0x5f
> >  filemap_write_and_wait+0x19/0x4f
> >  __blkdev_put+0x5f/0x190
> >  blkdev_close+0x24/0x27
> >  __fput+0xee/0x18a
> >  task_work_run+0x79/0xa0
> >  prepare_exit_to_usermode+0x9b/0xb5
> >  entry_SYSCALL_64_fastpath+0xab/0xad
> 
> Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> the above report? That report namely doesn't make sense to me.

The report is talking about the following lockup:

A work in a worker                     A task work on exit to user
------------------                     ---------------------------
mutex_lock(&bdev->bd_mutex)
                                       mutext_lock(&bdev->bd_mutex)
blk_execute_rq()
   wait_for_completion_io_timeout(&A)
                                       complete(&A)

Is this impossible?

To Peterz,

Anyway I wanted to avoid lockdep reports in the case using a timeout
interface. Do you think it's still worth reporting the kind of lockup?
I'm ok if you do.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  0:03       ` Byungchul Park
@ 2017-08-23  2:36         ` Sergey Senozhatsky
  -1 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  2:36 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On (08/23/17 09:03), Byungchul Park wrote:
[..]

aha, ok

> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
> Is this impossible?

I was really confused how this "unlock" may lead to a deadlock

> > >  other info that might help us debug this:
> > >  Possible unsafe locking scenario by crosslock:
> > >        CPU0                    CPU1
> > >        ----                    ----
> > >   lock(&bdev->bd_mutex);
> > >   lock((complete)&wait#2);
> > >                                lock(&bdev->bd_mutex);
> > >                                unlock((complete)&wait#2);


any chance the report can be improved? mention timeout, etc?
// well, if this functionality will stay.


p.s.
Bart Van Assche, thanks for Cc-ing Park Byungchul, I was really
sure I didn't enabled the cross-release, but apparently I was wrong:
 CONFIG_LOCKDEP_CROSSRELEASE=y
 CONFIG_LOCKDEP_COMPLETIONS=y

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  2:36         ` Sergey Senozhatsky
  0 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  2:36 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On (08/23/17 09:03), Byungchul Park wrote:
[..]

aha, ok

> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
> Is this impossible?

I was really confused how this "unlock" may lead to a deadlock

> > >  other info that might help us debug this:
> > >  Possible unsafe locking scenario by crosslock:
> > >        CPU0                    CPU1
> > >        ----                    ----
> > >   lock(&bdev->bd_mutex);
> > >   lock((complete)&wait#2);
> > >                                lock(&bdev->bd_mutex);
> > >                                unlock((complete)&wait#2);


any chance the report can be improved? mention timeout, etc?
// well, if this functionality will stay.


p.s.
Bart Van Assche, thanks for Cc-ing Park Byungchul, I was really
sure I didn't enabled the cross-release, but apparently I was wrong:
 CONFIG_LOCKDEP_CROSSRELEASE=y
 CONFIG_LOCKDEP_COMPLETIONS=y

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  2:36         ` Sergey Senozhatsky
  (?)
@ 2017-08-23  2:59         ` Byungchul Park
  -1 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  2:59 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Wed, Aug 23, 2017 at 11:36:49AM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 09:03), Byungchul Park wrote:
> [..]
> 
> aha, ok
> 
> > The report is talking about the following lockup:
> > 
> > A work in a worker                     A task work on exit to user
> > ------------------                     ---------------------------
> > mutex_lock(&bdev->bd_mutex)
> >                                        mutext_lock(&bdev->bd_mutex)
> > blk_execute_rq()
> >    wait_for_completion_io_timeout(&A)
> >                                        complete(&A)
> > 
> > Is this impossible?
> 
> I was really confused how this "unlock" may lead to a deadlock

Hi Sergey,

Right. It should be enhanced.

> 
> > > >  other info that might help us debug this:
> > > >  Possible unsafe locking scenario by crosslock:
> > > >        CPU0                    CPU1
> > > >        ----                    ----
> > > >   lock(&bdev->bd_mutex);
> > > >   lock((complete)&wait#2);
> > > >                                lock(&bdev->bd_mutex);
> > > >                                unlock((complete)&wait#2);
> 
> 
> any chance the report can be improved? mention timeout, etc?
> // well, if this functionality will stay.
> 
> 
> p.s.
> Bart Van Assche, thanks for Cc-ing Park Byungchul, I was really
> sure I didn't enabled the cross-release, but apparently I was wrong:
>  CONFIG_LOCKDEP_CROSSRELEASE=y
>  CONFIG_LOCKDEP_COMPLETIONS=y
> 
> 	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  0:03       ` Byungchul Park
  (?)
@ 2017-08-23  3:49         ` Boqun Feng
  -1 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  3:49 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

Hi Byungchul,

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
> > > WARNING: possible circular locking dependency detected
> > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tain=
ted
> > > ------------------------------------------------------
> > > fsck.ext4/148 is trying to acquire lock:
> > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/=
0x190
> > >=20
> > >  but now in release context of a crosslock acquired at the following:
> > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0=
xbb/0xda
> > >=20
> > >  which lock already depends on the new lock.
> > >=20

I felt this message really misleading, because the deadlock is detected
at the commit time of "((complete)&wait#2)" rather than the acquisition
time of "(&bdev->bd_mutex)", so I made the following improvement.

Thoughts?

Regards,
Boqun

----------------------->8
=46rom: Boqun Feng <boqun.feng@gmail.com>
Date: Wed, 23 Aug 2017 10:18:30 +0800
Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
 splats

When a crossrelease related deadlock is detected in a commit, the
current implemention makes splats like:

> fsck.ext4/148 is trying to acquire lock:
>  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>
>  but now in release context of a crosslock acquired at the following:
>  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/=
0xda
>
>  which lock already depends on the new lock.
> ...

However, it could be misleading because the current task has got the
lock already, and in fact the deadlock is detected when it is doing the
commit of the crossrelease lock. So make the splats more accurate to
describe the deadlock case.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 66011c9f5df3..642fb5362507 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, =
unsigned int depth,
 	pr_warn("WARNING: possible circular locking dependency detected\n");
 	print_kernel_ident();
 	pr_warn("------------------------------------------------------\n");
-	pr_warn("%s/%d is trying to acquire lock:\n",
-		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
=20
-	if (cross_lock(check_tgt->instance))
-		pr_warn("\nbut now in release context of a crosslock acquired at the fol=
lowing:\n");
-	else
+	if (cross_lock(check_tgt->instance)) {
+		pr_warn("%s/%d is committing a crossrelease lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_tgt);
+		pr_warn("\n, with the following lock held:\n");
+		print_lock(check_src);
+		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
+	} else {
+		pr_warn("%s/%d is trying to acquire lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_src);
 		pr_warn("\nbut task is already holding lock:\n");
+		print_lock(check_tgt);
+		pr_warn("\nwhich lock already depends on the new lock.\n\n");
+	}
=20
-	print_lock(check_tgt);
-	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
=20
 	print_circular_bug_entry(entry, depth);
--=20
2.14.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  3:49         ` Boqun Feng
  0 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  3:49 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

Hi Byungchul,

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > ------------------------------------------------------
> > > fsck.ext4/148 is trying to acquire lock:
> > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > 
> > >  but now in release context of a crosslock acquired at the following:
> > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > 
> > >  which lock already depends on the new lock.
> > > 

I felt this message really misleading, because the deadlock is detected
at the commit time of "((complete)&wait#2)" rather than the acquisition
time of "(&bdev->bd_mutex)", so I made the following improvement.

Thoughts?

Regards,
Boqun

----------------------->8
From: Boqun Feng <boqun.feng@gmail.com>
Date: Wed, 23 Aug 2017 10:18:30 +0800
Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
 splats

When a crossrelease related deadlock is detected in a commit, the
current implemention makes splats like:

> fsck.ext4/148 is trying to acquire lock:
>  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>
>  but now in release context of a crosslock acquired at the following:
>  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
>
>  which lock already depends on the new lock.
> ...

However, it could be misleading because the current task has got the
lock already, and in fact the deadlock is detected when it is doing the
commit of the crossrelease lock. So make the splats more accurate to
describe the deadlock case.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 66011c9f5df3..642fb5362507 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 	pr_warn("WARNING: possible circular locking dependency detected\n");
 	print_kernel_ident();
 	pr_warn("------------------------------------------------------\n");
-	pr_warn("%s/%d is trying to acquire lock:\n",
-		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
 
-	if (cross_lock(check_tgt->instance))
-		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
-	else
+	if (cross_lock(check_tgt->instance)) {
+		pr_warn("%s/%d is committing a crossrelease lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_tgt);
+		pr_warn("\n, with the following lock held:\n");
+		print_lock(check_src);
+		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
+	} else {
+		pr_warn("%s/%d is trying to acquire lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_src);
 		pr_warn("\nbut task is already holding lock:\n");
+		print_lock(check_tgt);
+		pr_warn("\nwhich lock already depends on the new lock.\n\n");
+	}
 
-	print_lock(check_tgt);
-	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
 
 	print_circular_bug_entry(entry, depth);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  3:49         ` Boqun Feng
  0 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  3:49 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

Hi Byungchul,

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > ------------------------------------------------------
> > > fsck.ext4/148 is trying to acquire lock:
> > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > 
> > >  but now in release context of a crosslock acquired at the following:
> > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > 
> > >  which lock already depends on the new lock.
> > > 

I felt this message really misleading, because the deadlock is detected
at the commit time of "((complete)&wait#2)" rather than the acquisition
time of "(&bdev->bd_mutex)", so I made the following improvement.

Thoughts?

Regards,
Boqun

----------------------->8
From: Boqun Feng <boqun.feng@gmail.com>
Date: Wed, 23 Aug 2017 10:18:30 +0800
Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
 splats

When a crossrelease related deadlock is detected in a commit, the
current implemention makes splats like:

> fsck.ext4/148 is trying to acquire lock:
>  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
>
>  but now in release context of a crosslock acquired at the following:
>  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
>
>  which lock already depends on the new lock.
> ...

However, it could be misleading because the current task has got the
lock already, and in fact the deadlock is detected when it is doing the
commit of the crossrelease lock. So make the splats more accurate to
describe the deadlock case.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 66011c9f5df3..642fb5362507 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 	pr_warn("WARNING: possible circular locking dependency detected\n");
 	print_kernel_ident();
 	pr_warn("------------------------------------------------------\n");
-	pr_warn("%s/%d is trying to acquire lock:\n",
-		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
 
-	if (cross_lock(check_tgt->instance))
-		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
-	else
+	if (cross_lock(check_tgt->instance)) {
+		pr_warn("%s/%d is committing a crossrelease lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_tgt);
+		pr_warn("\n, with the following lock held:\n");
+		print_lock(check_src);
+		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
+	} else {
+		pr_warn("%s/%d is trying to acquire lock:\n",
+			curr->comm, task_pid_nr(curr));
+		print_lock(check_src);
 		pr_warn("\nbut task is already holding lock:\n");
+		print_lock(check_tgt);
+		pr_warn("\nwhich lock already depends on the new lock.\n\n");
+	}
 
-	print_lock(check_tgt);
-	pr_warn("\nwhich lock already depends on the new lock.\n\n");
 	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
 
 	print_circular_bug_entry(entry, depth);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  3:49         ` Boqun Feng
@ 2017-08-23  4:38           ` Boqun Feng
  -1 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  4:38 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
> 
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > 
> > > >  but now in release context of a crosslock acquired at the following:
> > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > 
> > > >  which lock already depends on the new lock.
> > > > 
> 
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
> 
> Thoughts?
> 
> Regards,
> Boqun
> 

While I'm on this one, I think we should also add a case in @check_src
is a cross lock, i.e. we detect cross deadlock at the acquisition time
of the cross lock. How about the following?

Regards,
Boqun

--------------------------------------->8
From: Boqun Feng <boqun.feng@gmail.com>
Date: Wed, 23 Aug 2017 12:12:16 +0800
Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
 acquisition time

For a potential deadlock about CROSSRELEASE as follow:

	P1		P2
	===========	=============
	lock(A)
	lock(X)
			lock(A)
			commit(X)

	A: normal lock, X: cross lock

, we could detect it at two places:

1. commit time:

	We have run P1 first, and have dependency A --> X in graph, and
	then we run P2, and find the deadlock.

2. acquisition time:

	We have run P2 first, and have dependency A --> X, in
	graph(because another P3 may run previously and is acquiring for
	lock X), and then we run P1 and find the deadlock.

In current print_circular_lock_scenario(), for 1) we could print the
right scenario and note that's a deadlock related to CROSSRELEASE,
however for 2) we print the scenario as a normal lockdep deadlock.

It's better to print a proper scenario related to CROSSRELEASE to help
users find their bugs more easily, so improve this.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 642fb5362507..a3709e15f609 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
 		__print_lock_name(target);
 		printk(KERN_CONT ");\n");
 		printk("\n *** DEADLOCK ***\n\n");
+	} else if (cross_lock(src->instance)) {
+		printk(" Possible unsafe locking scenario by crosslock:\n\n");
+		printk("       CPU0                    CPU1\n");
+		printk("       ----                    ----\n");
+		printk("  lock(");
+		__print_lock_name(target);
+		printk(KERN_CONT ");\n");
+		printk("  lock(");
+		__print_lock_name(source);
+		printk(KERN_CONT ");\n");
+		printk("                               lock(");
+		__print_lock_name(parent == source ? target : parent);
+		printk(KERN_CONT ");\n");
+		printk("                               unlock(");
+		__print_lock_name(source);
+		printk(KERN_CONT ");\n");
+		printk("\n *** DEADLOCK ***\n\n");
 	} else {
 		printk(" Possible unsafe locking scenario:\n\n");
 		printk("       CPU0                    CPU1\n");
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  4:38           ` Boqun Feng
  0 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  4:38 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
> 
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > 
> > > >  but now in release context of a crosslock acquired at the following:
> > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > 
> > > >  which lock already depends on the new lock.
> > > > 
> 
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
> 
> Thoughts?
> 
> Regards,
> Boqun
> 

While I'm on this one, I think we should also add a case in @check_src
is a cross lock, i.e. we detect cross deadlock at the acquisition time
of the cross lock. How about the following?

Regards,
Boqun

--------------------------------------->8
From: Boqun Feng <boqun.feng@gmail.com>
Date: Wed, 23 Aug 2017 12:12:16 +0800
Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
 acquisition time

For a potential deadlock about CROSSRELEASE as follow:

	P1		P2
	===========	=============
	lock(A)
	lock(X)
			lock(A)
			commit(X)

	A: normal lock, X: cross lock

, we could detect it at two places:

1. commit time:

	We have run P1 first, and have dependency A --> X in graph, and
	then we run P2, and find the deadlock.

2. acquisition time:

	We have run P2 first, and have dependency A --> X, in
	graph(because another P3 may run previously and is acquiring for
	lock X), and then we run P1 and find the deadlock.

In current print_circular_lock_scenario(), for 1) we could print the
right scenario and note that's a deadlock related to CROSSRELEASE,
however for 2) we print the scenario as a normal lockdep deadlock.

It's better to print a proper scenario related to CROSSRELEASE to help
users find their bugs more easily, so improve this.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 642fb5362507..a3709e15f609 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
 		__print_lock_name(target);
 		printk(KERN_CONT ");\n");
 		printk("\n *** DEADLOCK ***\n\n");
+	} else if (cross_lock(src->instance)) {
+		printk(" Possible unsafe locking scenario by crosslock:\n\n");
+		printk("       CPU0                    CPU1\n");
+		printk("       ----                    ----\n");
+		printk("  lock(");
+		__print_lock_name(target);
+		printk(KERN_CONT ");\n");
+		printk("  lock(");
+		__print_lock_name(source);
+		printk(KERN_CONT ");\n");
+		printk("                               lock(");
+		__print_lock_name(parent == source ? target : parent);
+		printk(KERN_CONT ");\n");
+		printk("                               unlock(");
+		__print_lock_name(source);
+		printk(KERN_CONT ");\n");
+		printk("\n *** DEADLOCK ***\n\n");
 	} else {
 		printk(" Possible unsafe locking scenario:\n\n");
 		printk("       CPU0                    CPU1\n");
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  3:49         ` Boqun Feng
@ 2017-08-23  4:46           ` Byungchul Park
  -1 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  4:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
> 
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > 
> > > >  but now in release context of a crosslock acquired at the following:
> > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > 
> > > >  which lock already depends on the new lock.
> > > > 
> 
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
> 
> Thoughts?
> 
> Regards,
> Boqun
> 
> ----------------------->8
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Wed, 23 Aug 2017 10:18:30 +0800
> Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
>  splats
> 
> When a crossrelease related deadlock is detected in a commit, the
> current implemention makes splats like:
> 
> > fsck.ext4/148 is trying to acquire lock:
> >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >
> >  but now in release context of a crosslock acquired at the following:
> >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> >
> >  which lock already depends on the new lock.
> > ...
> 
> However, it could be misleading because the current task has got the
> lock already, and in fact the deadlock is detected when it is doing the
> commit of the crossrelease lock. So make the splats more accurate to
> describe the deadlock case.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/locking/lockdep.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 66011c9f5df3..642fb5362507 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
>  	pr_warn("WARNING: possible circular locking dependency detected\n");
>  	print_kernel_ident();
>  	pr_warn("------------------------------------------------------\n");
> -	pr_warn("%s/%d is trying to acquire lock:\n",
> -		curr->comm, task_pid_nr(curr));
> -	print_lock(check_src);
>  
> -	if (cross_lock(check_tgt->instance))
> -		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> -	else
> +	if (cross_lock(check_tgt->instance)) {
> +		pr_warn("%s/%d is committing a crossrelease lock:\n",
> +			curr->comm, task_pid_nr(curr));

I think it would be better to print something in term of acquisition,
since the following print_lock() will print infromation of acquisition.

> +		print_lock(check_tgt);
> +		pr_warn("\n, with the following lock held:\n");

The lock does not have to be held at the commit.

> +		print_lock(check_src);
> +		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> +	} else {
> +		pr_warn("%s/%d is trying to acquire lock:\n",
> +			curr->comm, task_pid_nr(curr));
> +		print_lock(check_src);
>  		pr_warn("\nbut task is already holding lock:\n");
> +		print_lock(check_tgt);
> +		pr_warn("\nwhich lock already depends on the new lock.\n\n");
> +	}
>  
> -	print_lock(check_tgt);
> -	pr_warn("\nwhich lock already depends on the new lock.\n\n");
>  	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
>  
>  	print_circular_bug_entry(entry, depth);
> -- 
> 2.14.1

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  4:46           ` Byungchul Park
  0 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  4:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> Hi Byungchul,
> 
> On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > ------------------------------------------------------
> > > > fsck.ext4/148 is trying to acquire lock:
> > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > 
> > > >  but now in release context of a crosslock acquired at the following:
> > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > 
> > > >  which lock already depends on the new lock.
> > > > 
> 
> I felt this message really misleading, because the deadlock is detected
> at the commit time of "((complete)&wait#2)" rather than the acquisition
> time of "(&bdev->bd_mutex)", so I made the following improvement.
> 
> Thoughts?
> 
> Regards,
> Boqun
> 
> ----------------------->8
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Wed, 23 Aug 2017 10:18:30 +0800
> Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
>  splats
> 
> When a crossrelease related deadlock is detected in a commit, the
> current implemention makes splats like:
> 
> > fsck.ext4/148 is trying to acquire lock:
> >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> >
> >  but now in release context of a crosslock acquired at the following:
> >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> >
> >  which lock already depends on the new lock.
> > ...
> 
> However, it could be misleading because the current task has got the
> lock already, and in fact the deadlock is detected when it is doing the
> commit of the crossrelease lock. So make the splats more accurate to
> describe the deadlock case.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/locking/lockdep.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 66011c9f5df3..642fb5362507 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
>  	pr_warn("WARNING: possible circular locking dependency detected\n");
>  	print_kernel_ident();
>  	pr_warn("------------------------------------------------------\n");
> -	pr_warn("%s/%d is trying to acquire lock:\n",
> -		curr->comm, task_pid_nr(curr));
> -	print_lock(check_src);
>  
> -	if (cross_lock(check_tgt->instance))
> -		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> -	else
> +	if (cross_lock(check_tgt->instance)) {
> +		pr_warn("%s/%d is committing a crossrelease lock:\n",
> +			curr->comm, task_pid_nr(curr));

I think it would be better to print something in term of acquisition,
since the following print_lock() will print infromation of acquisition.

> +		print_lock(check_tgt);
> +		pr_warn("\n, with the following lock held:\n");

The lock does not have to be held at the commit.

> +		print_lock(check_src);
> +		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> +	} else {
> +		pr_warn("%s/%d is trying to acquire lock:\n",
> +			curr->comm, task_pid_nr(curr));
> +		print_lock(check_src);
>  		pr_warn("\nbut task is already holding lock:\n");
> +		print_lock(check_tgt);
> +		pr_warn("\nwhich lock already depends on the new lock.\n\n");
> +	}
>  
> -	print_lock(check_tgt);
> -	pr_warn("\nwhich lock already depends on the new lock.\n\n");
>  	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
>  
>  	print_circular_bug_entry(entry, depth);
> -- 
> 2.14.1

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  4:38           ` Boqun Feng
@ 2017-08-23  4:46             ` Sergey Senozhatsky
  -1 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  4:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Byungchul Park, Bart Van Assche, peterz, linux-kernel,
	linux-block, sergey.senozhatsky.work, martin.petersen, axboe,
	linux-scsi, sfr, linux-next, kernel-team

On (08/23/17 12:38), Boqun Feng wrote:
[..]
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
>  		__print_lock_name(target);
>  		printk(KERN_CONT ");\n");

KERN_CONT and "\n" should not be together. "\n" flushes the cont
buffer immediately.

	-ss

>  		printk("\n *** DEADLOCK ***\n\n");
> +	} else if (cross_lock(src->instance)) {
> +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> +		printk("       CPU0                    CPU1\n");
> +		printk("       ----                    ----\n");
> +		printk("  lock(");
> +		__print_lock_name(target);
> +		printk(KERN_CONT ");\n");
> +		printk("  lock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("                               lock(");
> +		__print_lock_name(parent == source ? target : parent);
> +		printk(KERN_CONT ");\n");
> +		printk("                               unlock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("\n *** DEADLOCK ***\n\n");
>  	} else {
>  		printk(" Possible unsafe locking scenario:\n\n");
>  		printk("       CPU0                    CPU1\n");
> -- 
> 2.14.1
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  4:46             ` Sergey Senozhatsky
  0 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  4:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Byungchul Park, Bart Van Assche, peterz, linux-kernel,
	linux-block, sergey.senozhatsky.work, martin.petersen, axboe,
	linux-scsi, sfr, linux-next, kernel-team

On (08/23/17 12:38), Boqun Feng wrote:
[..]
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
>  		__print_lock_name(target);
>  		printk(KERN_CONT ");\n");

KERN_CONT and "\n" should not be together. "\n" flushes the cont
buffer immediately.

	-ss

>  		printk("\n *** DEADLOCK ***\n\n");
> +	} else if (cross_lock(src->instance)) {
> +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> +		printk("       CPU0                    CPU1\n");
> +		printk("       ----                    ----\n");
> +		printk("  lock(");
> +		__print_lock_name(target);
> +		printk(KERN_CONT ");\n");
> +		printk("  lock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("                               lock(");
> +		__print_lock_name(parent == source ? target : parent);
> +		printk(KERN_CONT ");\n");
> +		printk("                               unlock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("\n *** DEADLOCK ***\n\n");
>  	} else {
>  		printk(" Possible unsafe locking scenario:\n\n");
>  		printk("       CPU0                    CPU1\n");
> -- 
> 2.14.1
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  4:46           ` Byungchul Park
@ 2017-08-23  5:01             ` Boqun Feng
  -1 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  5:01 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

[-- Attachment #1: Type: text/plain, Size: 4914 bytes --]

On Wed, Aug 23, 2017 at 01:46:17PM +0900, Byungchul Park wrote:
> On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> > Hi Byungchul,
> > 
> > On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > > ======================================================
> > > > > WARNING: possible circular locking dependency detected
> > > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > > ------------------------------------------------------
> > > > > fsck.ext4/148 is trying to acquire lock:
> > > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > > 
> > > > >  but now in release context of a crosslock acquired at the following:
> > > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > > 
> > > > >  which lock already depends on the new lock.
> > > > > 
> > 
> > I felt this message really misleading, because the deadlock is detected
> > at the commit time of "((complete)&wait#2)" rather than the acquisition
> > time of "(&bdev->bd_mutex)", so I made the following improvement.
> > 
> > Thoughts?
> > 
> > Regards,
> > Boqun
> > 
> > ----------------------->8
> > From: Boqun Feng <boqun.feng@gmail.com>
> > Date: Wed, 23 Aug 2017 10:18:30 +0800
> > Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
> >  splats
> > 
> > When a crossrelease related deadlock is detected in a commit, the
> > current implemention makes splats like:
> > 
> > > fsck.ext4/148 is trying to acquire lock:
> > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > >
> > >  but now in release context of a crosslock acquired at the following:
> > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > >
> > >  which lock already depends on the new lock.
> > > ...
> > 
> > However, it could be misleading because the current task has got the
> > lock already, and in fact the deadlock is detected when it is doing the
> > commit of the crossrelease lock. So make the splats more accurate to
> > describe the deadlock case.
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >  kernel/locking/lockdep.c | 22 ++++++++++++++--------
> >  1 file changed, 14 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 66011c9f5df3..642fb5362507 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
> >  	pr_warn("WARNING: possible circular locking dependency detected\n");
> >  	print_kernel_ident();
> >  	pr_warn("------------------------------------------------------\n");
> > -	pr_warn("%s/%d is trying to acquire lock:\n",
> > -		curr->comm, task_pid_nr(curr));
> > -	print_lock(check_src);
> >  
> > -	if (cross_lock(check_tgt->instance))
> > -		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> > -	else
> > +	if (cross_lock(check_tgt->instance)) {
> > +		pr_warn("%s/%d is committing a crossrelease lock:\n",
> > +			curr->comm, task_pid_nr(curr));
> 
> I think it would be better to print something in term of acquisition,
> since the following print_lock() will print infromation of acquisition.
> 

Well, that print_lock() will print the cross lock acquisition
information at other contexts, but the current thread is doing the
commit. So I think the information would be a little misleading. I will
add "aacquired at" to indicate the lock information is for acquisition.

> > +		print_lock(check_tgt);
> > +		pr_warn("\n, with the following lock held:\n");
> 
> The lock does not have to be held at the commit.
> 

Ah.. right.

How about this:

		pr_warn("%s/%d is committing a crossrelease lock acquired at:\n",
			curr->comm, task_pid_nr(curr));
		print_lock(check_tgt);
		pr_warn("\n, after having the following lock held at least once:\n");

Regards,
Boqun

> > +		print_lock(check_src);
> > +		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> > +	} else {
> > +		pr_warn("%s/%d is trying to acquire lock:\n",
> > +			curr->comm, task_pid_nr(curr));
> > +		print_lock(check_src);
> >  		pr_warn("\nbut task is already holding lock:\n");
> > +		print_lock(check_tgt);
> > +		pr_warn("\nwhich lock already depends on the new lock.\n\n");
> > +	}
> >  
> > -	print_lock(check_tgt);
> > -	pr_warn("\nwhich lock already depends on the new lock.\n\n");
> >  	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
> >  
> >  	print_circular_bug_entry(entry, depth);
> > -- 
> > 2.14.1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  5:01             ` Boqun Feng
  0 siblings, 0 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  5:01 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

[-- Attachment #1: Type: text/plain, Size: 4914 bytes --]

On Wed, Aug 23, 2017 at 01:46:17PM +0900, Byungchul Park wrote:
> On Wed, Aug 23, 2017 at 11:49:51AM +0800, Boqun Feng wrote:
> > Hi Byungchul,
> > 
> > On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> > > On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:
> > > > On Tue, 2017-08-22 at 19:47 +0900, Sergey Senozhatsky wrote:
> > > > > ======================================================
> > > > > WARNING: possible circular locking dependency detected
> > > > > 4.13.0-rc6-next-20170822-dbg-00020-g39758ed8aae0-dirty #1746 Not tainted
> > > > > ------------------------------------------------------
> > > > > fsck.ext4/148 is trying to acquire lock:
> > > > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > > > > 
> > > > >  but now in release context of a crosslock acquired at the following:
> > > > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > > > > 
> > > > >  which lock already depends on the new lock.
> > > > > 
> > 
> > I felt this message really misleading, because the deadlock is detected
> > at the commit time of "((complete)&wait#2)" rather than the acquisition
> > time of "(&bdev->bd_mutex)", so I made the following improvement.
> > 
> > Thoughts?
> > 
> > Regards,
> > Boqun
> > 
> > ----------------------->8
> > From: Boqun Feng <boqun.feng@gmail.com>
> > Date: Wed, 23 Aug 2017 10:18:30 +0800
> > Subject: [PATCH] lockdep: Improve the readibility of crossrelease related
> >  splats
> > 
> > When a crossrelease related deadlock is detected in a commit, the
> > current implemention makes splats like:
> > 
> > > fsck.ext4/148 is trying to acquire lock:
> > >  (&bdev->bd_mutex){+.+.}, at: [<ffffffff8116e73e>] __blkdev_put+0x33/0x190
> > >
> > >  but now in release context of a crosslock acquired at the following:
> > >  ((complete)&wait#2){+.+.}, at: [<ffffffff812159e0>] blk_execute_rq+0xbb/0xda
> > >
> > >  which lock already depends on the new lock.
> > > ...
> > 
> > However, it could be misleading because the current task has got the
> > lock already, and in fact the deadlock is detected when it is doing the
> > commit of the crossrelease lock. So make the splats more accurate to
> > describe the deadlock case.
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >  kernel/locking/lockdep.c | 22 ++++++++++++++--------
> >  1 file changed, 14 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 66011c9f5df3..642fb5362507 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1195,17 +1195,23 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
> >  	pr_warn("WARNING: possible circular locking dependency detected\n");
> >  	print_kernel_ident();
> >  	pr_warn("------------------------------------------------------\n");
> > -	pr_warn("%s/%d is trying to acquire lock:\n",
> > -		curr->comm, task_pid_nr(curr));
> > -	print_lock(check_src);
> >  
> > -	if (cross_lock(check_tgt->instance))
> > -		pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
> > -	else
> > +	if (cross_lock(check_tgt->instance)) {
> > +		pr_warn("%s/%d is committing a crossrelease lock:\n",
> > +			curr->comm, task_pid_nr(curr));
> 
> I think it would be better to print something in term of acquisition,
> since the following print_lock() will print infromation of acquisition.
> 

Well, that print_lock() will print the cross lock acquisition
information at other contexts, but the current thread is doing the
commit. So I think the information would be a little misleading. I will
add "aacquired at" to indicate the lock information is for acquisition.

> > +		print_lock(check_tgt);
> > +		pr_warn("\n, with the following lock held:\n");
> 
> The lock does not have to be held at the commit.
> 

Ah.. right.

How about this:

		pr_warn("%s/%d is committing a crossrelease lock acquired at:\n",
			curr->comm, task_pid_nr(curr));
		print_lock(check_tgt);
		pr_warn("\n, after having the following lock held at least once:\n");

Regards,
Boqun

> > +		print_lock(check_src);
> > +		pr_warn("\non which lock the crossrelease lock already depends.\n\n");
> > +	} else {
> > +		pr_warn("%s/%d is trying to acquire lock:\n",
> > +			curr->comm, task_pid_nr(curr));
> > +		print_lock(check_src);
> >  		pr_warn("\nbut task is already holding lock:\n");
> > +		print_lock(check_tgt);
> > +		pr_warn("\nwhich lock already depends on the new lock.\n\n");
> > +	}
> >  
> > -	print_lock(check_tgt);
> > -	pr_warn("\nwhich lock already depends on the new lock.\n\n");
> >  	pr_warn("\nthe existing dependency chain (in reverse order) is:\n");
> >  
> >  	print_circular_bug_entry(entry, depth);
> > -- 
> > 2.14.1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  4:46             ` Sergey Senozhatsky
  (?)
@ 2017-08-23  5:35             ` Boqun Feng
  2017-08-23  5:44               ` Sergey Senozhatsky
  2017-08-23  5:55               ` Sergey Senozhatsky
  -1 siblings, 2 replies; 49+ messages in thread
From: Boqun Feng @ 2017-08-23  5:35 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, Bart Van Assche, peterz, linux-kernel,
	linux-block, martin.petersen, axboe, linux-scsi, sfr, linux-next,
	kernel-team

[-- Attachment #1: Type: text/plain, Size: 1855 bytes --]

On Wed, Aug 23, 2017 at 01:46:48PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 12:38), Boqun Feng wrote:
> [..]
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 642fb5362507..a3709e15f609 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
> >  		__print_lock_name(target);
> >  		printk(KERN_CONT ");\n");
> 
> KERN_CONT and "\n" should not be together. "\n" flushes the cont
> buffer immediately.
> 

Hmm.. Not quite familiar with printk() stuffs, but I could see several
usages of printk(KERN_CONT "...\n") in kernel.

Did a bit research myself, and I now think the inappropriate use is to
use a KERN_CONT printk *after* another printk ending with a "\n". Am I
missing some recent changes or rules of KERN_CONT?

Regards,
Boqun

> 	-ss
> 
> >  		printk("\n *** DEADLOCK ***\n\n");
> > +	} else if (cross_lock(src->instance)) {
> > +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > +		printk("       CPU0                    CPU1\n");
> > +		printk("       ----                    ----\n");
> > +		printk("  lock(");
> > +		__print_lock_name(target);
> > +		printk(KERN_CONT ");\n");
> > +		printk("  lock(");
> > +		__print_lock_name(source);
> > +		printk(KERN_CONT ");\n");
> > +		printk("                               lock(");
> > +		__print_lock_name(parent == source ? target : parent);
> > +		printk(KERN_CONT ");\n");
> > +		printk("                               unlock(");
> > +		__print_lock_name(source);
> > +		printk(KERN_CONT ");\n");
> > +		printk("\n *** DEADLOCK ***\n\n");
> >  	} else {
> >  		printk(" Possible unsafe locking scenario:\n\n");
> >  		printk("       CPU0                    CPU1\n");
> > -- 
> > 2.14.1
> > 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  4:38           ` Boqun Feng
@ 2017-08-23  5:44             ` Byungchul Park
  -1 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  5:44 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 12:38:13PM +0800, Boqun Feng wrote:
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Wed, 23 Aug 2017 12:12:16 +0800
> Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
>  acquisition time
> 
> For a potential deadlock about CROSSRELEASE as follow:
> 
> 	P1		P2
> 	===========	=============
> 	lock(A)
> 	lock(X)
> 			lock(A)
> 			commit(X)
> 
> 	A: normal lock, X: cross lock
> 
> , we could detect it at two places:
> 
> 1. commit time:
> 
> 	We have run P1 first, and have dependency A --> X in graph, and
> 	then we run P2, and find the deadlock.
> 
> 2. acquisition time:
> 
> 	We have run P2 first, and have dependency A --> X, in

X -> A

> 	graph(because another P3 may run previously and is acquiring for

".. another P3 may have run previously and was holding .."
            ^
    Additionally, not only P3 but also P2 like:

	lock(A)
	lock(X)
			lock(X) // I mean it's at _P2_
			lock(A)
			commit(X)

> 	lock X), and then we run P1 and find the deadlock.
>
> In current print_circular_lock_scenario(), for 1) we could print the
> right scenario and note that's a deadlock related to CROSSRELEASE,
> however for 2) we print the scenario as a normal lockdep deadlock.
> 
> It's better to print a proper scenario related to CROSSRELEASE to help
> users find their bugs more easily, so improve this.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/locking/lockdep.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
>  		__print_lock_name(target);
>  		printk(KERN_CONT ");\n");
>  		printk("\n *** DEADLOCK ***\n\n");
> +	} else if (cross_lock(src->instance)) {
> +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> +		printk("       CPU0                    CPU1\n");
> +		printk("       ----                    ----\n");
> +		printk("  lock(");
> +		__print_lock_name(target);
> +		printk(KERN_CONT ");\n");
> +		printk("  lock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("                               lock(");
> +		__print_lock_name(parent == source ? target : parent);
> +		printk(KERN_CONT ");\n");
> +		printk("                               unlock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("\n *** DEADLOCK ***\n\n");
>  	} else {
>  		printk(" Possible unsafe locking scenario:\n\n");
>  		printk("       CPU0                    CPU1\n");

I need time to be sure if it's correct.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  5:44             ` Byungchul Park
  0 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-23  5:44 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 12:38:13PM +0800, Boqun Feng wrote:
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Wed, 23 Aug 2017 12:12:16 +0800
> Subject: [PATCH] lockdep: Print proper scenario if cross deadlock detected at
>  acquisition time
> 
> For a potential deadlock about CROSSRELEASE as follow:
> 
> 	P1		P2
> 	===========	=============
> 	lock(A)
> 	lock(X)
> 			lock(A)
> 			commit(X)
> 
> 	A: normal lock, X: cross lock
> 
> , we could detect it at two places:
> 
> 1. commit time:
> 
> 	We have run P1 first, and have dependency A --> X in graph, and
> 	then we run P2, and find the deadlock.
> 
> 2. acquisition time:
> 
> 	We have run P2 first, and have dependency A --> X, in

X -> A

> 	graph(because another P3 may run previously and is acquiring for

".. another P3 may have run previously and was holding .."
            ^
    Additionally, not only P3 but also P2 like:

	lock(A)
	lock(X)
			lock(X) // I mean it's at _P2_
			lock(A)
			commit(X)

> 	lock X), and then we run P1 and find the deadlock.
>
> In current print_circular_lock_scenario(), for 1) we could print the
> right scenario and note that's a deadlock related to CROSSRELEASE,
> however for 2) we print the scenario as a normal lockdep deadlock.
> 
> It's better to print a proper scenario related to CROSSRELEASE to help
> users find their bugs more easily, so improve this.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/locking/lockdep.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 642fb5362507..a3709e15f609 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1156,6 +1156,23 @@ print_circular_lock_scenario(struct held_lock *src,
>  		__print_lock_name(target);
>  		printk(KERN_CONT ");\n");
>  		printk("\n *** DEADLOCK ***\n\n");
> +	} else if (cross_lock(src->instance)) {
> +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> +		printk("       CPU0                    CPU1\n");
> +		printk("       ----                    ----\n");
> +		printk("  lock(");
> +		__print_lock_name(target);
> +		printk(KERN_CONT ");\n");
> +		printk("  lock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("                               lock(");
> +		__print_lock_name(parent == source ? target : parent);
> +		printk(KERN_CONT ");\n");
> +		printk("                               unlock(");
> +		__print_lock_name(source);
> +		printk(KERN_CONT ");\n");
> +		printk("\n *** DEADLOCK ***\n\n");
>  	} else {
>  		printk(" Possible unsafe locking scenario:\n\n");
>  		printk("       CPU0                    CPU1\n");

I need time to be sure if it's correct.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  5:35             ` Boqun Feng
@ 2017-08-23  5:44               ` Sergey Senozhatsky
  2017-08-23  5:55               ` Sergey Senozhatsky
  1 sibling, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  5:44 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Sergey Senozhatsky, Byungchul Park, Bart Van Assche, peterz,
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

On (08/23/17 13:35), Boqun Feng wrote:
[..]
> > >  		printk(KERN_CONT ");\n");
> > 
> > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > buffer immediately.
> > 
> 
> Hmm.. Not quite familiar with printk() stuffs, but I could see several
> usages of printk(KERN_CONT "...\n") in kernel.
> 
> Did a bit research myself, and I now think the inappropriate use is to
> use a KERN_CONT printk *after* another printk ending with a "\n". Am I
> missing some recent changes or rules of KERN_CONT?

has been this way for quite some time (if not always).

LOG_NEWLINE results in cont_flush(), which log_store() the content
of KERN_CONT buffer.

if we see that supplied message has no \n then we store it in a
dedicated buffer (cont buffer)

	if (!(lflags & LOG_NEWLINE))
		return cont_add();
	
	return log_store();

we flush that buffer (move its content to the kernel log buffer) when
we receive a message with a \n or when printk() from another task/context
interrupts the current cont line and, thus, forces us to flush.

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  5:35             ` Boqun Feng
  2017-08-23  5:44               ` Sergey Senozhatsky
@ 2017-08-23  5:55               ` Sergey Senozhatsky
  2017-08-24  4:39                 ` Boqun Feng
  1 sibling, 1 reply; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-23  5:55 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Sergey Senozhatsky, Byungchul Park, Bart Van Assche, peterz,
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

On (08/23/17 13:35), Boqun Feng wrote:
> > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > buffer immediately.
> > 
> 
> Hmm.. Not quite familiar with printk() stuffs, but I could see several
> usages of printk(KERN_CONT "...\n") in kernel.
> 
> Did a bit research myself, and I now think the inappropriate use is to
> use a KERN_CONT printk *after* another printk ending with a "\n".

ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
upon the return. sorry, your code is correct.

	-ss

> > >  		printk("\n *** DEADLOCK ***\n\n");
> > > +	} else if (cross_lock(src->instance)) {
> > > +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > > +		printk("       CPU0                    CPU1\n");
> > > +		printk("       ----                    ----\n");
> > > +		printk("  lock(");
> > > +		__print_lock_name(target);
> > > +		printk(KERN_CONT ");\n");
> > > +		printk("  lock(");
> > > +		__print_lock_name(source);
> > > +		printk(KERN_CONT ");\n");
> > > +		printk("                               lock(");
> > > +		__print_lock_name(parent == source ? target : parent);
> > > +		printk(KERN_CONT ");\n");
> > > +		printk("                               unlock(");
> > > +		__print_lock_name(source);
> > > +		printk(KERN_CONT ");\n");
> > > +		printk("\n *** DEADLOCK ***\n\n");
> > >  	} else {
> > >  		printk(" Possible unsafe locking scenario:\n\n");
> > >  		printk("       CPU0                    CPU1\n");
> > > -- 
> > > 2.14.1
> > > 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  0:03       ` Byungchul Park
@ 2017-08-23  7:53         ` Peter Zijlstra
  -1 siblings, 0 replies; 49+ messages in thread
From: Peter Zijlstra @ 2017-08-23  7:53 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:

> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
> Is this impossible?
> 
> To Peterz,
> 
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?

Yes, people might not have expected to hit the timeout on this. They
might think timeout means a dead device or something like that.

I'd like to heard from the block folks if this was constructed thus on
purpose though.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-23  7:53         ` Peter Zijlstra
  0 siblings, 0 replies; 49+ messages in thread
From: Peter Zijlstra @ 2017-08-23  7:53 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +0000, Bart Van Assche wrote:

> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
> Is this impossible?
> 
> To Peterz,
> 
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?

Yes, people might not have expected to hit the timeout on this. They
might think timeout means a dead device or something like that.

I'd like to heard from the block folks if this was constructed thus on
purpose though.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  5:55               ` Sergey Senozhatsky
@ 2017-08-24  4:39                 ` Boqun Feng
  2017-08-24  4:49                   ` Sergey Senozhatsky
  0 siblings, 1 reply; 49+ messages in thread
From: Boqun Feng @ 2017-08-24  4:39 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, Bart Van Assche, peterz, linux-kernel,
	linux-block, martin.petersen, axboe, linux-scsi, sfr, linux-next,
	kernel-team

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 13:35), Boqun Feng wrote:
> > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > buffer immediately.
> > > 
> > 
> > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > usages of printk(KERN_CONT "...\n") in kernel.
> > 
> > Did a bit research myself, and I now think the inappropriate use is to
> > use a KERN_CONT printk *after* another printk ending with a "\n".
> 
> ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> upon the return. sorry, your code is correct.
> 

So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
correct usage, right? Thanks. Again, not familiar with printk stuff,
glad you can help me go through this ;-)

Regards,
Boqun

> 	-ss
> 
> > > >  		printk("\n *** DEADLOCK ***\n\n");
> > > > +	} else if (cross_lock(src->instance)) {
> > > > +		printk(" Possible unsafe locking scenario by crosslock:\n\n");
> > > > +		printk("       CPU0                    CPU1\n");
> > > > +		printk("       ----                    ----\n");
> > > > +		printk("  lock(");
> > > > +		__print_lock_name(target);
> > > > +		printk(KERN_CONT ");\n");
> > > > +		printk("  lock(");
> > > > +		__print_lock_name(source);
> > > > +		printk(KERN_CONT ");\n");
> > > > +		printk("                               lock(");
> > > > +		__print_lock_name(parent == source ? target : parent);
> > > > +		printk(KERN_CONT ");\n");
> > > > +		printk("                               unlock(");
> > > > +		__print_lock_name(source);
> > > > +		printk(KERN_CONT ");\n");
> > > > +		printk("\n *** DEADLOCK ***\n\n");
> > > >  	} else {
> > > >  		printk(" Possible unsafe locking scenario:\n\n");
> > > >  		printk("       CPU0                    CPU1\n");
> > > > -- 
> > > > 2.14.1
> > > > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-24  4:39                 ` Boqun Feng
@ 2017-08-24  4:49                   ` Sergey Senozhatsky
  0 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-24  4:49 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Sergey Senozhatsky, Byungchul Park, Bart Van Assche, peterz,
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

Hi,

On (08/24/17 12:39), Boqun Feng wrote:
> On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> > On (08/23/17 13:35), Boqun Feng wrote:
> > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > > buffer immediately.
> > > > 
> > > 
> > > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > > usages of printk(KERN_CONT "...\n") in kernel.
> > > 
> > > Did a bit research myself, and I now think the inappropriate use is to
> > > use a KERN_CONT printk *after* another printk ending with a "\n".
> > 
> > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> > upon the return. sorry, your code is correct.
> > 
> 
> So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
> correct usage, right?

well, yes. with one precondition - there should be no printk-s from other
CPUs/tasks in between

	printk(KERN_CON "...");  +  printk(KERN_CONT "...\n")
                               ^^^^^
			here we can have a preliminary flush and broken
			cont line. but it's been this way forever.

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-23  0:03       ` Byungchul Park
@ 2017-08-30  5:20         ` Sergey Senozhatsky
  -1 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-30  5:20 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On (08/23/17 09:03), Byungchul Park wrote:
[..]
> > Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> > the above report? That report namely doesn't make sense to me.
> 
> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
[..]
> To Peterz,
> 
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?
> I'm ok if you do.

Byungchul, a quick question.
have you measured the performance impact? somehow my linux-next is
notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
is irritatingly slow)


`time dmesg' shows some difference, but probably that's not a good
test.

	!LOCKDEP	LOCKDEP		LOCKDEP -CROSSRELEASE -COMPLETIONS
	real 0m0.661s	0m2.290s	0m1.920s
	user 0m0.010s	0m0.105s	0m0.000s
	sys  0m0.636s	0m2.224s	0m1.888s

anyone else "sees"/"can confirm" the slow down?


it gets back to "usual normal" when I disable CROSSRELEASE and COMPLETIONS.

---

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b19c491cbc4e..cdc30ef81c5e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1091,8 +1091,6 @@ config PROVE_LOCKING
        select DEBUG_MUTEXES
        select DEBUG_RT_MUTEXES if RT_MUTEXES
        select DEBUG_LOCK_ALLOC
-       select LOCKDEP_CROSSRELEASE
-       select LOCKDEP_COMPLETIONS
        select TRACE_IRQFLAGS
        default n
        help

---

	-ss

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-30  5:20         ` Sergey Senozhatsky
  0 siblings, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-30  5:20 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	sergey.senozhatsky.work, martin.petersen, axboe, linux-scsi, sfr,
	linux-next, kernel-team

On (08/23/17 09:03), Byungchul Park wrote:
[..]
> > Byungchul, did you add the crosslock checks to lockdep? Can you have a look at
> > the above report? That report namely doesn't make sense to me.
> 
> The report is talking about the following lockup:
> 
> A work in a worker                     A task work on exit to user
> ------------------                     ---------------------------
> mutex_lock(&bdev->bd_mutex)
>                                        mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>    wait_for_completion_io_timeout(&A)
>                                        complete(&A)
> 
[..]
> To Peterz,
> 
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?
> I'm ok if you do.

Byungchul, a quick question.
have you measured the performance impact? somehow my linux-next is
notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
is irritatingly slow)


`time dmesg' shows some difference, but probably that's not a good
test.

	!LOCKDEP	LOCKDEP		LOCKDEP -CROSSRELEASE -COMPLETIONS
	real 0m0.661s	0m2.290s	0m1.920s
	user 0m0.010s	0m0.105s	0m0.000s
	sys  0m0.636s	0m2.224s	0m1.888s

anyone else "sees"/"can confirm" the slow down?


it gets back to "usual normal" when I disable CROSSRELEASE and COMPLETIONS.

---

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b19c491cbc4e..cdc30ef81c5e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1091,8 +1091,6 @@ config PROVE_LOCKING
        select DEBUG_MUTEXES
        select DEBUG_RT_MUTEXES if RT_MUTEXES
        select DEBUG_LOCK_ALLOC
-       select LOCKDEP_CROSSRELEASE
-       select LOCKDEP_COMPLETIONS
        select TRACE_IRQFLAGS
        default n
        help

---

	-ss

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  5:20         ` Sergey Senozhatsky
  (?)
@ 2017-08-30  5:43         ` Byungchul Park
  2017-08-30  6:15           ` Sergey Senozhatsky
  -1 siblings, 1 reply; 49+ messages in thread
From: Byungchul Park @ 2017-08-30  5:43 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Bart Van Assche, peterz, linux-kernel, linux-block,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Wed, Aug 30, 2017 at 02:20:37PM +0900, Sergey Senozhatsky wrote:
> Byungchul, a quick question.

Hello Sergey,

> have you measured the performance impact? somehow my linux-next is

Yeah, it might have performance impact inevitably.

> notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> is irritatingly slow)

To Ingo,

I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
lockdep detection becomes strong but has performance impact. But,
it's anyway a debug option so IMHO we don't have to take case of the
performance impact. Please let me know your decision.

> `time dmesg' shows some difference, but probably that's not a good
> test.
> 
> 	!LOCKDEP	LOCKDEP		LOCKDEP -CROSSRELEASE -COMPLETIONS
> 	real 0m0.661s	0m2.290s	0m1.920s
> 	user 0m0.010s	0m0.105s	0m0.000s
> 	sys  0m0.636s	0m2.224s	0m1.888s
> 
> anyone else "sees"/"can confirm" the slow down?
> 
> 
> it gets back to "usual normal" when I disable CROSSRELEASE and COMPLETIONS.
> 
> ---
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index b19c491cbc4e..cdc30ef81c5e 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1091,8 +1091,6 @@ config PROVE_LOCKING
>         select DEBUG_MUTEXES
>         select DEBUG_RT_MUTEXES if RT_MUTEXES
>         select DEBUG_LOCK_ALLOC
> -       select LOCKDEP_CROSSRELEASE
> -       select LOCKDEP_COMPLETIONS
>         select TRACE_IRQFLAGS
>         default n
>         help
> 
> ---
> 
> 	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  5:43         ` Byungchul Park
@ 2017-08-30  6:15           ` Sergey Senozhatsky
  2017-08-30  8:42             ` Peter Zijlstra
  0 siblings, 1 reply; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-30  6:15 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Bart Van Assche, peterz, linux-kernel,
	linux-block, martin.petersen, axboe, linux-scsi, sfr, linux-next,
	kernel-team

Hi,

On (08/30/17 14:43), Byungchul Park wrote:
[..]
> > notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> > is irritatingly slow)
> 
> To Ingo,
> 
> I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
> dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
> lockdep detection becomes strong but has performance impact. But,
> it's anyway a debug option so IMHO we don't have to take case of the
> performance impact. Please let me know your decision.

well, I expected it :)

I've been running lockdep enabled kernels for years, and was OK with
the performance. but now it's just too much and I'm looking at disabling
lockdep.

a more relevant test -- compilation of a relatively small project

  LOCKDEP -CROSSRELEASE -COMPLETIONS     LOCKDEP +CROSSRELEASE +COMPLETIONS

   real    1m23.722s                      real    2m9.969s
   user    4m11.300s                      user    4m15.458s
   sys     0m49.386s                      sys     2m3.594s


you don't want to know how much time now it takes to recompile the
kernel ;)

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  6:15           ` Sergey Senozhatsky
@ 2017-08-30  8:42             ` Peter Zijlstra
  2017-08-30  8:47               ` Peter Zijlstra
  0 siblings, 1 reply; 49+ messages in thread
From: Peter Zijlstra @ 2017-08-30  8:42 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, Bart Van Assche, linux-kernel, linux-block,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Wed, Aug 30, 2017 at 03:15:11PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> On (08/30/17 14:43), Byungchul Park wrote:
> [..]
> > > notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> > > is irritatingly slow)
> > 
> > To Ingo,
> > 
> > I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
> > dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
> > lockdep detection becomes strong but has performance impact. But,
> > it's anyway a debug option so IMHO we don't have to take case of the
> > performance impact. Please let me know your decision.
> 
> well, I expected it :)
> 
> I've been running lockdep enabled kernels for years, and was OK with
> the performance. but now it's just too much and I'm looking at disabling
> lockdep.
> 
> a more relevant test -- compilation of a relatively small project
> 
>   LOCKDEP -CROSSRELEASE -COMPLETIONS     LOCKDEP +CROSSRELEASE +COMPLETIONS
> 
>    real    1m23.722s                      real    2m9.969s
>    user    4m11.300s                      user    4m15.458s
>    sys     0m49.386s                      sys     2m3.594s
> 
> 
> you don't want to know how much time now it takes to recompile the
> kernel ;)

Right,.. so when I look at perf annotate for __lock_acquire and
lock_release (the two most expensive lockdep functions in a kernel
profile) I don't actually see much cross-release stuff.

So the overhead looks to be spread out over all sorts, which makes it
harder to find and fix.

stack unwinding is done lots and is fairly expensive, I've not yet
checked if crossrelease does too much of that.

The below saved about 50% of my __lock_acquire() time, not sure it made
a significant difference over all though.

---
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..f8db1ead1c48 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3386,7 +3386,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		if (!class)
 			return 0;
 	}
-	atomic_inc((atomic_t *)&class->ops);
+	/* atomic_inc((atomic_t *)&class->ops); */
 	if (very_verbose(class)) {
 		printk("\nacquire class [%p] %s", class->key, class->name);
 		if (class->name_version > 1)

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  8:42             ` Peter Zijlstra
@ 2017-08-30  8:47               ` Peter Zijlstra
  2017-08-30  8:53                   ` Byungchul Park
  2017-08-30 12:30                 ` Sergey Senozhatsky
  0 siblings, 2 replies; 49+ messages in thread
From: Peter Zijlstra @ 2017-08-30  8:47 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, Bart Van Assche, linux-kernel, linux-block,
	martin.petersen, axboe, linux-scsi, sfr, linux-next, kernel-team

On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> 
> So the overhead looks to be spread out over all sorts, which makes it
> harder to find and fix.
> 
> stack unwinding is done lots and is fairly expensive, I've not yet
> checked if crossrelease does too much of that.

Aah, we do an unconditional stack unwind for every __lock_acquire() now.
It keeps a trace in the xhlocks[].

Does the below cure most of that overhead?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..7b872036b72e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
 	xhlock->trace.skip = 3;
-	save_stack_trace(&xhlock->trace);
+	/* save_stack_trace(&xhlock->trace); */
 }
 
 static inline int same_context_xhlock(struct hist_lock *xhlock)

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* RE: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  8:47               ` Peter Zijlstra
@ 2017-08-30  8:53                   ` Byungchul Park
  2017-08-30 12:30                 ` Sergey Senozhatsky
  1 sibling, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-30  8:53 UTC (permalink / raw)
  To: 'Peter Zijlstra', 'Sergey Senozhatsky'
  Cc: 'Bart Van Assche',
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@infradead.org]
> Sent: Wednesday, August 30, 2017 5:48 PM
> To: Sergey Senozhatsky
> Cc: Byungchul Park; Bart Van Assche; linux-kernel@vger.kernel.org; linux-
> block@vger.kernel.org; martin.petersen@oracle.com; axboe@kernel.dk; linux-
> scsi@vger.kernel.org; sfr@canb.auug.org.au; linux-next@vger.kernel.org;
> kernel-team@lge.com
> Subject: Re: possible circular locking dependency detected [was: linux-
> next: Tree for Aug 22]
> 
> On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> >
> > So the overhead looks to be spread out over all sorts, which makes it
> > harder to find and fix.
> >
> > stack unwinding is done lots and is fairly expensive, I've not yet
> > checked if crossrelease does too much of that.
> 
> Aah, we do an unconditional stack unwind for every __lock_acquire() now.
> It keeps a trace in the xhlocks[].

Yeah.. I also think this is most significant..

> 
> Does the below cure most of that overhead?
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 44c8d0d17170..7b872036b72e 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
>  	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
>  	xhlock->trace.entries = xhlock->trace_entries;
>  	xhlock->trace.skip = 3;
> -	save_stack_trace(&xhlock->trace);
> +	/* save_stack_trace(&xhlock->trace); */
>  }
> 
>  static inline int same_context_xhlock(struct hist_lock *xhlock)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
@ 2017-08-30  8:53                   ` Byungchul Park
  0 siblings, 0 replies; 49+ messages in thread
From: Byungchul Park @ 2017-08-30  8:53 UTC (permalink / raw)
  To: 'Peter Zijlstra', 'Sergey Senozhatsky'
  Cc: 'Bart Van Assche',
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@infradead.org]
> Sent: Wednesday, August 30, 2017 5:48 PM
> To: Sergey Senozhatsky
> Cc: Byungchul Park; Bart Van Assche; linux-kernel@vger.kernel.org; linux-
> block@vger.kernel.org; martin.petersen@oracle.com; axboe@kernel.dk; linux-
> scsi@vger.kernel.org; sfr@canb.auug.org.au; linux-next@vger.kernel.org;
> kernel-team@lge.com
> Subject: Re: possible circular locking dependency detected [was: linux-
> next: Tree for Aug 22]
> 
> On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> >
> > So the overhead looks to be spread out over all sorts, which makes it
> > harder to find and fix.
> >
> > stack unwinding is done lots and is fairly expensive, I've not yet
> > checked if crossrelease does too much of that.
> 
> Aah, we do an unconditional stack unwind for every __lock_acquire() now.
> It keeps a trace in the xhlocks[].

Yeah.. I also think this is most significant..

> 
> Does the below cure most of that overhead?
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 44c8d0d17170..7b872036b72e 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
>  	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
>  	xhlock->trace.entries = xhlock->trace_entries;
>  	xhlock->trace.skip = 3;
> -	save_stack_trace(&xhlock->trace);
> +	/* save_stack_trace(&xhlock->trace); */
>  }
> 
>  static inline int same_context_xhlock(struct hist_lock *xhlock)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
  2017-08-30  8:47               ` Peter Zijlstra
  2017-08-30  8:53                   ` Byungchul Park
@ 2017-08-30 12:30                 ` Sergey Senozhatsky
  1 sibling, 0 replies; 49+ messages in thread
From: Sergey Senozhatsky @ 2017-08-30 12:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Byungchul Park, Bart Van Assche,
	linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	sfr, linux-next, kernel-team

Hello Peter,

On (08/30/17 10:47), Peter Zijlstra wrote:
[..]
> On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> > 
> > So the overhead looks to be spread out over all sorts, which makes it
> > harder to find and fix.
> > 
> > stack unwinding is done lots and is fairly expensive, I've not yet
> > checked if crossrelease does too much of that.
> 
> Aah, we do an unconditional stack unwind for every __lock_acquire() now.
> It keeps a trace in the xhlocks[].
> 
> Does the below cure most of that overhead?

thanks.

mostly yes. the kernel is not so dramatically slower now. it's still
seems to be a bit slower, which is expected I suppose, but over all
it's much better

	real	1m35.209s
	user	4m12.467s
	sys	0m49.457s

// approx 10 seconds slower.

	-ss

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2017-08-30 12:30 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-22  8:38 linux-next: Tree for Aug 22 Stephen Rothwell
2017-08-22 10:47 ` possible circular locking dependency detected [was: linux-next: Tree for Aug 22] Sergey Senozhatsky
2017-08-22 21:43   ` Bart Van Assche
2017-08-22 21:43     ` Bart Van Assche
2017-08-22 21:43     ` Bart Van Assche
2017-08-23  0:03     ` Byungchul Park
2017-08-23  0:03       ` Byungchul Park
2017-08-23  0:03       ` Byungchul Park
2017-08-23  2:36       ` Sergey Senozhatsky
2017-08-23  2:36         ` Sergey Senozhatsky
2017-08-23  2:59         ` Byungchul Park
2017-08-23  3:49       ` Boqun Feng
2017-08-23  3:49         ` Boqun Feng
2017-08-23  3:49         ` Boqun Feng
2017-08-23  4:38         ` Boqun Feng
2017-08-23  4:38           ` Boqun Feng
2017-08-23  4:46           ` Sergey Senozhatsky
2017-08-23  4:46             ` Sergey Senozhatsky
2017-08-23  5:35             ` Boqun Feng
2017-08-23  5:44               ` Sergey Senozhatsky
2017-08-23  5:55               ` Sergey Senozhatsky
2017-08-24  4:39                 ` Boqun Feng
2017-08-24  4:49                   ` Sergey Senozhatsky
2017-08-23  5:44           ` Byungchul Park
2017-08-23  5:44             ` Byungchul Park
2017-08-23  4:46         ` Byungchul Park
2017-08-23  4:46           ` Byungchul Park
2017-08-23  5:01           ` Boqun Feng
2017-08-23  5:01             ` Boqun Feng
2017-08-23  7:53       ` Peter Zijlstra
2017-08-23  7:53         ` Peter Zijlstra
2017-08-30  5:20       ` Sergey Senozhatsky
2017-08-30  5:20         ` Sergey Senozhatsky
2017-08-30  5:43         ` Byungchul Park
2017-08-30  6:15           ` Sergey Senozhatsky
2017-08-30  8:42             ` Peter Zijlstra
2017-08-30  8:47               ` Peter Zijlstra
2017-08-30  8:53                 ` Byungchul Park
2017-08-30  8:53                   ` Byungchul Park
2017-08-30 12:30                 ` Sergey Senozhatsky
2017-08-22 18:11 ` linux-next: Tree for Aug 22 Stephen Rothwell
2017-08-22 18:14   ` Stephen Rothwell
2017-08-22 18:59     ` Paul E. McKenney
2017-08-22 18:59       ` Paul E. McKenney
2017-08-22 19:12       ` Stephen Rothwell
2017-08-22 19:32         ` Paul E. McKenney
2017-08-22 19:36           ` Paul E. McKenney
2017-08-22 21:57             ` Stephen Rothwell
2017-08-22 22:27               ` Stephen Rothwell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.