All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux 5.14-rc1
@ 2021-07-11 22:49 Linus Torvalds
  2021-07-12  1:56 ` Guenter Roeck
  2021-07-12  7:08 ` Jon Masters
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-07-11 22:49 UTC (permalink / raw)
  To: Linux Kernel Mailing List

You all know the drill by now. It's been the usual two weeks of merge
window, and not it's closed, and 5.14-rc1 is out there.

As usual, it's much too big to post the shortlog, with about 13k
commits (an another ~800 merge commits) by about 1650 developers, and
a diffstat summary of

 11859 files changed, 817707 insertions(+), 285485 deletions(-)

Appended is my mergelog which gives you an overview of what I've
pulled during the merge window, and who I pulled from. And as usual, I
want to stress how this is obviously just a very high-level summary,
and tiny part of the actual developer community - if you want the full
details of all those changes, you'll have to go to the -git tree.

On the whole, I don't think there are any huge surprises in here, and
size-wise this seems to be a pretty regular release too. Let's hope
that that translates to a nice and calm release cycle, but you never
know. Last release was big, but it was all fairly calm despite that,
so size isn't always the determining factor here..

If somebody wants to look at the actual diff for the release, I'd
encourage you to ignore - once again - another set of big AMD GPU
hardware description header files. We seem to have those fairly
regularly, and they are always these huge generated headers that end
up dwarfing everything else. Almost exactly half of the whole 5.14-rc1
patch is comprised of those GPU headers, and it skews the statistics a
lot.

Now, even if you ignore that AMD header drop, drivers account for over
two thirds of the changes when you look at the diff, and that's
perfectly normal. What's slightly less usual is how there's a lot of
line _removals_ in there, with the old IDE layer finally having met
its long-overdue demise, and all our IDE support is now based on
libata.

Of course, the fact that we removed all that legacy IDE code doesn't
mean that we had a reduction in lines over-all: a few tens of
thousands of lines of legacy code is nowhere near enough to balance
out the usual kernel growth. But it's still a nice thing to see the
cleanup.

So drivers dominate: even when ignoring the AMD header addition
there's a fair amount of gpu updates, but there's networking drivers,
rdma, sound, scsi, staging, media...

Outside of drivers, there's all the usual suspects: architecture
updates (arm, arm64, x86, powerpc, s390, with a smattering of other
architecture updates too) and various core kernel updates: networking,
filesystems, VM, scheduling etc. And the usual documentation and
tooling (perf and self-tests) updates.

Please do test, and we can get the whole calming-down period rolling
and hopefully get a timely final 5.14 release.

                  Linus

---

Al Viro (3):
    vfs d_path() updates
    iov_iter updates
    vfs name lookup updates

Alex Williamson (1):
    VFIO updates

Alexandre Belloni (2):
    i3c updates
    RTC updates

Andreas Gruenbacher (1):
    gfs2 updates

Andrew Morton (3):
    misc updates
    more updates
    yet more updates

Arnaldo Carvalho de Melo (2):
    perf tool updates
    more perf tool updates

Arnd Bergmann (1):
    asm/unaligned.h unification

Bartosz Golaszewski (1):
    gpio updates

Bjorn Andersson (2):
    remoteproc updates
    hwspinlock updates

Bjorn Helgaas (2):
    pci updates
    pci fix

Borislav Petkov (3):
    x86 RAS updates
    x86 cpu updates
    x86 SEV updates

Bruce Fields (1):
    nfsd updates

Casey Schaufler (1):
    smack updates

Christian Brauner (2):
    mount_setattr updates
    openat2 fixes

Christoph Hellwig (2):
    dma-mapping updates
    configfs updates

Corey Minyard (1):
    IPMI driver updates

Dan Williams (1):
    CXL (Compute Express Link) updates

Daniel Lezcano (1):
    thermal updates

Daniel Thompson (1):
    kgdb updates

Darrick Wong (1):
    xfs updates

Dave Airlie (2):
    drm updates
    drm fixes

David Kleikamp (1):
    jfs updates

David Sterba (1):
    btrfs updates

David Teigland (1):
    dlm updates

Dennis Zhou (2):
    percpu updates
    percpu fix

Dmitry Torokhov (1):
    input updates

Eric Biederman (1):
    user namespace rlimit handling update

Eric Biggers (1):
    fscrypt updates

Gao Xiang (1):
    erofs updates

Geert Uytterhoeven (1):
    m68k updates

Greg KH (5):
    char / misc driver updates
    driver core changes
    staging / IIO driver updates
    tty / serial updates
    USB / Thunderbolt updates

Greg Ungerer (1):
    m68knommu update

Guenter Roeck (1):
    hwmon updates

Guo Ren (1):
    arch/csky updates

Gustavo Silva (3):
    fallthrough fixes
    array-bounds fixes
    more fallthrough fixes

Hans de Goede (1):
    x86 platform driver updates

Herbert Xu (2):
    crypto updates
    crypto fixes

Ilya Dryomov (1):
    ceph updates

Ingo Molnar (19):
    EFI updates
    objtool fix and updates
    locking updates
    perf events updates
    scheduler udpates
    timers/nohz updates
    x86 exception handling updates
    x86 asm updates
    x86 boot update
    x86 resource control documentation fixes
    x86 cleanups
    x86 uapi fixlet
    x86 mm update
    x86 splitlock updates
    scheduler fixes
    locking fixes
    perf fixes
    scheduler fixes
    irq fixes

Jaegeuk Kim (1):
    f2fs updates

Jakub Kicinski (1):
    networking updates

James Bottomley (2):
    SCSI updates
    more SCSI updates

Jan Kara (1):
    misc fs updates

Jarkko Sakkinen (1):
    tpm driver updates

Jason Gunthorpe (1):
    rdma updates

Jassi Brar (1):
    mailbox updates

Jens Axboe (6):
    libata updates
    core block updates
    block driver updates
    io_uring updates
    more block updates
    io_uring fixes

Jessica Yu (1):
    module updates

Jiri Kosina (1):
    HID updates

Joerg Roedel (1):
    iommu updates

Jonathan Corbet (1):
    documentation updates

Juergen Gross (1):
    xen updates

Julia Lawall (1):
    coccinelle updates

Kees Cook (3):
    seccomp updates
    pstore updates
    clang feature updates

Lee Jones (2):
    mfd updates
    backlight updates

Linus Walleij (1):
    pin control updates

Mark Brown (3):
    regmap updates
    regulator updates
    spi updates

Masahiro Yamada (1):
    Kbuild updates

Mauro Carvalho Chehab (1):
    media updates

Micah Morton (1):
    SafeSetID update

Michael Ellerman (2):
    powerpc updates
    powerpc fixes

Michael Tsirkin (1):
    virtio,vhost,vdpa updates

Michal Simek (1):
    microblaze updates

Mike Marshall (1):
    orangefs updates

Mike Rapoport (2):
    memblock updates
    memblock fix

Mike Snitzer (1):
    device mapper updates

Miklos Szeredi (1):
    fuse updates

Mimi Zohar (1):
    integrity subsystem updates

Namjae Jeon (1):
    exfat updates

Olof Johansson (3):
    ARM SoC updates
    ARM devicetree updates
    ARM driver updates

Palmer Dabbelt (1):
    RISC-V updates

Paolo Bonzini (1):
    kvm updates

Paul E McKenney (1):
    lkmm fixlet

Paul McKenney (2):
    KCSAN updates
    RCU updates

Paul Moore (2):
    SELinux updates
    audit updates

Pavel Machek (1):
    LED updates

Petr Mladek (1):
    printk updates

Rafael Wysocki (6):
    power management updates
    ACPI updates
    PNP updates
    device properties framework updates
    more power management updates
    more ACPI updates

Richard Weinberger (3):
    MTD updates
    UBIFS updates
    UML updates

Rob Herring (1):
    devicetree updates

Russell King (1):
    ARM development updates

Sebastian Reichel (1):
    power supply and reset updates

Shuah Khan (2):
    KUnit update
    Kselftest update

Stafford Horne (1):
    OpenRISC updates

Stephen Boyd (2):
    clk updates
    more clk updates

Steve French (2):
    cifs updates
    cifs fixes

Steven Rostedt (2):
    tracing updates
    tracing fix and cleanup

Takashi Iwai (2):
    sound updates
    sound fixes

Ted Ts'o (2):
    ext4 updates
    ext4 updates

Tejun Heo (1):
    cgroup updates

Tetsuo Handa (1):
    tomoyo fix

Thierry Reding (1):
    pwm updates

Thomas Bogendoerfer (2):
    MIPS updates
    MIPS fixes

Thomas Gleixner (7):
    CPU hotplug cleanup
    CPU hotplug fix
    irq updates
    timer updates
    x86 interrupt related updates
    x86 entry code related updates
    x86 fpu updates

Tony Luck (1):
    EDAC updates

Trond Myklebust (1):
    NFS client updates

Ulf Hansson (2):
    MMC and MEMSTICK updates
    MMC fixes

Vasily Gorbik (2):
    s390 updates
    more s390 updates

Vinod Koul (1):
    dmaengine updates

Wei Liu (1):
    hyperv updates

Will Deacon (1):
    arm64 updates

Wim Van Sebroeck (1):
    watchdog updates

Wolfram Sang (1):
    i2c updates

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-11 22:49 Linux 5.14-rc1 Linus Torvalds
@ 2021-07-12  1:56 ` Guenter Roeck
  2021-07-12  4:14   ` Guenter Roeck
  2021-07-12  7:08 ` Jon Masters
  1 sibling, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-12  1:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On Sun, Jul 11, 2021 at 03:49:31PM -0700, Linus Torvalds wrote:
> You all know the drill by now. It's been the usual two weeks of merge
> window, and not it's closed, and 5.14-rc1 is out there.
> 
[ ... [ 
> Please do test, and we can get the whole calming-down period rolling
> and hopefully get a timely final 5.14 release.
> 

Build results:
	total: 154 pass: 152 fail: 2
Failed builds:
	arcv2:allnoconfig
	riscv:allmodconfig
Qemu test results:
	total: 462 pass: 443 fail: 19
Failed tests:
	arm:z2:pxa_defconfig:nodebug:nocd:nofs:nonvme:noscsi:notests:novirt:nofdt:flash8,384k,2:rootfs
	<all riscv32>

z2:pxa_defconfig fails to boot due to commit 4b361cfa8624 ("mtd: core:
add OTP nvmem provider support"). A patch to fix the problem has been
posted at
https://patchwork.ozlabs.org/project/linux-mtd/patch/20210707135359.32398-1-michael@walle.cc/

The riscv:allmodconfig build failure is not new. It is seen if both
STACKPROTECTOR_PER_TASK and GCC_PLUGIN_RANDSTRUCT are enabled.
See
https://patchwork.kernel.org/project/linux-riscv/patch/20210706162621.940924-1-linux@roeck-us.net/
for details and a proposed fix.

riscv32 images fail to boot due to commit ca6eaaa210de ("riscv:
__asm_copy_to-from_user: Optimize unaligned memory access and pipeline
stall"). I reported this a couple of days ago, but have not seen a reply.

In addition to that, there are some new warning tracebacks.

WARNING: CPU: 0 PID: 55 at crypto/testmgr.c:5652 alg_test.part.0+0x148/0x460
self-tests for drbg_nopr_hmac_sha512 (stdrng) failed (rc=-22)

This is due to commits

9b7b94683a9b crypto: DRBG - switch to HMAC SHA512 DRBG as default DRBG
8833272d876e crypto: drbg - self test for HMAC(SHA-512)

which set the default crypto algorithm to SHA-512 without actually
mandating CONFIG_CRYPTO_SHA512. A patch to fix this has been posted at
https://patchwork.kernel.org/project/linux-crypto/patch/304ee0376383d9ceecddbfd216c035215bbff861.camel@chronox.de/

WARNING: CPU: 0 PID: 24 at block/genhd.c:484 __device_add_disk+0x248/0x286

This is seen with riscv64 images when booting from usb or scsi drives.
I don't recall seeing this warning before, but I may have missed it
in the flurry of other warnings. It may have been introduced with commit
7c3f828b522b0 ("block: refactor device number setup in __device_add_disk")
but I did not try to bisect it yet.

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12  1:56 ` Guenter Roeck
@ 2021-07-12  4:14   ` Guenter Roeck
  2021-07-12  5:20     ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-12  4:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe

On Sun, Jul 11, 2021 at 06:56:21PM -0700, Guenter Roeck wrote:
> On Sun, Jul 11, 2021 at 03:49:31PM -0700, Linus Torvalds wrote:
> > You all know the drill by now. It's been the usual two weeks of merge
> > window, and not it's closed, and 5.14-rc1 is out there.
> > 
> [ ... ] 
> > Please do test, and we can get the whole calming-down period rolling
> > and hopefully get a timely final 5.14 release.
> > 
> 
[ ... ]
> 
> WARNING: CPU: 0 PID: 24 at block/genhd.c:484 __device_add_disk+0x248/0x286
> 
> This is seen with riscv64 images when booting from usb or scsi drives.
> I don't recall seeing this warning before, but I may have missed it
> in the flurry of other warnings. It may have been introduced with commit
> 7c3f828b522b0 ("block: refactor device number setup in __device_add_disk")
> but I did not try to bisect it yet.
> 
My guess was correct. Bisect points to the above commit. Bisect log as well
as complete backtrace and example qemu command attached.

Copying Christoph and Jens.

Guenter

---
# bad: [3dbdb38e286903ec220aaf1fb29a8d94297da246] Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
# good: [007b350a58754a93ca9fe50c498cc27780171153] Merge tag 'dlm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
git bisect start '3dbdb38e2869' '007b350a5875'
# good: [b6df00789e2831fff7a2c65aa7164b2a4dcbe599] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect good b6df00789e2831fff7a2c65aa7164b2a4dcbe599
# good: [990ec3014deedfed49e610cdc31dc6930ca63d8d] drm/amdgpu: add psp runtime db structures
git bisect good 990ec3014deedfed49e610cdc31dc6930ca63d8d
# bad: [c288d9cd710433e5991d58a0764c4d08a933b871] Merge tag 'for-5.14/io_uring-2021-06-30' of git://git.kernel.dk/linux-block
git bisect bad c288d9cd710433e5991d58a0764c4d08a933b871
# bad: [df668a5fe461bb9d7e899c538acc7197746038f4] Merge tag 'for-5.14/block-2021-06-29' of git://git.kernel.dk/linux-block
git bisect bad df668a5fe461bb9d7e899c538acc7197746038f4
# good: [4b5e35ce075817bc36d7c581b22853be984e5b41] Merge tag 'edac_updates_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
git bisect good 4b5e35ce075817bc36d7c581b22853be984e5b41
# bad: [e42cfb1da0bf33c313318da201730324c423351d] block: Remove unnecessary elevator operation checks
git bisect bad e42cfb1da0bf33c313318da201730324c423351d
# bad: [c97d93c31e5734a16bfe663085ec91b8c9fb20f9] block: factor out a part_devt helper
git bisect bad c97d93c31e5734a16bfe663085ec91b8c9fb20f9
# bad: [7681750bd35fe92dd915f4df177d45265e78a933] zram: convert to blk_alloc_disk/blk_cleanup_disk
git bisect bad 7681750bd35fe92dd915f4df177d45265e78a933
# good: [56b68085e536eff2676108f2f8356889a7dbbf55] blk-mq: Some tag allocation code refactoring
git bisect good 56b68085e536eff2676108f2f8356889a7dbbf55
# bad: [958229a7c55f219b1cff99f939dabbc1b6ba7161] block: add a flag to make put_disk on partially initalized disks safer
git bisect bad 958229a7c55f219b1cff99f939dabbc1b6ba7161
# bad: [7c3f828b522b07adb341b08fde1660685c5ba3eb] block: refactor device number setup in __device_add_disk
git bisect bad 7c3f828b522b07adb341b08fde1660685c5ba3eb
# good: [d97e594c51660bea510a387731637b894651e4b5] blk-mq: Use request queue-wide tags for tagset-wide sbitmap
git bisect good d97e594c51660bea510a387731637b894651e4b5
# first bad commit: [7c3f828b522b07adb341b08fde1660685c5ba3eb] block: refactor device number setup in __device_add_disk

---
[   11.940230] Waiting for root device /dev/sda...
[   12.066026] usb 1-1: new full-speed USB device number 2 using ohci-pci
[   12.306673] usb-storage 1-1:1.0: USB Mass Storage device detected
[   12.310957] scsi host0: usb-storage 1-1:1.0
[   13.354722] scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
[   13.370433] sd 0:0:0:0: Power-on or device reset occurred
[   13.390621] sd 0:0:0:0: [sda] 32768 512-byte logical blocks: (16.8 MB/16.0 MiB)
[   13.396348] sd 0:0:0:0: [sda] Write Protect is off
[   13.402622] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   13.403994] ------------[ cut here ]------------
[   13.404165] WARNING: CPU: 0 PID: 7 at block/genhd.c:484 __device_add_disk+0x248/0x286
[   13.404393] Modules linked in:
[   13.404601] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.14.0-rc1 #1
[   13.404830] Hardware name: riscv-virtio,qemu (DT)
[   13.405081] Workqueue: events_unbound async_run_entry_fn
[   13.405309] epc : __device_add_disk+0x248/0x286
[   13.405496]  ra : __device_add_disk+0x1b2/0x286
[   13.405657] epc : ffffffff8042a4cc ra : ffffffff8042a436 sp : ffffffd00024bb80
[   13.405863]  gp : ffffffff819d15a8 tp : ffffffe0027a8040 t0 : ffffffe01f6f48f8
[   13.406087]  t1 : 000000006faf79ac t2 : 00000000000001a5 s0 : ffffffd00024bbc0
[   13.406293]  s1 : ffffffe004450e00 a0 : 0000000000006000 a1 : ffffffe0027a88b0
[   13.406499]  a2 : ffffffff819e2890 a3 : 0000000000000000 a4 : 0000000000000008
[   13.406703]  a5 : 0000000000000000 a6 : 0000000000001fff a7 : 0000000000000000
[   13.406908]  s2 : ffffffe004450e00 s3 : 0000000000000001 s4 : 0000000000000000
[   13.407135]  s5 : ffffffe00438c268 s6 : 0000000000000000 s7 : 0000000000000000
[   13.407344]  s8 : ffffffff819d41b8 s9 : ffffffff819d4298 s10: ffffffe00261a858
[   13.407550]  s11: ffffffe00261a8d0 t3 : 0000000045db8cae t4 : 000000000000000c
[   13.407752]  t5 : fffffffff04a2835 t6 : 0000000000001fff
[   13.407912] status: 0000000000000120 badaddr: 0000000000000000 cause: 0000000000000003
[   13.408179] [<ffffffff8042a4cc>] __device_add_disk+0x248/0x286
[   13.408394] [<ffffffff8042a518>] device_add_disk+0xe/0x16
[   13.408555] [<ffffffff806e3886>] sd_probe+0x2b8/0x366
[   13.408711] [<ffffffff8067bce4>] really_probe.part.0+0x188/0x222
[   13.408886] [<ffffffff8067be16>] __driver_probe_device+0x98/0xbe
[   13.409079] [<ffffffff8067be68>] driver_probe_device+0x2c/0xb0
[   13.409247] [<ffffffff8067c330>] __device_attach_driver+0x62/0x9a
[   13.409419] [<ffffffff80679c7e>] bus_for_each_drv+0x5c/0xa2
[   13.409580] [<ffffffff8067b458>] __device_attach_async_helper+0x88/0x92
[   13.409766] [<ffffffff80032e12>] async_run_entry_fn+0x22/0xc4
[   13.409930] [<ffffffff80027e28>] process_one_work+0x1f4/0x53a
[   13.410114] [<ffffffff800281ec>] worker_thread+0x7e/0x324
[   13.410272] [<ffffffff8002fa1e>] kthread+0x100/0x116
[   13.410419] [<ffffffff80003648>] ret_from_exception+0x0/0x10
[   13.410614] irq event stamp: 59724
[   13.410733] hardirqs last  enabled at (59723): [<ffffffff80a1471c>] _raw_spin_unlock_irqrestore+0x54/0x62
[   13.411019] hardirqs last disabled at (59724): [<ffffffff80003592>] _save_context+0x7c/0xe0
[   13.411249] softirqs last  enabled at (34082): [<ffffffff80a1510a>] __do_softirq+0x39a/0x520
[   13.411496] softirqs last disabled at (34073): [<ffffffff80014354>] irq_exit+0xd2/0xde
[   13.411733] ---[ end trace 644c7abe39308f0f ]---
[   13.480431] sd 0:0:0:0: [sda] Attached SCSI disk
[   13.511335] EXT4-fs (sda): mounting ext2 file system using the ext4 subsystem
[   13.536810] EXT4-fs (sda): mounted filesystem without journal. Opts: (null). Quota mode: disabled.
[   13.537632] VFS: Mounted root (ext2 filesystem) readonly on device 8:0.

---

Sample qemu command:

qemu-system-riscv64 -M virt -m 512M \
     -no-reboot -bios default -kernel arch/riscv/boot/Image \
     -snapshot -device virtio-net-device,netdev=net0 -netdev user,id=net0 \
     -usb -device pci-ohci,id=ohci -device usb-storage,bus=ohci.0,drive=d0 \
     -drive file=/var/cache/buildbot/riscv64/rootfs.ext2,if=none,id=d0,format=raw \
     -append "root=/dev/sda rootwait console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200" \
     -nographic -monitor none

The problem is seen with various USB boot variants (ohcu, ehci, xhci, uas-ehci,
uas-xhci) and all SCSI controllers supported by qemu.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12  4:14   ` Guenter Roeck
@ 2021-07-12  5:20     ` Christoph Hellwig
  2021-07-12 13:53       ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2021-07-12  5:20 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linus Torvalds, Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe

On Sun, Jul 11, 2021 at 09:14:23PM -0700, Guenter Roeck wrote:
> My guess was correct. Bisect points to the above commit. Bisect log as well
> as complete backtrace and example qemu command attached.
> 
> Copying Christoph and Jens.

This should fіx it:

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 6d2d63629a90..b8d55af763f9 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -98,11 +98,7 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 
-#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
 #define SD_MINORS	16
-#else
-#define SD_MINORS	0
-#endif
 
 static void sd_config_discard(struct scsi_disk *, unsigned int);
 static void sd_config_write_same(struct scsi_disk *);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-11 22:49 Linux 5.14-rc1 Linus Torvalds
  2021-07-12  1:56 ` Guenter Roeck
@ 2021-07-12  7:08 ` Jon Masters
  2021-07-12 19:14   ` Linus Torvalds
  1 sibling, 1 reply; 12+ messages in thread
From: Jon Masters @ 2021-07-12  7:08 UTC (permalink / raw)
  To: Linus Torvalds, Linux Kernel Mailing List

On 7/11/21 6:49 PM, Linus Torvalds wrote:
> You all know the drill by now. It's been the usual two weeks of merge
> window, and not it's closed, and 5.14-rc1 is out there.

I happened to be installing a Fedora 34 (x86) VM for something and did a 
test kernel compile that hung on boot. Setting up a serial console I get 
the below backtrace from ttm but I have not had chance to look at it.

Fedora 34 (Server Edition)
Kernel 5.14.0-rc1 on an x86_64 (ttyS0)

Web console: https://fedora:9090/ or https://192.168.1.91:9090/

fedora login: [   11.263539] BUG: kernel NULL pointer dereference, 
address: 0000000000000010
[   11.266355] #PF: supervisor read access in kernel mode
[   11.268409] #PF: error_code(0x0000) - not-present page
[   11.270456] PGD 0 P4D 0
[   11.271506] Oops: 0000 [#1] SMP PTI
[   11.272903] CPU: 1 PID: 41 Comm: kworker/1:1 Not tainted 5.14.0-rc1 #1
[   11.275488] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
0.0.0 02/06/2015
[   11.278274] Workqueue: events ttm_device_delayed_workqueue [ttm]
[   11.279865] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[   11.281404] Code: 89 e7 45 31 e4 e8 67 bf f6 dc eb ea 0f 1f 44 00 00 
0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 68 01 00 
00 <83> 78 10 03 74 02 5d c3 8b 85 64 02 00 00 85 c0 74 f4 48 8b 7d 08
[   11.286271] RSP: 0018:ffffb7a24017fdd0 EFLAGS: 00010202
[   11.287616] RAX: 0000000000000000 RBX: ffff9da7c08e8670 RCX: 
ffff9da7c0b30000
[   11.288978] RDX: ffff9da7c27f7990 RSI: ffff9da7c27f7990 RDI: 
ffff9da7c27f7800
[   11.290332] RBP: ffff9da7c27f7800 R08: ffff9da7c27f7990 R09: 
0000000000000000
[   11.291690] R10: ffff9da7c991ec00 R11: 0000000000000000 R12: 
ffff9da7c27f7990
[   11.293021] R13: ffff9da7c27f7800 R14: ffff9da7c27f7960 R15: 
ffff9da7c27f7990
[   11.294349] FS:  0000000000000000(0000) GS:ffff9da937c80000(0000) 
knlGS:0000000000000000
[   11.295853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   11.296935] CR2: 0000000000000010 CR3: 000000010c178004 CR4: 
0000000000370ee0
[   11.298111] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   11.299120] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   11.300130] Call Trace:
[   11.300489]  ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
[   11.301256]  ttm_bo_release+0x1a1/0x300 [ttm]
[   11.301879]  ttm_bo_delayed_delete+0x1be/0x220 [ttm]
[   11.302587]  ttm_device_delayed_workqueue+0x18/0x40 [ttm]
[   11.303358]  process_one_work+0x1ec/0x390
[   11.303941]  worker_thread+0x53/0x3e0
[   11.304464]  ? process_one_work+0x390/0x390
[   11.305066]  kthread+0x127/0x150
[   11.305535]  ? set_kthread_struct+0x40/0x40
[   11.306188]  ret_from_fork+0x22/0x30
[   11.306749] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 
nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct 
nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw 
ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set 
nf_tables rfkill nfnetlink ip6table_filter ip6_tables iptable_filter 
sunrpc vfat fat snd_hda_codec_generic intel_rapl_msr snd_hda_intel 
intel_rapl_common snd_intel_dspcfg snd_hda_codec isst_if_common 
snd_hwdep snd_hda_core iTCO_wdt intel_pmc_bxt iTCO_vendor_support 
kvm_intel snd_seq snd_seq_device snd_pcm kvm joydev irqbypass i2c_i801 
rapl i2c_smbus snd_timer snd virtio_balloon lpc_ich soundcore fuse zram 
ip_tables xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul 
crc32_pclmul crc32c_intel cec drm ghash_clmulni_intel serio_raw 
virtio_blk qemu_fw_cfg virtio_net virtio_console net_failover failover 
pkcs8_key_parser
[   11.318215] CR2: 0000000000000010
[   11.318670] ---[ end trace 20fb2a3e9bc19a76 ]---
[   11.319300] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[   11.320090] Code: 89 e7 45 31 e4 e8 67 bf f6 dc eb ea 0f 1f 44 00 00 
0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 68 01 00 
00 <83> 78 10 03 74 02 5d c3 8b 85 64 02 00 00 85 c0 74 f4 48 8b 7d 08
[   11.322574] RSP: 0018:ffffb7a24017fdd0 EFLAGS: 00010202
[   11.323271] RAX: 0000000000000000 RBX: ffff9da7c08e8670 RCX: 
ffff9da7c0b30000
[   11.324226] RDX: ffff9da7c27f7990 RSI: ffff9da7c27f7990 RDI: 
ffff9da7c27f7800
[   11.325186] RBP: ffff9da7c27f7800 R08: ffff9da7c27f7990 R09: 
0000000000000000
[   11.326145] R10: ffff9da7c991ec00 R11: 0000000000000000 R12: 
ffff9da7c27f7990
[   11.327092] R13: ffff9da7c27f7800 R14: ffff9da7c27f7960 R15: 
ffff9da7c27f7990
[   11.328032] FS:  0000000000000000(0000) GS:ffff9da937c80000(0000) 
knlGS:0000000000000000
[   11.329086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   11.329848] CR2: 0000000000000010 CR3: 000000010c178004 CR4: 
0000000000370ee0
[   11.330810] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   11.331746] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400


-- 
Computer Architect

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12  5:20     ` Christoph Hellwig
@ 2021-07-12 13:53       ` Guenter Roeck
  2021-07-12 19:03         ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-12 13:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linus Torvalds, Linux Kernel Mailing List, Jens Axboe

On 7/11/21 10:20 PM, Christoph Hellwig wrote:
> On Sun, Jul 11, 2021 at 09:14:23PM -0700, Guenter Roeck wrote:
>> My guess was correct. Bisect points to the above commit. Bisect log as well
>> as complete backtrace and example qemu command attached.
>>
>> Copying Christoph and Jens.
> 
> This should fіx it:
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 6d2d63629a90..b8d55af763f9 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -98,11 +98,7 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
>   MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
>   MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>   
> -#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
>   #define SD_MINORS	16
> -#else
> -#define SD_MINORS	0
> -#endif
>   
>   static void sd_config_discard(struct scsi_disk *, unsigned int);
>   static void sd_config_write_same(struct scsi_disk *);
> 

Yes, that fixes the problem for me.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12 13:53       ` Guenter Roeck
@ 2021-07-12 19:03         ` Linus Torvalds
  2021-07-12 19:24           ` Christoph Hellwig
  2021-07-12 19:28           ` Guenter Roeck
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-07-12 19:03 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Christoph Hellwig, Linux Kernel Mailing List, Jens Axboe

On Mon, Jul 12, 2021 at 6:53 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 7/11/21 10:20 PM, Christoph Hellwig wrote:
> >
> > This should fіx it:
> >
> > -#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
> >   #define SD_MINORS   16
> > -#else
> > -#define SD_MINORS    0
> > -#endif
> >
> >   static void sd_config_discard(struct scsi_disk *, unsigned int);
> >   static void sd_config_write_same(struct scsi_disk *);
> >
>
> Yes, that fixes the problem for me.
>
> Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks for reporting and testing.

Christoph, can I get that as a proper patch with a commit message?

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12  7:08 ` Jon Masters
@ 2021-07-12 19:14   ` Linus Torvalds
  2021-07-12 19:22     ` Christian König
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-12 19:14 UTC (permalink / raw)
  To: Jon Masters, Christian König, Matthew Auld
  Cc: Linux Kernel Mailing List, dri-devel

On Mon, Jul 12, 2021 at 12:08 AM Jon Masters <jcm@jonmasters.org> wrote:
>
> I happened to be installing a Fedora 34 (x86) VM for something and did a
> test kernel compile that hung on boot. Setting up a serial console I get
> the below backtrace from ttm but I have not had chance to look at it.

It's a NULL pointer in qxl_bo_delete_mem_notify(), with the code
disassembling to

  16: 55                    push   %rbp
  17: 48 89 fd              mov    %rdi,%rbp
  1a: e8 a2 02 00 00        callq  0x2c1
  1f: 84 c0                test   %al,%al
  21: 74 0d                je     0x30
  23: 48 8b 85 68 01 00 00 mov    0x168(%rbp),%rax
  2a:* 83 78 10 03          cmpl   $0x3,0x10(%rax) <-- trapping instruction
  2e: 74 02                je     0x32
  30: 5d                    pop    %rbp
  31: c3                    retq

and that "cmpl $3" looks exactly like that

        if (bo->resource->mem_type == TTM_PL_PRIV

and the bug is almost certainly from commit d3116756a710 ("drm/ttm:
rename bo->mem and make it a pointer"), which did

-       if (bo->mem.mem_type == TTM_PL_PRIV ...
+       if (bo->resource->mem_type == TTM_PL_PRIV ...

and claimed "No functional change".

But clearly the "bo->resource" pointer is NULL.

Added guilty parties and dri-devel mailing list.

Christian? Full report at

   https://lore.kernel.org/lkml/a9473821-1d53-0037-7590-aeaf8e85e72a@jonmasters.org/

but there's not a whole lot else there that is interesting except for
the call trace:

  ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
  ttm_bo_release+0x1a1/0x300 [ttm]
  ttm_bo_delayed_delete+0x1be/0x220 [ttm]
  ttm_device_delayed_workqueue+0x18/0x40 [ttm]
  process_one_work+0x1ec/0x390
  worker_thread+0x53/0x3e0

so it's presumably the cleanup phase and perhaps "bo->resource" has
been deallocated and cleared?

                  Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12 19:14   ` Linus Torvalds
@ 2021-07-12 19:22     ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2021-07-12 19:22 UTC (permalink / raw)
  To: Linus Torvalds, Jon Masters, Matthew Auld
  Cc: Linux Kernel Mailing List, dri-devel

Hi guys,

Am 12.07.21 um 21:14 schrieb Linus Torvalds:
> On Mon, Jul 12, 2021 at 12:08 AM Jon Masters <jcm@jonmasters.org> wrote:
>> I happened to be installing a Fedora 34 (x86) VM for something and did a
>> test kernel compile that hung on boot. Setting up a serial console I get
>> the below backtrace from ttm but I have not had chance to look at it.
> It's a NULL pointer in qxl_bo_delete_mem_notify(), with the code
> disassembling to
>
>    16: 55                    push   %rbp
>    17: 48 89 fd              mov    %rdi,%rbp
>    1a: e8 a2 02 00 00        callq  0x2c1
>    1f: 84 c0                test   %al,%al
>    21: 74 0d                je     0x30
>    23: 48 8b 85 68 01 00 00 mov    0x168(%rbp),%rax
>    2a:* 83 78 10 03          cmpl   $0x3,0x10(%rax) <-- trapping instruction
>    2e: 74 02                je     0x32
>    30: 5d                    pop    %rbp
>    31: c3                    retq
>
> and that "cmpl $3" looks exactly like that
>
>          if (bo->resource->mem_type == TTM_PL_PRIV
>
> and the bug is almost certainly from commit d3116756a710 ("drm/ttm:
> rename bo->mem and make it a pointer"), which did
>
> -       if (bo->mem.mem_type == TTM_PL_PRIV ...
> +       if (bo->resource->mem_type == TTM_PL_PRIV ...
>
> and claimed "No functional change".
>
> But clearly the "bo->resource" pointer is NULL.
>
> Added guilty parties and dri-devel mailing list.
>
> Christian? Full report at
>
>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2Fa9473821-1d53-0037-7590-aeaf8e85e72a%40jonmasters.org%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C06dd885408e84008a9a208d945694d9f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637617140858341274%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UlqsiWTjfJZ4%2FeIJJMh1AeCqs5SeFjNG%2F22UiuVAIII%3D&amp;reserved=0
>
> but there's not a whole lot else there that is interesting except for
> the call trace:
>
>    ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
>    ttm_bo_release+0x1a1/0x300 [ttm]
>    ttm_bo_delayed_delete+0x1be/0x220 [ttm]
>    ttm_device_delayed_workqueue+0x18/0x40 [ttm]
>    process_one_work+0x1ec/0x390
>    worker_thread+0x53/0x3e0
>
> so it's presumably the cleanup phase and perhaps "bo->resource" has
> been deallocated and cleared?

That's a known issue. Fixed by:

commit 3efe180d5105d367ae1dfadb97892ab93a89a783
Author: Christian König <christian.koenig@amd.com>
Date:   Tue Jul 6 08:51:25 2021 +0200

     drm/qxl: add NULL check for bo->resource

     When allocations fails that can be NULL now.

Previously the structure was embedded into the buffer object and when 
allocation failed (or never happened in a temporary buffer) the 
structure was just zeroed.

Going to double check tomorrow why that hasn't showed up in your tree yet.

Christian.


>
>                    Linus


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12 19:03         ` Linus Torvalds
@ 2021-07-12 19:24           ` Christoph Hellwig
  2021-07-12 19:27             ` Linus Torvalds
  2021-07-12 19:28           ` Guenter Roeck
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2021-07-12 19:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Christoph Hellwig, Linux Kernel Mailing List, Jens Axboe

On Mon, Jul 12, 2021 at 12:03:36PM -0700, Linus Torvalds wrote:
> Christoph, can I get that as a proper patch with a commit message?

https://lore.kernel.org/linux-scsi/20210712155001.125632-1-hch@lst.de/T/#u

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12 19:24           ` Christoph Hellwig
@ 2021-07-12 19:27             ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-07-12 19:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Guenter Roeck, Linux Kernel Mailing List, Jens Axboe

On Mon, Jul 12, 2021 at 12:24 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Jul 12, 2021 at 12:03:36PM -0700, Linus Torvalds wrote:
> > Christoph, can I get that as a proper patch with a commit message?
>
> https://lore.kernel.org/linux-scsi/20210712155001.125632-1-hch@lst.de/T/#u

Thanks, applied and pushed out (along with two VM issues that also got
reported since rc1..)

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14-rc1
  2021-07-12 19:03         ` Linus Torvalds
  2021-07-12 19:24           ` Christoph Hellwig
@ 2021-07-12 19:28           ` Guenter Roeck
  1 sibling, 0 replies; 12+ messages in thread
From: Guenter Roeck @ 2021-07-12 19:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christoph Hellwig, Linux Kernel Mailing List, Jens Axboe

On 7/12/21 12:03 PM, Linus Torvalds wrote:
> On Mon, Jul 12, 2021 at 6:53 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 7/11/21 10:20 PM, Christoph Hellwig wrote:
>>>
>>> This should fіx it:
>>>
>>> -#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
>>>    #define SD_MINORS   16
>>> -#else
>>> -#define SD_MINORS    0
>>> -#endif
>>>
>>>    static void sd_config_discard(struct scsi_disk *, unsigned int);
>>>    static void sd_config_write_same(struct scsi_disk *);
>>>
>>
>> Yes, that fixes the problem for me.
>>
>> Tested-by: Guenter Roeck <linux@roeck-us.net>
> 
> Thanks for reporting and testing.
> 
> Christoph, can I get that as a proper patch with a commit message?
> 

Christoph already sent it:

https://patchwork.kernel.org/project/linux-block/patch/20210712155001.125632-1-hch@lst.de/

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-12 19:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-11 22:49 Linux 5.14-rc1 Linus Torvalds
2021-07-12  1:56 ` Guenter Roeck
2021-07-12  4:14   ` Guenter Roeck
2021-07-12  5:20     ` Christoph Hellwig
2021-07-12 13:53       ` Guenter Roeck
2021-07-12 19:03         ` Linus Torvalds
2021-07-12 19:24           ` Christoph Hellwig
2021-07-12 19:27             ` Linus Torvalds
2021-07-12 19:28           ` Guenter Roeck
2021-07-12  7:08 ` Jon Masters
2021-07-12 19:14   ` Linus Torvalds
2021-07-12 19:22     ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.