Hi,
Sorry about the duplicated message, but it looks like my previous
email contained some html code that got rejected by the linux-block
list.
We've noticed a kernel oops during the stress-ng test on aarch64 more
log details on [1]. Christoph, do you think this could be related to
the recent blk_cleanup_disk changes [2]?
[15259.574356] loop32292: detected capacity change from 0 to 4096
[15259.574436] loop6370: detected capacity change from 0 to 4096
[15259.638249] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000008
[15259.647046] Mem abort info:
[15259.649830] ESR = 0x96000006
[15259.652875] EC = 0x25: DABT (current EL), IL = 32 bits
[15259.653800] loop46040: detected capacity change from 4096 to 8192
[15259.658191] SET = 0, FnV = 0
[15259.667311] EA = 0, S1PTW = 0
[15259.670442] Data abort info:
[15259.673311] ISV = 0, ISS = 0x00000006
[15259.677145] CM = 0, WnR = 0
[15259.680102] user pgtable: 4k pages, 48-bit VAs, pgdp=000000093ce30000
[15259.686547] [0000000000000008] pgd=080000092b670003,
p4d=080000092b670003, pud=0800000911225003, pmd=0000000000000000
[15259.697181] Internal error: Oops: 96000006 [#1] SMP
[15259.702069] Modules linked in: binfmt_misc fcrypt sm4_generic
crc32_generic md4 michael_mic nhpoly1305_neon nhpoly1305
poly1305_generic libpoly1305 poly1305_neon rmd160 sha3_generic
sm3_generic streebog_generic wp512 blowfish_generic blowfish_common
cast5_generic des_generic libdes chacha_generic chacha_neon libchacha
camellia_generic cast6_generic cast_common serpent_generic
twofish_generic twofish_common dm_thin_pool dm_persistent_data
dm_bio_prison nvme nvme_core loop dm_log_writes dm_flakey rfkill
mlx5_ib ib_uverbs ib_core sunrpc mlx5_core joydev acpi_ipmi psample
ipmi_ssif i2c_smbus mlxfw ipmi_devintf ipmi_msghandler thunderx2_pmu
vfat fat cppc_cpufreq fuse zram ip_tables xfs crct10dif_ce ast
ghash_ce i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm gpio_xlp
i2c_xlp9xx uas usb_storage aes_neon_bs [last unloaded: nvmet]
[15259.781079] CPU: 2 PID: 2800640 Comm: stress-ng Not tainted 5.13.0-rc3 #1
[15259.787865] Hardware name: HPE Apollo 70 /C01_APACHE_MB
, BIOS L50_5.13_1.11 06/18/2019
[15259.797601] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[15259.803605] pc : blk_mq_run_hw_queues+0xec/0x10c
[15259.808226] lr : blk_freeze_queue_start+0x80/0x90
[15259.812925] sp : ffff80003b55bd00
[15259.816233] x29: ffff80003b55bd00 x28: ffff000a559320c0 x27: 0000000000000000
[15259.823375] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[15259.830513] x23: 0000000000000007 x22: 0000000000000000 x21: 0000000000000000
[15259.837645] x20: ffff00081aa6d3c0 x19: ffff00081aa6d3c0 x18: 00000000fffffffa
[15259.844776] x17: 0000000000000000 x16: 0000000000000000 x15: 0000040000000000
[15259.851905] x14: ffff000000000000 x13: 0000000000001000 x12: ffff000e7825b0a0
[15259.859034] x11: 0000000000000000 x10: ffff000e7825b098 x9 : ffff8000106d2950
[15259.866164] x8 : ffff000f7cfeab20 x7 : fffffffc00000000 x6 : ffff800011554000
[15259.873292] x5 : 0000000000000000 x4 : ffff000a559320c0 x3 : ffff00081aa6da28
[15259.880421] x2 : 0000000000000002 x1 : 0000000000000000 x0 : ffff0008b69f0a80
[15259.887551] Call trace:
[15259.889987] blk_mq_run_hw_queues+0xec/0x10c
[15259.894253] blk_freeze_queue_start+0x80/0x90
[15259.898603] blk_cleanup_queue+0x40/0x114
[15259.902606] blk_cleanup_disk+0x28/0x50
[15259.906434] loop_control_ioctl+0x17c/0x190 [loop]
[15259.911224] __arm64_sys_ioctl+0xb4/0x100
[15259.915229] invoke_syscall+0x50/0x120
[15259.918972] el0_svc_common.constprop.0+0x4c/0xd4
[15259.923666] do_el0_svc+0x30/0x9c
[15259.926971] el0_svc+0x2c/0x54
[15259.930022] el0_sync_handler+0x1a4/0x1b0
[15259.934023] el0_sync+0x19c/0x1c0
[15259.937335] Code: 91000000 b8626802 f9400021 f9402680 (b8627821)
[15259.943418] ---[ end trace 975879698e5c9146 ]---
[15260.113777] loop62523: detected capacity change from 4096 to 8192
[15260.113783] loop58780: detected capacity change from 4096 to 8192
[15260.113794] loop7620: detected capacity change from 4096 to 8192
[1] https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/datawarehouse-public/2021/06/11/319533768/build_aarch64_redhat%3A1340796730/tests/10127520_aarch64_2_console.log
[2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-next
Thanks,
Bruno
On Mon, Jun 14, 2021 at 2:35 PM CKI Project <cki-project@redhat.com> wrote:
>
>
> Hello,
>
> We ran automated tests on a recent commit from this kernel tree:
>
> Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
> Commit: 30ec225aae2e - Merge branch 'for-5.14/block' into for-next
>
> The results of these automated tests are provided below.
>
> Overall result: FAILED (see details below)
> Merge: OK
> Compile: OK
> Tests: PANICKED
>
> All kernel binaries, config files, and logs are available for download here:
>
> https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/06/11/319533768
>
> One or more kernel tests failed:
>
> ppc64le:
> Boot test
>
> aarch64:
> ❌ storage: software RAID testing
> stress: stress-ng
>
> We hope that these logs can help you find the problem quickly. For the full
> detail on our testing procedures, please scroll to the bottom of this message.
>
> Please reply to this email if you have any questions about the tests that we
> ran or if you have any suggestions on how to make future tests more effective.
>
> ,-. ,-.
> ( C ) ( K ) Continuous
> `-',-.`-' Kernel
> ( I ) Integration
> `-'
> ______________________________________________________________________________
>
> Compile testing
> ---------------
>
> We compiled the kernel for 4 architectures:
>
> aarch64:
> make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
> ppc64le:
> make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
> s390x:
> make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
> x86_64:
> make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
>
>
> Hardware testing
> ----------------
> We booted each kernel and ran the following tests:
>
> aarch64:
> Host 1:
> ✅ Boot test
> ✅ ACPI table test
> ✅ LTP
> ✅ CIFS Connectathon
> ✅ POSIX pjd-fstest suites
> ✅ Loopdev Sanity
> ✅ Memory: fork_mem
> ✅ Memory function: memfd_create
> ✅ AMTU (Abstract Machine Test Utility)
> ✅ Ethernet drivers sanity
> ✅ storage: SCSI VPD
> ✅ xarray-idr-radixtree-test
>
> Host 2:
> ✅ Boot test
> ✅ xfstests - ext4
> ✅ xfstests - xfs
> ❌ storage: software RAID testing
> ✅ Storage: swraid mdadm raid_module test
> ✅ xfstests - btrfs
> ✅ Storage blktests
> ✅ Storage block - filesystem fio test
> ✅ Storage block - queue scheduler test
> ✅ Storage nvme - tcp
> ✅ Storage: lvm device-mapper test
> stress: stress-ng
>
> ppc64le:
> Host 1:
> ✅ Boot test
> ✅ LTP
> ✅ CIFS Connectathon
> ✅ POSIX pjd-fstest suites
> ✅ Loopdev Sanity
> ✅ Memory: fork_mem
> ✅ Memory function: memfd_create
> ✅ AMTU (Abstract Machine Test Utility)
> ✅ Ethernet drivers sanity
> ✅ xarray-idr-radixtree-test
>
> Host 2:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ storage: software RAID testing
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage block - filesystem fio test
> ⚡⚡⚡ Storage block - queue scheduler test
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ Storage: lvm device-mapper test
>
> Host 3:
> Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ storage: software RAID testing
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage block - filesystem fio test
> ⚡⚡⚡ Storage block - queue scheduler test
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ Storage: lvm device-mapper test
>
> s390x:
> Host 1:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ✅ Boot test
> ⚡⚡⚡ LTP
> ⚡⚡⚡ CIFS Connectathon
> ⚡⚡⚡ POSIX pjd-fstest suites
> ⚡⚡⚡ Loopdev Sanity
> ⚡⚡⚡ Memory: fork_mem
> ⚡⚡⚡ Memory function: memfd_create
> ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
> ⚡⚡⚡ Ethernet drivers sanity
> ⚡⚡⚡ xarray-idr-radixtree-test
>
> Host 2:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ stress: stress-ng
>
> Host 3:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ stress: stress-ng
>
> Host 4:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ stress: stress-ng
>
> x86_64:
> Host 1:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ Storage SAN device stress - qedf driver
>
> Host 2:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ xfstests - ext4
> ⚡⚡⚡ xfstests - xfs
> ⚡⚡⚡ xfstests - nfsv4.2
> ⚡⚡⚡ storage: software RAID testing
> ⚡⚡⚡ Storage: swraid mdadm raid_module test
> ⚡⚡⚡ xfstests - btrfs
> ⚡⚡⚡ xfstests - cifsv3.11
> ⚡⚡⚡ Storage blktests
> ⚡⚡⚡ Storage block - filesystem fio test
> ⚡⚡⚡ Storage block - queue scheduler test
> ⚡⚡⚡ Storage nvme - tcp
> ⚡⚡⚡ Storage: lvm device-mapper test
> ⚡⚡⚡ stress: stress-ng
>
> Host 3:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ Storage SAN device stress - qla2xxx driver
>
> Host 4:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ✅ Boot test
> ⚡⚡⚡ Storage SAN device stress - mpt3sas_gen1
>
> Host 5:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ✅ Boot test
> ✅ ACPI table test
> ⚡⚡⚡ LTP
> ⚡⚡⚡ CIFS Connectathon
> ⚡⚡⚡ POSIX pjd-fstest suites
> ⚡⚡⚡ Loopdev Sanity
> ⚡⚡⚡ Memory: fork_mem
> ⚡⚡⚡ Memory function: memfd_create
> ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
> ⚡⚡⚡ Ethernet drivers sanity
> ⚡⚡⚡ storage: SCSI VPD
> ⚡⚡⚡ xarray-idr-radixtree-test
>
> Host 6:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ Storage SAN device stress - lpfc driver
>
> Host 7:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ Storage SAN device stress - qedf driver
>
> Host 8:
>
> ⚡ Internal infrastructure issues prevented one or more tests (marked
> with ⚡⚡⚡) from running on this architecture.
> This is not the fault of the kernel that was tested.
>
> ⚡⚡⚡ Boot test
> ⚡⚡⚡ Storage SAN device stress - lpfc driver
>
> Test sources: https://gitlab.com/cki-project/kernel-tests
> Pull requests are welcome for new tests or improvements to existing tests!
>
> Aborted tests
> -------------
> Tests that didn't complete running successfully are marked with ⚡⚡⚡.
> If this was caused by an infrastructure issue, we try to mark that
> explicitly in the report.
>
> Waived tests
> ------------
> If the test run included waived tests, they are marked with . Such tests are
> executed but their results are not taken into account. Tests are waived when
> their results are not reliable enough, e.g. when they're just introduced or are
> being fixed.
>
> Testing timeout
> ---------------
> We aim to provide a report within reasonable timeframe. Tests that haven't
> finished running yet are marked with ⏱.
>
>