linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block)
@ 2020-07-01 13:06 CKI Project
  2020-07-01 16:37 ` Rachel Sibley
  0 siblings, 1 reply; 5+ messages in thread
From: CKI Project @ 2020-07-01 13:06 UTC (permalink / raw)
  To: linux-block, axboe; +Cc: Xiong Zhou


Hello,

We ran automated tests on a recent commit from this kernel tree:

       Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
            Commit: c698ae90fb5e - Merge branch 'for-5.9/block' into for-next

The results of these automated tests are provided below.

    Overall result: FAILED (see details below)
             Merge: OK
           Compile: OK
             Tests: PANICKED

All kernel binaries, config files, and logs are available for download here:

  https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250

One or more kernel tests failed:

    s390x:
     ❌ Boot test
     ❌ Boot test
     ❌ Boot test

    ppc64le:
     ❌ Boot test
     ❌ Boot test
     💥 xfstests - ext4

    aarch64:
     💥 Boot test
     💥 xfstests - ext4

    x86_64:
     💥 Boot test
     💥 xfstests - ext4
     💥 Boot test
     💥 Boot test
     💥 Boot test

We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.

Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.

        ,-.   ,-.
       ( C ) ( K )  Continuous
        `-',-.`-'   Kernel
          ( I )     Integration
           `-'
______________________________________________________________________________

Compile testing
---------------

We compiled the kernel for 4 architectures:

    aarch64:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg

    ppc64le:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg

    s390x:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg

    x86_64:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg



Hardware testing
----------------
We booted each kernel and ran the following tests:

  aarch64:
    Host 1:
       ❌ Boot test
       ⚡⚡⚡ LTP
       ⚡⚡⚡ Loopdev Sanity
       ⚡⚡⚡ Memory function: memfd_create
       ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
       ⚡⚡⚡ Ethernet drivers sanity
       ⚡⚡⚡ storage: SCSI VPD
       🚧 ⚡⚡⚡ CIFS Connectathon
       🚧 ⚡⚡⚡ POSIX pjd-fstest suites

    Host 2:
       ✅ Boot test
       💥 xfstests - ext4
       ⚡⚡⚡ xfstests - xfs
       ⚡⚡⚡ storage: software RAID testing
       ⚡⚡⚡ stress: stress-ng
       🚧 ⚡⚡⚡ Storage blktests

  ppc64le:
    Host 1:
       ❌ Boot test
       🚧 ⚡⚡⚡ kdump - sysrq-c

    Host 2:
       ❌ Boot test
       ⚡⚡⚡ LTP
       ⚡⚡⚡ Loopdev Sanity
       ⚡⚡⚡ Memory function: memfd_create
       ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
       ⚡⚡⚡ Ethernet drivers sanity
       🚧 ⚡⚡⚡ CIFS Connectathon
       🚧 ⚡⚡⚡ POSIX pjd-fstest suites

    Host 3:
       ✅ Boot test
       💥 xfstests - ext4
       ⚡⚡⚡ xfstests - xfs
       ⚡⚡⚡ storage: software RAID testing
       🚧 ⚡⚡⚡ Storage blktests

  s390x:
    Host 1:
       ❌ Boot test
       ⚡⚡⚡ LTP
       ⚡⚡⚡ Loopdev Sanity
       ⚡⚡⚡ Memory function: memfd_create
       ⚡⚡⚡ Ethernet drivers sanity
       🚧 ⚡⚡⚡ CIFS Connectathon
       🚧 ⚡⚡⚡ POSIX pjd-fstest suites

    Host 2:
       ❌ Boot test
       ⚡⚡⚡ stress: stress-ng
       🚧 ⚡⚡⚡ Storage blktests

    Host 3:
       ❌ Boot test
       🚧 ⚡⚡⚡ kdump - sysrq-c

  x86_64:
    Host 1:

       ⚡ Internal infrastructure issues prevented one or more tests (marked
       with ⚡⚡⚡) from running on this architecture.
       This is not the fault of the kernel that was tested.

       ⚡⚡⚡ Boot test
       🚧 ⚡⚡⚡ kdump - sysrq-c - mpt3sas_gen1

    Host 2:
       💥 Boot test
       ⚡⚡⚡ Storage SAN device stress - lpfc driver

    Host 3:

       ⚡ Internal infrastructure issues prevented one or more tests (marked
       with ⚡⚡⚡) from running on this architecture.
       This is not the fault of the kernel that was tested.

       ⚡⚡⚡ Boot test
       🚧 ⚡⚡⚡ kdump - sysrq-c

    Host 4:
       ✅ Boot test
       💥 xfstests - ext4
       ⚡⚡⚡ xfstests - xfs
       ⚡⚡⚡ storage: software RAID testing
       ⚡⚡⚡ stress: stress-ng
       🚧 ⚡⚡⚡ Storage blktests

    Host 5:

       ⚡ Internal infrastructure issues prevented one or more tests (marked
       with ⚡⚡⚡) from running on this architecture.
       This is not the fault of the kernel that was tested.

       ⚡⚡⚡ Boot test
       ⚡⚡⚡ kdump - sysrq-c - megaraid_sas

    Host 6:
       💥 Boot test
       ⚡⚡⚡ LTP
       ⚡⚡⚡ Loopdev Sanity
       ⚡⚡⚡ Memory function: memfd_create
       ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
       ⚡⚡⚡ Ethernet drivers sanity
       ⚡⚡⚡ storage: SCSI VPD
       🚧 ⚡⚡⚡ CIFS Connectathon
       🚧 ⚡⚡⚡ POSIX pjd-fstest suites

    Host 7:
       💥 Boot test
       ⚡⚡⚡ Storage SAN device stress - qla2xxx driver

    Host 8:
       💥 Boot test
       ⚡⚡⚡ Storage SAN device stress - qedf driver

    Host 9:
       ⏱  Boot test
       ⏱  Storage SAN device stress - mpt3sas_gen1

  Test sources: https://github.com/CKI-project/tests-beaker
    💚 Pull requests are welcome for new tests or improvements to existing tests!

Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.

Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.

Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block)
  2020-07-01 13:06 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block) CKI Project
@ 2020-07-01 16:37 ` Rachel Sibley
  2020-07-01 16:42   ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Rachel Sibley @ 2020-07-01 16:37 UTC (permalink / raw)
  To: CKI Project, linux-block, axboe; +Cc: Xiong Zhou

Hi, we're seeing multiple panics across all arches, I included a snippet of the call trace for both
xfstests and boot test.

You should be able to inspect in more detail by viewing the console.log under each build/tests directory:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250

Thanks,
Rachel

On 7/1/20 9:06 AM, CKI Project wrote:
> 
> Hello,
> 
> We ran automated tests on a recent commit from this kernel tree:
> 
>         Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
>              Commit: c698ae90fb5e - Merge branch 'for-5.9/block' into for-next
> 
> The results of these automated tests are provided below.
> 
>      Overall result: FAILED (see details below)
>               Merge: OK
>             Compile: OK
>               Tests: PANICKED
> 
> All kernel binaries, config files, and logs are available for download here:
> 
>    https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250
> 
> One or more kernel tests failed:
> 
>      s390x:
>       ❌ Boot test
>       ❌ Boot test
>       ❌ Boot test
> 
>      ppc64le:
>       ❌ Boot test
>       ❌ Boot test
>       💥 xfstests - ext4

https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/06/30/609250/build_ppc64le_redhat%3A926155/tests/8501352/ppc64le_3_console.log

[  890.198174] run fstests generic/040 at 2020-06-30 12:03:02
[  891.055910] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[  891.055942] Faulting instruction address: 0x00000000
[  891.055956] Oops: Kernel access of bad area, sig: 11 [#1]
[  891.055969] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[  891.055982] Modules linked in: dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rfkill joydev i40e at24 sunrpc ses 
enclosure scsi_transport_sas regmap_i2c ofpart powernv_flash mtd crct10dif_vpmsum ipmi_powernv ipmi_devintf opal_prd ipmi_msghandler rtc_opal i2c_opal 
ip_tables xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea vmx_crypto sysfillrect sysimgblt crc32c_vpmsum fb_sys_fops 
drm_ttm_helper ttm drm i2c_core aacraid drm_panel_orientation_quirks
[  891.056077] CPU: 25 PID: 84211 Comm: systemd-udevd Kdump: loaded Not tainted 5.8.0-rc2-c698ae9.cki #1
[  891.056095] NIP:  0000000000000000 LR: c00000000070eef0 CTR: 0000000000000000
[  891.056110] REGS: c0000007c25474e0 TRAP: 0400   Not tainted  (5.8.0-rc2-c698ae9.cki)
[  891.056125] MSR:  9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24488248  XER: 20040000
[  891.056145] CFAR: c00000000070eeec IRQMASK: 0
[  891.056145] GPR00: c00000000070f050 c0000007c2547770 c000000001cb7f00 c0000002b6059af8
[  891.056145] GPR04: 0000000000000000 c0000007dcf6f000 c0000002b6059af8 0000000000000000
[  891.056145] GPR08: 00000007fc940000 c000000001c97d78 0000000000000000 0000000000000000
[  891.056145] GPR12: 0000000000000000 c0000007fffe2e00 c0000007c2344400 0000000000000000
[  891.056145] GPR16: 0000000000000000 00007fffc9b7cb50 c0000007c2344400 0000000000000000
[  891.056145] GPR20: c0000002b307bdd8 0000000000000000 c0000007c2547ca8 c0000007dcf6f000
[  891.056145] GPR24: 000000000000000c 000000000000000a c0000007c2547790 0000000000000001
[  891.056145] GPR28: 0000000000000000 0000000000000000 00000000ffffffff c0000002b6059af8
[  891.056260] NIP [0000000000000000] 0x0
[  891.056272] LR [c00000000070eef0] submit_bio_noacct+0x2f0/0x5c0
[  891.056285] Call Trace:
[  891.056294] [c0000007c2547770] [c00000000070f050] submit_bio_noacct+0x450/0x5c0 (unreliable)
[  891.056312] [c0000007c2547800] [c00000000070f228] submit_bio+0x68/0x2d0
[  891.056328] [c0000007c25478c0] [c000000000505fe8] mpage_readahead+0x1c8/0x290
[  891.056345] [c0000007c25479a0] [c0000000004fd6f8] blkdev_readahead+0x28/0x40
[  891.056362] [c0000007c25479c0] [c000000000383980] read_pages+0xb0/0x4a0
[  891.056376] [c0000007c2547a40] [c000000000384474] page_cache_readahead_unbounded+0x244/0x300
[  891.056395] [c0000007c2547b00] [c00000000037445c] generic_file_buffered_read+0x9bc/0x1120
[  891.056411] [c0000007c2547c50] [c0000000004fddc0] blkdev_read_iter+0x50/0x80
[  891.056428] [c0000007c2547c70] [c000000000493c64] new_sync_read+0x124/0x1a0
[  891.056443] [c0000007c2547d10] [c000000000496e30] vfs_read+0x100/0x200
[  891.056471] [c0000007c2547d70] [c000000000497368] ksys_read+0x78/0x130
[  891.056487] [c0000007c2547dc0] [c000000000030564] system_call_exception+0xe4/0x170
[  891.056504] [c0000007c2547e20] [c00000000000ca70] system_call_common+0xf0/0x278
[  891.056518] Instruction dump:
[  891.056529] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[  891.056545] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[  891.056564] ---[ end trace 14197a45ec121b51 ]---
> 
>      aarch64:
>       💥 Boot test
>       💥 xfstests - ext4
> 
>      x86_64:
>       💥 Boot test
>       💥 xfstests - ext4
>       💥 Boot test
>       💥 Boot test
>       💥 Boot test

https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/06/30/609250/build_x86_64_redhat%3A926153/tests/8501344/x86_64_6_console.log

[   12.160102] Call Trace:
[   12.162838]  submit_bio_noacct+0x1f4/0x3d0
[   12.167414]  mpage_readahead+0x159/0x1b0
[   12.171795]  ? __blkdev_direct_IO_simple+0x2b0/0x2b0
[   12.177337]  read_pages+0x5d/0x300
[   12.181132]  page_cache_readahead_unbounded+0x1aa/0x230
[   12.186965]  force_page_cache_readahead+0xda/0x140
[   12.192313]  generic_file_buffered_read+0x647/0xc00
[   12.197761]  new_sync_read+0x102/0x180
[   12.201946]  vfs_read+0x9d/0x150
[   12.205549]  ksys_read+0x4f/0xc0
[   12.209153]  do_syscall_64+0x4d/0x90
[   12.213145]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   12.218781] RIP: 0033:0x7f263ae70542
[   12.222767] Code: Bad RIP value.
[   12.226367] RSP: 002b:00007fff55c02f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   12.234817] RAX: ffffffffffffffda RBX: 0000557dff1219e8 RCX: 00007f263ae70542
[   12.242780] RDX: 0000000000000040 RSI: 0000557dff1219f8 RDI: 0000000000000006
[   12.250742] RBP: 0000557dff1170c0 R08: 0000557dff1219d0 R09: 00007f263af41a40
[   12.258705] R10: 0000000000000010 R11: 0000000000000246 R12: 00000003bfff0000
[   12.266668] R13: 0000000000000040 R14: 0000557dff1219d0 R15: 0000557dff117110
[   12.274631] Modules linked in: mgag200 drm_vram_helper drm_kms_helper lpfc drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel nvmet_fc 
drm nvmet ghash_clmulni_intel nvme_fc nvme_fabrics i2c_algo_bit nvme_core scsi_transport_fc wmi scsi_dh_alua scsi_dh_rdac scsi_dh_emc
[   12.302196] CR2: 0000000000000000
[   12.305931] ---[ end trace 4b2a7525c30bbc3f ]---

> 
> We hope that these logs can help you find the problem quickly. For the full
> detail on our testing procedures, please scroll to the bottom of this message.
> 
> Please reply to this email if you have any questions about the tests that we
> ran or if you have any suggestions on how to make future tests more effective.
> 
>          ,-.   ,-.
>         ( C ) ( K )  Continuous
>          `-',-.`-'   Kernel
>            ( I )     Integration
>             `-'
> ______________________________________________________________________________
> 
> Compile testing
> ---------------
> 
> We compiled the kernel for 4 architectures:
> 
>      aarch64:
>        make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
> 
>      ppc64le:
>        make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
> 
>      s390x:
>        make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
> 
>      x86_64:
>        make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
> 
> 
> 
> Hardware testing
> ----------------
> We booted each kernel and ran the following tests:
> 
>    aarch64:
>      Host 1:
>         ❌ Boot test
>         ⚡⚡⚡ LTP
>         ⚡⚡⚡ Loopdev Sanity
>         ⚡⚡⚡ Memory function: memfd_create
>         ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
>         ⚡⚡⚡ Ethernet drivers sanity
>         ⚡⚡⚡ storage: SCSI VPD
>         🚧 ⚡⚡⚡ CIFS Connectathon
>         🚧 ⚡⚡⚡ POSIX pjd-fstest suites
> 
>      Host 2:
>         ✅ Boot test
>         💥 xfstests - ext4
>         ⚡⚡⚡ xfstests - xfs
>         ⚡⚡⚡ storage: software RAID testing
>         ⚡⚡⚡ stress: stress-ng
>         🚧 ⚡⚡⚡ Storage blktests
> 
>    ppc64le:
>      Host 1:
>         ❌ Boot test
>         🚧 ⚡⚡⚡ kdump - sysrq-c
> 
>      Host 2:
>         ❌ Boot test
>         ⚡⚡⚡ LTP
>         ⚡⚡⚡ Loopdev Sanity
>         ⚡⚡⚡ Memory function: memfd_create
>         ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
>         ⚡⚡⚡ Ethernet drivers sanity
>         🚧 ⚡⚡⚡ CIFS Connectathon
>         🚧 ⚡⚡⚡ POSIX pjd-fstest suites
> 
>      Host 3:
>         ✅ Boot test
>         💥 xfstests - ext4
>         ⚡⚡⚡ xfstests - xfs
>         ⚡⚡⚡ storage: software RAID testing
>         🚧 ⚡⚡⚡ Storage blktests
> 
>    s390x:
>      Host 1:
>         ❌ Boot test
>         ⚡⚡⚡ LTP
>         ⚡⚡⚡ Loopdev Sanity
>         ⚡⚡⚡ Memory function: memfd_create
>         ⚡⚡⚡ Ethernet drivers sanity
>         🚧 ⚡⚡⚡ CIFS Connectathon
>         🚧 ⚡⚡⚡ POSIX pjd-fstest suites
> 
>      Host 2:
>         ❌ Boot test
>         ⚡⚡⚡ stress: stress-ng
>         🚧 ⚡⚡⚡ Storage blktests
> 
>      Host 3:
>         ❌ Boot test
>         🚧 ⚡⚡⚡ kdump - sysrq-c
> 
>    x86_64:
>      Host 1:
> 
>         ⚡ Internal infrastructure issues prevented one or more tests (marked
>         with ⚡⚡⚡) from running on this architecture.
>         This is not the fault of the kernel that was tested.
> 
>         ⚡⚡⚡ Boot test
>         🚧 ⚡⚡⚡ kdump - sysrq-c - mpt3sas_gen1
> 
>      Host 2:
>         💥 Boot test
>         ⚡⚡⚡ Storage SAN device stress - lpfc driver
> 
>      Host 3:
> 
>         ⚡ Internal infrastructure issues prevented one or more tests (marked
>         with ⚡⚡⚡) from running on this architecture.
>         This is not the fault of the kernel that was tested.
> 
>         ⚡⚡⚡ Boot test
>         🚧 ⚡⚡⚡ kdump - sysrq-c
> 
>      Host 4:
>         ✅ Boot test
>         💥 xfstests - ext4
>         ⚡⚡⚡ xfstests - xfs
>         ⚡⚡⚡ storage: software RAID testing
>         ⚡⚡⚡ stress: stress-ng
>         🚧 ⚡⚡⚡ Storage blktests
> 
>      Host 5:
> 
>         ⚡ Internal infrastructure issues prevented one or more tests (marked
>         with ⚡⚡⚡) from running on this architecture.
>         This is not the fault of the kernel that was tested.
> 
>         ⚡⚡⚡ Boot test
>         ⚡⚡⚡ kdump - sysrq-c - megaraid_sas
> 
>      Host 6:
>         💥 Boot test
>         ⚡⚡⚡ LTP
>         ⚡⚡⚡ Loopdev Sanity
>         ⚡⚡⚡ Memory function: memfd_create
>         ⚡⚡⚡ AMTU (Abstract Machine Test Utility)
>         ⚡⚡⚡ Ethernet drivers sanity
>         ⚡⚡⚡ storage: SCSI VPD
>         🚧 ⚡⚡⚡ CIFS Connectathon
>         🚧 ⚡⚡⚡ POSIX pjd-fstest suites
> 
>      Host 7:
>         💥 Boot test
>         ⚡⚡⚡ Storage SAN device stress - qla2xxx driver
> 
>      Host 8:
>         💥 Boot test
>         ⚡⚡⚡ Storage SAN device stress - qedf driver
> 
>      Host 9:
>         ⏱  Boot test
>         ⏱  Storage SAN device stress - mpt3sas_gen1
> 
>    Test sources: https://github.com/CKI-project/tests-beaker
>      💚 Pull requests are welcome for new tests or improvements to existing tests!
> 
> Aborted tests
> -------------
> Tests that didn't complete running successfully are marked with ⚡⚡⚡.
> If this was caused by an infrastructure issue, we try to mark that
> explicitly in the report.
> 
> Waived tests
> ------------
> If the test run included waived tests, they are marked with 🚧. Such tests are
> executed but their results are not taken into account. Tests are waived when
> their results are not reliable enough, e.g. when they're just introduced or are
> being fixed.
> 
> Testing timeout
> ---------------
> We aim to provide a report within reasonable timeframe. Tests that haven't
> finished running yet are marked with ⏱.
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block)
  2020-07-01 16:37 ` Rachel Sibley
@ 2020-07-01 16:42   ` Jens Axboe
  2020-07-01 17:16     ` Rachel Sibley
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-07-01 16:42 UTC (permalink / raw)
  To: Rachel Sibley, CKI Project, linux-block; +Cc: Xiong Zhou

On 7/1/20 10:37 AM, Rachel Sibley wrote:
> Hi, we're seeing multiple panics across all arches, I included a snippet of the call trace for both
> xfstests and boot test.
> 
> You should be able to inspect in more detail by viewing the console.log under each build/tests directory:
> https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250

This was due to a bad patch series, which since got reverted and redone. Current
tree should be fine.

Now it doesn't matter for this one since I guessed what this was and found it
before the bot did, but I do wish the reports were easier to look at. I should
not have to dig through directories (which were empty when the report went out,
btw) to find logs, then download logs and leaf through hundreds of kb of text
to find out why the bot thought the tree was broken. It should be readily
apparent and in the email. If there's an OOPS, include the oops.

I'd much rather get a separate report for each arch, each having the oops
that got triggered, than get one massive email where it's really not obvious
where to look.

This:

> https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/06/30/609250/build_ppc64le_redhat%3A926155/tests/8501352/ppc64le_3_console.log
> 
> [  890.198174] run fstests generic/040 at 2020-06-30 12:03:02
> [  891.055910] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> [  891.055942] Faulting instruction address: 0x00000000
> [  891.055956] Oops: Kernel access of bad area, sig: 11 [#1]
> [  891.055969] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> [  891.055982] Modules linked in: dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rfkill joydev i40e at24 sunrpc ses 
> enclosure scsi_transport_sas regmap_i2c ofpart powernv_flash mtd crct10dif_vpmsum ipmi_powernv ipmi_devintf opal_prd ipmi_msghandler rtc_opal i2c_opal 
> ip_tables xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea vmx_crypto sysfillrect sysimgblt crc32c_vpmsum fb_sys_fops 
> drm_ttm_helper ttm drm i2c_core aacraid drm_panel_orientation_quirks
> [  891.056077] CPU: 25 PID: 84211 Comm: systemd-udevd Kdump: loaded Not tainted 5.8.0-rc2-c698ae9.cki #1
> [  891.056095] NIP:  0000000000000000 LR: c00000000070eef0 CTR: 0000000000000000
> [  891.056110] REGS: c0000007c25474e0 TRAP: 0400   Not tainted  (5.8.0-rc2-c698ae9.cki)
> [  891.056125] MSR:  9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24488248  XER: 20040000
> [  891.056145] CFAR: c00000000070eeec IRQMASK: 0
> [  891.056145] GPR00: c00000000070f050 c0000007c2547770 c000000001cb7f00 c0000002b6059af8
> [  891.056145] GPR04: 0000000000000000 c0000007dcf6f000 c0000002b6059af8 0000000000000000
> [  891.056145] GPR08: 00000007fc940000 c000000001c97d78 0000000000000000 0000000000000000
> [  891.056145] GPR12: 0000000000000000 c0000007fffe2e00 c0000007c2344400 0000000000000000
> [  891.056145] GPR16: 0000000000000000 00007fffc9b7cb50 c0000007c2344400 0000000000000000
> [  891.056145] GPR20: c0000002b307bdd8 0000000000000000 c0000007c2547ca8 c0000007dcf6f000
> [  891.056145] GPR24: 000000000000000c 000000000000000a c0000007c2547790 0000000000000001
> [  891.056145] GPR28: 0000000000000000 0000000000000000 00000000ffffffff c0000002b6059af8
> [  891.056260] NIP [0000000000000000] 0x0
> [  891.056272] LR [c00000000070eef0] submit_bio_noacct+0x2f0/0x5c0
> [  891.056285] Call Trace:
> [  891.056294] [c0000007c2547770] [c00000000070f050] submit_bio_noacct+0x450/0x5c0 (unreliable)
> [  891.056312] [c0000007c2547800] [c00000000070f228] submit_bio+0x68/0x2d0
> [  891.056328] [c0000007c25478c0] [c000000000505fe8] mpage_readahead+0x1c8/0x290
> [  891.056345] [c0000007c25479a0] [c0000000004fd6f8] blkdev_readahead+0x28/0x40
> [  891.056362] [c0000007c25479c0] [c000000000383980] read_pages+0xb0/0x4a0
> [  891.056376] [c0000007c2547a40] [c000000000384474] page_cache_readahead_unbounded+0x244/0x300
> [  891.056395] [c0000007c2547b00] [c00000000037445c] generic_file_buffered_read+0x9bc/0x1120
> [  891.056411] [c0000007c2547c50] [c0000000004fddc0] blkdev_read_iter+0x50/0x80
> [  891.056428] [c0000007c2547c70] [c000000000493c64] new_sync_read+0x124/0x1a0
> [  891.056443] [c0000007c2547d10] [c000000000496e30] vfs_read+0x100/0x200
> [  891.056471] [c0000007c2547d70] [c000000000497368] ksys_read+0x78/0x130
> [  891.056487] [c0000007c2547dc0] [c000000000030564] system_call_exception+0xe4/0x170
> [  891.056504] [c0000007c2547e20] [c00000000000ca70] system_call_common+0xf0/0x278
> [  891.056518] Instruction dump:
> [  891.056529] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  891.056545] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  891.056564] ---[ end trace 14197a45ec121b51 ]---

Is what should be in the email, that's the important part.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block)
  2020-07-01 16:42   ` Jens Axboe
@ 2020-07-01 17:16     ` Rachel Sibley
  2020-07-01 17:21       ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Rachel Sibley @ 2020-07-01 17:16 UTC (permalink / raw)
  To: Jens Axboe, CKI Project, linux-block; +Cc: Xiong Zhou



On 7/1/20 12:42 PM, Jens Axboe wrote:
> On 7/1/20 10:37 AM, Rachel Sibley wrote:
>> Hi, we're seeing multiple panics across all arches, I included a snippet of the call trace for both
>> xfstests and boot test.
>>
>> You should be able to inspect in more detail by viewing the console.log under each build/tests directory:
>> https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250
> 
> This was due to a bad patch series, which since got reverted and redone. Current
> tree should be fine.
> 
> Now it doesn't matter for this one since I guessed what this was and found it
> before the bot did, but I do wish the reports were easier to look at. I should
> not have to dig through directories (which were empty when the report went out,

Sorry about that we noticed this right after we sent the report and worked quickly to resolve it on our end,
the logs are now accessible in the external artifacts location.

> btw) to find logs, then download logs and leaf through hundreds of kb of text
> to find out why the bot thought the tree was broken. It should be readily
> apparent and in the email. If there's an OOPS, include the oops.

Agreed, this is also something we'd like to do and we have an outstanding ticket to work on it.
I'll follow up and see if we can move this along quicker to make it easier to find it in the reports.

> 
> I'd much rather get a separate report for each arch, each having the oops
> that got triggered, than get one massive email where it's really not obvious
> where to look.

We are working on open sourcing our dashboard (datawarehouse) and in the process of reworking it. This is
one of our main priorities right now. Once the data warehouse is public, it will be linked in the upstream
reports and it will make it easier to find related logs/failures going forward.

Thanks for all the feedback!
Rachel

> 
> This:
> 
>> https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/06/30/609250/build_ppc64le_redhat%3A926155/tests/8501352/ppc64le_3_console.log
>>
>> [  890.198174] run fstests generic/040 at 2020-06-30 12:03:02
>> [  891.055910] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>> [  891.055942] Faulting instruction address: 0x00000000
>> [  891.055956] Oops: Kernel access of bad area, sig: 11 [#1]
>> [  891.055969] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
>> [  891.055982] Modules linked in: dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rfkill joydev i40e at24 sunrpc ses
>> enclosure scsi_transport_sas regmap_i2c ofpart powernv_flash mtd crct10dif_vpmsum ipmi_powernv ipmi_devintf opal_prd ipmi_msghandler rtc_opal i2c_opal
>> ip_tables xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea vmx_crypto sysfillrect sysimgblt crc32c_vpmsum fb_sys_fops
>> drm_ttm_helper ttm drm i2c_core aacraid drm_panel_orientation_quirks
>> [  891.056077] CPU: 25 PID: 84211 Comm: systemd-udevd Kdump: loaded Not tainted 5.8.0-rc2-c698ae9.cki #1
>> [  891.056095] NIP:  0000000000000000 LR: c00000000070eef0 CTR: 0000000000000000
>> [  891.056110] REGS: c0000007c25474e0 TRAP: 0400   Not tainted  (5.8.0-rc2-c698ae9.cki)
>> [  891.056125] MSR:  9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24488248  XER: 20040000
>> [  891.056145] CFAR: c00000000070eeec IRQMASK: 0
>> [  891.056145] GPR00: c00000000070f050 c0000007c2547770 c000000001cb7f00 c0000002b6059af8
>> [  891.056145] GPR04: 0000000000000000 c0000007dcf6f000 c0000002b6059af8 0000000000000000
>> [  891.056145] GPR08: 00000007fc940000 c000000001c97d78 0000000000000000 0000000000000000
>> [  891.056145] GPR12: 0000000000000000 c0000007fffe2e00 c0000007c2344400 0000000000000000
>> [  891.056145] GPR16: 0000000000000000 00007fffc9b7cb50 c0000007c2344400 0000000000000000
>> [  891.056145] GPR20: c0000002b307bdd8 0000000000000000 c0000007c2547ca8 c0000007dcf6f000
>> [  891.056145] GPR24: 000000000000000c 000000000000000a c0000007c2547790 0000000000000001
>> [  891.056145] GPR28: 0000000000000000 0000000000000000 00000000ffffffff c0000002b6059af8
>> [  891.056260] NIP [0000000000000000] 0x0
>> [  891.056272] LR [c00000000070eef0] submit_bio_noacct+0x2f0/0x5c0
>> [  891.056285] Call Trace:
>> [  891.056294] [c0000007c2547770] [c00000000070f050] submit_bio_noacct+0x450/0x5c0 (unreliable)
>> [  891.056312] [c0000007c2547800] [c00000000070f228] submit_bio+0x68/0x2d0
>> [  891.056328] [c0000007c25478c0] [c000000000505fe8] mpage_readahead+0x1c8/0x290
>> [  891.056345] [c0000007c25479a0] [c0000000004fd6f8] blkdev_readahead+0x28/0x40
>> [  891.056362] [c0000007c25479c0] [c000000000383980] read_pages+0xb0/0x4a0
>> [  891.056376] [c0000007c2547a40] [c000000000384474] page_cache_readahead_unbounded+0x244/0x300
>> [  891.056395] [c0000007c2547b00] [c00000000037445c] generic_file_buffered_read+0x9bc/0x1120
>> [  891.056411] [c0000007c2547c50] [c0000000004fddc0] blkdev_read_iter+0x50/0x80
>> [  891.056428] [c0000007c2547c70] [c000000000493c64] new_sync_read+0x124/0x1a0
>> [  891.056443] [c0000007c2547d10] [c000000000496e30] vfs_read+0x100/0x200
>> [  891.056471] [c0000007c2547d70] [c000000000497368] ksys_read+0x78/0x130
>> [  891.056487] [c0000007c2547dc0] [c000000000030564] system_call_exception+0xe4/0x170
>> [  891.056504] [c0000007c2547e20] [c00000000000ca70] system_call_common+0xf0/0x278
>> [  891.056518] Instruction dump:
>> [  891.056529] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [  891.056545] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [  891.056564] ---[ end trace 14197a45ec121b51 ]---
> 
> Is what should be in the email, that's the important part.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block)
  2020-07-01 17:16     ` Rachel Sibley
@ 2020-07-01 17:21       ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2020-07-01 17:21 UTC (permalink / raw)
  To: Rachel Sibley, CKI Project, linux-block; +Cc: Xiong Zhou

On 7/1/20 11:16 AM, Rachel Sibley wrote:
> 
> 
> On 7/1/20 12:42 PM, Jens Axboe wrote:
>> On 7/1/20 10:37 AM, Rachel Sibley wrote:
>>> Hi, we're seeing multiple panics across all arches, I included a snippet of the call trace for both
>>> xfstests and boot test.
>>>
>>> You should be able to inspect in more detail by viewing the console.log under each build/tests directory:
>>> https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/06/30/609250
>>
>> This was due to a bad patch series, which since got reverted and redone. Current
>> tree should be fine.
>>
>> Now it doesn't matter for this one since I guessed what this was and found it
>> before the bot did, but I do wish the reports were easier to look at. I should
>> not have to dig through directories (which were empty when the report went out,
> 
> Sorry about that we noticed this right after we sent the report and
> worked quickly to resolve it on our end, the logs are now accessible
> in the external artifacts location.

I was probably just too quick, but if we can fix the below, then it'd
work much nicer and the logs would just be a secondary resource.

>> btw) to find logs, then download logs and leaf through hundreds of kb of text
>> to find out why the bot thought the tree was broken. It should be readily
>> apparent and in the email. If there's an OOPS, include the oops.
> 
> Agreed, this is also something we'd like to do and we have an
> outstanding ticket to work on it.  I'll follow up and see if we can
> move this along quicker to make it easier to find it in the reports.

Thanks, that would be a massive improvement! The OOPS is really the key
thing here, and then I think it's fine to have to dig in
logs/directories to find other related information. Sometimes you just
know what it is just by seeing the OOPS.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-01 17:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01 13:06 💥 PANICKED: Test report for kernel 5.8.0-rc2-c698ae9.cki (block) CKI Project
2020-07-01 16:37 ` Rachel Sibley
2020-07-01 16:42   ` Jens Axboe
2020-07-01 17:16     ` Rachel Sibley
2020-07-01 17:21       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).