All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d)
       [not found] <cki.1BB6AA01C6.FWO6ZHIQNG@redhat.com>
@ 2021-10-19  2:13 ` Yi Zhang
  2021-10-19  2:51   ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Yi Zhang @ 2021-10-19  2:13 UTC (permalink / raw)
  To: linux-block
  Cc: skt-results-master, CKI Project, Changhui Zhong, Bruno Goncalves,
	Ming Lei

Hello

With this commit, the servers boots with NULL pointer[1], pls help check it.

>        Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
>             Commit: 1983520d28d8 - Merge branch 'for-5.16/io_uring' into for-next
[1]
[    8.614036] Kernel attempted to read user page (f) - exploit
attempt? (uid: 0)
[    8.614071] BUG: Kernel NULL pointer dereference on read at 0x0000000f
[    8.614099] Faulting instruction address: 0xc00000000093b5b4
[    8.614118] Oops: Kernel access of bad area, sig: 11 [#1]
[    8.614143] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[    8.614192] Modules linked in: zram ip_tables ast i2c_algo_bit
drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto crc32c_vpmsum
i2c_core drm_panel_orientation_quirks
[    8.614285] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 5.15.0-rc6+ #1
[    8.614323] NIP:  c00000000093b5b4 LR: c00000000093b5a4 CTR: c000000000972c50
[    8.614371] REGS: c000000018d3b430 TRAP: 0300   Not tainted  (5.15.0-rc6+)
[    8.614409] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
44004284  XER: 00000000
[    8.614464] CFAR: c000000000972cd0 DAR: 000000000000000f DSISR:
40000000 IRQMASK: 1
[    8.614464] GPR00: c00000000093b5a4 c000000018d3b6d0
c0000000028aa100 c000000044582a00
[    8.614464] GPR04: 00000000ffff8e2c 0000000000000001
0000000000000004 c000000001212930
[    8.614464] GPR08: 00000000000001d7 0000000000000007
c000000044361f80 0000000000000000
[    8.614464] GPR12: c000000000972c50 c000001ffffa7200
0000000000000000 0000000000000100
[    8.614464] GPR16: 0000000000000004 0000000004200042
0000000000000000 0000000000000001
[    8.614464] GPR20: c0000000028d3a00 c000000002167888
00000000ffff8e2d c000000002167888
[    8.614464] GPR24: c0000000021f0580 0000000000000001
c000000044582ae0 0000000000000801
[    8.614464] GPR28: c000000044361f80 c000000044361f80
c00000003b81f800 c000000044582a00
[    8.614852] NIP [c00000000093b5b4] blk_mq_free_request+0x74/0x210
[    8.614898] LR [c00000000093b5a4] blk_mq_free_request+0x64/0x210
[    8.614942] Call Trace:
[    8.614961] [c000000018d3b6d0] [c00000000093b8ec]
__blk_mq_end_request+0x19c/0x1d0 (unreliable)
[    8.615020] [c000000018d3b710] [c00000000092fdfc]
blk_flush_complete_seq+0x1ac/0x3d0
[    8.615077] [c000000018d3b770] [c0000000009302dc] flush_end_io+0x2bc/0x390
[    8.615122] [c000000018d3b7d0] [c00000000093b7cc]
__blk_mq_end_request+0x7c/0x1d0
[    8.615174] [c000000018d3b810] [c000000000c20684]
scsi_end_request+0x124/0x270
[    8.615228] [c000000018d3b860] [c000000000c21458]
scsi_io_completion+0x1f8/0x740
[    8.615250] [c000000018d3b900] [c000000000c13e14]
scsi_finish_command+0x134/0x190
[    8.615292] [c000000018d3b990] [c000000000c21068] scsi_complete+0xa8/0x200
[    8.615345] [c000000018d3ba10] [c000000000939870] blk_complete_reqs+0x80/0xa0
[    8.615409] [c000000018d3ba40] [c00000000115dfe0] __do_softirq+0x170/0x3fc
[    8.615475] [c000000018d3bb30] [c00000000015df38] __irq_exit_rcu+0x168/0x1a0
[    8.615535] [c000000018d3bb60] [c00000000015e140] irq_exit+0x20/0x40
[    8.615583] [c000000018d3bb80] [c000000000054670]
doorbell_exception+0x120/0x300
[    8.615616] [c000000018d3bbc0] [c000000000016cc4]
replay_soft_interrupts+0x1e4/0x2c0
[    8.615639] [c000000018d3bda0] [c000000000016ed8]
arch_local_irq_restore+0x138/0x1a0
[    8.615694] [c000000018d3bdd0] [c000000000de1714]
cpuidle_enter_state+0x104/0x540
[    8.615741] [c000000018d3be30] [c000000000de1bec] cpuidle_enter+0x4c/0x70
[    8.615795] [c000000018d3be70] [c0000000001aef18] do_idle+0x2f8/0x3f0
[    8.615839] [c000000018d3bf00] [c0000000001af238] cpu_startup_entry+0x38/0x40
[    8.615901] [c000000018d3bf30] [c00000000005be6c] start_secondary+0x29c/0x2b0
[    8.615969] [c000000018d3bf90] [c00000000000d254]
start_secondary_prolog+0x10/0x14
[    8.616025] Instruction dump:
[    8.616074] 41820048 e93d0008 e9290000 e9890068 2c2c0000 41820010
7d8903a6 4e800421
[    8.616136] e8410018 e93f00d8 2c290000 41820140 <e8690008> 4bff6b91
60000000 39400000
[    8.616184] ---[ end trace cc3215be892e1be7 ]---
[    8.644628] systemd-journald[1408]: Received client request to
flush runtime journal.
[  OK  ] Finished Create Static Device Nodes in /dev.
         Starting Rule-based Manage…for Device Events and Files...
[  OK  ] Finished Coldplug All udev Devices.
         Starting Wait for udev To …plete Device Initialization...
[    8.690954] fuse: init (API version 7.34)
[  OK  ] Finished Load Kernel Module fuse.
[    8.694411]
         Mounting FUSE Control File System...
[  OK  ] Mounted FUSE Control File System.
[  OK  ] Started Rule-based Manager for Device Events and Files.
         Starting Load Kernel Module configfs...
[  OK  ] Finished Load Kernel Module configfs.
[    8.950307] IPMI message handler: version 39.2
[  OK  ] Found device /dev/zram0.
         Starting Create swap on /dev/zram0...
[    9.008355] zram0: detected capacity change from 0 to 16777216
[    9.694430] Kernel panic - not syncing: Aiee, killing interrupt handler!
[   11.080031] ---[ end Kernel panic - not syncing: Aiee, killing
interrupt handler! ]---



On Tue, Oct 19, 2021 at 8:22 AM CKI Project <cki-project@redhat.com> wrote:
>
>
> ┌───────────────────────────────────────────────────────────────┐
> │ REVIEW REQUIRED FOR FAILED TEST                               │
> ├───────────────────────────────────────────────────────────────┤
> │ This failed kernel test has been held for review by kernel    │
> │ test maintainers and the CKI team. Please investigate using   │
> │ the pipeline link below this box.                             │
> │                                                               │
> │ If the test failure is related to a non-kernel bug, no action │
> │ is needed. If a kernel bug is found, please reply all with    │
> │ your assessment and we will release the report.               │
> │ For more details: https://docs.engineering.redhat.com/x/eG5qB │
> └───────────────────────────────────────────────────────────────┘
>
> Pipeline: https://gitlab.com/redhat/red-hat-ci-tools/kernel/cki-internal-pipelines/cki-trusted-contributors/-/pipelines/390489191
>
> Check out if the issue is autotriaged in the dashboard:
>     https://datawarehouse.cki-project.org/search?q=390489191
>
> Hello,
>
> We ran automated tests on a recent commit from this kernel tree:
>
>        Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
>             Commit: 1983520d28d8 - Merge branch 'for-5.16/io_uring' into for-next
>
> The results of these automated tests are provided below.
>
>     Overall result: FAILED (see details below)
>              Merge: OK
>            Compile: OK
>              Tests: PANICKED
>     Targeted tests: NO
>
> All kernel binaries, config files, and logs are available for download here:
>
>   https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/10/18/390489191
>
> One or more kernel tests failed:
>
>     s390x:
>      💥 Boot test
>      💥 Boot test
>      💥 Boot test
>
>     ppc64le:
>      💥 Storage blktests - srp
>      💥 Storage block - filesystem fio test
>
>     aarch64:
>      💥 Boot test
>      💥 Boot test
>      💥 Boot test
>      💥 Boot test
>
>     x86_64:
>      💥 Boot test
>      💥 Boot test
>
> We hope that these logs can help you find the problem quickly. For the full
> detail on our testing procedures, please scroll to the bottom of this message.
>
> Please reply to this email if you have any questions about the tests that we
> ran or if you have any suggestions on how to make future tests more effective.
>
>         ,-.   ,-.
>        ( C ) ( K )  Continuous
>         `-',-.`-'   Kernel
>           ( I )     Integration
>            `-'
> ______________________________________________________________________________
>
> Compile testing
> ---------------
>
> We compiled the kernel for 4 architectures:
>
>     aarch64:
>       make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
>     ppc64le:
>       make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
>     s390x:
>       make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
>     x86_64:
>       make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
>
>
>




--
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d)
  2021-10-19  2:13 ` 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d) Yi Zhang
@ 2021-10-19  2:51   ` Jens Axboe
  2021-10-19  3:40     ` Yi Zhang
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-10-19  2:51 UTC (permalink / raw)
  To: Yi Zhang, linux-block
  Cc: skt-results-master, CKI Project, Changhui Zhong, Bruno Goncalves,
	Ming Lei

On 10/18/21 8:13 PM, Yi Zhang wrote:
> Hello
> 
> With this commit, the servers boots with NULL pointer[1], pls help check it.
> 
>>        Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
>>             Commit: 1983520d28d8 - Merge branch 'for-5.16/io_uring' into for-next
> [1]
> [    8.614036] Kernel attempted to read user page (f) - exploit
> attempt? (uid: 0)
> [    8.614071] BUG: Kernel NULL pointer dereference on read at 0x0000000f
> [    8.614099] Faulting instruction address: 0xc00000000093b5b4
> [    8.614118] Oops: Kernel access of bad area, sig: 11 [#1]
> [    8.614143] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> [    8.614192] Modules linked in: zram ip_tables ast i2c_algo_bit
> drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto crc32c_vpmsum
> i2c_core drm_panel_orientation_quirks
> [    8.614285] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 5.15.0-rc6+ #1
> [    8.614323] NIP:  c00000000093b5b4 LR: c00000000093b5a4 CTR: c000000000972c50
> [    8.614371] REGS: c000000018d3b430 TRAP: 0300   Not tainted  (5.15.0-rc6+)
> [    8.614409] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> 44004284  XER: 00000000
> [    8.614464] CFAR: c000000000972cd0 DAR: 000000000000000f DSISR:
> 40000000 IRQMASK: 1
> [    8.614464] GPR00: c00000000093b5a4 c000000018d3b6d0
> c0000000028aa100 c000000044582a00
> [    8.614464] GPR04: 00000000ffff8e2c 0000000000000001
> 0000000000000004 c000000001212930
> [    8.614464] GPR08: 00000000000001d7 0000000000000007
> c000000044361f80 0000000000000000
> [    8.614464] GPR12: c000000000972c50 c000001ffffa7200
> 0000000000000000 0000000000000100
> [    8.614464] GPR16: 0000000000000004 0000000004200042
> 0000000000000000 0000000000000001
> [    8.614464] GPR20: c0000000028d3a00 c000000002167888
> 00000000ffff8e2d c000000002167888
> [    8.614464] GPR24: c0000000021f0580 0000000000000001
> c000000044582ae0 0000000000000801
> [    8.614464] GPR28: c000000044361f80 c000000044361f80
> c00000003b81f800 c000000044582a00
> [    8.614852] NIP [c00000000093b5b4] blk_mq_free_request+0x74/0x210
> [    8.614898] LR [c00000000093b5a4] blk_mq_free_request+0x64/0x210
> [    8.614942] Call Trace:
> [    8.614961] [c000000018d3b6d0] [c00000000093b8ec]
> __blk_mq_end_request+0x19c/0x1d0 (unreliable)
> [    8.615020] [c000000018d3b710] [c00000000092fdfc]
> blk_flush_complete_seq+0x1ac/0x3d0
> [    8.615077] [c000000018d3b770] [c0000000009302dc] flush_end_io+0x2bc/0x390
> [    8.615122] [c000000018d3b7d0] [c00000000093b7cc]
> __blk_mq_end_request+0x7c/0x1d0
> [    8.615174] [c000000018d3b810] [c000000000c20684]
> scsi_end_request+0x124/0x270
> [    8.615228] [c000000018d3b860] [c000000000c21458]
> scsi_io_completion+0x1f8/0x740
> [    8.615250] [c000000018d3b900] [c000000000c13e14]
> scsi_finish_command+0x134/0x190
> [    8.615292] [c000000018d3b990] [c000000000c21068] scsi_complete+0xa8/0x200
> [    8.615345] [c000000018d3ba10] [c000000000939870] blk_complete_reqs+0x80/0xa0
> [    8.615409] [c000000018d3ba40] [c00000000115dfe0] __do_softirq+0x170/0x3fc
> [    8.615475] [c000000018d3bb30] [c00000000015df38] __irq_exit_rcu+0x168/0x1a0
> [    8.615535] [c000000018d3bb60] [c00000000015e140] irq_exit+0x20/0x40
> [    8.615583] [c000000018d3bb80] [c000000000054670]
> doorbell_exception+0x120/0x300
> [    8.615616] [c000000018d3bbc0] [c000000000016cc4]
> replay_soft_interrupts+0x1e4/0x2c0
> [    8.615639] [c000000018d3bda0] [c000000000016ed8]
> arch_local_irq_restore+0x138/0x1a0
> [    8.615694] [c000000018d3bdd0] [c000000000de1714]
> cpuidle_enter_state+0x104/0x540
> [    8.615741] [c000000018d3be30] [c000000000de1bec] cpuidle_enter+0x4c/0x70
> [    8.615795] [c000000018d3be70] [c0000000001aef18] do_idle+0x2f8/0x3f0
> [    8.615839] [c000000018d3bf00] [c0000000001af238] cpu_startup_entry+0x38/0x40
> [    8.615901] [c000000018d3bf30] [c00000000005be6c] start_secondary+0x29c/0x2b0
> [    8.615969] [c000000018d3bf90] [c00000000000d254]
> start_secondary_prolog+0x10/0x14
> [    8.616025] Instruction dump:
> [    8.616074] 41820048 e93d0008 e9290000 e9890068 2c2c0000 41820010
> 7d8903a6 4e800421
> [    8.616136] e8410018 e93f00d8 2c290000 41820140 <e8690008> 4bff6b91
> 60000000 39400000
> [    8.616184] ---[ end trace cc3215be892e1be7 ]---
> [    8.644628] systemd-journald[1408]: Received client request to
> flush runtime journal.
> [  OK  ] Finished Create Static Device Nodes in /dev.
>          Starting Rule-based Manage…for Device Events and Files...
> [  OK  ] Finished Coldplug All udev Devices.
>          Starting Wait for udev To …plete Device Initialization...
> [    8.690954] fuse: init (API version 7.34)
> [  OK  ] Finished Load Kernel Module fuse.
> [    8.694411]
>          Mounting FUSE Control File System...
> [  OK  ] Mounted FUSE Control File System.
> [  OK  ] Started Rule-based Manager for Device Events and Files.
>          Starting Load Kernel Module configfs...
> [  OK  ] Finished Load Kernel Module configfs.
> [    8.950307] IPMI message handler: version 39.2
> [  OK  ] Found device /dev/zram0.
>          Starting Create swap on /dev/zram0...
> [    9.008355] zram0: detected capacity change from 0 to 16777216
> [    9.694430] Kernel panic - not syncing: Aiee, killing interrupt handler!
> [   11.080031] ---[ end Kernel panic - not syncing: Aiee, killing
> interrupt handler! ]---

Can you try this?


diff --git a/block/blk-flush.c b/block/blk-flush.c
index 4201728bf3a5..e9c0b300a177 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -129,6 +129,9 @@ static void blk_flush_restore_request(struct request *rq)
 	/* make @rq a normal request */
 	rq->rq_flags &= ~RQF_FLUSH_SEQ;
 	rq->end_io = rq->flush.saved_end_io;
+	/* clear pointers overlapping with flush data */
+	rq->elv.icq = NULL;
+	rq->elv.priv[0] = rq->elv.priv[1] = NULL;
 }
 
 static void blk_flush_queue_rq(struct request *rq, bool add_front)

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d)
  2021-10-19  2:51   ` Jens Axboe
@ 2021-10-19  3:40     ` Yi Zhang
  2021-10-19  9:43       ` Bruno Goncalves
  0 siblings, 1 reply; 4+ messages in thread
From: Yi Zhang @ 2021-10-19  3:40 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, skt-results-master, CKI Project, Changhui Zhong,
	Bruno Goncalves, Ming Lei

On Tue, Oct 19, 2021 at 10:52 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 10/18/21 8:13 PM, Yi Zhang wrote:
> > Hello
> >
> > With this commit, the servers boots with NULL pointer[1], pls help check it.
> >
> >>        Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
> >>             Commit: 1983520d28d8 - Merge branch 'for-5.16/io_uring' into for-next
> > [1]
> > [    8.614036] Kernel attempted to read user page (f) - exploit
> > attempt? (uid: 0)
> > [    8.614071] BUG: Kernel NULL pointer dereference on read at 0x0000000f
> > [    8.614099] Faulting instruction address: 0xc00000000093b5b4
> > [    8.614118] Oops: Kernel access of bad area, sig: 11 [#1]
> > [    8.614143] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> > [    8.614192] Modules linked in: zram ip_tables ast i2c_algo_bit
> > drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt
> > fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto crc32c_vpmsum
> > i2c_core drm_panel_orientation_quirks
> > [    8.614285] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 5.15.0-rc6+ #1
> > [    8.614323] NIP:  c00000000093b5b4 LR: c00000000093b5a4 CTR: c000000000972c50
> > [    8.614371] REGS: c000000018d3b430 TRAP: 0300   Not tainted  (5.15.0-rc6+)
> > [    8.614409] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> > 44004284  XER: 00000000
> > [    8.614464] CFAR: c000000000972cd0 DAR: 000000000000000f DSISR:
> > 40000000 IRQMASK: 1
> > [    8.614464] GPR00: c00000000093b5a4 c000000018d3b6d0
> > c0000000028aa100 c000000044582a00
> > [    8.614464] GPR04: 00000000ffff8e2c 0000000000000001
> > 0000000000000004 c000000001212930
> > [    8.614464] GPR08: 00000000000001d7 0000000000000007
> > c000000044361f80 0000000000000000
> > [    8.614464] GPR12: c000000000972c50 c000001ffffa7200
> > 0000000000000000 0000000000000100
> > [    8.614464] GPR16: 0000000000000004 0000000004200042
> > 0000000000000000 0000000000000001
> > [    8.614464] GPR20: c0000000028d3a00 c000000002167888
> > 00000000ffff8e2d c000000002167888
> > [    8.614464] GPR24: c0000000021f0580 0000000000000001
> > c000000044582ae0 0000000000000801
> > [    8.614464] GPR28: c000000044361f80 c000000044361f80
> > c00000003b81f800 c000000044582a00
> > [    8.614852] NIP [c00000000093b5b4] blk_mq_free_request+0x74/0x210
> > [    8.614898] LR [c00000000093b5a4] blk_mq_free_request+0x64/0x210
> > [    8.614942] Call Trace:
> > [    8.614961] [c000000018d3b6d0] [c00000000093b8ec]
> > __blk_mq_end_request+0x19c/0x1d0 (unreliable)
> > [    8.615020] [c000000018d3b710] [c00000000092fdfc]
> > blk_flush_complete_seq+0x1ac/0x3d0
> > [    8.615077] [c000000018d3b770] [c0000000009302dc] flush_end_io+0x2bc/0x390
> > [    8.615122] [c000000018d3b7d0] [c00000000093b7cc]
> > __blk_mq_end_request+0x7c/0x1d0
> > [    8.615174] [c000000018d3b810] [c000000000c20684]
> > scsi_end_request+0x124/0x270
> > [    8.615228] [c000000018d3b860] [c000000000c21458]
> > scsi_io_completion+0x1f8/0x740
> > [    8.615250] [c000000018d3b900] [c000000000c13e14]
> > scsi_finish_command+0x134/0x190
> > [    8.615292] [c000000018d3b990] [c000000000c21068] scsi_complete+0xa8/0x200
> > [    8.615345] [c000000018d3ba10] [c000000000939870] blk_complete_reqs+0x80/0xa0
> > [    8.615409] [c000000018d3ba40] [c00000000115dfe0] __do_softirq+0x170/0x3fc
> > [    8.615475] [c000000018d3bb30] [c00000000015df38] __irq_exit_rcu+0x168/0x1a0
> > [    8.615535] [c000000018d3bb60] [c00000000015e140] irq_exit+0x20/0x40
> > [    8.615583] [c000000018d3bb80] [c000000000054670]
> > doorbell_exception+0x120/0x300
> > [    8.615616] [c000000018d3bbc0] [c000000000016cc4]
> > replay_soft_interrupts+0x1e4/0x2c0
> > [    8.615639] [c000000018d3bda0] [c000000000016ed8]
> > arch_local_irq_restore+0x138/0x1a0
> > [    8.615694] [c000000018d3bdd0] [c000000000de1714]
> > cpuidle_enter_state+0x104/0x540
> > [    8.615741] [c000000018d3be30] [c000000000de1bec] cpuidle_enter+0x4c/0x70
> > [    8.615795] [c000000018d3be70] [c0000000001aef18] do_idle+0x2f8/0x3f0
> > [    8.615839] [c000000018d3bf00] [c0000000001af238] cpu_startup_entry+0x38/0x40
> > [    8.615901] [c000000018d3bf30] [c00000000005be6c] start_secondary+0x29c/0x2b0
> > [    8.615969] [c000000018d3bf90] [c00000000000d254]
> > start_secondary_prolog+0x10/0x14
> > [    8.616025] Instruction dump:
> > [    8.616074] 41820048 e93d0008 e9290000 e9890068 2c2c0000 41820010
> > 7d8903a6 4e800421
> > [    8.616136] e8410018 e93f00d8 2c290000 41820140 <e8690008> 4bff6b91
> > 60000000 39400000
> > [    8.616184] ---[ end trace cc3215be892e1be7 ]---
> > [    8.644628] systemd-journald[1408]: Received client request to
> > flush runtime journal.
> > [  OK  ] Finished Create Static Device Nodes in /dev.
> >          Starting Rule-based Manage…for Device Events and Files...
> > [  OK  ] Finished Coldplug All udev Devices.
> >          Starting Wait for udev To …plete Device Initialization...
> > [    8.690954] fuse: init (API version 7.34)
> > [  OK  ] Finished Load Kernel Module fuse.
> > [    8.694411]
> >          Mounting FUSE Control File System...
> > [  OK  ] Mounted FUSE Control File System.
> > [  OK  ] Started Rule-based Manager for Device Events and Files.
> >          Starting Load Kernel Module configfs...
> > [  OK  ] Finished Load Kernel Module configfs.
> > [    8.950307] IPMI message handler: version 39.2
> > [  OK  ] Found device /dev/zram0.
> >          Starting Create swap on /dev/zram0...
> > [    9.008355] zram0: detected capacity change from 0 to 16777216
> > [    9.694430] Kernel panic - not syncing: Aiee, killing interrupt handler!
> > [   11.080031] ---[ end Kernel panic - not syncing: Aiee, killing
> > interrupt handler! ]---
>
> Can you try this?
>

Yeah, the boot panic issue was fixed on ppc64le from my testing, how
about wait CKI re-test it on other arches?

>
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index 4201728bf3a5..e9c0b300a177 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -129,6 +129,9 @@ static void blk_flush_restore_request(struct request *rq)
>         /* make @rq a normal request */
>         rq->rq_flags &= ~RQF_FLUSH_SEQ;
>         rq->end_io = rq->flush.saved_end_io;
> +       /* clear pointers overlapping with flush data */
> +       rq->elv.icq = NULL;
> +       rq->elv.priv[0] = rq->elv.priv[1] = NULL;
>  }
>
>  static void blk_flush_queue_rq(struct request *rq, bool add_front)
>
> --
> Jens Axboe
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d)
  2021-10-19  3:40     ` Yi Zhang
@ 2021-10-19  9:43       ` Bruno Goncalves
  0 siblings, 0 replies; 4+ messages in thread
From: Bruno Goncalves @ 2021-10-19  9:43 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Jens Axboe, linux-block, skt-results-master, CKI Project,
	Changhui Zhong, Ming Lei

On Tue, Oct 19, 2021 at 5:41 AM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> On Tue, Oct 19, 2021 at 10:52 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > On 10/18/21 8:13 PM, Yi Zhang wrote:
> > > Hello
> > >
> > > With this commit, the servers boots with NULL pointer[1], pls help check it.
> > >
> > >>        Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
> > >>             Commit: 1983520d28d8 - Merge branch 'for-5.16/io_uring' into for-next
> > > [1]
> > > [    8.614036] Kernel attempted to read user page (f) - exploit
> > > attempt? (uid: 0)
> > > [    8.614071] BUG: Kernel NULL pointer dereference on read at 0x0000000f
> > > [    8.614099] Faulting instruction address: 0xc00000000093b5b4
> > > [    8.614118] Oops: Kernel access of bad area, sig: 11 [#1]
> > > [    8.614143] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> > > [    8.614192] Modules linked in: zram ip_tables ast i2c_algo_bit
> > > drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt
> > > fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto crc32c_vpmsum
> > > i2c_core drm_panel_orientation_quirks
> > > [    8.614285] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 5.15.0-rc6+ #1
> > > [    8.614323] NIP:  c00000000093b5b4 LR: c00000000093b5a4 CTR: c000000000972c50
> > > [    8.614371] REGS: c000000018d3b430 TRAP: 0300   Not tainted  (5.15.0-rc6+)
> > > [    8.614409] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> > > 44004284  XER: 00000000
> > > [    8.614464] CFAR: c000000000972cd0 DAR: 000000000000000f DSISR:
> > > 40000000 IRQMASK: 1
> > > [    8.614464] GPR00: c00000000093b5a4 c000000018d3b6d0
> > > c0000000028aa100 c000000044582a00
> > > [    8.614464] GPR04: 00000000ffff8e2c 0000000000000001
> > > 0000000000000004 c000000001212930
> > > [    8.614464] GPR08: 00000000000001d7 0000000000000007
> > > c000000044361f80 0000000000000000
> > > [    8.614464] GPR12: c000000000972c50 c000001ffffa7200
> > > 0000000000000000 0000000000000100
> > > [    8.614464] GPR16: 0000000000000004 0000000004200042
> > > 0000000000000000 0000000000000001
> > > [    8.614464] GPR20: c0000000028d3a00 c000000002167888
> > > 00000000ffff8e2d c000000002167888
> > > [    8.614464] GPR24: c0000000021f0580 0000000000000001
> > > c000000044582ae0 0000000000000801
> > > [    8.614464] GPR28: c000000044361f80 c000000044361f80
> > > c00000003b81f800 c000000044582a00
> > > [    8.614852] NIP [c00000000093b5b4] blk_mq_free_request+0x74/0x210
> > > [    8.614898] LR [c00000000093b5a4] blk_mq_free_request+0x64/0x210
> > > [    8.614942] Call Trace:
> > > [    8.614961] [c000000018d3b6d0] [c00000000093b8ec]
> > > __blk_mq_end_request+0x19c/0x1d0 (unreliable)
> > > [    8.615020] [c000000018d3b710] [c00000000092fdfc]
> > > blk_flush_complete_seq+0x1ac/0x3d0
> > > [    8.615077] [c000000018d3b770] [c0000000009302dc] flush_end_io+0x2bc/0x390
> > > [    8.615122] [c000000018d3b7d0] [c00000000093b7cc]
> > > __blk_mq_end_request+0x7c/0x1d0
> > > [    8.615174] [c000000018d3b810] [c000000000c20684]
> > > scsi_end_request+0x124/0x270
> > > [    8.615228] [c000000018d3b860] [c000000000c21458]
> > > scsi_io_completion+0x1f8/0x740
> > > [    8.615250] [c000000018d3b900] [c000000000c13e14]
> > > scsi_finish_command+0x134/0x190
> > > [    8.615292] [c000000018d3b990] [c000000000c21068] scsi_complete+0xa8/0x200
> > > [    8.615345] [c000000018d3ba10] [c000000000939870] blk_complete_reqs+0x80/0xa0
> > > [    8.615409] [c000000018d3ba40] [c00000000115dfe0] __do_softirq+0x170/0x3fc
> > > [    8.615475] [c000000018d3bb30] [c00000000015df38] __irq_exit_rcu+0x168/0x1a0
> > > [    8.615535] [c000000018d3bb60] [c00000000015e140] irq_exit+0x20/0x40
> > > [    8.615583] [c000000018d3bb80] [c000000000054670]
> > > doorbell_exception+0x120/0x300
> > > [    8.615616] [c000000018d3bbc0] [c000000000016cc4]
> > > replay_soft_interrupts+0x1e4/0x2c0
> > > [    8.615639] [c000000018d3bda0] [c000000000016ed8]
> > > arch_local_irq_restore+0x138/0x1a0
> > > [    8.615694] [c000000018d3bdd0] [c000000000de1714]
> > > cpuidle_enter_state+0x104/0x540
> > > [    8.615741] [c000000018d3be30] [c000000000de1bec] cpuidle_enter+0x4c/0x70
> > > [    8.615795] [c000000018d3be70] [c0000000001aef18] do_idle+0x2f8/0x3f0
> > > [    8.615839] [c000000018d3bf00] [c0000000001af238] cpu_startup_entry+0x38/0x40
> > > [    8.615901] [c000000018d3bf30] [c00000000005be6c] start_secondary+0x29c/0x2b0
> > > [    8.615969] [c000000018d3bf90] [c00000000000d254]
> > > start_secondary_prolog+0x10/0x14
> > > [    8.616025] Instruction dump:
> > > [    8.616074] 41820048 e93d0008 e9290000 e9890068 2c2c0000 41820010
> > > 7d8903a6 4e800421
> > > [    8.616136] e8410018 e93f00d8 2c290000 41820140 <e8690008> 4bff6b91
> > > 60000000 39400000
> > > [    8.616184] ---[ end trace cc3215be892e1be7 ]---
> > > [    8.644628] systemd-journald[1408]: Received client request to
> > > flush runtime journal.
> > > [  OK  ] Finished Create Static Device Nodes in /dev.
> > >          Starting Rule-based Manage…for Device Events and Files...
> > > [  OK  ] Finished Coldplug All udev Devices.
> > >          Starting Wait for udev To …plete Device Initialization...
> > > [    8.690954] fuse: init (API version 7.34)
> > > [  OK  ] Finished Load Kernel Module fuse.
> > > [    8.694411]
> > >          Mounting FUSE Control File System...
> > > [  OK  ] Mounted FUSE Control File System.
> > > [  OK  ] Started Rule-based Manager for Device Events and Files.
> > >          Starting Load Kernel Module configfs...
> > > [  OK  ] Finished Load Kernel Module configfs.
> > > [    8.950307] IPMI message handler: version 39.2
> > > [  OK  ] Found device /dev/zram0.
> > >          Starting Create swap on /dev/zram0...
> > > [    9.008355] zram0: detected capacity change from 0 to 16777216
> > > [    9.694430] Kernel panic - not syncing: Aiee, killing interrupt handler!
> > > [   11.080031] ---[ end Kernel panic - not syncing: Aiee, killing
> > > interrupt handler! ]---
> >
> > Can you try this?
> >
>
> Yeah, the boot panic issue was fixed on ppc64le from my testing, how
> about wait CKI re-test it on other arches?

Thanks, CKI is running for
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next&id=3da791d43d00f7f8c824ac7b619c9f73e44fea13,
the boot issue is fixed.

Bruno

>
> >
> > diff --git a/block/blk-flush.c b/block/blk-flush.c
> > index 4201728bf3a5..e9c0b300a177 100644
> > --- a/block/blk-flush.c
> > +++ b/block/blk-flush.c
> > @@ -129,6 +129,9 @@ static void blk_flush_restore_request(struct request *rq)
> >         /* make @rq a normal request */
> >         rq->rq_flags &= ~RQF_FLUSH_SEQ;
> >         rq->end_io = rq->flush.saved_end_io;
> > +       /* clear pointers overlapping with flush data */
> > +       rq->elv.icq = NULL;
> > +       rq->elv.priv[0] = rq->elv.priv[1] = NULL;
> >  }
> >
> >  static void blk_flush_queue_rq(struct request *rq, bool add_front)
> >
> > --
> > Jens Axboe
> >
>
>
> --
> Best Regards,
>   Yi Zhang
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-19  9:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cki.1BB6AA01C6.FWO6ZHIQNG@redhat.com>
2021-10-19  2:13 ` 💥 PANICKED: Waiting for review: Test report for kernel 5.15.0-rc6 (block, 1983520d) Yi Zhang
2021-10-19  2:51   ` Jens Axboe
2021-10-19  3:40     ` Yi Zhang
2021-10-19  9:43       ` Bruno Goncalves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.