Kernel panic - not syncing: corrupted stack end detected inside scheduler

All of lore.kernel.org
 help / color / mirror / Atom feed

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-19 11:23 ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-19 11:23 UTC (permalink / raw)
  To: linux-riscv

Could this be a stack overflow?

[ 2427.690000] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[ 2427.690000]
[ 2427.690000] CPU: 1 PID: 3540 Comm: kworker/u8:2 Not tainted 4.19.0-00014-g978b77fe75 #6
[ 2427.690000] Workqueue: writeback wb_workfn (flush-179:0)
[ 2427.690000] Call Trace:
[ 2427.690000] [<ffffffe000c867d4>] walk_stackframe+0x0/0xa4
[ 2427.690000] [<ffffffe000c869d4>] show_stack+0x2a/0x34
[ 2427.690000] [<ffffffe0011a8800>] dump_stack+0x62/0x7c
[ 2427.690000] [<ffffffe000c8b542>] panic+0xd2/0x1f0
[ 2427.690000] [<ffffffe0011bb25c>] schedule+0x0/0x58
[ 2427.690000] [<ffffffe0011bb470>] preempt_schedule_common+0xe/0x1e
[ 2427.690000] [<ffffffe0011bb4b4>] _cond_resched+0x34/0x40
[ 2427.690000] [<ffffffe001025694>] __spi_pump_messages+0x29e/0x40e
[ 2427.690000] [<ffffffe001025986>] __spi_sync+0x168/0x16a
[ 2427.690000] [<ffffffe001025b86>] spi_sync_locked+0xc/0x14
[ 2427.690000] [<ffffffe001077e8e>] mmc_spi_data_do.isra.2+0x568/0xa7c
[ 2427.690000] [<ffffffe0010783fa>] mmc_spi_request+0x58/0xc6
[ 2427.690000] [<ffffffe001068bbe>] __mmc_start_request+0x4e/0xe2
[ 2427.690000] [<ffffffe001069902>] mmc_start_request+0x78/0xa4
[ 2427.690000] [<ffffffd008307394>] mmc_blk_mq_issue_rq+0x21e/0x64e [mmc_block]
[ 2427.690000] [<ffffffd008307b46>] mmc_mq_queue_rq+0x11a/0x1f0 [mmc_block]
[ 2427.690000] [<ffffffe000ebbf60>] __blk_mq_try_issue_directly+0xca/0x146
[ 2427.690000] [<ffffffe000ebca2c>] blk_mq_request_issue_directly+0x42/0x92
[ 2427.690000] [<ffffffe000ebcaac>] blk_mq_try_issue_list_directly+0x30/0x6e
[ 2427.690000] [<ffffffe000ebfdc2>] blk_mq_sched_insert_requests+0x56/0x80
[ 2427.690000] [<ffffffe000ebc9da>] blk_mq_flush_plug_list+0xd6/0xe6
[ 2427.690000] [<ffffffe000eb3498>] blk_flush_plug_list+0x9e/0x17c
[ 2427.690000] [<ffffffe000ebc2f8>] blk_mq_make_request+0x282/0x2d8
[ 2427.690000] [<ffffffe000eb1d02>] generic_make_request+0xee/0x27a
[ 2427.690000] [<ffffffe000eb1f6e>] submit_bio+0xe0/0x136
[ 2427.690000] [<ffffffe000db10da>] submit_bh_wbc+0x130/0x176
[ 2427.690000] [<ffffffe000db12c6>] __block_write_full_page+0x1a6/0x3a8
[ 2427.690000] [<ffffffe000db167c>] block_write_full_page+0xce/0xe0
[ 2427.690000] [<ffffffe000db40f0>] blkdev_writepage+0x16/0x1e
[ 2427.690000] [<ffffffe000d3c7ca>] __writepage+0x14/0x4c
[ 2427.690000] [<ffffffe000d3d142>] write_cache_pages+0x15c/0x306
[ 2427.690000] [<ffffffe000d3e8a4>] generic_writepages+0x36/0x52
[ 2427.690000] [<ffffffe000db40b4>] blkdev_writepages+0xc/0x14
[ 2427.690000] [<ffffffe000d3f0ec>] do_writepages+0x36/0xa6
[ 2427.690000] [<ffffffe000da96ca>] __writeback_single_inode+0x2e/0x174
[ 2427.690000] [<ffffffe000da9c08>] writeback_sb_inodes+0x1ac/0x33e
[ 2427.690000] [<ffffffe000da9dea>] __writeback_inodes_wb+0x50/0x96
[ 2427.690000] [<ffffffe000daa052>] wb_writeback+0x182/0x186
[ 2427.690000] [<ffffffe000daa67c>] wb_workfn+0x242/0x270
[ 2427.690000] [<ffffffe000c9bb08>] process_one_work+0x16e/0x2ee
[ 2427.690000] [<ffffffe000c9bcde>] worker_thread+0x56/0x42a
[ 2427.690000] [<ffffffe000ca0bdc>] kthread+0xda/0xe8
[ 2427.690000] [<ffffffe000c85730>] ret_from_exception+0x0/0xc

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-19 11:23 ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-19 11:23 UTC (permalink / raw)
  To: linux-riscv

Could this be a stack overflow?

[ 2427.690000] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[ 2427.690000]
[ 2427.690000] CPU: 1 PID: 3540 Comm: kworker/u8:2 Not tainted 4.19.0-00014-g978b77fe75 #6
[ 2427.690000] Workqueue: writeback wb_workfn (flush-179:0)
[ 2427.690000] Call Trace:
[ 2427.690000] [<ffffffe000c867d4>] walk_stackframe+0x0/0xa4
[ 2427.690000] [<ffffffe000c869d4>] show_stack+0x2a/0x34
[ 2427.690000] [<ffffffe0011a8800>] dump_stack+0x62/0x7c
[ 2427.690000] [<ffffffe000c8b542>] panic+0xd2/0x1f0
[ 2427.690000] [<ffffffe0011bb25c>] schedule+0x0/0x58
[ 2427.690000] [<ffffffe0011bb470>] preempt_schedule_common+0xe/0x1e
[ 2427.690000] [<ffffffe0011bb4b4>] _cond_resched+0x34/0x40
[ 2427.690000] [<ffffffe001025694>] __spi_pump_messages+0x29e/0x40e
[ 2427.690000] [<ffffffe001025986>] __spi_sync+0x168/0x16a
[ 2427.690000] [<ffffffe001025b86>] spi_sync_locked+0xc/0x14
[ 2427.690000] [<ffffffe001077e8e>] mmc_spi_data_do.isra.2+0x568/0xa7c
[ 2427.690000] [<ffffffe0010783fa>] mmc_spi_request+0x58/0xc6
[ 2427.690000] [<ffffffe001068bbe>] __mmc_start_request+0x4e/0xe2
[ 2427.690000] [<ffffffe001069902>] mmc_start_request+0x78/0xa4
[ 2427.690000] [<ffffffd008307394>] mmc_blk_mq_issue_rq+0x21e/0x64e [mmc_block]
[ 2427.690000] [<ffffffd008307b46>] mmc_mq_queue_rq+0x11a/0x1f0 [mmc_block]
[ 2427.690000] [<ffffffe000ebbf60>] __blk_mq_try_issue_directly+0xca/0x146
[ 2427.690000] [<ffffffe000ebca2c>] blk_mq_request_issue_directly+0x42/0x92
[ 2427.690000] [<ffffffe000ebcaac>] blk_mq_try_issue_list_directly+0x30/0x6e
[ 2427.690000] [<ffffffe000ebfdc2>] blk_mq_sched_insert_requests+0x56/0x80
[ 2427.690000] [<ffffffe000ebc9da>] blk_mq_flush_plug_list+0xd6/0xe6
[ 2427.690000] [<ffffffe000eb3498>] blk_flush_plug_list+0x9e/0x17c
[ 2427.690000] [<ffffffe000ebc2f8>] blk_mq_make_request+0x282/0x2d8
[ 2427.690000] [<ffffffe000eb1d02>] generic_make_request+0xee/0x27a
[ 2427.690000] [<ffffffe000eb1f6e>] submit_bio+0xe0/0x136
[ 2427.690000] [<ffffffe000db10da>] submit_bh_wbc+0x130/0x176
[ 2427.690000] [<ffffffe000db12c6>] __block_write_full_page+0x1a6/0x3a8
[ 2427.690000] [<ffffffe000db167c>] block_write_full_page+0xce/0xe0
[ 2427.690000] [<ffffffe000db40f0>] blkdev_writepage+0x16/0x1e
[ 2427.690000] [<ffffffe000d3c7ca>] __writepage+0x14/0x4c
[ 2427.690000] [<ffffffe000d3d142>] write_cache_pages+0x15c/0x306
[ 2427.690000] [<ffffffe000d3e8a4>] generic_writepages+0x36/0x52
[ 2427.690000] [<ffffffe000db40b4>] blkdev_writepages+0xc/0x14
[ 2427.690000] [<ffffffe000d3f0ec>] do_writepages+0x36/0xa6
[ 2427.690000] [<ffffffe000da96ca>] __writeback_single_inode+0x2e/0x174
[ 2427.690000] [<ffffffe000da9c08>] writeback_sb_inodes+0x1ac/0x33e
[ 2427.690000] [<ffffffe000da9dea>] __writeback_inodes_wb+0x50/0x96
[ 2427.690000] [<ffffffe000daa052>] wb_writeback+0x182/0x186
[ 2427.690000] [<ffffffe000daa67c>] wb_workfn+0x242/0x270
[ 2427.690000] [<ffffffe000c9bb08>] process_one_work+0x16e/0x2ee
[ 2427.690000] [<ffffffe000c9bcde>] worker_thread+0x56/0x42a
[ 2427.690000] [<ffffffe000ca0bdc>] kthread+0xda/0xe8
[ 2427.690000] [<ffffffe000c85730>] ret_from_exception+0x0/0xc

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-19 23:46   ` Palmer Dabbelt
  0 siblings, 0 replies; 12+ messages in thread
From: Palmer Dabbelt @ 2018-11-19 23:46 UTC (permalink / raw)
  To: linux-riscv

On Mon, 19 Nov 2018 03:23:14 PST (-0800), schwab at suse.de wrote:
> Could this be a stack overflow?

Yes.

> [ 2427.690000] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [ 2427.690000]
> [ 2427.690000] CPU: 1 PID: 3540 Comm: kworker/u8:2 Not tainted 4.19.0-00014-g978b77fe75 #6
> [ 2427.690000] Workqueue: writeback wb_workfn (flush-179:0)
> [ 2427.690000] Call Trace:
> [ 2427.690000] [<ffffffe000c867d4>] walk_stackframe+0x0/0xa4
> [ 2427.690000] [<ffffffe000c869d4>] show_stack+0x2a/0x34
> [ 2427.690000] [<ffffffe0011a8800>] dump_stack+0x62/0x7c
> [ 2427.690000] [<ffffffe000c8b542>] panic+0xd2/0x1f0
> [ 2427.690000] [<ffffffe0011bb25c>] schedule+0x0/0x58
> [ 2427.690000] [<ffffffe0011bb470>] preempt_schedule_common+0xe/0x1e
> [ 2427.690000] [<ffffffe0011bb4b4>] _cond_resched+0x34/0x40
> [ 2427.690000] [<ffffffe001025694>] __spi_pump_messages+0x29e/0x40e
> [ 2427.690000] [<ffffffe001025986>] __spi_sync+0x168/0x16a
> [ 2427.690000] [<ffffffe001025b86>] spi_sync_locked+0xc/0x14
> [ 2427.690000] [<ffffffe001077e8e>] mmc_spi_data_do.isra.2+0x568/0xa7c
> [ 2427.690000] [<ffffffe0010783fa>] mmc_spi_request+0x58/0xc6
> [ 2427.690000] [<ffffffe001068bbe>] __mmc_start_request+0x4e/0xe2
> [ 2427.690000] [<ffffffe001069902>] mmc_start_request+0x78/0xa4
> [ 2427.690000] [<ffffffd008307394>] mmc_blk_mq_issue_rq+0x21e/0x64e [mmc_block]
> [ 2427.690000] [<ffffffd008307b46>] mmc_mq_queue_rq+0x11a/0x1f0 [mmc_block]
> [ 2427.690000] [<ffffffe000ebbf60>] __blk_mq_try_issue_directly+0xca/0x146
> [ 2427.690000] [<ffffffe000ebca2c>] blk_mq_request_issue_directly+0x42/0x92
> [ 2427.690000] [<ffffffe000ebcaac>] blk_mq_try_issue_list_directly+0x30/0x6e
> [ 2427.690000] [<ffffffe000ebfdc2>] blk_mq_sched_insert_requests+0x56/0x80
> [ 2427.690000] [<ffffffe000ebc9da>] blk_mq_flush_plug_list+0xd6/0xe6
> [ 2427.690000] [<ffffffe000eb3498>] blk_flush_plug_list+0x9e/0x17c
> [ 2427.690000] [<ffffffe000ebc2f8>] blk_mq_make_request+0x282/0x2d8
> [ 2427.690000] [<ffffffe000eb1d02>] generic_make_request+0xee/0x27a
> [ 2427.690000] [<ffffffe000eb1f6e>] submit_bio+0xe0/0x136
> [ 2427.690000] [<ffffffe000db10da>] submit_bh_wbc+0x130/0x176
> [ 2427.690000] [<ffffffe000db12c6>] __block_write_full_page+0x1a6/0x3a8
> [ 2427.690000] [<ffffffe000db167c>] block_write_full_page+0xce/0xe0
> [ 2427.690000] [<ffffffe000db40f0>] blkdev_writepage+0x16/0x1e
> [ 2427.690000] [<ffffffe000d3c7ca>] __writepage+0x14/0x4c
> [ 2427.690000] [<ffffffe000d3d142>] write_cache_pages+0x15c/0x306
> [ 2427.690000] [<ffffffe000d3e8a4>] generic_writepages+0x36/0x52
> [ 2427.690000] [<ffffffe000db40b4>] blkdev_writepages+0xc/0x14
> [ 2427.690000] [<ffffffe000d3f0ec>] do_writepages+0x36/0xa6
> [ 2427.690000] [<ffffffe000da96ca>] __writeback_single_inode+0x2e/0x174
> [ 2427.690000] [<ffffffe000da9c08>] writeback_sb_inodes+0x1ac/0x33e
> [ 2427.690000] [<ffffffe000da9dea>] __writeback_inodes_wb+0x50/0x96
> [ 2427.690000] [<ffffffe000daa052>] wb_writeback+0x182/0x186
> [ 2427.690000] [<ffffffe000daa67c>] wb_workfn+0x242/0x270
> [ 2427.690000] [<ffffffe000c9bb08>] process_one_work+0x16e/0x2ee
> [ 2427.690000] [<ffffffe000c9bcde>] worker_thread+0x56/0x42a
> [ 2427.690000] [<ffffffe000ca0bdc>] kthread+0xda/0xe8
> [ 2427.690000] [<ffffffe000c85730>] ret_from_exception+0x0/0xc

It smells like the issue is somewhere in the SPI driver, which is known to be 
buggy.  I don't see anything specific to indicate this is a stack overflow in 
this stack trace (the stack stuff above panic is just part of the printing).

Sorry I can't be more specific.  Does this require hardware to manifest?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-19 23:46   ` Palmer Dabbelt
  0 siblings, 0 replies; 12+ messages in thread
From: Palmer Dabbelt @ 2018-11-19 23:46 UTC (permalink / raw)
  To: schwab; +Cc: linux-riscv

On Mon, 19 Nov 2018 03:23:14 PST (-0800), schwab@suse.de wrote:
> Could this be a stack overflow?

Yes.

> [ 2427.690000] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [ 2427.690000]
> [ 2427.690000] CPU: 1 PID: 3540 Comm: kworker/u8:2 Not tainted 4.19.0-00014-g978b77fe75 #6
> [ 2427.690000] Workqueue: writeback wb_workfn (flush-179:0)
> [ 2427.690000] Call Trace:
> [ 2427.690000] [<ffffffe000c867d4>] walk_stackframe+0x0/0xa4
> [ 2427.690000] [<ffffffe000c869d4>] show_stack+0x2a/0x34
> [ 2427.690000] [<ffffffe0011a8800>] dump_stack+0x62/0x7c
> [ 2427.690000] [<ffffffe000c8b542>] panic+0xd2/0x1f0
> [ 2427.690000] [<ffffffe0011bb25c>] schedule+0x0/0x58
> [ 2427.690000] [<ffffffe0011bb470>] preempt_schedule_common+0xe/0x1e
> [ 2427.690000] [<ffffffe0011bb4b4>] _cond_resched+0x34/0x40
> [ 2427.690000] [<ffffffe001025694>] __spi_pump_messages+0x29e/0x40e
> [ 2427.690000] [<ffffffe001025986>] __spi_sync+0x168/0x16a
> [ 2427.690000] [<ffffffe001025b86>] spi_sync_locked+0xc/0x14
> [ 2427.690000] [<ffffffe001077e8e>] mmc_spi_data_do.isra.2+0x568/0xa7c
> [ 2427.690000] [<ffffffe0010783fa>] mmc_spi_request+0x58/0xc6
> [ 2427.690000] [<ffffffe001068bbe>] __mmc_start_request+0x4e/0xe2
> [ 2427.690000] [<ffffffe001069902>] mmc_start_request+0x78/0xa4
> [ 2427.690000] [<ffffffd008307394>] mmc_blk_mq_issue_rq+0x21e/0x64e [mmc_block]
> [ 2427.690000] [<ffffffd008307b46>] mmc_mq_queue_rq+0x11a/0x1f0 [mmc_block]
> [ 2427.690000] [<ffffffe000ebbf60>] __blk_mq_try_issue_directly+0xca/0x146
> [ 2427.690000] [<ffffffe000ebca2c>] blk_mq_request_issue_directly+0x42/0x92
> [ 2427.690000] [<ffffffe000ebcaac>] blk_mq_try_issue_list_directly+0x30/0x6e
> [ 2427.690000] [<ffffffe000ebfdc2>] blk_mq_sched_insert_requests+0x56/0x80
> [ 2427.690000] [<ffffffe000ebc9da>] blk_mq_flush_plug_list+0xd6/0xe6
> [ 2427.690000] [<ffffffe000eb3498>] blk_flush_plug_list+0x9e/0x17c
> [ 2427.690000] [<ffffffe000ebc2f8>] blk_mq_make_request+0x282/0x2d8
> [ 2427.690000] [<ffffffe000eb1d02>] generic_make_request+0xee/0x27a
> [ 2427.690000] [<ffffffe000eb1f6e>] submit_bio+0xe0/0x136
> [ 2427.690000] [<ffffffe000db10da>] submit_bh_wbc+0x130/0x176
> [ 2427.690000] [<ffffffe000db12c6>] __block_write_full_page+0x1a6/0x3a8
> [ 2427.690000] [<ffffffe000db167c>] block_write_full_page+0xce/0xe0
> [ 2427.690000] [<ffffffe000db40f0>] blkdev_writepage+0x16/0x1e
> [ 2427.690000] [<ffffffe000d3c7ca>] __writepage+0x14/0x4c
> [ 2427.690000] [<ffffffe000d3d142>] write_cache_pages+0x15c/0x306
> [ 2427.690000] [<ffffffe000d3e8a4>] generic_writepages+0x36/0x52
> [ 2427.690000] [<ffffffe000db40b4>] blkdev_writepages+0xc/0x14
> [ 2427.690000] [<ffffffe000d3f0ec>] do_writepages+0x36/0xa6
> [ 2427.690000] [<ffffffe000da96ca>] __writeback_single_inode+0x2e/0x174
> [ 2427.690000] [<ffffffe000da9c08>] writeback_sb_inodes+0x1ac/0x33e
> [ 2427.690000] [<ffffffe000da9dea>] __writeback_inodes_wb+0x50/0x96
> [ 2427.690000] [<ffffffe000daa052>] wb_writeback+0x182/0x186
> [ 2427.690000] [<ffffffe000daa67c>] wb_workfn+0x242/0x270
> [ 2427.690000] [<ffffffe000c9bb08>] process_one_work+0x16e/0x2ee
> [ 2427.690000] [<ffffffe000c9bcde>] worker_thread+0x56/0x42a
> [ 2427.690000] [<ffffffe000ca0bdc>] kthread+0xda/0xe8
> [ 2427.690000] [<ffffffe000c85730>] ret_from_exception+0x0/0xc

It smells like the issue is somewhere in the SPI driver, which is known to be 
buggy.  I don't see anything specific to indicate this is a stack overflow in 
this stack trace (the stack stuff above panic is just part of the printing).

Sorry I can't be more specific.  Does this require hardware to manifest?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-20  8:52     ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-20  8:52 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Palmer Dabbelt <palmer@sifive.com> wrote:

> Sorry I can't be more specific.  Does this require hardware to manifest?

This was on the hifive.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-20  8:52     ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-20  8:52 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: linux-riscv

On Nov 19 2018, Palmer Dabbelt <palmer@sifive.com> wrote:

> Sorry I can't be more specific.  Does this require hardware to manifest?

This was on the hifive.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-20 17:29       ` Palmer Dabbelt
  0 siblings, 0 replies; 12+ messages in thread
From: Palmer Dabbelt @ 2018-11-20 17:29 UTC (permalink / raw)
  To: linux-riscv

On Tue, 20 Nov 2018 00:52:42 PST (-0800), schwab at suse.de wrote:
> On Nov 19 2018, Palmer Dabbelt <palmer@sifive.com> wrote:
>
>> Sorry I can't be more specific.  Does this require hardware to manifest?
>
> This was on the hifive.

OK, well, we know there are at least some issues when using an SD card via the 
SPI interface as a disk, but nobody has had time to track it down yet.  Right 
now the only reproducer is to just write a lot of data, which is a pain to 
debug.  The current issue manifests as a hang in some kernel thread that has a 
MMC-like name.  What you're seeing may be the same or a different issue.

Do you have a better way to reproduce this than to just hammer the filesystem?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-20 17:29       ` Palmer Dabbelt
  0 siblings, 0 replies; 12+ messages in thread
From: Palmer Dabbelt @ 2018-11-20 17:29 UTC (permalink / raw)
  To: schwab; +Cc: linux-riscv

On Tue, 20 Nov 2018 00:52:42 PST (-0800), schwab@suse.de wrote:
> On Nov 19 2018, Palmer Dabbelt <palmer@sifive.com> wrote:
>
>> Sorry I can't be more specific.  Does this require hardware to manifest?
>
> This was on the hifive.

OK, well, we know there are at least some issues when using an SD card via the 
SPI interface as a disk, but nobody has had time to track it down yet.  Right 
now the only reproducer is to just write a lot of data, which is a pain to 
debug.  The current issue manifests as a hang in some kernel thread that has a 
MMC-like name.  What you're seeing may be the same or a different issue.

Do you have a better way to reproduce this than to just hammer the filesystem?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-21  8:55         ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-21  8:55 UTC (permalink / raw)
  To: linux-riscv

On Nov 20 2018, Palmer Dabbelt <palmer@sifive.com> wrote:

> Do you have a better way to reproduce this than to just hammer the filesystem?

I don't hammer the filesystem.  Most of the data is on nfs and /tmp is
on tmpfs.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-21  8:55         ` Andreas Schwab
  0 siblings, 0 replies; 12+ messages in thread
From: Andreas Schwab @ 2018-11-21  8:55 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: linux-riscv

On Nov 20 2018, Palmer Dabbelt <palmer@sifive.com> wrote:

> Do you have a better way to reproduce this than to just hammer the filesystem?

I don't hammer the filesystem.  Most of the data is on nfs and /tmp is
on tmpfs.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-13 21:28 Qian Cai
  0 siblings, 0 replies; 12+ messages in thread
From: Qian Cai @ 2018-11-13 21:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: selinux, Paul Moore, Stephen Smalley, Eric Paris

Running a runc testsuite [1] on an aarch64 server with the latest mainline (rc2)
triggered this,

[1] https://fedorapeople.org/cgit/caiqian/public_git/runctst.git/tree/runctst.py

=============

# python runctst.py
...
- start: runc ocf poststart
load container root: container "root" does not exist
container_linux.go:364: running poststart hook 0 caused "error running hook:
exit status 1, stdout: , stderr: "
- error: unexpected zero-size output.
- start: runc ocf mount dev
- pass: runc ocf mount dev
- start: runc ocf network
- pass: runc ocf network
- start: runc ocf args systemd
- error: unexpected dbus output is USER        PID %CPU %MEM    VSZ   RSS
TTY      STAT START   TIME COMMAND
root          1 68.6  0.0   8832  7424 ?        Rs   19:29   0:03 /usr/sbin/init
root         18 24.0  0.0  14848  8704 ?        Ss   19:29   0:00
/usr/lib/systemd/systemd-journald
root         22  0.0  0.0   9728  6592 ?        Rs   19:29   0:00 ps aux

==============

[ 2231.649459] Kernel panic - not syncing: corrupted stack end detected inside
scheduler
[ 2231.657307] CPU: 185 PID: 11718 Comm: dbus-daemon Kdump: loaded Tainted:
G        W         4.20.0-rc2+ #4
[ 2231.666961] Hardware name: HPE Apollo 70             /C01_APACHE_MB         ,
BIOS L50_5.13_1.0.6 07/10/2018
[ 2231.676788] Call trace:
[ 2231.679273]  dump_backtrace+0x0/0x2c8
[ 2231.682950]  show_stack+0x24/0x30
[ 2231.686273]  dump_stack+0x118/0x19c
[ 2231.689765]  panic+0x1b8/0x31c
[ 2231.692822]  schedule+0x0/0x240
[ 2231.695963]  preempt_schedule_common+0x3c/0x78
[ 2231.700406]  _cond_resched+0xfc/0x108
[ 2231.704077]  kmem_cache_alloc+0x2e0/0x3f8
[ 2231.708102]  selinux_inode_alloc_security+0xc4/0x1b0
[ 2231.713080]  security_inode_alloc+0x44/0x70
[ 2231.717267]  inode_init_always+0x270/0x4b8
[ 2231.721364]  alloc_inode+0x50/0xd0
[ 2231.724768]  new_inode_pseudo+0x84/0x120
[ 2231.728691]  sock_alloc+0x30/0x108
[ 2231.732093]  __sock_create+0x154/0x560
[ 2231.735843]  __sys_socket+0xc8/0x178
[ 2231.739429]  __arm64_sys_socket+0x4c/0x60
[ 2231.743460]  el0_svc_handler+0xd4/0x198
[ 2231.747295]  el0_svc+0x8/0xc
[ 2231.750777] SMP: stopping secondary CPUs
[ 2231.757333] Starting crashdump kernel...
[ 2231.761260] Bye!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Kernel panic - not syncing: corrupted stack end detected inside scheduler
@ 2018-11-13  4:45 Qian Cai
  0 siblings, 0 replies; 12+ messages in thread
From: Qian Cai @ 2018-11-13  4:45 UTC (permalink / raw)
  To: linux kernel; +Cc: linux-mm

Running LTP oom01 [1] test triggered kernel panic on an aarch64 server with the latest mainline (rc2).

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c

[ 3433.338741] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[ 3433.347644] CPU: 49 PID: 2189 Comm: in:imjournal Kdump: loaded Tainted: G        W         4.20.0-rc2+ #15
[ 3433.357298] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
[ 3433.364523] Call trace:
[ 3433.366993]  dump_backtrace+0x0/0x2c8
[ 3433.370659]  show_stack+0x24/0x30
[ 3433.373980]  dump_stack+0x118/0x19c
[ 3433.377473]  panic+0x1b8/0x31c
[ 3433.380530]  schedule+0x0/0x240
[ 3433.383672]  schedule+0xdc/0x240
[ 3433.386905]  io_schedule+0x24/0x48
[ 3433.390313]  get_request+0x3b0/0xb68
[ 3433.393891]  blk_queue_bio+0x3a4/0xcd8
[ 3433.397642]  generic_make_request+0x440/0x7d8
[ 3433.402000]  submit_bio+0xbc/0x300
[ 3433.405409]  __swap_writepage+0xa54/0xd00
[ 3433.409420]  swap_writepage+0x44/0xb0
[ 3433.413086]  pageout.isra.12+0x580/0xd80
[ 3433.417011]  shrink_page_list+0x2480/0x36f0
[ 3433.421196]  shrink_inactive_list+0x388/0xb98
[ 3433.425555]  shrink_node_memcg+0x344/0x9c0
[ 3433.429653]  shrink_node+0x200/0x940
[ 3433.433231]  do_try_to_free_pages+0x234/0x7d0
[ 3433.437589]  try_to_free_pages+0x228/0x6b0
[ 3433.441689]  __alloc_pages_nodemask+0xcbc/0x2028
[ 3433.446309]  alloc_pages_vma+0x1a4/0x208
[ 3433.450235]  __read_swap_cache_async+0x4fc/0x858
[ 3433.454855]  read_swap_cache_async+0xa4/0x100
[ 3433.459214]  swap_cluster_readahead+0x598/0x650
[ 3433.463746]  shmem_swapin+0xd4/0x150
[ 3433.467324]  shmem_getpage_gfp+0xf50/0x1c48
[ 3433.471509]  shmem_fault+0x140/0x340
[ 3433.475086]  __do_fault+0xd0/0x440
[ 3433.478490]  do_fault+0x54c/0xf48
[ 3433.481807]  __handle_mm_fault+0x4c0/0x928
[ 3433.485905]  handle_mm_fault+0x30c/0x4b8
[ 3433.489832]  do_page_fault+0x294/0x658
[ 3433.493584]  do_translation_fault+0x98/0xa8
[ 3433.497769]  do_mem_abort+0x64/0xf0
[ 3433.501258]  el0_da+0x24/0x28

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-11-21  8:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-19 11:23 Kernel panic - not syncing: corrupted stack end detected inside scheduler Andreas Schwab
2018-11-19 11:23 ` Andreas Schwab
2018-11-19 23:46 ` Palmer Dabbelt
2018-11-19 23:46   ` Palmer Dabbelt
2018-11-20  8:52   ` Andreas Schwab
2018-11-20  8:52     ` Andreas Schwab
2018-11-20 17:29     ` Palmer Dabbelt
2018-11-20 17:29       ` Palmer Dabbelt
2018-11-21  8:55       ` Andreas Schwab
2018-11-21  8:55         ` Andreas Schwab
  -- strict thread matches above, loose matches on Subject: below --
2018-11-13 21:28 Qian Cai
2018-11-13  4:45 Qian Cai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.