All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-devel] dm thin-volume hung as swap: bug or as-design ?
@ 2021-01-29 10:40 Coly Li
  2021-01-29 13:57 ` Alasdair G Kergon
  0 siblings, 1 reply; 3+ messages in thread
From: Coly Li @ 2021-01-29 10:40 UTC (permalink / raw)
  To: dm-devel

Hi folks,

Recently I receive a report that whole system hung and no response after
a while with I/O load. The special configuration is the dm thin-pool
volume is used as the swap partition of the system.

>From the crash dump, I find one task is suspicious which looks as following,

PID: 462    TASK: ffff93033d74a680  CPU: 7   COMMAND: "kworker/u256:1"
 #0 [ffffb24b4d9c3710] __schedule at ffffffff9e29dc3d
 #1 [ffffb24b4d9c37a0] schedule at ffffffff9e29e0bf
 #2 [ffffb24b4d9c37b0] schedule_timeout at ffffffff9e2a179d
 #3 [ffffb24b4d9c3828] wait_for_completion at ffffffff9e29eaaa
 #4 [ffffb24b4d9c3878] __flush_work at ffffffff9dabb277
 #5 [ffffb24b4d9c38f0] drain_all_pages at ffffffff9dc74e05
 #6 [ffffb24b4d9c3920] __alloc_pages_slowpath at ffffffff9dc77279
 #7 [ffffb24b4d9c3a20] __alloc_pages_nodemask at ffffffff9dc77e41
 #8 [ffffb24b4d9c3a80] new_slab at ffffffff9dc99c1a
 #9 [ffffb24b4d9c3ae8] ___slab_alloc at ffffffff9dc9c6d9
#10 [ffffb24b4d9c3b40] exit_shadow_spine at ffffffffc08ef8cf
[dm_persistent_data]
#11 [ffffb24b4d9c3b50] insert at ffffffffc08edfcc [dm_persistent_data]
#12 [ffffb24b4d9c3c30] sm_ll_mutate at ffffffffc08ea20e [dm_persistent_data]
#13 [ffffb24b4d9c3cd8] dm_kcopyd_zero at ffffffffc03f7a39 [dm_mod]
#14 [ffffb24b4d9c3ce8] schedule_zero at ffffffffc093d181 [dm_thin_pool]
#15 [ffffb24b4d9c3d40] process_cell at ffffffffc093d78c [dm_thin_pool]
#16 [ffffb24b4d9c3dc8] do_worker at ffffffffc093dce6 [dm_thin_pool]
#17 [ffffb24b4d9c3e98] process_one_work at ffffffff9daba4d4
#18 [ffffb24b4d9c3ed8] worker_thread at ffffffff9daba6ed
#19 [ffffb24b4d9c3f10] kthread at ffffffff9dac0a2d
#20 [ffffb24b4d9c3f50] ret_from_fork at ffffffff9e400202

This task is writing on a thin-pool volume which is mounted as swap
partition in the system. This is very suspicious, because I see the
dm-thin code, all memory allocation inside from dm-thin code has
explicity GFP_NOIO/GFP_NOFS or implict memalloc_noio_save(), in order to
avoid deadlock in recursive memory reclaim code path.

I do many testings, and confirm such issue can be reproduced in latest
upstream Linux v5.11-rc5+ kernel. If I create two thin-pool volumes, one
is mounted as swap, one is written by heavy I/O pressure. If anonymous
pages swapping happens on the first thin-pool volume while I/O hitting
on second thin-pool, after around 3 minutes the whole system gets hung
and no any response and kernel information for 1 hour+ before I reset
the machine.

My questions are,
- Can a thin-pool volume be used as swap device?
- The above description is a bug, or an already know issue which should
be avoided ?

Thanks in advance.

Coly Li


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dm-devel] dm thin-volume hung as swap: bug or as-design ?
  2021-01-29 10:40 [dm-devel] dm thin-volume hung as swap: bug or as-design ? Coly Li
@ 2021-01-29 13:57 ` Alasdair G Kergon
  2021-01-29 14:11   ` Coly Li
  0 siblings, 1 reply; 3+ messages in thread
From: Alasdair G Kergon @ 2021-01-29 13:57 UTC (permalink / raw)
  To: Coly Li; +Cc: dm-devel

On Fri, Jan 29, 2021 at 06:40:06PM +0800, Coly Li wrote:
> Recently I receive a report that whole system hung and no response after
> a while with I/O load. The special configuration is the dm thin-pool
> volume is used as the swap partition of the system.
> My questions are,
> - Can a thin-pool volume be used as swap device?

Yes in principle, but it won't get much testing as it's not 
necessarily a particularly sensible configuration.
- You'd normally prefer fully-pre-allocated disk space for swap
  and turn off the zeroing.

Is there some use-case where it does make more sense?

> - The above description is a bug, or an already know issue which should
> be avoided ?
 
Bug.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dm-devel] dm thin-volume hung as swap: bug or as-design ?
  2021-01-29 13:57 ` Alasdair G Kergon
@ 2021-01-29 14:11   ` Coly Li
  0 siblings, 0 replies; 3+ messages in thread
From: Coly Li @ 2021-01-29 14:11 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: dm-devel

On 1/29/21 9:57 PM, Alasdair G Kergon wrote:
> On Fri, Jan 29, 2021 at 06:40:06PM +0800, Coly Li wrote:
>> Recently I receive a report that whole system hung and no response after
>> a while with I/O load. The special configuration is the dm thin-pool
>> volume is used as the swap partition of the system.
>> My questions are,
>> - Can a thin-pool volume be used as swap device?
> 
> Yes in principle, but it won't get much testing as it's not 
> necessarily a particularly sensible configuration.
> - You'd normally prefer fully-pre-allocated disk space for swap
>   and turn off the zeroing.
> 
> Is there some use-case where it does make more sense?
> 

What I see is on a system there are dozens of partitions created on top
of many thin-pool for each, including the swap partition. People just
use thin-pool volumes in this way, and bug triggered.


>> - The above description is a bug, or an already know issue which should
>> be avoided ?
>  
> Bug.
> 

Thanks for the confirmation. I tried to change all memory allocation
into GFP_NOIO or with memalloc_noio_save(), the deadlock still exists.
What I suspect yet is might be from the memory allocation from the
kworkers, but this is only my guess and no evidence.

If there is patch addressed this hung issue (thin-pool volume as swap),
I'd like to help testing, because my local environment may 100%
reproduce the problem in 5 minutes.

Thanks.

Coly Li


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-29 14:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 10:40 [dm-devel] dm thin-volume hung as swap: bug or as-design ? Coly Li
2021-01-29 13:57 ` Alasdair G Kergon
2021-01-29 14:11   ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.