All of lore.kernel.org
 help / color / mirror / Atom feed
* zswap z3fold + memory offline = infinite loop
@ 2020-05-13  0:36 Qian Cai
  2020-05-13  7:31 ` David Hildenbrand
  2020-05-13  8:28 ` Vitaly Wool
  0 siblings, 2 replies; 6+ messages in thread
From: Qian Cai @ 2020-05-13  0:36 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Michal Hocko, David Hildenbrand, Linux-MM, LKML, Minchan Kim,
	Seth Jennings, Dan Streetman

Put zswap z3fold pages into the memory and then offline those memory would trigger an infinite loop here in

__offline_pages() --> do_migrate_range() because there is no error handling,

			if (pfn) {
				/*
				 * TODO: fatal migration failures should bail
				 * out
				 */
				do_migrate_range(pfn, end_pfn);

There, isolate_movable_page() will always return -EBUSY  because,

	if (!mapping->a_ops->isolate_page(page, mode))
		goto out_no_isolated;

i.e., z3fold_page_isolate() will always return false because,

zhdr->mapped_count == 2

It should be easy to reproduce. Otherwise, use this one,

https://github.com/cailca/linux-mm/blob/master/random.c

and then watch the console burning with those,

[12661.793667][T566417] failed to isolate pfn 1045b2
[12661.793745][T566417] page:c00c000004116c80 refcount:2 mapcount:0 mapping:00000000999f9672 index:0x0
[12661.793865][T566417] mapping->a_ops:z3fold_aops
[12661.793919][T566417] flags: 0x3fffc000000000()
[12661.793969][T566417] raw: 003fffc000000000 c00c000003cef388 c00c000006b0da08 c000001275b87f6a
[12661.794071][T566417] raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
[12661.794158][T566417] page dumped because: isolation failed
[12661.794226][T566417] page_owner tracks the page as allocated
[12661.794292][T566417] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12800(GFP_NOWAIT|__GFP_NOWARN|__GFP_NORETRY)
[12661.794463][T566417]  prep_new_page+0x3d0/0x450
[12661.794508][T566417]  get_page_from_freelist+0x1bb8/0x27c0
[12661.794575][T566417]  __alloc_pages_slowpath.constprop.60+0x240/0x15a0
[12661.794654][T566417]  __alloc_pages_nodemask+0x520/0x650
[12661.794715][T566417]  alloc_pages_current+0xbc/0x140
[12661.794772][T566417]  z3fold_zpool_malloc+0x6cc/0xe20
[12661.794826][T566417]  zpool_malloc+0x34/0x50
[12661.794888][T566417]  zswap_frontswap_store+0x60c/0xe20
[12661.794942][T566417]  __frontswap_store+0x128/0x330
[12661.794995][T566417]  swap_writepage+0x58/0x110
[12661.795048][T566417]  pageout+0x16c/0xa40
[12661.795092][T566417]  shrink_page_list+0x1ab4/0x2490
[12661.795155][T566417]  shrink_inactive_list+0x25c/0x710
[12661.795206][T566417]  shrink_lruvec+0x444/0x1260
[12661.795274][T566417]  shrink_node+0x288/0x9a0
[12661.795330][T566417]  do_try_to_free_pages+0x158/0x640
[12661.795383][T566417] page last free stack trace:
[12661.795437][T566417]  free_pcp_prepare+0x52c/0x590
[12661.795493][T566417]  free_unref_page+0x38/0xf0
[12662.156109][T566417]  free_z3fold_page+0x58/0x120
[12662.156131][T566417]  free_pages_work+0x148/0x1c0
[12662.156195][T566417]  process_one_work+0x310/0x900
[12662.156257][T566417]  worker_thread+0x78/0x530
[12662.156306][T566417]  kthread+0x1c4/0x1d0
[12662.156354][T566417]  ret_from_kernel_thread+0x5c/0x74


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: zswap z3fold + memory offline = infinite loop
  2020-05-13  0:36 zswap z3fold + memory offline = infinite loop Qian Cai
@ 2020-05-13  7:31 ` David Hildenbrand
  2020-05-13  8:28 ` Vitaly Wool
  1 sibling, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2020-05-13  7:31 UTC (permalink / raw)
  To: Qian Cai, Vitaly Wool
  Cc: Michal Hocko, Linux-MM, LKML, Minchan Kim, Seth Jennings, Dan Streetman

On 13.05.20 02:36, Qian Cai wrote:
> Put zswap z3fold pages into the memory and then offline those memory would trigger an infinite loop here in
> 
> __offline_pages() --> do_migrate_range() because there is no error handling,

That's usually not the case, that's why we - nowadays - basically never
see endless loops anymore, except when some code is doing something
wrong with movable pages. The TODO below is somewhat misleading.

> 
> 			if (pfn) {
> 				/*
> 				 * TODO: fatal migration failures should bail
> 				 * out
> 				 */
> 				do_migrate_range(pfn, end_pfn);
> 
> There, isolate_movable_page() will always return -EBUSY  because,
> 
> 	if (!mapping->a_ops->isolate_page(page, mode))
> 		goto out_no_isolated;
> 
> i.e., z3fold_page_isolate() will always return false because,
> 
> zhdr->mapped_count == 2

Whenever we have such endless loops when offlining it's either because
1. Somebody puts unmovable data into the movable zone
2. Somebody marks unmovable data as movable
3. Somebody turns movable data into unmovable data although the
pageblock is isolated

After start_isolate_page_range() we are sure that we only have movable
data in the isolated page range. The pageblocks are isolated, so e.g.,
no new allocations can end up on that memory.

Offlining code can therefore assume that all memory is in fact movable
and can be moved eventually. It might not always work on the first
attempt, that's why we retry.

We don't expect permanent migration failure, because that would mean the
page is unmovable.

I can see that mm/z3fold.c uses  __SetPageMovable(). So the question is,
why are these pages permanently unmovable.


> 
> It should be easy to reproduce. Otherwise, use this one,
> 
> https://github.com/cailca/linux-mm/blob/master/random.c
> 
> and then watch the console burning with those,
> 
> [12661.793667][T566417] failed to isolate pfn 1045b2
> [12661.793745][T566417] page:c00c000004116c80 refcount:2 mapcount:0 mapping:00000000999f9672 index:0x0
> [12661.793865][T566417] mapping->a_ops:z3fold_aops
> [12661.793919][T566417] flags: 0x3fffc000000000()
> [12661.793969][T566417] raw: 003fffc000000000 c00c000003cef388 c00c000006b0da08 c000001275b87f6a
> [12661.794071][T566417] raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
> [12661.794158][T566417] page dumped because: isolation failed
> [12661.794226][T566417] page_owner tracks the page as allocated
> [12661.794292][T566417] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12800(GFP_NOWAIT|__GFP_NOWARN|__GFP_NORETRY)
> [12661.794463][T566417]  prep_new_page+0x3d0/0x450
> [12661.794508][T566417]  get_page_from_freelist+0x1bb8/0x27c0
> [12661.794575][T566417]  __alloc_pages_slowpath.constprop.60+0x240/0x15a0
> [12661.794654][T566417]  __alloc_pages_nodemask+0x520/0x650
> [12661.794715][T566417]  alloc_pages_current+0xbc/0x140
> [12661.794772][T566417]  z3fold_zpool_malloc+0x6cc/0xe20
> [12661.794826][T566417]  zpool_malloc+0x34/0x50
> [12661.794888][T566417]  zswap_frontswap_store+0x60c/0xe20
> [12661.794942][T566417]  __frontswap_store+0x128/0x330
> [12661.794995][T566417]  swap_writepage+0x58/0x110
> [12661.795048][T566417]  pageout+0x16c/0xa40
> [12661.795092][T566417]  shrink_page_list+0x1ab4/0x2490
> [12661.795155][T566417]  shrink_inactive_list+0x25c/0x710
> [12661.795206][T566417]  shrink_lruvec+0x444/0x1260
> [12661.795274][T566417]  shrink_node+0x288/0x9a0
> [12661.795330][T566417]  do_try_to_free_pages+0x158/0x640
> [12661.795383][T566417] page last free stack trace:
> [12661.795437][T566417]  free_pcp_prepare+0x52c/0x590
> [12661.795493][T566417]  free_unref_page+0x38/0xf0
> [12662.156109][T566417]  free_z3fold_page+0x58/0x120
> [12662.156131][T566417]  free_pages_work+0x148/0x1c0
> [12662.156195][T566417]  process_one_work+0x310/0x900
> [12662.156257][T566417]  worker_thread+0x78/0x530
> [12662.156306][T566417]  kthread+0x1c4/0x1d0
> [12662.156354][T566417]  ret_from_kernel_thread+0x5c/0x74
> 


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: zswap z3fold + memory offline = infinite loop
  2020-05-13  0:36 zswap z3fold + memory offline = infinite loop Qian Cai
  2020-05-13  7:31 ` David Hildenbrand
@ 2020-05-13  8:28 ` Vitaly Wool
  2020-05-19  3:50   ` Qian Cai
  1 sibling, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2020-05-13  8:28 UTC (permalink / raw)
  To: Qian Cai
  Cc: Michal Hocko, David Hildenbrand, Linux-MM, LKML, Minchan Kim,
	Seth Jennings, Dan Streetman

[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]

On Wed, May 13, 2020, 2:36 AM Qian Cai <cai@lca.pw> wrote:

> Put zswap z3fold pages into the memory and then offline those memory would
> trigger an infinite loop here in
>
> __offline_pages() --> do_migrate_range() because there is no error
> handling,
>
>                         if (pfn) {
>                                 /*
>                                  * TODO: fatal migration failures should
> bail
>                                  * out
>                                  */
>                                 do_migrate_range(pfn, end_pfn);
>
> There, isolate_movable_page() will always return -EBUSY  because,
>
>         if (!mapping->a_ops->isolate_page(page, mode))
>                 goto out_no_isolated;
>
> i.e., z3fold_page_isolate() will always return false because,
>
> zhdr->mapped_count == 2
>

So who mapped these pages? The whole zswap operation presumes that objects
are mapped for a short while to run some I/O and so, most of the time
zhdr->mapped_count would be 0.

Removing that check in ->isolate() is not a big deal, but ->migratepage()
shall not allow actual migration anyway if there are mapped objects.

~Vitaly

[-- Attachment #2: Type: text/html, Size: 1744 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: zswap z3fold + memory offline = infinite loop
  2020-05-13  8:28 ` Vitaly Wool
@ 2020-05-19  3:50   ` Qian Cai
  2020-05-20  8:23     ` Vitaly Wool
  0 siblings, 1 reply; 6+ messages in thread
From: Qian Cai @ 2020-05-19  3:50 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Michal Hocko, David Hildenbrand, Linux-MM, LKML, Minchan Kim,
	Seth Jennings, Dan Streetman

On Wed, May 13, 2020 at 4:28 AM Vitaly Wool <vitaly.wool@konsulko.com> wrote:
>
>
>
> On Wed, May 13, 2020, 2:36 AM Qian Cai <cai@lca.pw> wrote:
>>
>> Put zswap z3fold pages into the memory and then offline those memory would trigger an infinite loop here in
>>
>> __offline_pages() --> do_migrate_range() because there is no error handling,
>>
>>                         if (pfn) {
>>                                 /*
>>                                  * TODO: fatal migration failures should bail
>>                                  * out
>>                                  */
>>                                 do_migrate_range(pfn, end_pfn);
>>
>> There, isolate_movable_page() will always return -EBUSY  because,
>>
>>         if (!mapping->a_ops->isolate_page(page, mode))
>>                 goto out_no_isolated;
>>
>> i.e., z3fold_page_isolate() will always return false because,
>>
>> zhdr->mapped_count == 2
>
>
> So who mapped these pages? The whole zswap operation presumes that objects are mapped for a short while to run some I/O and so, most of the time zhdr->mapped_count would be 0.

I have no clue why those pages have been mapped for so long, but it is
trivial to reproduce using the above reproducer. Also, zbud has no
such issue. Alternatively, if you could send me some debug patches to
narrow it down, I'll be happy to run for you.

>
> Removing that check in ->isolate() is not a big deal, but ->migratepage() shall not allow actual migration anyway if there are mapped objects.

Is that worse than an endless loop here?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: zswap z3fold + memory offline = infinite loop
  2020-05-19  3:50   ` Qian Cai
@ 2020-05-20  8:23     ` Vitaly Wool
  2020-05-20 19:05       ` Qian Cai
  0 siblings, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2020-05-20  8:23 UTC (permalink / raw)
  To: Qian Cai
  Cc: Michal Hocko, David Hildenbrand, Linux-MM, LKML, Minchan Kim,
	Seth Jennings, Dan Streetman

On Tue, May 19, 2020 at 5:50 AM Qian Cai <cai@lca.pw> wrote:
<snip>
>
> >
> > Removing that check in ->isolate() is not a big deal, but ->migratepage() shall not allow actual migration anyway if there are mapped objects.
>
> Is that worse than an endless loop here?

Well, let's figure if there really has to be an endless loop. Would
you mind retesting with
https://marc.info/?l=linux-mm&m=158996286816704&w=2?

Best regards,
   Vitaly

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: zswap z3fold + memory offline = infinite loop
  2020-05-20  8:23     ` Vitaly Wool
@ 2020-05-20 19:05       ` Qian Cai
  0 siblings, 0 replies; 6+ messages in thread
From: Qian Cai @ 2020-05-20 19:05 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Michal Hocko, David Hildenbrand, Linux-MM, LKML, Minchan Kim,
	Seth Jennings, Dan Streetman

On Wed, May 20, 2020 at 10:23:38AM +0200, Vitaly Wool wrote:
> On Tue, May 19, 2020 at 5:50 AM Qian Cai <cai@lca.pw> wrote:
> <snip>
> >
> > >
> > > Removing that check in ->isolate() is not a big deal, but ->migratepage() shall not allow actual migration anyway if there are mapped objects.
> >
> > Is that worse than an endless loop here?
> 
> Well, let's figure if there really has to be an endless loop. Would
> you mind retesting with
> https://marc.info/?l=linux-mm&m=158996286816704&w=2?

The patch does not help. Running the reproducer the second time would still
trigger the looping.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-20 19:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  0:36 zswap z3fold + memory offline = infinite loop Qian Cai
2020-05-13  7:31 ` David Hildenbrand
2020-05-13  8:28 ` Vitaly Wool
2020-05-19  3:50   ` Qian Cai
2020-05-20  8:23     ` Vitaly Wool
2020-05-20 19:05       ` Qian Cai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.