From: Michael Ellerman <mpe@ellerman.id.au>
To: Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Reza Arbab <arbab@linux.vnet.ibm.com>,
Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early
Date: Tue, 10 Oct 2017 23:05:08 +1100 [thread overview]
Message-ID: <87bmlfw6mj.fsf@concordia.ellerman.id.au> (raw)
In-Reply-To: <20170918070834.13083-2-mhocko@kernel.org>
Michal Hocko <mhocko@kernel.org> writes:
> From: Michal Hocko <mhocko@suse.com>
>
> Memory offlining can fail just too eagerly under a heavy memory pressure.
>
> [ 5410.336792] page:ffffea22a646bd00 count:255 mapcount:252 mapping:ffff88ff926c9f38 index:0x3
> [ 5410.336809] flags: 0x9855fe40010048(uptodate|active|mappedtodisk)
> [ 5410.336811] page dumped because: isolation failed
> [ 5410.336813] page->mem_cgroup:ffff8801cd662000
> [ 5420.655030] memory offlining [mem 0x18b580000000-0x18b5ffffffff] failed
>
> Isolation has failed here because the page is not on LRU. Most probably
> because it was on the pcp LRU cache or it has been removed from the LRU
> already but it hasn't been freed yet. In both cases the page doesn't look
> non-migrable so retrying more makes sense.
This breaks offline for me.
Prior to this commit:
/sys/devices/system/memory/memory0# time echo 0 > online
-bash: echo: write error: Device or resource busy
real 0m0.001s
user 0m0.000s
sys 0m0.001s
After:
/sys/devices/system/memory/memory0# time echo 0 > online
-bash: echo: write error: Device or resource busy
real 2m0.009s
user 0m0.000s
sys 1m25.035s
There's no way that block can be removed, it contains the kernel text,
so it should instantly fail - which it used to.
With commit 3aa2823fdf66 ("mm, memory_hotplug: remove timeout from
__offline_memory") also applied, it appears to just get stuck forever,
and I get lots of:
[ 1232.112953] INFO: task kworker/3:0:4609 blocked for more than 120 seconds.
[ 1232.113067] Not tainted 4.14.0-rc4-gcc6-next-20171009-g49827b9 #1
[ 1232.113183] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1232.113319] kworker/3:0 D11984 4609 2 0x00000800
[ 1232.113416] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[ 1232.113531] Call Trace:
[ 1232.113579] [c0000000fb2db7a0] [c0000000fb2db900] 0xc0000000fb2db900 (unreliable)
[ 1232.113717] [c0000000fb2db970] [c00000000001c964] __switch_to+0x304/0x6e0
[ 1232.113840] [c0000000fb2dba10] [c000000000a408c0] __schedule+0x2e0/0xa80
[ 1232.113978] [c0000000fb2dbae0] [c000000000a410a8] schedule+0x48/0xc0
[ 1232.114113] [c0000000fb2dbb10] [c000000000a44d88] rwsem_down_read_failed+0x128/0x1b0
[ 1232.114269] [c0000000fb2dbb70] [c0000000001696a8] __percpu_down_read+0x108/0x110
[ 1232.114426] [c0000000fb2dbba0] [c00000000032e498] get_online_mems+0x68/0x80
[ 1232.115487] [c0000000fb2dbbc0] [c0000000002c82ec] memcg_create_kmem_cache+0x4c/0x190
[ 1232.115651] [c0000000fb2dbc60] [c0000000003483b8] memcg_kmem_cache_create_func+0x38/0xf0
[ 1232.115809] [c0000000fb2dbc90] [c000000000121594] process_one_work+0x2b4/0x590
[ 1232.115964] [c0000000fb2dbd20] [c000000000121908] worker_thread+0x98/0x5d0
[ 1232.116095] [c0000000fb2dbdc0] [c00000000012a134] kthread+0x164/0x1b0
[ 1232.116229] [c0000000fb2dbe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
cheers
WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Reza Arbab <arbab@linux.vnet.ibm.com>,
Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early
Date: Tue, 10 Oct 2017 23:05:08 +1100 [thread overview]
Message-ID: <87bmlfw6mj.fsf@concordia.ellerman.id.au> (raw)
In-Reply-To: <20170918070834.13083-2-mhocko@kernel.org>
Michal Hocko <mhocko@kernel.org> writes:
> From: Michal Hocko <mhocko@suse.com>
>
> Memory offlining can fail just too eagerly under a heavy memory pressure.
>
> [ 5410.336792] page:ffffea22a646bd00 count:255 mapcount:252 mapping:ffff88ff926c9f38 index:0x3
> [ 5410.336809] flags: 0x9855fe40010048(uptodate|active|mappedtodisk)
> [ 5410.336811] page dumped because: isolation failed
> [ 5410.336813] page->mem_cgroup:ffff8801cd662000
> [ 5420.655030] memory offlining [mem 0x18b580000000-0x18b5ffffffff] failed
>
> Isolation has failed here because the page is not on LRU. Most probably
> because it was on the pcp LRU cache or it has been removed from the LRU
> already but it hasn't been freed yet. In both cases the page doesn't look
> non-migrable so retrying more makes sense.
This breaks offline for me.
Prior to this commit:
/sys/devices/system/memory/memory0# time echo 0 > online
-bash: echo: write error: Device or resource busy
real 0m0.001s
user 0m0.000s
sys 0m0.001s
After:
/sys/devices/system/memory/memory0# time echo 0 > online
-bash: echo: write error: Device or resource busy
real 2m0.009s
user 0m0.000s
sys 1m25.035s
There's no way that block can be removed, it contains the kernel text,
so it should instantly fail - which it used to.
With commit 3aa2823fdf66 ("mm, memory_hotplug: remove timeout from
__offline_memory") also applied, it appears to just get stuck forever,
and I get lots of:
[ 1232.112953] INFO: task kworker/3:0:4609 blocked for more than 120 seconds.
[ 1232.113067] Not tainted 4.14.0-rc4-gcc6-next-20171009-g49827b9 #1
[ 1232.113183] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1232.113319] kworker/3:0 D11984 4609 2 0x00000800
[ 1232.113416] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[ 1232.113531] Call Trace:
[ 1232.113579] [c0000000fb2db7a0] [c0000000fb2db900] 0xc0000000fb2db900 (unreliable)
[ 1232.113717] [c0000000fb2db970] [c00000000001c964] __switch_to+0x304/0x6e0
[ 1232.113840] [c0000000fb2dba10] [c000000000a408c0] __schedule+0x2e0/0xa80
[ 1232.113978] [c0000000fb2dbae0] [c000000000a410a8] schedule+0x48/0xc0
[ 1232.114113] [c0000000fb2dbb10] [c000000000a44d88] rwsem_down_read_failed+0x128/0x1b0
[ 1232.114269] [c0000000fb2dbb70] [c0000000001696a8] __percpu_down_read+0x108/0x110
[ 1232.114426] [c0000000fb2dbba0] [c00000000032e498] get_online_mems+0x68/0x80
[ 1232.115487] [c0000000fb2dbbc0] [c0000000002c82ec] memcg_create_kmem_cache+0x4c/0x190
[ 1232.115651] [c0000000fb2dbc60] [c0000000003483b8] memcg_kmem_cache_create_func+0x38/0xf0
[ 1232.115809] [c0000000fb2dbc90] [c000000000121594] process_one_work+0x2b4/0x590
[ 1232.115964] [c0000000fb2dbd20] [c000000000121908] worker_thread+0x98/0x5d0
[ 1232.116095] [c0000000fb2dbdc0] [c00000000012a134] kthread+0x164/0x1b0
[ 1232.116229] [c0000000fb2dbe30] [c00000000000bae0] ret_from_kernel_thread+0x5c/0x7c
cheers
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-10 12:05 UTC|newest]
Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-18 7:08 [PATCH v2 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko
2017-09-18 7:08 ` Michal Hocko
2017-09-18 7:08 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko
2017-09-18 7:08 ` Michal Hocko
2017-10-10 12:05 ` Michael Ellerman [this message]
2017-10-10 12:05 ` Michael Ellerman
2017-10-10 12:27 ` Michal Hocko
2017-10-10 12:27 ` Michal Hocko
2017-10-11 2:37 ` Michael Ellerman
2017-10-11 2:37 ` Michael Ellerman
2017-10-11 5:19 ` Michael Ellerman
2017-10-11 5:19 ` Michael Ellerman
2017-10-11 14:05 ` Anshuman Khandual
2017-10-11 14:05 ` Anshuman Khandual
2017-10-11 14:16 ` Michal Hocko
2017-10-11 14:16 ` Michal Hocko
2017-10-11 6:51 ` Michal Hocko
2017-10-11 6:51 ` Michal Hocko
2017-10-11 8:04 ` Vlastimil Babka
2017-10-11 8:04 ` Vlastimil Babka
2017-10-11 8:13 ` Michal Hocko
2017-10-11 8:13 ` Michal Hocko
2017-10-11 11:17 ` Vlastimil Babka
2017-10-11 11:17 ` Vlastimil Babka
2017-10-11 11:24 ` Michal Hocko
2017-10-11 11:24 ` Michal Hocko
2017-10-13 11:42 ` Michael Ellerman
2017-10-13 11:42 ` Michael Ellerman
2017-10-13 11:58 ` Michal Hocko
2017-10-13 11:58 ` Michal Hocko
2017-10-13 12:00 ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michal Hocko
2017-10-13 12:00 ` Michal Hocko
2017-10-13 12:00 ` [PATCH 2/2] mm, page_alloc: fail has_unmovable_pages when seeing reserved pages Michal Hocko
2017-10-13 12:00 ` Michal Hocko
2017-10-13 12:04 ` Vlastimil Babka
2017-10-13 12:04 ` Vlastimil Babka
2017-10-13 12:07 ` Michal Hocko
2017-10-13 12:07 ` Michal Hocko
2017-10-17 13:03 ` Vlastimil Babka
2017-10-17 13:03 ` Vlastimil Babka
2017-10-17 11:41 ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michael Ellerman
2017-10-17 11:41 ` Michael Ellerman
2017-10-17 12:03 ` Michal Hocko
2017-10-17 12:03 ` Michal Hocko
2017-10-17 13:02 ` Vlastimil Babka
2017-10-17 13:02 ` Vlastimil Babka
2017-10-19 2:51 ` Joonsoo Kim
2017-10-19 2:51 ` Joonsoo Kim
2017-10-19 7:15 ` Michal Hocko
2017-10-19 7:15 ` Michal Hocko
2017-10-19 7:33 ` Joonsoo Kim
2017-10-19 7:33 ` Joonsoo Kim
2017-10-19 8:20 ` Michal Hocko
2017-10-19 8:20 ` Michal Hocko
2017-10-19 12:21 ` Michal Hocko
2017-10-19 12:21 ` Michal Hocko
2017-10-20 2:13 ` Joonsoo Kim
2017-10-20 2:13 ` Joonsoo Kim
2017-10-20 5:59 ` Michal Hocko
2017-10-20 5:59 ` Michal Hocko
2017-10-20 6:50 ` Joonsoo Kim
2017-10-20 6:50 ` Joonsoo Kim
2017-10-20 7:02 ` Michal Hocko
2017-10-20 7:02 ` Michal Hocko
2017-10-23 5:23 ` Joonsoo Kim
2017-10-23 5:23 ` Joonsoo Kim
2017-10-23 8:10 ` Michal Hocko
2017-10-23 8:10 ` Michal Hocko
2017-10-24 4:44 ` Joonsoo Kim
2017-10-24 4:44 ` Joonsoo Kim
2017-10-24 7:44 ` Michal Hocko
2017-10-24 7:44 ` Michal Hocko
2017-10-24 8:12 ` Vlastimil Babka
2017-10-24 8:12 ` Vlastimil Babka
2017-10-24 12:25 ` Michal Hocko
2017-10-24 12:25 ` Michal Hocko
2017-10-26 2:47 ` Joonsoo Kim
2017-10-26 2:47 ` Joonsoo Kim
2017-10-26 7:41 ` Michal Hocko
2017-10-26 7:41 ` Michal Hocko
2017-10-20 7:22 ` Xishi Qiu
2017-10-20 7:22 ` Xishi Qiu
2017-10-20 8:17 ` Michal Hocko
2017-10-20 8:17 ` Michal Hocko
2017-10-23 5:26 ` Joonsoo Kim
2017-10-23 5:26 ` Joonsoo Kim
2017-10-26 13:04 ` Vlastimil Babka
2017-10-26 13:04 ` Vlastimil Babka
2017-10-26 13:59 ` Michal Hocko
2017-10-26 13:59 ` Michal Hocko
2017-09-18 7:08 ` [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory Michal Hocko
2017-09-18 7:08 ` Michal Hocko
-- strict thread matches above, loose matches on Subject: below --
2017-09-04 8:21 [PATCH 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko
2017-09-04 8:21 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko
2017-09-04 8:21 ` Michal Hocko
2017-09-05 6:29 ` Anshuman Khandual
2017-09-05 6:29 ` Anshuman Khandual
2017-09-05 7:13 ` Michal Hocko
2017-09-05 7:13 ` Michal Hocko
2017-09-08 17:26 ` Vlastimil Babka
2017-09-08 17:26 ` Vlastimil Babka
2017-09-11 8:17 ` Michal Hocko
2017-09-11 8:17 ` Michal Hocko
2017-09-13 11:41 ` Vlastimil Babka
2017-09-13 11:41 ` Vlastimil Babka
2017-09-13 12:10 ` Michal Hocko
2017-09-13 12:10 ` Michal Hocko
2017-09-13 12:14 ` Michal Hocko
2017-09-13 12:14 ` Michal Hocko
2017-09-13 12:19 ` Vlastimil Babka
2017-09-13 12:19 ` Vlastimil Babka
2017-09-13 12:32 ` Michal Hocko
2017-09-13 12:32 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bmlfw6mj.fsf@concordia.ellerman.id.au \
--to=mpe@ellerman.id.au \
--cc=akpm@linux-foundation.org \
--cc=arbab@linux.vnet.ibm.com \
--cc=imammedo@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=qiuxishi@huawei.com \
--cc=vbabka@suse.cz \
--cc=vkuznets@redhat.com \
--cc=yasu.isimatu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.