From: Vlastimil Babka <vbabka@suse.cz> To: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Reza Arbab <arbab@linux.vnet.ibm.com>, Yasuaki Ishimatsu <yasu.isimatu@gmail.com>, qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Date: Wed, 13 Sep 2017 13:41:20 +0200 [thread overview] Message-ID: <9fad7246-c634-18bb-78f9-b95376c009da@suse.cz> (raw) In-Reply-To: <20170911081714.4zc33r7wlj2nnbho@dhcp22.suse.cz> On 09/11/2017 10:17 AM, Michal Hocko wrote: > On Fri 08-09-17 19:26:06, Vlastimil Babka wrote: >> On 09/04/2017 10:21 AM, Michal Hocko wrote: >>> From: Michal Hocko <mhocko@suse.com> >>> >>> Fix this by removing the max retry count and only rely on the timeout >>> resp. interruption by a signal from the userspace. Also retry rather >>> than fail when check_pages_isolated sees some !free pages because those >>> could be a result of the race as well. >>> >>> Signed-off-by: Michal Hocko <mhocko@suse.com> >> >> Even within a movable node where has_unmovable_pages() is a non-issue, you could >> have pinned movable pages where the pinning is not temporary. > > Who would pin those pages? Such a page would be unreclaimable as well > and thus a memory leak and I would argue it would be a bug. I don't know who exactly, but generally it's a problem for CMA and a reason why there was some effort from PeterZ to introduce an API for long-term pinning. >> So after this >> patch, this will really keep retrying forever. I'm not saying it's wrong, just >> pointing it out, since the changelog seems to assume there would be only >> temporary failures possible and thus unbound retries are always correct. >> The obvious problem if we wanted to avoid this, is how to recognize >> non-temporary failures... > > Yes, we should be able to distinguish the two and hopefully we can teach > the migration code to distinguish between EBUSY (likely permanent) and > EGAIN (temporal) failure. This sound like something we should aim for > longterm I guess. Anyway as I've said in other email. If somebody really > wants to have a guaratee of a bounded retry then it is trivial to set up > an alarm and send a signal itself to bail out. Sure, I would just be careful about not breaking existing userspace (udev?) when offline triggered via ACPI from some management interface (or whatever the exact mechanism is). > Do you think that the changelog should be more clear about this? It certainly wouldn't hurt :)
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz> To: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Reza Arbab <arbab@linux.vnet.ibm.com>, Yasuaki Ishimatsu <yasu.isimatu@gmail.com>, qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Date: Wed, 13 Sep 2017 13:41:20 +0200 [thread overview] Message-ID: <9fad7246-c634-18bb-78f9-b95376c009da@suse.cz> (raw) In-Reply-To: <20170911081714.4zc33r7wlj2nnbho@dhcp22.suse.cz> On 09/11/2017 10:17 AM, Michal Hocko wrote: > On Fri 08-09-17 19:26:06, Vlastimil Babka wrote: >> On 09/04/2017 10:21 AM, Michal Hocko wrote: >>> From: Michal Hocko <mhocko@suse.com> >>> >>> Fix this by removing the max retry count and only rely on the timeout >>> resp. interruption by a signal from the userspace. Also retry rather >>> than fail when check_pages_isolated sees some !free pages because those >>> could be a result of the race as well. >>> >>> Signed-off-by: Michal Hocko <mhocko@suse.com> >> >> Even within a movable node where has_unmovable_pages() is a non-issue, you could >> have pinned movable pages where the pinning is not temporary. > > Who would pin those pages? Such a page would be unreclaimable as well > and thus a memory leak and I would argue it would be a bug. I don't know who exactly, but generally it's a problem for CMA and a reason why there was some effort from PeterZ to introduce an API for long-term pinning. >> So after this >> patch, this will really keep retrying forever. I'm not saying it's wrong, just >> pointing it out, since the changelog seems to assume there would be only >> temporary failures possible and thus unbound retries are always correct. >> The obvious problem if we wanted to avoid this, is how to recognize >> non-temporary failures... > > Yes, we should be able to distinguish the two and hopefully we can teach > the migration code to distinguish between EBUSY (likely permanent) and > EGAIN (temporal) failure. This sound like something we should aim for > longterm I guess. Anyway as I've said in other email. If somebody really > wants to have a guaratee of a bounded retry then it is trivial to set up > an alarm and send a signal itself to bail out. Sure, I would just be careful about not breaking existing userspace (udev?) when offline triggered via ACPI from some management interface (or whatever the exact mechanism is). > Do you think that the changelog should be more clear about this? It certainly wouldn't hurt :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-09-13 11:41 UTC|newest] Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-09-04 8:21 [PATCH 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko 2017-09-04 8:21 ` Michal Hocko 2017-09-04 8:21 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko 2017-09-04 8:21 ` Michal Hocko 2017-09-05 6:29 ` Anshuman Khandual 2017-09-05 6:29 ` Anshuman Khandual 2017-09-05 7:13 ` Michal Hocko 2017-09-05 7:13 ` Michal Hocko 2017-09-08 17:26 ` Vlastimil Babka 2017-09-08 17:26 ` Vlastimil Babka 2017-09-11 8:17 ` Michal Hocko 2017-09-11 8:17 ` Michal Hocko 2017-09-13 11:41 ` Vlastimil Babka [this message] 2017-09-13 11:41 ` Vlastimil Babka 2017-09-13 12:10 ` Michal Hocko 2017-09-13 12:10 ` Michal Hocko 2017-09-13 12:14 ` Michal Hocko 2017-09-13 12:14 ` Michal Hocko 2017-09-13 12:19 ` Vlastimil Babka 2017-09-13 12:19 ` Vlastimil Babka 2017-09-13 12:32 ` Michal Hocko 2017-09-13 12:32 ` Michal Hocko 2017-09-04 8:21 ` [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory Michal Hocko 2017-09-04 8:21 ` Michal Hocko 2017-09-04 8:58 ` Xishi Qiu 2017-09-04 8:58 ` Xishi Qiu 2017-09-04 9:01 ` Michal Hocko 2017-09-04 9:01 ` Michal Hocko 2017-09-04 9:05 ` Xishi Qiu 2017-09-04 9:05 ` Xishi Qiu 2017-09-04 9:15 ` Michal Hocko 2017-09-04 9:15 ` Michal Hocko 2017-09-05 5:46 ` Anshuman Khandual 2017-09-05 5:46 ` Anshuman Khandual 2017-09-05 7:23 ` Michal Hocko 2017-09-05 7:23 ` Michal Hocko 2017-09-05 8:54 ` Anshuman Khandual 2017-09-05 8:54 ` Anshuman Khandual 2017-09-08 17:27 ` Vlastimil Babka 2017-09-08 17:27 ` Vlastimil Babka 2017-09-18 7:08 [PATCH v2 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko 2017-09-18 7:08 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko 2017-09-18 7:08 ` Michal Hocko 2017-10-10 12:05 ` Michael Ellerman 2017-10-10 12:05 ` Michael Ellerman 2017-10-10 12:27 ` Michal Hocko 2017-10-10 12:27 ` Michal Hocko 2017-10-11 2:37 ` Michael Ellerman 2017-10-11 2:37 ` Michael Ellerman 2017-10-11 5:19 ` Michael Ellerman 2017-10-11 5:19 ` Michael Ellerman 2017-10-11 14:05 ` Anshuman Khandual 2017-10-11 14:05 ` Anshuman Khandual 2017-10-11 14:16 ` Michal Hocko 2017-10-11 14:16 ` Michal Hocko 2017-10-11 6:51 ` Michal Hocko 2017-10-11 6:51 ` Michal Hocko 2017-10-11 8:04 ` Vlastimil Babka 2017-10-11 8:04 ` Vlastimil Babka 2017-10-11 8:13 ` Michal Hocko 2017-10-11 8:13 ` Michal Hocko 2017-10-11 11:17 ` Vlastimil Babka 2017-10-11 11:17 ` Vlastimil Babka 2017-10-11 11:24 ` Michal Hocko 2017-10-11 11:24 ` Michal Hocko 2017-10-13 11:42 ` Michael Ellerman 2017-10-13 11:42 ` Michael Ellerman 2017-10-13 11:58 ` Michal Hocko 2017-10-13 11:58 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=9fad7246-c634-18bb-78f9-b95376c009da@suse.cz \ --to=vbabka@suse.cz \ --cc=akpm@linux-foundation.org \ --cc=arbab@linux.vnet.ibm.com \ --cc=imammedo@redhat.com \ --cc=kamezawa.hiroyu@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=qiuxishi@huawei.com \ --cc=vkuznets@redhat.com \ --cc=yasu.isimatu@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.