From: Michal Hocko <mhocko@kernel.org> To: Shakeel Butt <shakeelb@google.com> Cc: Yisheng Xie <ysxie@foxmail.com>, Andrew Morton <akpm@linux-foundation.org>, Johannes Weiner <hannes@cmpxchg.org>, Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>, riel@redhat.com, Linux MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>, xieyisheng1@huawei.com, guohanjun@huawei.com, Xishi Qiu <qiuxishi@huawei.com> Subject: Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones Date: Mon, 13 Mar 2017 16:48:22 +0100 [thread overview] Message-ID: <20170313154822.GV31518@dhcp22.suse.cz> (raw) In-Reply-To: <CALvZod4NDM5i9ukWpNpnOLHKdOiPxSVmJmifT1cZ7vaazcJ89A@mail.gmail.com> On Mon 13-03-17 08:17:56, Shakeel Butt wrote: > On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote: > > Please do not post new version after a single feedback and try to wait > > for more review to accumulate. This is in the 3rd version and it is not > > clear why it is still an RFC. > > > > On Sun 12-03-17 19:06:10, Yisheng Xie wrote: > >> From: Yisheng Xie <xieyisheng1@huawei.com> > >> > >> When we enter do_try_to_free_pages, the may_thrash is always clear, and > >> it will retry shrink zones to tap cgroup's reserves memory by setting > >> may_thrash when the former shrink_zones reclaim nothing. > >> > >> However, when memcg is disabled or on legacy hierarchy, it should not do > >> this useless retry at all, for we do not have any cgroup's reserves > >> memory to tap, and we have already done hard work but made no progress. > >> > >> To avoid this time costly and useless retrying, add a stub function > >> mem_cgroup_thrashed() and return true when memcg is disabled or on > >> legacy hierarchy. > > > > Have you actually seen this as a bad behavior? On which workload? Or > > have spotted this by the code review? > > > > Please note that more than _what_ it is more interesting _why_ the patch > > has been prepared. > > > > I agree the current additional round of reclaim is just lame because we > > are trying hard to control the retry logic from the page allocator which > > is a sufficient justification to fix this IMO. But I really hate the > > name. At this point we do not have any idea that the memcg is trashing > > as the name of the function suggests. > > > > All of them simply might not have any reclaimable pages. So I would > > suggest either a better name e.g. memcg_allow_lowmem_reclaim() or, > > preferably, fix this properly. E.g. something like the following. > > --- > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index bae698484e8e..989ba9761921 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -99,6 +99,9 @@ struct scan_control { > > /* Can cgroups be reclaimed below their normal consumption range? */ > > unsigned int may_thrash:1; > > > > + /* Did we have any memcg protected by the low limit */ > > + unsigned int memcg_low_protection:1; > > + > > unsigned int hibernation_mode:1; > > > > /* One of the zones is ready for compaction */ > > @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) > > if (mem_cgroup_low(root, memcg)) { > > if (!sc->may_thrash) > > continue; > > + sc->memcg_low_protection = true; > > I think you wanted to put this statement before the continue otherwise > it will just disable the sc->may_thrash (second reclaim pass) > altogether. yes, of course, just a quick and dirty hack to show my point. Sorry about the confusion. > > mem_cgroup_events(memcg, MEMCG_LOW, 1); > > } > > > > @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, > > return 1; > > > > /* Untapped cgroup reserves? Don't OOM, retry. */ > > - if (!sc->may_thrash) { > > + if ( sc->memcg_low_protection && !sc->may_thrash) { > > sc->priority = initial_priority; > > sc->may_thrash = 1; > > goto retry; > > -- > > Michal Hocko > > SUSE Labs > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org> To: Shakeel Butt <shakeelb@google.com> Cc: Yisheng Xie <ysxie@foxmail.com>, Andrew Morton <akpm@linux-foundation.org>, Johannes Weiner <hannes@cmpxchg.org>, Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>, riel@redhat.com, Linux MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>, xieyisheng1@huawei.com, guohanjun@huawei.com, Xishi Qiu <qiuxishi@huawei.com> Subject: Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones Date: Mon, 13 Mar 2017 16:48:22 +0100 [thread overview] Message-ID: <20170313154822.GV31518@dhcp22.suse.cz> (raw) In-Reply-To: <CALvZod4NDM5i9ukWpNpnOLHKdOiPxSVmJmifT1cZ7vaazcJ89A@mail.gmail.com> On Mon 13-03-17 08:17:56, Shakeel Butt wrote: > On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote: > > Please do not post new version after a single feedback and try to wait > > for more review to accumulate. This is in the 3rd version and it is not > > clear why it is still an RFC. > > > > On Sun 12-03-17 19:06:10, Yisheng Xie wrote: > >> From: Yisheng Xie <xieyisheng1@huawei.com> > >> > >> When we enter do_try_to_free_pages, the may_thrash is always clear, and > >> it will retry shrink zones to tap cgroup's reserves memory by setting > >> may_thrash when the former shrink_zones reclaim nothing. > >> > >> However, when memcg is disabled or on legacy hierarchy, it should not do > >> this useless retry at all, for we do not have any cgroup's reserves > >> memory to tap, and we have already done hard work but made no progress. > >> > >> To avoid this time costly and useless retrying, add a stub function > >> mem_cgroup_thrashed() and return true when memcg is disabled or on > >> legacy hierarchy. > > > > Have you actually seen this as a bad behavior? On which workload? Or > > have spotted this by the code review? > > > > Please note that more than _what_ it is more interesting _why_ the patch > > has been prepared. > > > > I agree the current additional round of reclaim is just lame because we > > are trying hard to control the retry logic from the page allocator which > > is a sufficient justification to fix this IMO. But I really hate the > > name. At this point we do not have any idea that the memcg is trashing > > as the name of the function suggests. > > > > All of them simply might not have any reclaimable pages. So I would > > suggest either a better name e.g. memcg_allow_lowmem_reclaim() or, > > preferably, fix this properly. E.g. something like the following. > > --- > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index bae698484e8e..989ba9761921 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -99,6 +99,9 @@ struct scan_control { > > /* Can cgroups be reclaimed below their normal consumption range? */ > > unsigned int may_thrash:1; > > > > + /* Did we have any memcg protected by the low limit */ > > + unsigned int memcg_low_protection:1; > > + > > unsigned int hibernation_mode:1; > > > > /* One of the zones is ready for compaction */ > > @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) > > if (mem_cgroup_low(root, memcg)) { > > if (!sc->may_thrash) > > continue; > > + sc->memcg_low_protection = true; > > I think you wanted to put this statement before the continue otherwise > it will just disable the sc->may_thrash (second reclaim pass) > altogether. yes, of course, just a quick and dirty hack to show my point. Sorry about the confusion. > > mem_cgroup_events(memcg, MEMCG_LOW, 1); > > } > > > > @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, > > return 1; > > > > /* Untapped cgroup reserves? Don't OOM, retry. */ > > - if (!sc->may_thrash) { > > + if ( sc->memcg_low_protection && !sc->may_thrash) { > > sc->priority = initial_priority; > > sc->may_thrash = 1; > > goto retry; > > -- > > Michal Hocko > > SUSE Labs > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-03-13 15:49 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-03-12 11:06 [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones Yisheng Xie 2017-03-12 11:06 ` Yisheng Xie 2017-03-12 20:20 ` Shakeel Butt 2017-03-12 20:20 ` Shakeel Butt 2017-03-13 8:33 ` Michal Hocko 2017-03-13 8:33 ` Michal Hocko 2017-03-13 12:00 ` Yisheng Xie 2017-03-13 12:00 ` Yisheng Xie 2017-03-13 15:17 ` Shakeel Butt 2017-03-13 15:17 ` Shakeel Butt 2017-03-13 15:48 ` Michal Hocko [this message] 2017-03-13 15:48 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170313154822.GV31518@dhcp22.suse.cz \ --to=mhocko@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=guohanjun@huawei.com \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=qiuxishi@huawei.com \ --cc=riel@redhat.com \ --cc=shakeelb@google.com \ --cc=vbabka@suse.cz \ --cc=xieyisheng1@huawei.com \ --cc=ysxie@foxmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.