From: Michal Hocko <mhocko@kernel.org> To: Balbir Singh <bsingharora@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Johannes Weiner <hannes@cmpxchg.org>, Vlastimil Babka <vbabka@suse.cz>, linux-mm <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>, Boris Zhmurov <bb@kernelpanic.ru>, "Christopher S. Aker" <caker@theshore.net>, Donald Buczek <buczek@molgen.mpg.de>, Paul Menzel <pmenzel@molgen.mpg.de>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Subject: Re: [PATCH] mm, vmscan: add cond_resched into shrink_node_memcg Date: Mon, 5 Dec 2016 13:49:55 +0100 [thread overview] Message-ID: <20161205124955.GG30758@dhcp22.suse.cz> (raw) In-Reply-To: <CAKTCnz=K8QG69tKB8yStiZypBzcvnE=wW+25xuo9f_HZNzPtDg@mail.gmail.com> [CC Paul - sorry I've tried to save you from more emails...] On Mon 05-12-16 23:44:27, Balbir Singh wrote: > > > > Hi, > > there were multiple reportes of the similar RCU stalls. Only Boris has > > confirmed that this patch helps in his workload. Others might see a > > slightly different issue and that should be investigated if it is the > > case. As pointed out by Paul [1] cond_resched might be not sufficient > > to silence RCU stalls because that would require a real scheduling. > > This is a separate problem, though, and Paul is working with Peter [2] > > to resolve it. > > > > Anyway, I believe that this patch should be a good start because it > > really seems that nr_taken=0 during the LRU isolation can be triggered > > in the real life. All reporters are agreeing to start seeing this issue > > when moving on to 4.8 kernel which might be just a coincidence or a > > different behavior of some subsystem. Well, MM has moved from zone to > > node reclaim but I couldn't have found any direct relation to that > > change. > > > > [1] http://lkml.kernel.org/r/20161130142955.GS3924@linux.vnet.ibm.com > > [2] http://lkml.kernel.org/r/20161201124024.GB3924@linux.vnet.ibm.com > > > > mm/vmscan.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index c05f00042430..c4abf08861d2 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2362,6 +2362,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc > > } > > } > > > > + cond_resched(); > > + > > I see a cond_resched_rcu_qs() as a part of linux next inside the while > (nr[..]) loop. This is a left over from Paul's initial attempt to fix this issue. I expect him to drop his patch from his tree. He has considered it experimental anyway. > Do we need this as well? Paul is working with Peter to make cond_resched general and cover RCU stalls even when cond_resched doesn't schedule because there is no runnable task. -- Michal Hocko SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org> To: Balbir Singh <bsingharora@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Johannes Weiner <hannes@cmpxchg.org>, Vlastimil Babka <vbabka@suse.cz>, linux-mm <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>, Boris Zhmurov <bb@kernelpanic.ru>, "Christopher S. Aker" <caker@theshore.net>, Donald Buczek <buczek@molgen.mpg.de>, Paul Menzel <pmenzel@molgen.mpg.de>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Subject: Re: [PATCH] mm, vmscan: add cond_resched into shrink_node_memcg Date: Mon, 5 Dec 2016 13:49:55 +0100 [thread overview] Message-ID: <20161205124955.GG30758@dhcp22.suse.cz> (raw) In-Reply-To: <CAKTCnz=K8QG69tKB8yStiZypBzcvnE=wW+25xuo9f_HZNzPtDg@mail.gmail.com> [CC Paul - sorry I've tried to save you from more emails...] On Mon 05-12-16 23:44:27, Balbir Singh wrote: > > > > Hi, > > there were multiple reportes of the similar RCU stalls. Only Boris has > > confirmed that this patch helps in his workload. Others might see a > > slightly different issue and that should be investigated if it is the > > case. As pointed out by Paul [1] cond_resched might be not sufficient > > to silence RCU stalls because that would require a real scheduling. > > This is a separate problem, though, and Paul is working with Peter [2] > > to resolve it. > > > > Anyway, I believe that this patch should be a good start because it > > really seems that nr_taken=0 during the LRU isolation can be triggered > > in the real life. All reporters are agreeing to start seeing this issue > > when moving on to 4.8 kernel which might be just a coincidence or a > > different behavior of some subsystem. Well, MM has moved from zone to > > node reclaim but I couldn't have found any direct relation to that > > change. > > > > [1] http://lkml.kernel.org/r/20161130142955.GS3924@linux.vnet.ibm.com > > [2] http://lkml.kernel.org/r/20161201124024.GB3924@linux.vnet.ibm.com > > > > mm/vmscan.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index c05f00042430..c4abf08861d2 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2362,6 +2362,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc > > } > > } > > > > + cond_resched(); > > + > > I see a cond_resched_rcu_qs() as a part of linux next inside the while > (nr[..]) loop. This is a left over from Paul's initial attempt to fix this issue. I expect him to drop his patch from his tree. He has considered it experimental anyway. > Do we need this as well? Paul is working with Peter to make cond_resched general and cover RCU stalls even when cond_resched doesn't schedule because there is no runnable task. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-12-05 12:50 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-12-02 9:58 [PATCH] mm, vmscan: add cond_resched into shrink_node_memcg Michal Hocko 2016-12-02 9:58 ` Michal Hocko 2016-12-05 12:44 ` Balbir Singh 2016-12-05 12:44 ` Balbir Singh 2016-12-05 12:49 ` Michal Hocko [this message] 2016-12-05 12:49 ` Michal Hocko 2016-12-05 16:16 ` Paul E. McKenney 2016-12-05 16:16 ` Paul E. McKenney 2016-12-09 10:13 ` Donald Buczek 2016-12-09 10:13 ` Donald Buczek
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161205124955.GG30758@dhcp22.suse.cz \ --to=mhocko@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=bb@kernelpanic.ru \ --cc=bsingharora@gmail.com \ --cc=buczek@molgen.mpg.de \ --cc=caker@theshore.net \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=paulmck@linux.vnet.ibm.com \ --cc=pmenzel@molgen.mpg.de \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.