From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> To: "Arkadiusz Miśkiewicz" <a.miskiewicz@gmail.com>, "Andrew Morton" <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org>, cgroups@vger.kernel.org, Aleksa Sarai <asarai@suse.de>, Jay Kamat <jgkamat@fb.com>, Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@suse.com>, Johannes Weiner <hannes@cmpxchg.org>, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, linux-mm <linux-mm@kvack.org> Subject: [PATCH v2] oom, oom_reaper: do not enqueue same task twice Date: Sat, 26 Jan 2019 22:10:52 +0900 [thread overview] Message-ID: <1d161137-55a5-126f-b47e-b2625bd798ca@i-love.sakura.ne.jp> (raw) In-Reply-To: <2b0c7d6c-c58a-da7d-6f0a-4900694ec2d3@gmail.com> On 2019/01/26 20:29, Arkadiusz Miśkiewicz wrote: > On 26/01/2019 12:09, Tetsuo Handa wrote: >> Arkadiusz, will you try this patch? > > > Works. Several tries and always getting 0 pids.current after ~1s. > Thank you for testing. I updated this patch to use tsk->signal->oom_mm (a snapshot of tsk->mm saved by mark_oom_victim(tsk)) rather than raw tsk->mm so that we don't need to worry about possibility of changing tsk->mm across multiple wake_oom_reaper(tsk) calls. From 9c9e935fc038342c48461aabca666f1b544e32b1 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Sat, 26 Jan 2019 21:57:25 +0900 Subject: [PATCH v2] oom, oom_reaper: do not enqueue same task twice Arkadiusz reported that enabling memcg's group oom killing causes strange memcg statistics where there is no task in a memcg despite the number of tasks in that memcg is not 0. It turned out that there is a bug in wake_oom_reaper() which allows enqueuing same task twice which makes impossible to decrease the number of tasks in that memcg due to a refcount leak. This bug existed since the OOM reaper became invokable from task_will_free_mem(current) path in out_of_memory() in Linux 4.7, but memcg's group oom killing made it easier to trigger this bug by calling wake_oom_reaper() on the same task from one out_of_memory() request. Fix this bug using an approach used by commit 855b018325737f76 ("oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task"). As a side effect of this patch, this patch also avoids enqueuing multiple threads sharing memory via task_will_free_mem(current) path. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Fixes: af8e15cc85a25315 ("oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head") --- mm/oom_kill.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index f0e8cd9..057bfee 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -505,14 +505,6 @@ bool __oom_reap_task_mm(struct mm_struct *mm) struct vm_area_struct *vma; bool ret = true; - /* - * Tell all users of get_user/copy_from_user etc... that the content - * is no longer stable. No barriers really needed because unmapping - * should imply barriers already and the reader would hit a page fault - * if it stumbled over a reaped memory. - */ - set_bit(MMF_UNSTABLE, &mm->flags); - for (vma = mm->mmap ; vma; vma = vma->vm_next) { if (!can_madv_dontneed_vma(vma)) continue; @@ -647,8 +639,13 @@ static int oom_reaper(void *unused) static void wake_oom_reaper(struct task_struct *tsk) { - /* tsk is already queued? */ - if (tsk == oom_reaper_list || tsk->oom_reaper_list) + /* + * Tell all users of get_user/copy_from_user etc... that the content + * is no longer stable. No barriers really needed because unmapping + * should imply barriers already and the reader would hit a page fault + * if it stumbled over a reaped memory. + */ + if (test_and_set_bit(MMF_UNSTABLE, &tsk->signal->oom_mm->flags)) return; get_task_struct(tsk); -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> To: "Arkadiusz Miśkiewicz" <a.miskiewicz@gmail.com>, "Andrew Morton" <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org>, cgroups@vger.kernel.org, Aleksa Sarai <asarai@suse.de>, Jay Kamat <jgkamat@fb.com>, Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@suse.com>, Johannes Weiner <hannes@cmpxchg.org>, linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, linux-mm <linux-mm@kvack.org> Subject: [PATCH v2] oom, oom_reaper: do not enqueue same task twice Date: Sat, 26 Jan 2019 22:10:52 +0900 [thread overview] Message-ID: <1d161137-55a5-126f-b47e-b2625bd798ca@i-love.sakura.ne.jp> (raw) In-Reply-To: <2b0c7d6c-c58a-da7d-6f0a-4900694ec2d3@gmail.com> On 2019/01/26 20:29, Arkadiusz Miśkiewicz wrote: > On 26/01/2019 12:09, Tetsuo Handa wrote: >> Arkadiusz, will you try this patch? > > > Works. Several tries and always getting 0 pids.current after ~1s. > Thank you for testing. I updated this patch to use tsk->signal->oom_mm (a snapshot of tsk->mm saved by mark_oom_victim(tsk)) rather than raw tsk->mm so that we don't need to worry about possibility of changing tsk->mm across multiple wake_oom_reaper(tsk) calls. >From 9c9e935fc038342c48461aabca666f1b544e32b1 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Sat, 26 Jan 2019 21:57:25 +0900 Subject: [PATCH v2] oom, oom_reaper: do not enqueue same task twice Arkadiusz reported that enabling memcg's group oom killing causes strange memcg statistics where there is no task in a memcg despite the number of tasks in that memcg is not 0. It turned out that there is a bug in wake_oom_reaper() which allows enqueuing same task twice which makes impossible to decrease the number of tasks in that memcg due to a refcount leak. This bug existed since the OOM reaper became invokable from task_will_free_mem(current) path in out_of_memory() in Linux 4.7, but memcg's group oom killing made it easier to trigger this bug by calling wake_oom_reaper() on the same task from one out_of_memory() request. Fix this bug using an approach used by commit 855b018325737f76 ("oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task"). As a side effect of this patch, this patch also avoids enqueuing multiple threads sharing memory via task_will_free_mem(current) path. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Fixes: af8e15cc85a25315 ("oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head") --- mm/oom_kill.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index f0e8cd9..057bfee 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -505,14 +505,6 @@ bool __oom_reap_task_mm(struct mm_struct *mm) struct vm_area_struct *vma; bool ret = true; - /* - * Tell all users of get_user/copy_from_user etc... that the content - * is no longer stable. No barriers really needed because unmapping - * should imply barriers already and the reader would hit a page fault - * if it stumbled over a reaped memory. - */ - set_bit(MMF_UNSTABLE, &mm->flags); - for (vma = mm->mmap ; vma; vma = vma->vm_next) { if (!can_madv_dontneed_vma(vma)) continue; @@ -647,8 +639,13 @@ static int oom_reaper(void *unused) static void wake_oom_reaper(struct task_struct *tsk) { - /* tsk is already queued? */ - if (tsk == oom_reaper_list || tsk->oom_reaper_list) + /* + * Tell all users of get_user/copy_from_user etc... that the content + * is no longer stable. No barriers really needed because unmapping + * should imply barriers already and the reader would hit a page fault + * if it stumbled over a reaped memory. + */ + if (test_and_set_bit(MMF_UNSTABLE, &tsk->signal->oom_mm->flags)) return; get_task_struct(tsk); -- 1.8.3.1
next prev parent reply other threads:[~2019-01-26 13:11 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <df806a77-3327-9db5-8be2-976fde1c84e5@gmail.com> [not found] ` <20190117122535.njcbqhlmzozdkncw@mikami> [not found] ` <1d36b181-cbaf-6694-1a31-2f7f55d15675@gmail.com> [not found] ` <96ef6615-a5df-30af-b4dc-417a18ca63f1@gmail.com> 2019-01-25 7:52 ` pids.current with invalid value for hours [5.0.0 rc3 git] Arkadiusz Miśkiewicz 2019-01-25 16:37 ` Tejun Heo 2019-01-25 19:47 ` Arkadiusz Miśkiewicz 2019-01-26 1:27 ` Tetsuo Handa 2019-01-26 1:27 ` Tetsuo Handa 2019-01-26 2:41 ` Arkadiusz Miśkiewicz 2019-01-26 6:10 ` Tetsuo Handa 2019-01-26 6:10 ` Tetsuo Handa 2019-01-26 7:55 ` Tetsuo Handa 2019-01-26 7:55 ` Tetsuo Handa 2019-01-26 11:09 ` Tetsuo Handa 2019-01-26 11:09 ` Tetsuo Handa 2019-01-26 11:29 ` Arkadiusz Miśkiewicz 2019-01-26 13:10 ` Tetsuo Handa [this message] 2019-01-26 13:10 ` [PATCH v2] oom, oom_reaper: do not enqueue same task twice Tetsuo Handa 2019-01-27 8:37 ` Michal Hocko 2019-01-27 10:56 ` Tetsuo Handa 2019-01-27 11:40 ` Michal Hocko 2019-01-27 14:57 ` [PATCH v3] " Tetsuo Handa 2019-01-27 14:57 ` Tetsuo Handa 2019-01-27 16:58 ` Michal Hocko 2019-01-27 23:00 ` Roman Gushchin 2019-01-28 18:15 ` Andrew Morton 2019-01-28 18:42 ` Michal Hocko 2019-01-28 21:53 ` Johannes Weiner 2019-01-29 10:34 ` Tetsuo Handa 2019-01-26 1:41 ` pids.current with invalid value for hours [5.0.0 rc3 git] Roman Gushchin 2019-01-26 2:28 ` Arkadiusz Miśkiewicz
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1d161137-55a5-126f-b47e-b2625bd798ca@i-love.sakura.ne.jp \ --to=penguin-kernel@i-love.sakura.ne.jp \ --cc=a.miskiewicz@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=asarai@suse.de \ --cc=cgroups@vger.kernel.org \ --cc=guro@fb.com \ --cc=hannes@cmpxchg.org \ --cc=jgkamat@fb.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=tj@kernel.org \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.