All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: <linux-mm@kvack.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Wenwei Tao <wenwei.tww@alibaba-inc.com>,
	Oleg Nesterov <oleg@redhat.com>,
	David Rientjes <rientjes@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer
Date: Fri,  4 Aug 2017 10:33:50 +0200	[thread overview]
Message-ID: <20170804083350.470-2-mhocko@kernel.org> (raw)
In-Reply-To: <20170804083350.470-1-mhocko@kernel.org>

From: Michal Hocko <mhocko@suse.com>

Wenwei Tao has noticed that our current assumption that the oom victim
is dying and never doing any visible changes after it dies is not
entirely true. __task_will_free_mem consider a task dying when
SIGNAL_GROUP_EXIT is set but do_group_exit sends SIGKILL to all threads
_after_ the flag is set. So there is a race window when some threads
won't have fatal_signal_pending while the oom_reaper could start
unmapping the address space. generic_perform_write could then write
zero page to the page cache and corrupt data.

The race window is rather small and close to impossible to happen but it
would be better to have it covered.

Fix this by extending the existing MMF_UNSTABLE check in handle_mm_fault
and segfault on any page fault after the oom reaper started its work.
This means that nobody will ever observe a potentially corrupted
content. Formerly we cared only about use_mm users because those can
outlive the oom victim quite easily but having the process itself
protected sounds like a reasonable thing to do as well.

There doesn't seem to be any real life bug report so this is merely a
fix of a theoretical bug.

Noticed-by: Wenwei Tao <wenwei.tww@alibaba-inc.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 4fe5b6254688..e7308e633b52 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3874,15 +3874,10 @@ int handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 	/*
 	 * This mm has been already reaped by the oom reaper and so the
 	 * refault cannot be trusted in general. Anonymous refaults would
-	 * lose data and give a zero page instead e.g. This is especially
-	 * problem for use_mm() because regular tasks will just die and
-	 * the corrupted data will not be visible anywhere while kthread
-	 * will outlive the oom victim and potentially propagate the date
-	 * further.
+	 * lose data and give a zero page instead e.g.
 	 */
-	if (unlikely((current->flags & PF_KTHREAD) && !(ret & VM_FAULT_ERROR)
+	if (unlikely(!(ret & VM_FAULT_ERROR)
 				&& test_bit(MMF_UNSTABLE, &vma->vm_mm->flags))) {
-
 		/*
 		 * We are going to enforce SIGBUS but the PF path might have
 		 * dropped the mmap_sem already so take it again so that
-- 
2.13.2

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Wenwei Tao <wenwei.tww@alibaba-inc.com>,
	Oleg Nesterov <oleg@redhat.com>,
	David Rientjes <rientjes@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer
Date: Fri,  4 Aug 2017 10:33:50 +0200	[thread overview]
Message-ID: <20170804083350.470-2-mhocko@kernel.org> (raw)
In-Reply-To: <20170804083350.470-1-mhocko@kernel.org>

From: Michal Hocko <mhocko@suse.com>

Wenwei Tao has noticed that our current assumption that the oom victim
is dying and never doing any visible changes after it dies is not
entirely true. __task_will_free_mem consider a task dying when
SIGNAL_GROUP_EXIT is set but do_group_exit sends SIGKILL to all threads
_after_ the flag is set. So there is a race window when some threads
won't have fatal_signal_pending while the oom_reaper could start
unmapping the address space. generic_perform_write could then write
zero page to the page cache and corrupt data.

The race window is rather small and close to impossible to happen but it
would be better to have it covered.

Fix this by extending the existing MMF_UNSTABLE check in handle_mm_fault
and segfault on any page fault after the oom reaper started its work.
This means that nobody will ever observe a potentially corrupted
content. Formerly we cared only about use_mm users because those can
outlive the oom victim quite easily but having the process itself
protected sounds like a reasonable thing to do as well.

There doesn't seem to be any real life bug report so this is merely a
fix of a theoretical bug.

Noticed-by: Wenwei Tao <wenwei.tww@alibaba-inc.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 4fe5b6254688..e7308e633b52 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3874,15 +3874,10 @@ int handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 	/*
 	 * This mm has been already reaped by the oom reaper and so the
 	 * refault cannot be trusted in general. Anonymous refaults would
-	 * lose data and give a zero page instead e.g. This is especially
-	 * problem for use_mm() because regular tasks will just die and
-	 * the corrupted data will not be visible anywhere while kthread
-	 * will outlive the oom victim and potentially propagate the date
-	 * further.
+	 * lose data and give a zero page instead e.g.
 	 */
-	if (unlikely((current->flags & PF_KTHREAD) && !(ret & VM_FAULT_ERROR)
+	if (unlikely(!(ret & VM_FAULT_ERROR)
 				&& test_bit(MMF_UNSTABLE, &vma->vm_mm->flags))) {
-
 		/*
 		 * We are going to enforce SIGBUS but the PF path might have
 		 * dropped the mmap_sem already so take it again so that
-- 
2.13.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-04  8:34 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-03 13:59 [PATCH] mm, oom: fix potential data corruption when oom_reaper races with writer Michal Hocko
2017-08-03 13:59 ` Michal Hocko
2017-08-04  6:46 ` Tetsuo Handa
2017-08-04  7:42   ` Michal Hocko
2017-08-04  7:42     ` Michal Hocko
2017-08-04  8:25     ` Tetsuo Handa
2017-08-04  8:32       ` Michal Hocko
2017-08-04  8:32         ` Michal Hocko
2017-08-04  8:33         ` [PATCH 1/2] mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS Michal Hocko
2017-08-04  8:33           ` Michal Hocko
2017-08-04  8:33           ` Michal Hocko [this message]
2017-08-04  8:33             ` [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer Michal Hocko
2017-08-04  9:16       ` Re: [PATCH] " Michal Hocko
2017-08-04  9:16         ` Michal Hocko
2017-08-04 10:41         ` Tetsuo Handa
2017-08-04 10:41           ` Tetsuo Handa
2017-08-04 11:00           ` Michal Hocko
2017-08-04 11:00             ` Michal Hocko
2017-08-04 14:56             ` Michal Hocko
2017-08-04 14:56               ` Michal Hocko
2017-08-04 16:49               ` Tetsuo Handa
2017-08-04 16:49                 ` Tetsuo Handa
2017-08-05  1:46               ` 陶文苇
2017-08-05  1:46                 ` 陶文苇
2017-08-07 11:38 [PATCH 0/2] mm, oom: fix oom_reaper fallouts Michal Hocko
2017-08-07 11:38 ` [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer Michal Hocko
2017-08-07 11:38   ` Michal Hocko
2017-08-08 17:48   ` Andrea Arcangeli
2017-08-08 17:48     ` Andrea Arcangeli
2017-08-08 23:35     ` Tetsuo Handa
2017-08-08 23:35       ` Tetsuo Handa
2017-08-09 18:36       ` Andrea Arcangeli
2017-08-09 18:36         ` Andrea Arcangeli
2017-08-10  8:21     ` Michal Hocko
2017-08-10  8:21       ` Michal Hocko
2017-08-10 13:33       ` Michal Hocko
2017-08-10 13:33         ` Michal Hocko
2017-08-11  2:28   ` Tetsuo Handa
2017-08-11  2:28     ` Tetsuo Handa
2017-08-11  7:09     ` Michal Hocko
2017-08-11  7:09       ` Michal Hocko
2017-08-11  7:54       ` Tetsuo Handa
2017-08-11  7:54         ` Tetsuo Handa
2017-08-11 10:22         ` Andrea Arcangeli
2017-08-11 10:22           ` Andrea Arcangeli
2017-08-11 10:42           ` Andrea Arcangeli
2017-08-11 10:42             ` Andrea Arcangeli
2017-08-11 11:53             ` Tetsuo Handa
2017-08-11 11:53               ` Tetsuo Handa
2017-08-11 12:08         ` Michal Hocko
2017-08-11 12:08           ` Michal Hocko
2017-08-11 15:46           ` Tetsuo Handa
2017-08-11 15:46             ` Tetsuo Handa
2017-08-14 13:59             ` Michal Hocko
2017-08-14 13:59               ` Michal Hocko
2017-08-15  5:30               ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170804083350.470-2-mhocko@kernel.org \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=wenwei.tww@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.