linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: vbabka@suse.cz
Cc: akpm@linux-foundation.org, alex.shi@linux.alibaba.com,
	hughd@google.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, liwang@redhat.com,
	mgorman@techsingularity.net, stable@vger.kernel.org
Subject: [PATCH 1/2] mm, compaction: make capture control handling safe wrt interrupts
Date: Tue, 16 Jun 2020 10:26:48 +0200	[thread overview]
Message-ID: <20200616082649.27173-1-vbabka@suse.cz> (raw)
In-Reply-To: <b17acf5b-5e8a-3edf-5a64-603bf6177312@suse.cz>

Hugh reports:

=====
While stressing compaction, one run oopsed on NULL capc->cc in
__free_one_page()'s task_capc(zone): compact_zone_order() had been
interrupted, and a page was being freed in the return from interrupt.

Though you would not expect it from the source, both gccs I was using
(a 4.8.1 and a 7.5.0) had chosen to compile compact_zone_order() with
the ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before
callq compact_zone - long after the "current->capture_control = &capc".
An interrupt in between those finds capc->cc NULL (zeroed by an earlier
rep stos).

This could presumably be fixed by a barrier() before setting
current->capture_control in compact_zone_order(); but would also need
more care on return from compact_zone(), in order not to risk leaking
a page captured by interrupt just before capture_control is reset.

Maybe that is the preferable fix, but I felt safer for task_capc() to
exclude the rather surprising possibility of capture at interrupt time.
=====

I have checked that gcc10 also behaves the same.

The advantage of fix in compact_zone_order() is that we don't add another
test in the page freeing hot path, and that it might prevent future problems
if we stop exposing pointers to unitialized structures in current task.

So this patch implements the suggestion for compact_zone_order() with barrier()
(and WRITE_ONCE() to prevent store tearing) for setting
current->capture_control, and prevents page leaking with WRITE_ONCE/READ_ONCE
in the proper order.

Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction")
Cc: stable@vger.kernel.org # 5.1+
Reported-by: Hugh Dickins <hughd@google.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/compaction.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index fd988b7e5f2b..86375605faa9 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2316,15 +2316,26 @@ static enum compact_result compact_zone_order(struct zone *zone, int order,
 		.page = NULL,
 	};
 
-	current->capture_control = &capc;
+	/*
+	 * Make sure the structs are really initialized before we expose the
+	 * capture control, in case we are interrupted and the interrupt handler
+	 * frees a page.
+	 */
+	barrier();
+	WRITE_ONCE(current->capture_control, &capc);
 
 	ret = compact_zone(&cc, &capc);
 
 	VM_BUG_ON(!list_empty(&cc.freepages));
 	VM_BUG_ON(!list_empty(&cc.migratepages));
 
-	*capture = capc.page;
-	current->capture_control = NULL;
+	/*
+	 * Make sure we hide capture control first before we read the captured
+	 * page pointer, otherwise an interrupt could free and capture a page
+	 * and we would leak it.
+	 */
+	WRITE_ONCE(current->capture_control, NULL);
+	*capture = READ_ONCE(capc.page);
 
 	return ret;
 }
-- 
2.27.0


  reply	other threads:[~2020-06-16  8:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10 20:48 [PATCH] mm, page_alloc: capture page in task context only Hugh Dickins
2020-06-11 15:43 ` Mel Gorman
2020-06-12 10:30 ` Vlastimil Babka
2020-06-15 21:03   ` Hugh Dickins
2020-06-16  7:45     ` Vlastimil Babka
2020-06-16  8:26       ` Vlastimil Babka [this message]
2020-06-16  8:26         ` [PATCH 2/2] mm, page_alloc: use unlikely() in task_capc() Vlastimil Babka
2020-06-16 20:29           ` Hugh Dickins
2020-06-17  9:55             ` Vlastimil Babka
2020-06-22  8:58               ` Mel Gorman
2020-06-16 20:18         ` [PATCH 1/2] mm, compaction: make capture control handling safe wrt interrupts Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200616082649.27173-1-vbabka@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwang@redhat.com \
    --cc=mgorman@techsingularity.net \
    --cc=stable@vger.kernel.org \
    --subject='Re: [PATCH 1/2] mm, compaction: make capture control handling safe wrt interrupts' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).