From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751390AbdH1NeS (ORCPT ); Mon, 28 Aug 2017 09:34:18 -0400 Received: from outbound-smtp08.blacknight.com ([46.22.139.13]:54074 "EHLO outbound-smtp08.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751209AbdH1NeQ (ORCPT ); Mon, 28 Aug 2017 09:34:16 -0400 Date: Mon, 28 Aug 2017 14:34:15 +0100 From: Mel Gorman To: Andrew Morton Cc: "Hansen, Dave" , "Luck, Tony" , Linux MM , LKML Subject: [PATCH] mm, madvise: Ensure poisoned pages are removed from per-cpu lists Message-ID: <20170828133414.7qro57jbepdcyz5x@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Wendy Wang reported off-list that a RAS HWPOISON-SOFT test case failed and bisected it to the commit 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP"). The problem is that a page that was poisoned with madvise() is reused. The commit removed a check that would trigger if DEBUG_VM was enabled but re-enabling the check only fixes the problem as a side-effect by printing a bad_page warning and recovering. The root of the problem is that a madvise() can leave a poisoned on the per-cpu list. This patch drains all per-cpu lists after pages are poisoned so that they will not be reused. Wendy reports that the test case in question passes with this patch applied. While this could be done in a targeted fashion, it is over-complicated for such a rare operation. Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Reported-and-tested-by: Wang, Wendy Cc: stable@kernel.org Signed-off-by: Mel Gorman --- mm/madvise.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/madvise.c b/mm/madvise.c index 23ed525bc2bc..4d7d1e5ddba9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -613,6 +613,7 @@ static int madvise_inject_error(int behavior, unsigned long start, unsigned long end) { struct page *page; + struct zone *zone; if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -646,6 +647,11 @@ static int madvise_inject_error(int behavior, if (ret) return ret; } + + /* Ensure that all poisoned pages are removed from per-cpu lists */ + for_each_populated_zone(zone) + drain_all_pages(zone); + return 0; } #endif