From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55EA8C433E2 for ; Tue, 8 Sep 2020 07:57:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 253D02177B for ; Tue, 8 Sep 2020 07:57:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729851AbgIHH47 (ORCPT ); Tue, 8 Sep 2020 03:56:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:53164 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729588AbgIHH4h (ORCPT ); Tue, 8 Sep 2020 03:56:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DD01FB6D6; Tue, 8 Sep 2020 07:56:36 +0000 (UTC) From: Oscar Salvador To: akpm@linux-foundation.org Cc: naoya.horiguchi@nec.com, mhocko@kernel.org, tony.luck@intel.com, cai@lca.pw, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [PATCH v2 3/5] mm,hwpoison: Drain pcplists before bailing out for non-buddy zero-refcount page Date: Tue, 8 Sep 2020 09:56:24 +0200 Message-Id: <20200908075626.11976-4-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20200908075626.11976-1-osalvador@suse.de> References: <20200908075626.11976-1-osalvador@suse.de> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org A page with 0-refcount and !PageBuddy could perfectly be a pcppage. Currently, we bail out with an error if we encounter such a page, meaning that we do not handle pcppages neither from hard-offline nor from soft-offline path. Fix this by draining pcplists whenever we find this kind of page and retry the check again. It might be that pcplists have been spilled into the buddy allocator and so we can handle it. Signed-off-by: Oscar Salvador --- mm/memory-failure.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index b0ef5db45ba6..e57fc5c5117a 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -950,13 +950,13 @@ static int page_action(struct page_state *ps, struct page *p, } /** - * get_hwpoison_page() - Get refcount for memory error handling: + * __get_hwpoison_page() - Get refcount for memory error handling: * @page: raw error page (hit by memory error) * * Return: return 0 if failed to grab the refcount, otherwise true (some * non-zero value.) */ -static int get_hwpoison_page(struct page *page) +static int __get_hwpoison_page(struct page *page) { struct page *head = compound_head(page); @@ -986,6 +986,26 @@ static int get_hwpoison_page(struct page *page) return 0; } +static int get_hwpoison_page(struct page *p) +{ + int ret; + bool drained = false; + +retry: + ret = __get_hwpoison_page(p); + if (!ret && !is_free_buddy_page(p) && !page_count(p) && !drained) { + /* + * The page might be in a pcplist, so try to drain those + * and see if we are lucky. + */ + drain_all_pages(page_zone(p)); + drained = true; + goto retry; + } + + return ret; +} + /* * Do all that is necessary to remove user space mappings. Unmap * the pages and send SIGBUS to the processes if the data was dirty. -- 2.26.2