From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E912EC433F5 for ; Tue, 12 Apr 2022 09:25:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A6E46B0078; Tue, 12 Apr 2022 05:25:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 456F06B007B; Tue, 12 Apr 2022 05:25:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 345986B007D; Tue, 12 Apr 2022 05:25:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0148.hostedemail.com [216.40.44.148]) by kanga.kvack.org (Postfix) with ESMTP id 21B606B0078 for ; Tue, 12 Apr 2022 05:25:57 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A78931838E528 for ; Tue, 12 Apr 2022 09:25:56 +0000 (UTC) X-FDA: 79347695112.23.2B6B930 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf06.hostedemail.com (Postfix) with ESMTP id BC2B7180019 for ; Tue, 12 Apr 2022 09:25:55 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Kd0gf5VwczdZqg; Tue, 12 Apr 2022 17:25:18 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 12 Apr 2022 17:25:53 +0800 Subject: Re: [PATCH] mm/memory-failure.c: bail out early if huge zero page To: Oscar Salvador CC: , , Xu Yu References: <49273e6688d7571756603dac996692a15f245d58.1649603963.git.xuyu@linux.alibaba.com> From: Miaohe Lin Message-ID: Date: Tue, 12 Apr 2022 17:25:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: BC2B7180019 X-Stat-Signature: 3rj5ko1euyrnwctom3affnn7t3oode35 X-Rspam-User: Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com X-HE-Tag: 1649755555-31225 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/4/12 16:31, Oscar Salvador wrote: > On Sun, Apr 10, 2022 at 11:22:34PM +0800, Xu Yu wrote: >> Kernel panic when injecting memory_failure for the global huge_zero_page, >> when CONFIG_DEBUG_VM is enabled, as follows. > ... >> In fact, huge_zero_page is unhandlable currently in either soft offline >> or memory failure injection. With CONFIG_DEBUG_VM disabled, >> huge_zero_page is bailed out when checking HWPoisonHandlable() in >> get_any_page(), or checking page mapping in split_huge_page_to_list(). >> >> This makes huge_zero_page bail out early in madvise_inject_error(), and >> panic above won't happen again. > > I would not special case this in madvise_inject_error() but rather > handle it in memory-failure code. > We do already have HWPoisonHandlable(), which tells us whether the page > is of a type we can really do something about, so why not add another > check in HWPoisonHandlable() for huge_zero_page(), and have that checked > in memory_failure(). IIUC, this does not work. Because HWPoisonHandlable is only called in !MF_COUNT_INCREASED case. But MF_COUNT_INCREASED is always set when called from madvise_inject_error, so HWPoisonHandlable is not even called in this scene. Or am I miss something? BTW: IIRC, LRU isn't set on huge_zero_page. So the origin HWPoisonHandlable can already filter out this page. Thanks! > Something like (untested): > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index dcb6bb9cf731..dccd0503f803 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1181,6 +1181,10 @@ static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) > { > bool movable = false; > > + /* Can't handle huge_zero_page() */ > + if(is_huge_zero_page(compound_head(page))) > + return false; > + > /* Soft offline could mirgate non-LRU movable pages */ > if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) > movable = true; > @@ -1796,6 +1800,10 @@ int memory_failure(unsigned long pfn, int flags) > res = -EBUSY; > goto unlock_mutex; > } > + } else if(!HWPoisonHandable(p)) { > + action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); > + res = -EBUSY; > + goto unlock_mutex; > } > > if (PageTransHuge(hpage)) { > > It can certainly be prettier, but you can get the idea. > > >> >> Reported-by: Abaci >> Signed-off-by: Xu Yu >> --- >> mm/madvise.c | 13 +++++++++++-- >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/mm/madvise.c b/mm/madvise.c >> index 1873616a37d2..03ad50d222e0 100644 >> --- a/mm/madvise.c >> +++ b/mm/madvise.c >> @@ -1079,7 +1079,7 @@ static int madvise_inject_error(int behavior, >> >> for (; start < end; start += size) { >> unsigned long pfn; >> - struct page *page; >> + struct page *page, *head; >> int ret; >> >> ret = get_user_pages_fast(start, 1, 0, &page); >> @@ -1087,12 +1087,21 @@ static int madvise_inject_error(int behavior, >> return ret; >> pfn = page_to_pfn(page); >> >> + head = compound_head(page); >> + if (unlikely(is_huge_zero_page(head))) { >> + pr_warn("Unhandlable attempt to %s pfn %#lx at process virtual address %#lx\n", >> + behavior == MADV_SOFT_OFFLINE ? "soft offline" : >> + "inject memory failure for", >> + pfn, start); >> + return -EINVAL; >> + } >> + >> /* >> * When soft offlining hugepages, after migrating the page >> * we dissolve it, therefore in the second loop "page" will >> * no longer be a compound page. >> */ >> - size = page_size(compound_head(page)); >> + size = page_size(head); >> >> if (behavior == MADV_SOFT_OFFLINE) { >> pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n", >> -- >> 2.20.1.2432.ga663e714 >> >> >