From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F068C433EF for ; Fri, 18 Feb 2022 01:53:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 875A56B0071; Thu, 17 Feb 2022 20:53:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 825626B0074; Thu, 17 Feb 2022 20:53:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6ED996B0075; Thu, 17 Feb 2022 20:53:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5A9E56B0071 for ; Thu, 17 Feb 2022 20:53:09 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1B9F898C2F for ; Fri, 18 Feb 2022 01:53:09 +0000 (UTC) X-FDA: 79154227698.28.DE286BD Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf22.hostedemail.com (Postfix) with ESMTP id 08E07C0003 for ; Fri, 18 Feb 2022 01:53:07 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4K0F6Q5hpQz9snD; Fri, 18 Feb 2022 09:51:26 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Fri, 18 Feb 2022 09:53:05 +0800 Subject: Re: [PATCH v2 4/8] mm/memory-failure.c: fix race with changing page more robustly To: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= CC: "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" References: <20220216091431.39406-1-linmiaohe@huawei.com> <20220216091431.39406-5-linmiaohe@huawei.com> <20220218011351.GA2941369@hori.linux.bs1.fc.nec.co.jp> From: Miaohe Lin Message-ID: <8a939237-50e2-090e-efe1-2eb04a68f6d1@huawei.com> Date: Fri, 18 Feb 2022 09:53:05 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20220218011351.GA2941369@hori.linux.bs1.fc.nec.co.jp> Content-Type: text/plain; charset="utf-8" Content-Language: en-US X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 08E07C0003 X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Stat-Signature: 7ka5xthyq4byrpgb9y3541j5qtumdha4 X-Rspamd-Server: rspam03 X-HE-Tag: 1645149187-252258 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/2/18 9:13, HORIGUCHI NAOYA(=E5=A0=80=E5=8F=A3 =E7=9B=B4=E4=B9=9F)= wrote: > On Wed, Feb 16, 2022 at 05:14:27PM +0800, Miaohe Lin wrote: >> We're only intended to deal with the non-Compound page after we split = thp >> in memory_failure. However, the page could have changed compound pages= due >> to race window. If this happens, we could try again to hopefully handl= e the >> page next round. Also remove unneeded orig_head. It's always equal to = the >> hpage. So we can use hpage directly and remove this redundant one. >> >> Signed-off-by: Miaohe Lin >> --- >> mm/memory-failure.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 7e205d91b2d7..d66f642888be 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1690,7 +1690,6 @@ int memory_failure(unsigned long pfn, int flags) >> { >> struct page *p; >> struct page *hpage; >> - struct page *orig_head; >> struct dev_pagemap *pgmap; >> int res =3D 0; >> unsigned long page_flags; >> @@ -1736,7 +1735,7 @@ int memory_failure(unsigned long pfn, int flags) >> goto unlock_mutex; >> } >> =20 >> - orig_head =3D hpage =3D compound_head(p); >> + hpage =3D compound_head(p); >> num_poisoned_pages_inc(); >> =20 >> /* >> @@ -1817,13 +1816,18 @@ int memory_failure(unsigned long pfn, int flag= s) >> lock_page(p); >> =20 >> /* >> - * The page could have changed compound pages during the locking. >> - * If this happens just bail out. >> + * We're only intended to deal with the non-Compound page here. >> + * However, the page could have changed compound pages due to >> + * race window. If this happens, we could try again to hopefully >> + * handle the page next round. >> */ >> - if (PageCompound(p) && compound_head(p) !=3D orig_head) { >> - action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); >> - res =3D -EBUSY; >> - goto unlock_page; >> + if (PageCompound(p)) { >> + if (TestClearPageHWPoison(p)) >> + num_poisoned_pages_dec(); >> + unlock_page(p); >> + put_page(p); >> + flags &=3D ~MF_COUNT_INCREASED; >=20 > Could you limit the retry chance only once by using the local variable > "retry"? It might be very rare to hit the race more than once in a sin= gle > error event, but just to be safe from potential infinite loop (that cou= ld be > opened by future changes). >=20 Sure. Will do it in V3. Thanks. > Thanks, > Naoya Horiguchi >=20 >> + goto try_again; >> } >> =20 >> /* >> --=20 >> 2.23.0