From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=BAYES_00, CHARSET_FARAWAY_HEADER,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, PDS_BAD_THREAD_QP_64,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD59FC433DB for ; Thu, 25 Feb 2021 11:28:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E58DB64F16 for ; Thu, 25 Feb 2021 11:28:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E58DB64F16 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=nec.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F3DD18D0017; Thu, 25 Feb 2021 06:28:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF0538D0005; Thu, 25 Feb 2021 06:28:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB94A8D0017; Thu, 25 Feb 2021 06:28:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0053.hostedemail.com [216.40.44.53]) by kanga.kvack.org (Postfix) with ESMTP id C53C58D0005 for ; Thu, 25 Feb 2021 06:28:24 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 9266C1800E8DC for ; Thu, 25 Feb 2021 11:28:24 +0000 (UTC) X-FDA: 77856566928.29.9ABEB18 Received: from JPN01-OS2-obe.outbound.protection.outlook.com (mail-eopbgr1410070.outbound.protection.outlook.com [40.107.141.70]) by imf05.hostedemail.com (Postfix) with ESMTP id 04B38E00010C for ; Thu, 25 Feb 2021 11:28:21 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T2LSn+E3O1Unvh8y7rQZTbJthtaEV47gXRN5emsOBE/3n3nxX1JaU8tMTrETXAJpdk7/D38vULpr1u/LmZ/rPEPXLCxTq+gMzaKt5/yLFO1YGjkS5gr/+TFfo3bRPiUV6MhS0oIBeJRnf/1jNdFABIAdOn065K37C85hVXZ3bOMI9wTj3kIENdewXFPmCzjDUC8ERvohNvtqIgngeYkO3PB0Udn0O5++I/lUI9rlUM9tgfo7fgnSkfyRspewIKYNfADvtN0fRIvwOgW/gIWUKciUyNKTRwmcJjPsVDWyUiZjpm50vRMh/29jssbCVJwM8/Y9ciA6CYe01qWFBusjkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W2CnKLFc7e8Ep0lol3SpLDRgEHfwzqGOUhP3Dx4zS/0=; b=O8b0sLAm86j+LAnFgmgt9y7v7IH4Tr1i3/aTmZh9Ce2yi6w8GrvGjLJzRgjrGhcbtc9OCTY2noa/cI6Ut7LWzMdTRMUJ+9D8/hR2Z0zHr410KASIUKfCs/HJc9Ma9aU2G7L0AKqpSbuS8BBJkvVKexkMku4mBV523lzLY1FRyC9Ay8/rk4UBC6TaYso9OwtJgi+Il0Y9IVWeoFaWpukRjA5JYqjJyr0G3FFk9yDIQLifWnkoiysKMLO4Fb3dJhYFpddKyq3wEu1zCRwxgNzRi7UURcV/gdxqfB89pXY0ZctN56sAt1nIm58MjDp7O99Jwi9e9Bk7ji+Hl/JEgWMJCw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nec.com; dmarc=pass action=none header.from=nec.com; dkim=pass header.d=nec.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nec.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W2CnKLFc7e8Ep0lol3SpLDRgEHfwzqGOUhP3Dx4zS/0=; b=jEQupOuizveEH+pmmEzPoYTW5oMi/3kW/2GS1HEOJLeokIVVxT7piH0X9kvakXNcCi8O1KJ2a34qu6fFMcYMMlGdC5qmK+BFlNCdXJ5vmzp9kHL5Y8YwfKA4jTxsKq2e0tCdRuewjMSJDNbXN2toDCHrAbINdEwOISK2Zzs/QuQ= Received: from TY1PR01MB1852.jpnprd01.prod.outlook.com (2603:1096:403:8::12) by TYAPR01MB3967.jpnprd01.prod.outlook.com (2603:1096:404:c9::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3868.29; Thu, 25 Feb 2021 11:28:18 +0000 Received: from TY1PR01MB1852.jpnprd01.prod.outlook.com ([fe80::1d15:6a82:ea9b:5101]) by TY1PR01MB1852.jpnprd01.prod.outlook.com ([fe80::1d15:6a82:ea9b:5101%6]) with mapi id 15.20.3868.032; Thu, 25 Feb 2021 11:28:18 +0000 From: =?iso-2022-jp?B?SE9SSUdVQ0hJIE5BT1lBKBskQktZOH0hIUQ+TGkbKEIp?= To: Aili Yao CC: Oscar Salvador , "tony.luck@intel.com" , "david@redhat.com" , "akpm@linux-foundation.org" , "bp@alien8.de" , "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" , "x86@kernel.org" , "inux-edac@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "yangfeng1@kingsoft.com" Subject: Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned Thread-Topic: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned Thread-Index: AQHXCn0F5fAOAyGRl0mX4JfXYioGdapnG5GAgAEgN4CAAIHeAA== Date: Thu, 25 Feb 2021 11:28:18 +0000 Message-ID: <20210225112818.GA10141@hori.linux.bs1.fc.nec.co.jp> References: <20210224151619.67c29731@alex-virtual-machine> <20210224103105.GA16368@linux> <20210225114329.4e1a41c6@alex-virtual-machine> In-Reply-To: <20210225114329.4e1a41c6@alex-virtual-machine> Accept-Language: ja-JP, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: kingsoft.com; dkim=none (message not signed) header.d=none;kingsoft.com; dmarc=none action=none header.from=nec.com; x-originating-ip: [165.225.110.205] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 39f0d871-177e-40f2-4e6e-08d8d9807374 x-ms-traffictypediagnostic: TYAPR01MB3967: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 0VJjweDa7FyAAsJNS5wGXCUXdLPSXvpig2zZ59Jghk8WmsiL4zIh6i4h//5aoqYE5SlM/RWsNvjKuNEcl8X0X99CYS2IZaodz9OFyoTCPvWuE5InCT2kxvb3wf1XdgYoSf7DBOEi+MjDx4QUqeUxGb8nQosP0Te9EQmZWO8OEBcns9e+Vb7hCUYQMeWSC5Q3mkLEONDDYqLgg4WqGPjZxUnddF5XWcqmSsrb9dXgXzMmJW66cYV4r3K2xqiB75QHBjq+LayR1nvFk0PXHEg0at2iTWkS1lH1CiyHj9qUMEMwRp4wnCEJ1JjEq3rfPdrNrO5ZHcDYZJ/0vojHUgYozF6XlvYu7og4NcHvmQJw1TAucbKKTkQrdr2fJ2dz5lUdWeLPQDKL73ukh4awmipWETsQmbuJqk3Nhbf7k5IJlWEEIi1U+cAIqTL9WiWre60BNPEG+xnQI+Bk1HlGlqLzcihoXqBcN80uWxQSp4NuTMoVP6YZR8HR2I5RlyC1trtUNIJQJzuGrf3qnVejKpGNvOZPudVGVpuRYOSzCG4Kl7JzCMmn3NQvFprcclWKQhW4 x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:TY1PR01MB1852.jpnprd01.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(376002)(366004)(346002)(396003)(39860400002)(136003)(316002)(186003)(55236004)(26005)(76116006)(6506007)(8936002)(5660300002)(85182001)(1076003)(9686003)(6916009)(6486002)(7416002)(54906003)(83380400001)(66556008)(2906002)(71200400001)(4326008)(66446008)(8676002)(33656002)(64756008)(478600001)(66476007)(6512007)(86362001)(66946007)(21314003);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-2022-jp?B?YmpzZWpBdmZHU1gvdG9LeCtTMHVLeWQ1UXFPT295c2kxazV0dmQ3Sk5l?= =?iso-2022-jp?B?cWRQQXVVVGhGeGdRdXh1cHp1b3NkWms5dmo1czJnelhRUG5TbURuYzBE?= =?iso-2022-jp?B?bWhyenJ5VElyVWxkbjI2MDNucVpLNDJxZm5ieUNtZ0RMQnpFdTBWdVlO?= =?iso-2022-jp?B?MnVNeFlYS25OdmVUNGkxWmlWTDNwbDZ1NmlwbDJNZzNhRHBjRXMvQjNO?= =?iso-2022-jp?B?RXJFRUQ2SmxuSGxodG5RV0M5ZFAyallxYjRRa1RuMlZ1V3I4aHpZakkz?= =?iso-2022-jp?B?V3lqSnVGaGI2SFFvS1IvdmV0MG15dEpaZWgwSDN0cSt3UlZSbUlPN0FI?= =?iso-2022-jp?B?eUh5RmRlbnpLdGhsZ0ZnSlpUUFRENjBjTHhoZFdBZm40Y2RuRko3aStk?= =?iso-2022-jp?B?aEErTUIwR3MrTlBCN3VuMDF4UnFsK1BnWlVxUFZNSWVURWxoSUVqVDdq?= =?iso-2022-jp?B?a3M3Nkh6a2hvdTZZSHIvZnNBL01zRzdHZHowSzN3ZjNhUnREV1FHRDc4?= =?iso-2022-jp?B?eDExMVJLekRoazJ5Um9wSUR2SG8ydmJoVmFnQmhtZ3hZU1BRdklUM3Yv?= =?iso-2022-jp?B?SjBOOHZZUWNRSUFsdmh2Z2VlM09BMGgzdXFDRmJwdEtlYktEbVR4cGJC?= =?iso-2022-jp?B?UER0Q3VVL2JXMnc0TXpQRkpMOGFQU1NtZVlKdGRxL3hKUW1Vb01DR2ZI?= =?iso-2022-jp?B?akMwKzQ3Y2NCb0lKV3F2Z3ZMNzNRMkJqdVFzN2tCejNYLzQ2aStjbDdh?= =?iso-2022-jp?B?ZlBOR0hvSWRLZ0tCang4NEo3amwwVnBjcGVrQUJLcTJReEJ5bGRVSjZs?= =?iso-2022-jp?B?MU04WVZHMFVnTmNRdWdGOVgwWEJtVWtXcjA4UUZPQjZockFMdHJoTEYr?= =?iso-2022-jp?B?aUFpQU5BRnlkb3NGYVcrb3ZiVkJSV0lDOEVMeCs3NWdqVGJvYW1ZbGJn?= =?iso-2022-jp?B?ZSs2SXl2MmlERktKc2xRaXU0MnBwTGlMUld1WVZUWHZ3YVBMU0Z4QStX?= =?iso-2022-jp?B?T3RNK3BzU2xtK0VkS3d4b0xWd3A0bmtxMy81M0dza3pvNmFtek41RHpi?= =?iso-2022-jp?B?cUJmY0F2dmdxb08rQ0FrSEsyZXlQQWpPazlCSzRvSVlvNUxTMk4xQ0Fi?= =?iso-2022-jp?B?WW9RL3liVU5YYUo3ajFpcm0vNWtmdTlOUHR1Vkh1Vjd6cDRuSkFpYUVS?= =?iso-2022-jp?B?UzRGMEJUWFA5Z01POXd0YUZMR0FBL0Mya01DUjVIZzVJNmZyRXpuc3lo?= =?iso-2022-jp?B?QWxVcGVqUXh2eXBTL1hzbmdIaVhnNm1uVnZFK3dycCtlR1E3OVQ3RDJr?= =?iso-2022-jp?B?bU5tS295S1JUZGwyRDZoaFBtUjk4Q29pTkdiZi9kTTVIZWFVMkoyVGVw?= =?iso-2022-jp?B?NnQ4MmdNVXIxY0R5V3kvZ2ZNNWk1Ykk2RVlSZlpmVHk0ZE1WY1R3RmFP?= =?iso-2022-jp?B?UFhjSjluR2E4VzlCZ1lJdnB3MGhmMDI0R0FLUCsrbXpMc2tqN1RDdGYy?= =?iso-2022-jp?B?dnRpb2szbWFLcCtJZnAyRlRuVFZIdW1wbTZibzRGZVYwNTNNZ2ZYd0V3?= =?iso-2022-jp?B?UG9veUdEUGVpMEh3Q1FjaWVseDYrZFhoOGVCUVJiY3lmdGxxbytQYy9l?= =?iso-2022-jp?B?eVFMVURwbW04cGd1c3M4dXF6bGFqZzErMUplVm4wdEdiQWFBdklwZjlO?= =?iso-2022-jp?B?bnBWWGFja1MrUk1BblM3dkQ3dVNTSjFqS2hkbDVEckpmaDNtSy9QN2s4?= =?iso-2022-jp?B?NnVsVTQxelNxa2ZPZHAzUFJYTDFtM3h6L2RoQllXVHcySnMrdzdnYTlB?= =?iso-2022-jp?B?NmJVWUhIdlliTkx0MTdpS1hacUhGWWk4aVd0OXNWU3JhRmRmZHpWdUFp?= =?iso-2022-jp?B?NVVqOTBjV0V6eEVYM1FQY1BKL0ZtMXFITm5JdXU1ZEg4OEFpYjFoTGJv?= =?iso-2022-jp?B?amZCSndxVG5mR3Q2cFpzU2RQRDBMQT09?= Content-Type: text/plain; charset="iso-2022-jp" Content-ID: <2F1ADB323F838F48B411E9387AA9B558@jpnprd01.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nec.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TY1PR01MB1852.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 39f0d871-177e-40f2-4e6e-08d8d9807374 X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Feb 2021 11:28:18.7829 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: e67df547-9d0d-4f4d-9161-51c6ed1f7d11 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: bz0UYVDiS482tExH3rZjF/8KUwkVAXzTxW1je/cAVWOAFRhVwi39OPJk9AaHj6YSycpcGYmwX9klaw67ajuzKQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB3967 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 04B38E00010C X-Stat-Signature: tpcbdwi4tqdkyih37p5swiudkezkimd1 Received-SPF: none (nec.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=JPN01-OS2-obe.outbound.protection.outlook.com; client-ip=40.107.141.70 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614252501-522473 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 25, 2021 at 11:43:29AM +0800, Aili Yao wrote: > On Wed, 24 Feb 2021 11:31:55 +0100 Oscar Salvador wro= te: ... > =20 > > >=20 > > > 3.The kill_me_maybe will check the return: > > >=20 > > > 1244 static void kill_me_maybe(struct callback_head *cb) > > > 1245 { > > >=20 > > > 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && > > > 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { > > > 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT, =20 > >=20 > > So, IIUC, in case of a LMCE nested call, the second MCE will reach here= . > > set_mce_nospec() will either mark the underlying page as not mapped/cac= hed. > > > This set_mce_nospec() is not proper when the recovery job is on the fly. = In my test > this function failed. Hi Aili, I agree that this set_mce_nospec() is not expected to be called for "already hwpoisoned" page because in the reported case the error page is already contained and no need to resort changing cache mode. ... > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > > index e9481632fcd1..06f006174b8c 100644 > > > --- a/mm/memory-failure.c > > > +++ b/mm/memory-failure.c > > > @@ -1224,7 +1224,7 @@ static int memory_failure_hugetlb(unsigned long= pfn, int flags) > > > if (TestSetPageHWPoison(head)) { > > > pr_err("Memory failure: %#lx: already hardware poisoned\n", > > > pfn); > > > - return 0; > > > + return -EBUSY; =20 > >=20 > > As David said, madvise_inject_error() will start returning -EBUSY now i= n case > > we madvise(MADV_HWPOISON) on an already hwpoisoned page. > >=20 > > AFAICS, memory_failure() can return 0, -Eerrors, and MF_XXX. > > Would it make sense to unify that? That way we could declare error code= s that > > make somse sense (like MF_ALREADY_HWPOISONED). It seems to me that memory_failure() does not return MF_XXX. But yes, returning some positive value for the reported case could be a solution. > >=20 >=20 > @David: >=20 > I checked the code again, and find a few places will care the exact retur= n value, like: >=20 > 1: drivers/base/memory.c:483: ret =3D memory_failure(pfn, 0); > This is for hard page offline, I see the code in mcelog: > static void offline_action(struct mempage *mp, u64 addr) > { > if (offline <=3D OFFLINE_ACCOUNT) > return; > Lprintf("Offlining page %llx\n", addr); > if (memory_offline(addr) < 0) { > Lprintf("Offlining page %llx failed: %s\n", addr, strerror(errno)); > mp->offlined =3D PAGE_OFFLINE_FAILED; > } else > mp->offlined =3D PAGE_OFFLINE; > } > I think return an negative value will be more proper? As the related kill= ing function may not be performed, and we can't say > it's a success operation? >=20 > 2:mm/hwpoison-inject.c:51: return memory_failure(pfn, 0); > mm/madvise.c:910: ret =3D memory_failure(pfn, MF_COUNT_INCR= EASED); >=20 > These two cases are mainly for error injections, I checked the test codes= , mostly it only care if the value is 0 or < 0; > I do the related test, normally it work well, but for stress test, someti= mes in some case, I do meet some fail cases along with the -EBUSY return. > I will dig more. >=20 > Other place will only care if the return value is 0. or just ignore it. >=20 > Hi naoya, what's your opnion for this possible issue, I need your inputs! We could use some negative value (error code) to report the reported case, then as you mentioned above, some callers need change to handle the new case, and the same is true if you use some positive value. My preference is -EHWPOISON, but other options are fine if justified well. Thanks, Naoya Horiguchi=