From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: ** X-Spam-Status: No, score=2.4 required=3.0 tests=CHARSET_FARAWAY_HEADER, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67EEBC2D0CE for ; Fri, 24 Jan 2020 09:16:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D4AE2075D for ; Fri, 24 Jan 2020 09:16:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=necglobal.onmicrosoft.com header.i=@necglobal.onmicrosoft.com header.b="sR1h+nNW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D4AE2075D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=nec.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9DC636B02BA; Fri, 24 Jan 2020 04:16:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 98E676B02BC; Fri, 24 Jan 2020 04:16:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A3816B02BD; Fri, 24 Jan 2020 04:16:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 764AC6B02BA for ; Fri, 24 Jan 2020 04:16:31 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 4150E4DAE for ; Fri, 24 Jan 2020 09:16:31 +0000 (UTC) X-FDA: 76411972182.30.twist81_38b05318aea4a X-HE-Tag: twist81_38b05318aea4a X-Filterd-Recvd-Size: 8553 Received: from JPN01-TY1-obe.outbound.protection.outlook.com (mail-eopbgr1400055.outbound.protection.outlook.com [40.107.140.55]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 24 Jan 2020 09:16:29 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d/l5nkRHT/MZhF3kNxy9iJyGlU7zAHCHSomdC+zxfNJbS+PfvFjBX1/dHgHgFhqwufc4DecaOhEc2QExS8LdGJtJrajEd/MsVoNgVlBCyeFVfr/acWS15IuhRT9pofVFPUifNZOKNg/sSUzSMzBXMKP63dmHTqGZgfhUNFkUuN6Dubey9z+cxWcZM1Yzh/XSWiCPoqIsT9hxkI0131LJZF8/EC73CDMvVpYS6WDzNCwlBFIlLvwU6YuPR/PtxRygoz7z3rS0O7e7Imljy7hk8BR2rLbUk6+0csPcrh8LTmVl0sf3NbeFGxzYmzAX+dMCNfGDd7b6CSlkGKRGvvt+YQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DkImky8DVq3K8p3bps7phbmAdj5LEGOyo94s18ry01I=; b=nlyZK8eApxnT78HTd74XLZuNhdreUIbnnPMyBDdfndMcaX9nPqn7bSg8Ahl8rlPMYoSGuxqScr1Ry0mKHOlZN1dqwa2lIZisP/Iyk5TGcCLIKgU9wMBEoE15glSHNbbBCciHR/JcfB7IPIslOamJPy5Vd1Uq3GXDzs2bv8H2MGwEChSQaG7p234NUwtRIUpRK8lA9X2X5VW7T1RtQ76Hhi3GezEnwEFBQ3sqsOvjA+AAIUqjsUG6I2tf2FqglWFqXvK6EPz/sdCDHPHaIXm1eiqCWdRDa7KYpgalHKzHS5TiiVfZ0xd9oo8mKRsQjuiq0lF4vdMOdJhFyndkltXeMA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nec.com; dmarc=pass action=none header.from=nec.com; dkim=pass header.d=nec.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=necglobal.onmicrosoft.com; s=selector1-necglobal-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DkImky8DVq3K8p3bps7phbmAdj5LEGOyo94s18ry01I=; b=sR1h+nNWuASNRFiychA5YHihHHoyz6qGIsbz9VeUu8G8aShIMNCbovy63+si3NcDuJAPTgjxUPgPB7vaCFsVdBXirxHqrVpwW0vS1ThAD6odyNvcOeULZ8ukmc3h9uwSE4r4cIomdDscQndjevTYjtDWiW953pICJnV0gp4yCck= Received: from OSBPR01MB1752.jpnprd01.prod.outlook.com (52.134.227.11) by OSBPR01MB3880.jpnprd01.prod.outlook.com (20.178.5.210) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2665.19; Fri, 24 Jan 2020 09:16:26 +0000 Received: from OSBPR01MB1752.jpnprd01.prod.outlook.com ([fe80::2de4:5005:518e:7f64]) by OSBPR01MB1752.jpnprd01.prod.outlook.com ([fe80::2de4:5005:518e:7f64%3]) with mapi id 15.20.2644.028; Fri, 24 Jan 2020 09:16:26 +0000 From: =?iso-2022-jp?B?SE9SSUdVQ0hJIE5BT1lBKBskQktZOH0hIUQ+TGkbKEIp?= To: Vikram Sethi CC: "linux-mm@kvack.org" , "n-horiguchi@ah.jp.nec.com" , "James.Morse@arm.com" , "alex.williamson@redhat.com" Subject: Re: Memory failure handling of VFIO-pinned THP Thread-Topic: Memory failure handling of VFIO-pinned THP Thread-Index: AQHV0jWfpMk9c8vn+UeF2CLs1h6kK6f5iXuA Date: Fri, 24 Jan 2020 09:16:26 +0000 Message-ID: <20200124091625.GA32278@hori.linux.bs1.fc.nec.co.jp> References: <902d2541-3da6-8519-3e94-d435afb5e19c@nvidia.com> In-Reply-To: <902d2541-3da6-8519-3e94-d435afb5e19c@nvidia.com> Accept-Language: ja-JP, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=naoya.horiguchi@nec.com; x-originating-ip: [165.225.110.211] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 94a108f7-1b8c-42c7-cf70-08d7a0ae16d4 x-ms-traffictypediagnostic: OSBPR01MB3880: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-forefront-prvs: 02929ECF07 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(4636009)(136003)(39860400002)(346002)(376002)(396003)(366004)(189003)(199004)(8676002)(81156014)(186003)(71200400001)(6916009)(1076003)(54906003)(2906002)(478600001)(26005)(81166006)(64756008)(66556008)(66476007)(55236004)(5660300002)(66446008)(86362001)(76116006)(91956017)(66946007)(8936002)(85182001)(9686003)(6512007)(6486002)(316002)(33656002)(6506007)(4326008);DIR:OUT;SFP:1101;SCL:1;SRVR:OSBPR01MB3880;H:OSBPR01MB1752.jpnprd01.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: nec.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: aTYboWTxhoss51On0DaNHtZk6Scd6Zl9q070YzOPHJhkxVRT+ShNfIMtCnXCiU5EY/9jsJKhX20VvYZyOAqy9ugr2zM+50sKDEXA4iVHUlRTs7kQm2W7sspXZq9kyRS0b9LqL0j2xTGB0w7FkldqkyYv1QQBoTlnKiTdWTEQraAUVp67ID7vvkk/xDsv5lnB216fIq4HhEwd8awmSrcvj3N7KxNiKpeLbksX1OXkFckFncjMaiAcp834mvabDVYNH/0M7jbDtI4Bok7PRK+77CbyGwX3pxEDJmeQAVScpZOTseJTIQYR6SVjb8HeAxKM4ZeKr8mhuVgsiONoYWTnIWBWZlc5logwRYBbqVKJ0W/LJuliBwx4FsyVR4BKZ6YPC8yv0VC4Rg9zo27G+8pSEoB61WAfk7AzXzxjrhLSzVvPImaNptPxMFgQuNqSx6P3 x-ms-exchange-antispam-messagedata: 78F+GJkw8dcrB7N3TWCsXjtEIqECUpmvZY943EoK45btdU5xEPbSa54+MZ3EERnPBx43Tla6nLCj9NaF0WuG3TJ1mcczdOPeEBfX3QNGO4Ej4MS80J+g8vNuMLLZk7D3s0zfYbFZ4CbxEPTDqN0/YA== Content-Type: text/plain; charset="iso-2022-jp" Content-ID: <308E94458A71764D8C9D7DFEED30E1F5@jpnprd01.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nec.com X-MS-Exchange-CrossTenant-Network-Message-Id: 94a108f7-1b8c-42c7-cf70-08d7a0ae16d4 X-MS-Exchange-CrossTenant-originalarrivaltime: 24 Jan 2020 09:16:26.2474 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: e67df547-9d0d-4f4d-9161-51c6ed1f7d11 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: GVsszKunFqzvJZPSvMm9LzldtaXQ0JLNYJx11XjNdJk/mgdBvlQiXmyR89a9anJ+U4MZQkumNpt29i3zuCACEg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: OSBPR01MB3880 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Vikram, On Thu, Jan 23, 2020 at 03:39:33PM -0600, Vikram Sethi wrote: > Hello, >=20 > I was looking at memory_failure handling of pinned transparent hugepages > (specifically pinned by VFIO for a VM with physical I/O). >=20 > AFAICT, on the initial memory error detected interrupt call memory_failur= e > won't be able to split the THP because it is pinned, and will return -EBU= SY > without actually unmapping any processes with mappings to the THP with > uncorrected memory error. Yes, that's the current behavior. >=20 > Later, when the VM does a load to the bad location (consumes poison), loo= king > at the firmware first path on ARM64, the SEA exception will be forwarded = by > Firmware to host kernel, where the GHES code will queue work for > memory_failure, where again memory_failure will exit early for the pinned= THP, > and userspace won't get the SIGBUS with Action Required code to be able t= o > inject the error into the VM. >=20 > =20 >=20 > Discussing with James, we were wondering why the pinned THP isn't treated= like > hugetlbfs memory failure, marking the entire hugepage with hw_poison flag= , and > unmapping of mapped processes when the error is detected > (memory_failure_hugetlb calling hwpoison_user_mappings)? If that were don= e, > when the VM later tries to load the bad location, the resulting VM fault = will > get the appropriate VM_FAULT_HWPOISON code, which will trigger KVM to sen= d the > SIGBUS with Action Required code to userspace, which can then inject to t= he VM? Generally, THP can be shared by multiple processes, where some map with pte mapping, and the others map with pmd mapping. So if we treat all pages in the pinned thp as hwpoisoned, processes mapping with pte mapping could be signaled by accessing to non-error subpages, which seems to me suboptimal. But I agree that containing a whole thp could improve error reporting when there's no pte mapping for the pinned thp. > I do understand that the page is pinned so that DMAs can happen from the = VM's I > /O devices without I/O faults, but since the hw_poison flag would be set = for > the page on the initial "error detected" interrupt by memory_failure, the > kernel wouldn't reallocate the page anyway. And any interim DMA writes th= at hit > the bad page wouldn't be corrupting anyone else, and DMA reads would be g= etting > poison back/completer abort.=20 > =20 >=20 > Am I missing something, or is this currently broken for VFIO and VM THP p= ages > with memory failure (at least as far as signaling user space goes)? You're right, it's simply not implemented. Thanks, Naoya Horiguchi=