From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB0FCC34047 for ; Wed, 19 Feb 2020 16:22:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 840DD2465D for ; Wed, 19 Feb 2020 16:22:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 840DD2465D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1DFBF6B0007; Wed, 19 Feb 2020 11:22:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18FD76B0008; Wed, 19 Feb 2020 11:22:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07EDE6B000A; Wed, 19 Feb 2020 11:22:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0091.hostedemail.com [216.40.44.91]) by kanga.kvack.org (Postfix) with ESMTP id E12596B0007 for ; Wed, 19 Feb 2020 11:22:35 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7DFB92C14 for ; Wed, 19 Feb 2020 16:22:35 +0000 (UTC) X-FDA: 76507394670.20.skin60_339d57cbc6961 X-HE-Tag: skin60_339d57cbc6961 X-Filterd-Recvd-Size: 3075 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Wed, 19 Feb 2020 16:22:34 +0000 (UTC) X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Feb 2020 08:22:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,461,1574150400"; d="scan'208";a="254146684" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.202]) by orsmga002.jf.intel.com with ESMTP; 19 Feb 2020 08:22:31 -0800 Date: Wed, 19 Feb 2020 08:22:31 -0800 From: Sean Christopherson To: "Longpeng (Mike)" Cc: mike.kravetz@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, arei.gonglei@huawei.com, weidong.huang@huawei.com, weifuqiang@huawei.com, kvm@vger.kernel.org Subject: Re: [PATCH] mm/hugetlb: avoid get wrong ptep caused by race Message-ID: <20200219162231.GE15888@linux.intel.com> References: <1582027825-112728-1-git-send-email-longpeng2@huawei.com> <20200218203717.GE28156@linux.intel.com> <20200219015836.GM28156@linux.intel.com> <6ccbde03-953c-c006-a07e-8146b84389d9@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <6ccbde03-953c-c006-a07e-8146b84389d9@huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 19, 2020 at 08:21:26PM +0800, Longpeng (Mike) wrote: > =E5=9C=A8 2020/2/19 9:58, Sean Christopherson =E5=86=99=E9=81=93: > > FWIW, I'd be in favor of going the READ/WRITE_ONCE() route for x86, e= .g. > > convert everything as a follow-up patch (or patches). I'm fairly con= fident > > that KVM's usage of lookup_address_in_mm() is safe, but I wouldn't ex= actly > > bet my life on it. I'd much rather the failing scenario be that KVM = uses > > a sub-optimal page size as opposed to exploding on a bad pointer. > >=20 > Um...our testcase starts 50 VMs with 2U4G(use 1G hugepage) and then do > live-upgrade(private feature that just modify the qemu and libvirt) and > live-migrate in turns for each one. However our live upgraded new QEMU = won't do > touch_all_pages. > Suppose we start a VM without touch_all_pages in QEMU, the VM's guest m= emory is > not mapped in the CR3 pagetable at the moment. When the 2 vcpus running= , they > could access some pages belong to the same 1G-hugepage, both of them wi= ll vmexit > due to ept_violation and then call gup-->follow_hugetlb_page-->hugetlb_= fault, so > the race may encounter, right? Yep. The code I'm referring to is similar but different code that just happened to go into KVM for kernel 5.6. It has no effect on the gup() fl= ow that leads to this bug. I mentioned it above as an example of code outsi= de of hugetlb_fault() that would also benefit from moving to READ/WRITE_ONCE= ().