From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A4EBC433DB for ; Thu, 18 Feb 2021 18:32:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CFC7164E2F for ; Thu, 18 Feb 2021 18:32:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CFC7164E2F Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F3CF6B0006; Thu, 18 Feb 2021 13:32:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 377176B006C; Thu, 18 Feb 2021 13:32:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28D046B006E; Thu, 18 Feb 2021 13:32:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 138A26B0006 for ; Thu, 18 Feb 2021 13:32:37 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D93B752B0 for ; Thu, 18 Feb 2021 18:32:36 +0000 (UTC) X-FDA: 77832234312.11.287CBD1 Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by imf16.hostedemail.com (Postfix) with ESMTP id B89E780192DB for ; Thu, 18 Feb 2021 18:32:35 +0000 (UTC) Received: by mail-il1-f175.google.com with SMTP id q9so2357139ilo.1 for ; Thu, 18 Feb 2021 10:32:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ASJfbMESWoJh/HQsMW7Q3UKMpf/MfugaFVT8oj4WLmg=; b=ryCJHtpSF1ypfE4YKHb5dWe5u3MHqsEwaEb86TPt1EPl6gh2KYJ445RXpgLxv4Hxo5 SI8cver45q3N02qNa4nG2bpG99XCwT0u9VJiSrCw2RN8twvMjSsTZrnu6RgQW/lAQBBo 96yqOc/zXHlKVeqzMlp++IMC6pd1isZABQAMn+4270ssbBkolJpt99LBVV+9MX439rxx +xAOZ6oxmoK3QESndmnAbLlK8wxrAa7YURMzc2vDoGzTdX7Fahd/7T4JmYNofBmVrKNX Ys9Y4XQRw0x7y5WEWO87oKDhpjFBnVBxWG54QmZrMT+re7WY+aXQuseBUadBpIJr5zEK PAdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ASJfbMESWoJh/HQsMW7Q3UKMpf/MfugaFVT8oj4WLmg=; b=BBaGVk1vS6pd19MP/bAQLs4IL2+9DuOPchG9VUG4Gdk/m1Ur+dOY8ju6bKAMDXdMEy KF8HWRiF/8hGdm6bhpLkbSatPLdlaZGo3lEzTMwQBUGwEmGvDEnc3Nl4/ZG6GdZVbt5J LO0vOkiuOLXqjmG2zKAQUAaM6MjJfqvkzsb+gAzhHRGr2reV6CuPDJ3DZ+nlZW1/oQTW K3l7MzC7wsZfgv1O+YDRq4BLrRPjBQGZB+9UQVunRnFsX3zQsM14JvFsTxdo2wnkbSih aayKo5gjqUl7Rs29z/AOawMEodHheNJbj2fbcXQI5TsgZaj6vIVK2CBWJ7KWBvyNR6P/ cm+w== X-Gm-Message-State: AOAM5320gAFUFHmktW27rKI2Y0RSke38aNBPsTEBsEXIsbRgXzgbRGdM cjI09duFZwfiaoDvuUI84dTgvanh5pvgBKphJp4k8g== X-Google-Smtp-Source: ABdhPJxN7VuxhWkGQJ+OPNAsI1XMlP8fqswa88PjUGdgrl0cSBm2dVVLaUGfCOtoI2FWS+lI1yoLHRzD23Q5kDVMGes= X-Received: by 2002:a05:6e02:1c8d:: with SMTP id w13mr394834ill.301.1613673155515; Thu, 18 Feb 2021 10:32:35 -0800 (PST) MIME-Version: 1.0 References: <20210217204418.54259-1-peterx@redhat.com> <20210217204619.54761-1-peterx@redhat.com> <20210217204619.54761-3-peterx@redhat.com> In-Reply-To: <20210217204619.54761-3-peterx@redhat.com> From: Axel Rasmussen Date: Thu, 18 Feb 2021 10:32:00 -0800 Message-ID: Subject: Re: [PATCH v2 4/4] hugetlb/userfaultfd: Unshare all pmds for hugetlbfs when register wp To: Peter Xu Cc: Linux MM , LKML , Mike Kravetz , Mike Rapoport , Andrea Arcangeli , Matthew Wilcox , "Kirill A . Shutemov" , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B89E780192DB X-Stat-Signature: 76u7grxc37aimudot8mc563mqqc1r4dm Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf16; identity=mailfrom; envelope-from=""; helo=mail-il1-f175.google.com; client-ip=209.85.166.175 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1613673155-68295 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 17, 2021 at 12:46 PM Peter Xu wrote: > > Huge pmd sharing for hugetlbfs is racy with userfaultfd-wp because > userfaultfd-wp is always based on pgtable entries, so they cannot be shared. > > Walk the hugetlb range and unshare all such mappings if there is, right before > UFFDIO_REGISTER will succeed and return to userspace. > > This will pair with want_pmd_share() in hugetlb code so that huge pmd sharing > is completely disabled for userfaultfd-wp registered range. > > Signed-off-by: Peter Xu > --- > fs/userfaultfd.c | 4 ++++ > include/linux/hugetlb.h | 1 + > mm/hugetlb.c | 51 +++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 56 insertions(+) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 894cc28142e7..e259318fcae1 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1448,6 +1449,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > vma->vm_flags = new_flags; > vma->vm_userfaultfd_ctx.ctx = ctx; > > + if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma)) > + hugetlb_unshare_all_pmds(vma); This line yields the following error, if building with: # CONFIG_CMA is not set ./fs/userfaultfd.c:1459: undefined reference to `hugetlb_unshare_all_pmds' > + > skip: > prev = vma; > start = vma->vm_end; > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 3b4104021dd3..97ecfd4c20b2 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -188,6 +188,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, > unsigned long address, unsigned long end, pgprot_t newprot); > > bool is_hugetlb_entry_migration(pte_t pte); > +void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); > > #else /* !CONFIG_HUGETLB_PAGE */ > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index f53a0b852ed8..83c006ea3ff9 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5723,4 +5723,55 @@ void __init hugetlb_cma_check(void) > pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); > } > > +/* > + * This function will unconditionally remove all the shared pmd pgtable entries > + * within the specific vma for a hugetlbfs memory range. > + */ > +void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) > +{ > + struct hstate *h = hstate_vma(vma); > + unsigned long sz = huge_page_size(h); > + struct mm_struct *mm = vma->vm_mm; > + struct mmu_notifier_range range; > + unsigned long address, start, end; > + spinlock_t *ptl; > + pte_t *ptep; > + > + if (!(vma->vm_flags & VM_MAYSHARE)) > + return; > + > + start = ALIGN(vma->vm_start, PUD_SIZE); > + end = ALIGN_DOWN(vma->vm_end, PUD_SIZE); > + > + if (start >= end) > + return; > + > + /* > + * No need to call adjust_range_if_pmd_sharing_possible(), because > + * we're going to operate on the whole vma > + */ > + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, > + vma->vm_start, vma->vm_end); > + mmu_notifier_invalidate_range_start(&range); > + i_mmap_lock_write(vma->vm_file->f_mapping); > + for (address = start; address < end; address += PUD_SIZE) { > + unsigned long tmp = address; > + > + ptep = huge_pte_offset(mm, address, sz); > + if (!ptep) > + continue; > + ptl = huge_pte_lock(h, mm, ptep); > + /* We don't want 'address' to be changed */ > + huge_pmd_unshare(mm, vma, &tmp, ptep); > + spin_unlock(ptl); > + } > + flush_hugetlb_tlb_range(vma, vma->vm_start, vma->vm_end); > + i_mmap_unlock_write(vma->vm_file->f_mapping); > + /* > + * No need to call mmu_notifier_invalidate_range(), see > + * Documentation/vm/mmu_notifier.rst. > + */ > + mmu_notifier_invalidate_range_end(&range); > +} > + > #endif /* CONFIG_CMA */ > -- > 2.26.2 >