From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 504B4C43460 for ; Fri, 14 May 2021 18:31:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CD91661444 for ; Fri, 14 May 2021 18:31:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CD91661444 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1879C6B0036; Fri, 14 May 2021 14:31:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15F4C6B006E; Fri, 14 May 2021 14:31:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF4806B0070; Fri, 14 May 2021 14:31:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0198.hostedemail.com [216.40.44.198]) by kanga.kvack.org (Postfix) with ESMTP id BEAB86B0036 for ; Fri, 14 May 2021 14:31:16 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5F72B18241964 for ; Fri, 14 May 2021 18:31:16 +0000 (UTC) X-FDA: 78140678952.13.3AD71A1 Received: from mail-io1-f50.google.com (mail-io1-f50.google.com [209.85.166.50]) by imf24.hostedemail.com (Postfix) with ESMTP id D9C27A0003A1 for ; Fri, 14 May 2021 18:31:14 +0000 (UTC) Received: by mail-io1-f50.google.com with SMTP id z24so28957291ioj.7 for ; Fri, 14 May 2021 11:31:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SmJasCD4M03RbqThN96F5XZ2SVb5iXBWwVT406MBh74=; b=ZvB0aM1P2qW/Q5g9TY/7DfmVDDAuEW2BfIavLVuI6HskR7zYC0ne/v6eTW1r3NffDM aeQVjFjLo8V4xUfcbdQKydmkspw+EhjDtv5QGQV1Is5h58G8DIN+LENDsOkgNMcXxtnN hclhatRuD4+uUyFaaJxF+/A4NM55P2U3/Ela1LZE8FB/l6x9NWg52cuzyWj+sJE9rrK8 WZqscbD6J4WydwVFq+sbLMfhKSOxpaTuTdi/FvEk1c4ldEbAuYPjbXRpUOwjBz8jSWLA jmG8zHfiNESfih4cN/1i3FyHvouOyfwls1/ucXBqHjc8sfGe66D1KEW+/PaYScJiiXMn lIWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SmJasCD4M03RbqThN96F5XZ2SVb5iXBWwVT406MBh74=; b=Q/ouZ34UKyHiqzgtf3fAG569/JmUwPHwwXYOmncfab1wy3MHWpohacsFahMeQIsA0j XTHTDkEL5E0YjxJpHGw5punWI01obWrO4JvCjyWCSGreVde32nlkHHCo2P246LU0K34s Ra8KBXy0PNtK4oVKa6YavsyFN8qfk5vJRfNgrc7+Gh0D8hzjEm8jH9oShrOtx85tdu+Q Mx+Zp8l72f9X8NxfjhErVCIBMVW7MWQamTUdf1d3zqJkTpuLRtUhvk4GO+xu0hflwlir SouppUXFi/myT6i81w2j/CyLdwQssUrLY7zOpCpxXbx4Wmo4oFJDQX7uAPy6QOsY/Mc8 uCwA== X-Gm-Message-State: AOAM533MvWRpEAZ+NMeLP8yTlkUklwIOPaNXxcT+aX5lr7HLt4tiLoSp 52c8SS12Ex2QvzoEKl/Zz/HTDXR8lEa6+rZ+YKU0iQ== X-Google-Smtp-Source: ABdhPJxuDhFQjQvwXfsfZotNnPUgWP7VLDTOTARGJSAuufoA2/hB6sE6jJtTAFvssvhu4xtsZy6H4x5a8Jis0d77iaY= X-Received: by 2002:a6b:c84f:: with SMTP id y76mr35976045iof.23.1621017075019; Fri, 14 May 2021 11:31:15 -0700 (PDT) MIME-Version: 1.0 References: <20210513234309.366727-1-almasrymina@google.com> <09dc0712-48e8-8ba2-f170-4c2febcfff83@oracle.com> In-Reply-To: From: Axel Rasmussen Date: Fri, 14 May 2021 11:30:39 -0700 Message-ID: Subject: Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY To: Mike Kravetz Cc: Peter Xu , Mina Almasry , Linux-MM , Andrew Morton , open list Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: D9C27A0003A1 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=ZvB0aM1P; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.166.50 as permitted sender) smtp.mailfrom=axelrasmussen@google.com X-Rspamd-Server: rspam03 X-Stat-Signature: d9o7f65qtp5dozdi41qyao5f1xpb4pz1 X-HE-Tag: 1621017074-8202 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's complicated and would take some more time for me to be certain, but after looking for half an hour or so this morning, I agree with Mike that such a race is possible. That is, we may back out into the retry path, and drop mmap_lock, and leave a situation where a page is in the cache, but we have !PageUptodate(). hugetlb_mcopy_atomic_pte clearly handles the VM_SHARED case, so I don't see a reason why there can't be another (non-userfaultfd-registered) mapping. If it were faulted at the right time, it seems like such a fault would indeed zero the page, and then the UFFDIO_COPY retry (once it acquired the lock again) would try to reuse it. On Fri, May 14, 2021 at 10:56 AM Mike Kravetz wrote: > > On 5/14/21 5:31 AM, Peter Xu wrote: > > Hi, Mike, > > > > On Thu, May 13, 2021 at 09:02:15PM -0700, Mike Kravetz wrote: > > > > [...] > > > >> I am also concerned with the semantics of this approach and what happens > >> when a fault races with the userfaultfd copy. Previously I asked Peter > >> if we could/should use a page found in the cache for the copy. His > >> answer was as follows: > >> > >> AFAICT that's the expected behavior, and it need to be like that so as to avoid > >> silent data corruption (if the page cache existed, it means the page is not > >> "missing" at all, then it does not suite for a UFFDIO_COPY as it's only used > >> for uffd page missing case). > > > > I didn't follow the rest discussion in depth yet... but just to mention that > > the above answer was for the question whether we can "update the page in the > > page cache", rather than "use a page found in the page cache". > > > > I think reuse the page should be fine, however it'll definitely break existing > > user interface (as it'll expect -EEXIST for now - we have kselftest covers > > that), meanwhile I don't see why the -EEXIST bothers a lot: it still tells the > > user that this page was filled in already. Normally it was filled in by > > another UFFDIO_COPY (as we could have multiple uffd service threads) along with > > a valid pte, then this userspace thread can simply skip this message as it > > means the event has been handled by some other servicing thread. > > > > (This also reminded me that there won't be a chance of UFFDIO_COPY race on page > > no page fault at least, since no page fault will always go into the uffd > > missing handling rather than filling in the page cache for a VM_UFFD_MISSING > > vma; while mmap read lock should guarantee VM_UFFD_MISSING be persistent) > > Perhaps I am missing something. > > Since this is a shared mapping, can we not have a 'regular' mapping to > the same range that is uffd registered? And, that regular mappings could > fault and race with the uffd copy code? > > -- > Mike Kravetz