From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4FB8C433E0 for ; Fri, 29 Jan 2021 22:49:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3982064DED for ; Fri, 29 Jan 2021 22:49:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3982064DED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 971AA8D0002; Fri, 29 Jan 2021 17:49:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 923358D0001; Fri, 29 Jan 2021 17:49:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EAB38D0002; Fri, 29 Jan 2021 17:49:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id 69B1E8D0001 for ; Fri, 29 Jan 2021 17:49:46 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2AEE32480 for ; Fri, 29 Jan 2021 22:49:46 +0000 (UTC) X-FDA: 77760306372.15.EF9336C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 9EA03200C53C for ; Fri, 29 Jan 2021 22:49:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611960583; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=86e/Q91vEJwCNuCWODhKxinbSCcjEo7o3J9HsTUI3gE=; b=XSFpeTlfYFTl5SwV+9qxGcrAXpkE6+FsANIfQjH/mHHAzGfLLTimM2cBMDEHXVTkkRJ1zA 5DmdzH4aNYTRuCZtwzPzymLh+hRdL/GWjS9ckk/RYSxNctKjQrhbBq0GnGj7HvdeqnO8rp NSRpfVegMsLoqkrVg/PXvAFCRJ57rS4= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-352-35E5ci80Oc2ITLaEIiOUXQ-1; Fri, 29 Jan 2021 17:49:41 -0500 X-MC-Unique: 35E5ci80Oc2ITLaEIiOUXQ-1 Received: by mail-qt1-f200.google.com with SMTP id w3so7104804qti.17 for ; Fri, 29 Jan 2021 14:49:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=86e/Q91vEJwCNuCWODhKxinbSCcjEo7o3J9HsTUI3gE=; b=cLdUzLRsUS6JSmTvOjDAsz4Pvg2dJ+JcygZ1TAUhFoUjbxuq0OCLXcvhXNdL7yrb/0 tMgNDQSUDqUztdjUSzKPl3moGGxggi17OjMDGIGX1YVdEY24y5tiz7iGUf4Dhsb6dHuM LpavHAqoBXePxK/zRMGfsyEXRSxVGkUY0d5ZFOJhbMCSfZlqMlPxzrT6BKAr6K/id8uX 6dB2SfLT3JUwd2M/B/40k4bAoDwz7fSZFInVZ6ThviFkuuHw/vcYbssUjdQqh+e2JOM4 YRxmoMkQDJMteqqJY+YWfV1Qnnk/sdwSemFgBkIGL7ACO20j/BTJE0A7nbvjIQ5hm/vU RNEg== X-Gm-Message-State: AOAM530W1xjR4/MvECxeUbJSw2ldkJ555Q140nIIJeWfBWPZR9kiVG9l Dgf672X1EXA0Rk1jYxTQdKKWKf/o+8DlqXoC7nKWyHvoSIDF7JS2F/0ktR4xAkviNbnRDMii8gA 9XoK4UKARhXM= X-Received: by 2002:ac8:bc4:: with SMTP id p4mr6356533qti.195.1611960580606; Fri, 29 Jan 2021 14:49:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJynZsWldx+l8FXJGR6IflMlMsR+z6lMklRqeZ9afQwH+bW1mk/spRQWn8v6S0rRIXGypKuJcQ== X-Received: by 2002:ac8:bc4:: with SMTP id p4mr6356518qti.195.1611960580378; Fri, 29 Jan 2021 14:49:40 -0800 (PST) Received: from xz-x1 ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id u133sm1798784qka.116.2021.01.29.14.49.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Jan 2021 14:49:39 -0800 (PST) Date: Fri, 29 Jan 2021 17:49:38 -0500 From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: Re: [PATCH RFC 00/30] userfaultfd-wp: Support shmem and hugetlbfs Message-ID: <20210129224938.GC260413@xz-x1> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9EA03200C53C X-Stat-Signature: ikf8if9cwfwwc9xdo438ng6tt5zxh3iy Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf01; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=63.128.21.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1611960584-204057 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 15, 2021 at 12:08:37PM -0500, Peter Xu wrote: > This is a RFC series to support userfaultfd upon shmem and hugetlbfs. > > PS. Note that there's a known issue [0] with tlb against uffd-wp/soft-dirty in > general and Nadav is working on it. It may or may not directly affect > shmem/hugetlbfs since there're no COW on shared mappings normally. Private > shmem could hit, but still that's another problem to solve in general, and this > RFC is majorly to see whether there's any objection on the concept of the idea > specific to uffd-wp on shmem/hugetlbfs. > > The whole series can also be found online [1]. > > The major comment I'd like to get is on the new idea of swap special pte. That > comes from suggestions from both Hugh and Andrea and I appreciated a lot for > those discussions. > > In short, it's a new type of pte that doesn't exist in the past, while used in > file-backed memories to persist information across ptes being erased (but the > page cache could still exist, for example, so in the next page fault we can > reload the page cache with that specific information when necessary). > > I'm copy-pasting some commit message from the patch "mm/swap: Introduce the > idea of special swap ptes", where uffd-wp becomes the first user of it: > > We used to have special swap entries, like migration entries, hw-poison > entries, device private entries, etc. > > Those "special swap entries" reside in the range that they need to be at least > swap entries first, and their types are decided by swp_type(entry). > > This patch introduces another idea called "special swap ptes". > > It's very easy to get confused against "special swap entries", but a speical > swap pte should never contain a swap entry at all. It means, it's illegal to > call pte_to_swp_entry() upon a special swap pte. > > Make the uffd-wp special pte to be the first special swap pte. > > Before this patch, is_swap_pte()==true means one of the below: > > (a.1) The pte has a normal swap entry (non_swap_entry()==false). For > example, when an anonymous page got swapped out. > > (a.2) The pte has a special swap entry (non_swap_entry()==true). For > example, a migration entry, a hw-poison entry, etc. > > After this patch, is_swap_pte()==true means one of the below, where case (b) is > added: > > (a) The pte contains a swap entry. > > (a.1) The pte has a normal swap entry (non_swap_entry()==false). For > example, when an anonymous page got swapped out. > > (a.2) The pte has a special swap entry (non_swap_entry()==true). For > example, a migration entry, a hw-poison entry, etc. > > (b) The pte does not contain a swap entry at all (so it cannot be passed > into pte_to_swp_entry()). For example, uffd-wp special swap pte. > > Hugetlbfs needs similar thing because it's also file-backed. I directly reused > the same special pte there, though the shmem/hugetlb change on supporting this > new pte is different since they don't share code path a lot. Huge & Mike, Would any of you have comment/concerns on the high-level design of this series? It would be great to know it, especially major objection, before move on to an non-rfc version. Thanks, -- Peter Xu