linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Matthew Wilcox <willy@infradead.org>,
	"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Khalid Aziz <khalid.aziz@oracle.com>,
	akpm@linux-foundation.org, longpeng2@huawei.com, arnd@arndb.de,
	dave.hansen@linux.intel.com, rppt@kernel.org, surenb@google.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Peter Xu <peterx@redhat.com>
Subject: Re: [RFC PATCH 0/6] Add support for shared PTEs across processes
Date: Wed, 26 Jan 2022 11:16:42 +0100	[thread overview]
Message-ID: <e164d7f4-406e-eed8-37d7-753f790b7560@redhat.com> (raw)
In-Reply-To: <YfDIYKygRHX4RIri@casper.infradead.org>

On 26.01.22 05:04, Matthew Wilcox wrote:
> On Tue, Jan 25, 2022 at 06:59:50PM +0000, Matthew Wilcox wrote:
>> On Tue, Jan 25, 2022 at 09:57:05PM +0300, Kirill A. Shutemov wrote:
>>> On Tue, Jan 25, 2022 at 02:09:47PM +0000, Matthew Wilcox wrote:
>>>>> I think zero-API approach (plus madvise() hints to tweak it) is worth
>>>>> considering.
>>>>
>>>> I think the zero-API approach actually misses out on a lot of
>>>> possibilities that the mshare() approach offers.  For example, mshare()
>>>> allows you to mmap() many small files in the shared region -- you
>>>> can't do that with zeroAPI.
>>>
>>> Do you consider a use-case for many small files to be common? I would
>>> think that the main consumer of the feature to be mmap of huge files.
>>> And in this case zero enabling burden on userspace side sounds like a
>>> sweet deal.
>>
>> mmap() of huge files is certainly the Oracle use-case.  With occasional
>> funny business like mprotect() of a single page in the middle of a 1GB
>> hugepage.
> 
> Bill and I were talking about this earlier and realised that this is
> the key point.  There's a requirement that when one process mprotects
> a page that it gets protected in all processes.  You can't do that
> without *some* API because that's different behaviour than any existing
> API would produce.

A while ago I talked with Peter about an extended uffd (here: WP)
mechanism that would work on fds instead of the process address space.

The rough idea would be to register the uffd (or however that would be
called) handler on an fd instead of a virtual address space of a single
process and write-protect pages in that fd. Once anybody would try
writing to such a protected range (write, mmap, ...), the uffd handler
would fire and user space could handle the event (-> unprotect). The
page cache would have to remember the uffd information ("wp using
uffd"). When (un)protecting pages using this mechanism, all page tables
mapping the page would have to be updated accordingly using the rmap. At
that point, we wouldn't care if it's a single page table (e.g., shared
similar to hugetlb) or simply multiple page tables.

It's a completely rough idea, I just wanted to mention it.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-01-26 10:16 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 21:19 [RFC PATCH 0/6] Add support for shared PTEs across processes Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 1/6] mm: Add new system calls mshare, mshare_unlink Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 2/6] mm: Add msharefs filesystem Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 3/6] mm: Add read for msharefs Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 4/6] mm: implement mshare_unlink syscall Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 5/6] mm: Add locking to msharefs syscalls Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 6/6] mm: Add basic page table sharing using mshare Khalid Aziz
2022-01-18 21:41 ` [RFC PATCH 0/6] Add support for shared PTEs across processes Dave Hansen
2022-01-18 21:46   ` Matthew Wilcox
2022-01-18 22:47     ` Khalid Aziz
2022-01-18 22:06 ` Dave Hansen
2022-01-18 22:52   ` Khalid Aziz
2022-01-19 11:38 ` Mark Hemment
2022-01-19 17:02   ` Khalid Aziz
2022-01-20 12:49     ` Mark Hemment
2022-01-20 19:15       ` Khalid Aziz
2022-01-24 15:15         ` Mark Hemment
2022-01-24 15:27           ` Matthew Wilcox
2022-01-24 22:20           ` Khalid Aziz
2022-01-21  1:08 ` Barry Song
2022-01-21  2:13   ` Matthew Wilcox
2022-01-21  7:35     ` Barry Song
2022-01-21 14:47       ` Matthew Wilcox
2022-01-21 16:41         ` Khalid Aziz
2022-01-22  1:39           ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2022-01-22  1:41             ` Matthew Wilcox
2022-01-22 10:18               ` Thomas Schoebel-Theuer
2022-01-22 16:09                 ` Matthew Wilcox
2022-01-22 11:31 ` Mike Rapoport
2022-01-22 18:29   ` Andy Lutomirski
2022-01-24 18:48   ` Khalid Aziz
2022-01-24 19:45     ` Andy Lutomirski
2022-01-24 22:30       ` Khalid Aziz
2022-01-24 23:16         ` Andy Lutomirski
2022-01-24 23:44           ` Khalid Aziz
2022-01-25 11:42 ` Kirill A. Shutemov
2022-01-25 12:09   ` William Kucharski
2022-01-25 13:18     ` David Hildenbrand
2022-01-25 14:01       ` Kirill A. Shutemov
2022-01-25 13:23   ` Matthew Wilcox
2022-01-25 13:59     ` Kirill A. Shutemov
2022-01-25 14:09       ` Matthew Wilcox
2022-01-25 18:57         ` Kirill A. Shutemov
2022-01-25 18:59           ` Matthew Wilcox
2022-01-26  4:04             ` Matthew Wilcox
2022-01-26 10:16               ` David Hildenbrand [this message]
2022-01-26 13:38                 ` Matthew Wilcox
2022-01-26 13:55                   ` David Hildenbrand
2022-01-26 14:12                     ` Matthew Wilcox
2022-01-26 14:30                       ` David Hildenbrand
2022-01-26 14:12                   ` Mike Rapoport
2022-01-26 13:42               ` Kirill A. Shutemov
2022-01-26 14:18                 ` Mike Rapoport
2022-01-26 17:33                   ` Khalid Aziz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e164d7f4-406e-eed8-37d7-753f790b7560@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=khalid.aziz@oracle.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).