linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: khalid.aziz@oracle.com, Andrew Morton <akpm@linux-foundation.org>,
	 Arnd Bergmann <arnd@arndb.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 David Hildenbrand <david@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	 Linux-MM <linux-mm@kvack.org>,
	longpeng2@huawei.com, Mike Rapoport <rppt@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>
Subject: Re: [RFC PATCH 0/6] Add support for shared PTEs across processes
Date: Fri, 21 Jan 2022 20:35:17 +1300	[thread overview]
Message-ID: <CAGsJ_4wv144TUSQPNOnHnmNmJrXe4Fn8d14JeAJ5ka-S+dRxRA@mail.gmail.com> (raw)
In-Reply-To: <YeoW4CMiU8qbRFST@casper.infradead.org>

On Fri, Jan 21, 2022 at 3:13 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Jan 21, 2022 at 09:08:06AM +0800, Barry Song wrote:
> > > A file under /sys/fs/mshare can be opened and read from. A read from
> > > this file returns two long values - (1) starting address, and (2)
> > > size of the mshare'd region.
> > >
> > > --
> > > int mshare_unlink(char *name)
> > >
> > > A shared address range created by mshare() can be destroyed using
> > > mshare_unlink() which removes the  shared named object. Once all
> > > processes have unmapped the shared object, the shared address range
> > > references are de-allocated and destroyed.
> >
> > > mshare_unlink() returns 0 on success or -1 on error.
> >
> > I am still struggling with the user scenarios of these new APIs. This patch
> > supposes multiple processes will have same virtual address for the shared
> > area? How can this be guaranteed while different processes can map different
> > stack, heap, libraries, files?
>
> The two processes choose to share a chunk of their address space.
> They can map anything they like in that shared area, and then also
> anything they like in the areas that aren't shared.  They can choose
> for that shared area to have the same address in both processes
> or different locations in each process.
>
> If two processes want to put a shared library in that shared address
> space, that should work.  They probably would need to agree to use
> the same virtual address for the shared page tables for that to work.

we are depending on an elf loader and ld to map the library
dynamically , so hardly
can we find a chance in users' code to call mshare() to map libraries
in application
level?

so we are supposed to modify some very low level code to use this feature?

>
> Processes should probably not put their stacks in the shared region.
> I mean, it could work, I suppose ... threads manage it in a single
> address space.  But I don't see why you'd want to do that.  For
> heaps, if you want the other process to be able to access the memory,
> I suppose you could put it in the shared region, but heaps aren't
> going to be put in the shared region by default.
>
> Think of this like hugetlbfs, only instead of sharing hugetlbfs
> memory, you can share _anything_ that's mmapable.

yep, we can call mshare() on any kind of memory. for example, if multiple
processes use SYSV shmem, posix shmem or mmap the same file. but
it seems it is more sensible to let kernel do it automatically rather than
depending on calling mshare() from users? It is difficult for users to
decide which areas should be applied mshare(). users might want to call
mshare() for all shared areas to save memory coming from duplicated PTEs?
unlike SYSV shmem and POSIX shmem which are a feature for inter-processes
communications,  mshare() looks not like a feature for applications,
but like a feature
for the whole system level? why would applications have to call something which
doesn't directly help them? without mshare(), those applications
will still work without any problem, right? is there anything in
mshare() which is
a must-have for applications? or mshare() is only a suggestion from applications
like madvise()?

>
> > BTW, it seems you have different intention with the below?
> > Shared page tables during fork[1]
> > [1] https://lwn.net/Articles/861547/
>
> Yes, that's completely different.

Thanks for clarification.

Best Regards.
Barry


  reply	other threads:[~2022-01-21  7:35 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 21:19 [RFC PATCH 0/6] Add support for shared PTEs across processes Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 1/6] mm: Add new system calls mshare, mshare_unlink Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 2/6] mm: Add msharefs filesystem Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 3/6] mm: Add read for msharefs Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 4/6] mm: implement mshare_unlink syscall Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 5/6] mm: Add locking to msharefs syscalls Khalid Aziz
2022-01-18 21:19 ` [RFC PATCH 6/6] mm: Add basic page table sharing using mshare Khalid Aziz
2022-01-18 21:41 ` [RFC PATCH 0/6] Add support for shared PTEs across processes Dave Hansen
2022-01-18 21:46   ` Matthew Wilcox
2022-01-18 22:47     ` Khalid Aziz
2022-01-18 22:06 ` Dave Hansen
2022-01-18 22:52   ` Khalid Aziz
2022-01-19 11:38 ` Mark Hemment
2022-01-19 17:02   ` Khalid Aziz
2022-01-20 12:49     ` Mark Hemment
2022-01-20 19:15       ` Khalid Aziz
2022-01-24 15:15         ` Mark Hemment
2022-01-24 15:27           ` Matthew Wilcox
2022-01-24 22:20           ` Khalid Aziz
2022-01-21  1:08 ` Barry Song
2022-01-21  2:13   ` Matthew Wilcox
2022-01-21  7:35     ` Barry Song [this message]
2022-01-21 14:47       ` Matthew Wilcox
2022-01-21 16:41         ` Khalid Aziz
2022-01-22  1:39           ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2022-01-22  1:41             ` Matthew Wilcox
2022-01-22 10:18               ` Thomas Schoebel-Theuer
2022-01-22 16:09                 ` Matthew Wilcox
2022-01-22 11:31 ` Mike Rapoport
2022-01-22 18:29   ` Andy Lutomirski
2022-01-24 18:48   ` Khalid Aziz
2022-01-24 19:45     ` Andy Lutomirski
2022-01-24 22:30       ` Khalid Aziz
2022-01-24 23:16         ` Andy Lutomirski
2022-01-24 23:44           ` Khalid Aziz
2022-01-25 11:42 ` Kirill A. Shutemov
2022-01-25 12:09   ` William Kucharski
2022-01-25 13:18     ` David Hildenbrand
2022-01-25 14:01       ` Kirill A. Shutemov
2022-01-25 13:23   ` Matthew Wilcox
2022-01-25 13:59     ` Kirill A. Shutemov
2022-01-25 14:09       ` Matthew Wilcox
2022-01-25 18:57         ` Kirill A. Shutemov
2022-01-25 18:59           ` Matthew Wilcox
2022-01-26  4:04             ` Matthew Wilcox
2022-01-26 10:16               ` David Hildenbrand
2022-01-26 13:38                 ` Matthew Wilcox
2022-01-26 13:55                   ` David Hildenbrand
2022-01-26 14:12                     ` Matthew Wilcox
2022-01-26 14:30                       ` David Hildenbrand
2022-01-26 14:12                   ` Mike Rapoport
2022-01-26 13:42               ` Kirill A. Shutemov
2022-01-26 14:18                 ` Mike Rapoport
2022-01-26 17:33                   ` Khalid Aziz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4wv144TUSQPNOnHnmNmJrXe4Fn8d14JeAJ5ka-S+dRxRA@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).