From: Qi Zheng <zhengqi.arch@bytedance.com>
To: Chih-En Lin <shiyn.lin@gmail.com>, David Hildenbrand <david@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Christian Brauner <brauner@kernel.org>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
William Kucharski <william.kucharski@oracle.com>,
John Hubbard <jhubbard@nvidia.com>,
Yunsheng Lin <linyunsheng@huawei.com>,
Arnd Bergmann <arnd@arndb.de>,
Suren Baghdasaryan <surenb@google.com>,
Colin Cross <ccross@google.com>, Feng Tang <feng.tang@intel.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Mike Rapoport <rppt@kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Daniel Axtens <dja@axtens.net>,
Jonathan Marek <jonathan@marek.ca>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Peter Xu <peterx@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andy Lutomirski <luto@kernel.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Fenghua Yu <fenghua.yu@intel.com>,
linux-kernel@vger.kernel.org, Kaiyang Zhao <zhao776@purdue.edu>,
Huichun Feng <foxhoundsk.tw@gmail.com>,
Jim Huang <jserv.tw@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: [External] [RFC PATCH 0/6] Introduce Copy-On-Write to Page Table
Date: Sat, 21 May 2022 16:59:19 +0800 [thread overview]
Message-ID: <f9ccf33b-c81c-6b25-6471-80c600f06732@bytedance.com> (raw)
In-Reply-To: <20220519183127.3909598-1-shiyn.lin@gmail.com>
On 2022/5/20 2:31 AM, Chih-En Lin wrote:
> When creating the user process, it usually uses the Copy-On-Write (COW)
> mechanism to save the memory usage and the cost of time for copying.
> COW defers the work of copying private memory and shares it across the
> processes as read-only. If either process wants to write in these
> memories, it will page fault and copy the shared memory, so the process
> will now get its private memory right here, which is called break COW.
>
> Presently this kind of technology is only used as the mapping memory.
> It still needs to copy the entire page table from the parent.
> It might cost a lot of time and memory to copy each page table when the
> parent already has a lot of page tables allocated. For example, here is
> the state table for mapping the 1 GB memory of forking.
>
> mmap before fork mmap after fork
> MemTotal: 32746776 kB 32746776 kB
> MemFree: 31468152 kB 31463244 kB
> AnonPages: 1073836 kB 1073628 kB
> Mapped: 39520 kB 39992 kB
> PageTables: 3356 kB 5432 kB
>
> This patch introduces Copy-On-Write to the page table. This patch only
> implements the COW on the PTE level. It's based on the paper
> On-Demand Fork [1]. Summary of the implementation for the paper:
>
> - Only implements the COW to the anonymous mapping
> - Only do COW to the PTE table which the range is all covered by a
> single VMA.
> - Use the reference count to control the COW PTE table lifetime.
> Decrease the counter when breaking COW or dereference the COW PTE
> table. When the counter reduces to zero, free the PTE table.
>
Hi,
To reduce the empty user PTE tables, I also introduced a reference
count (pte_ref) for user PTE tables in my patch[1][2], It is used
to track the usage of each user PTE tables.
The following people will hold a pte_ref:
- The !pte_none() entry, such as regular page table entry that map
physical pages, or swap entry, or migrate entry, etc.
- Visitor to the PTE page table entries, such as page table walker.
With COW PTE, a new holder (the process using the COW PTE) is added.
It's funny, it leads me to see more meaning of pte_ref.
Thanks,
Qi
[1] [RFC PATCH 00/18] Try to free user PTE page table pages
link:
https://lore.kernel.org/lkml/20220429133552.33768-1-zhengqi.arch@bytedance.com/
(percpu_ref version)
[2] [PATCH v3 00/15] Free user PTE page table pages
link:
https://lore.kernel.org/lkml/20211110105428.32458-1-zhengqi.arch@bytedance.com/
(atomic count version)
--
Thanks,
Qi
next prev parent reply other threads:[~2022-05-21 8:59 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-19 18:31 [RFC PATCH 0/6] Introduce Copy-On-Write to Page Table Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 1/6] mm: Add a new mm flag for Copy-On-Write PTE table Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 2/6] mm: clone3: Add CLONE_COW_PGTABLE flag Chih-En Lin
2022-05-20 14:13 ` Christophe Leroy
2022-05-21 3:50 ` Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 3/6] mm, pgtable: Add ownership for the PTE table Chih-En Lin
2022-05-19 23:07 ` kernel test robot
2022-05-20 0:08 ` kernel test robot
2022-05-20 14:15 ` Christophe Leroy
2022-05-21 4:03 ` Chih-En Lin
2022-05-21 4:02 ` Matthew Wilcox
2022-05-21 5:01 ` Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 4/6] mm: Add COW PTE fallback function Chih-En Lin
2022-05-20 0:20 ` kernel test robot
2022-05-20 14:21 ` Christophe Leroy
2022-05-21 4:15 ` Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 5/6] mm, pgtable: Add the reference counter for COW PTE Chih-En Lin
2022-05-20 14:30 ` Christophe Leroy
2022-05-21 4:22 ` Chih-En Lin
2022-05-21 4:08 ` Matthew Wilcox
2022-05-21 5:10 ` Chih-En Lin
2022-05-19 18:31 ` [RFC PATCH 6/6] mm: Expand Copy-On-Write to PTE table Chih-En Lin
2022-05-20 14:49 ` Christophe Leroy
2022-05-21 4:38 ` Chih-En Lin
2022-05-21 8:59 ` Qi Zheng [this message]
2022-05-21 19:08 ` [External] [RFC PATCH 0/6] Introduce Copy-On-Write to Page Table Chih-En Lin
2022-05-21 16:07 ` David Hildenbrand
2022-05-21 18:50 ` Chih-En Lin
2022-05-21 20:28 ` David Hildenbrand
2022-05-21 20:12 ` Matthew Wilcox
2022-05-21 20:22 ` David Hildenbrand
2022-05-21 22:19 ` Andy Lutomirski
2022-05-22 0:31 ` Matthew Wilcox
2022-05-22 15:20 ` Andy Lutomirski
2022-05-22 19:40 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f9ccf33b-c81c-6b25-6471-80c600f06732@bytedance.com \
--to=zhengqi.arch@bytedance.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=anshuman.khandual@arm.com \
--cc=arnd@arndb.de \
--cc=bigeasy@linutronix.de \
--cc=brauner@kernel.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=ccross@google.com \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=dja@axtens.net \
--cc=ebiederm@xmission.com \
--cc=feng.tang@intel.com \
--cc=fenghua.yu@intel.com \
--cc=foxhoundsk.tw@gmail.com \
--cc=geert@linux-m68k.org \
--cc=jhubbard@nvidia.com \
--cc=jonathan@marek.ca \
--cc=jserv.tw@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=luto@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pasha.tatashin@soleen.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=shiyn.lin@gmail.com \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=william.kucharski@oracle.com \
--cc=willy@infradead.org \
--cc=zhao776@purdue.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.