From: Huang Shijie <firstname.lastname@example.org> To: Linus Torvalds <email@example.com> Cc: Al Viro <firstname.lastname@example.org>, Shijie Huang <email@example.com>, Andrew Morton <firstname.lastname@example.org>, Linux-MM <email@example.com>, "Song Bao Hua (Barry Song)" <firstname.lastname@example.org>, Linux Kernel Mailing List <email@example.com>, Frank Wang <firstname.lastname@example.org> Subject: Re: Is it possible to implement the per-node page cache for programs/libraries? Date: Thu, 2 Sep 2021 10:08:06 +0000 [thread overview] Message-ID: <YTCihsPZL0HtO2lp@hsj> (raw) In-Reply-To: <CAHk-=wjAPEs3HRGswJ-AE1R048j2MBsBtMfg3GOsaFykHoeKsg@mail.gmail.com> Hi Linus, On Wed, Sep 01, 2021 at 10:29:01AM -0700, Linus Torvalds wrote: > On Wed, Sep 1, 2021 at 10:24 AM Linus Torvalds > <email@example.com> wrote: > > > > But what you could do, if you wanted to, would be to catch the > > situation where you have lots of expensive NUMA accesses either using > > our VM infrastructure or performance counters, and when the mapping is > > a MAP_PRIVATE you just do a COW fault on them. > > > > Sounds entirely doable, and has absolutely nothing to do with the page > > cache. It would literally just be an "over-eager COW fault triggered > > by NUMA access counters". Yes. You are right, we can use COW. :) Actually we have _TWO_ levels to do the optimization for NUMA remote-access: 1.) the page cache which is independent to process. 2.) the process address space(page table). For 2.), we can use the over-eager COW: 2.1) I have finished a user patch for glibc which uses "over-eager COW" to do the text replication in NUMA. 2.2) Also a kernel patch uses the "over-eager COW" to do the replication for the programs itself in NUMA. (We may refine it to another topic..) > > Note how it would work perfectly fine for anonymous mappings too. Just > to reinforce the point that this has nothing to do with any page cache > issues. > > Of course, if you want to actually then *share* pages within a node > (rather than replicate them for each process), that gets more > exciting. Do we really need to change the page cache? The 2.1) above may produces one-copy "shared libraries pages" for each process, such glibc.so. Even in the same NUMA node 0, we may run two same processes. So it produces "two glibc.so" now. If We run 5 same processes in NUMA Node 0, it will produces "five glibs.so". But if we have per-node page cache for the glibc.so, we can do it like this: (1) disable the "over-eager COW" in the process. (2) use the per-node page cache's pages to different processes in the _SAME_ NUMA node. So all the processes in the same NUMA node, can use only one same page. (3) Processes in other NUMA nodes, use the pages belong to this node. By this way, we can save many pages, and provide more access speed in NUMA. Thanks Huang Shijie
prev parent reply other threads:[~2021-09-02 2:09 UTC|newest] Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-01 3:07 Shijie Huang 2021-09-01 3:07 ` Shijie Huang 2021-09-01 2:09 ` Barry Song 2021-09-01 2:09 ` Barry Song 2021-09-01 3:25 ` Matthew Wilcox 2021-09-01 13:30 ` Huang Shijie 2021-09-01 14:25 ` Huang Shijie 2021-09-01 11:32 ` Matthew Wilcox 2021-09-01 23:58 ` Matthew Wilcox 2021-09-02 0:15 ` Barry Song 2021-09-02 0:15 ` Barry Song 2021-09-02 1:13 ` Linus Torvalds 2021-09-02 1:13 ` Linus Torvalds 2021-09-02 10:16 ` Huang Shijie 2021-09-02 3:25 ` Nicholas Piggin 2021-09-02 10:17 ` Matthew Wilcox 2021-09-03 7:10 ` Nicholas Piggin 2021-09-03 19:01 ` Matthew Wilcox 2021-09-03 19:08 ` Linus Torvalds 2021-09-03 19:08 ` Linus Torvalds 2021-09-06 9:56 ` Huang Shijie 2021-09-03 23:42 ` Nicholas Piggin 2021-09-01 4:55 ` Al Viro 2021-09-01 13:10 ` Huang Shijie 2021-09-01 17:24 ` Linus Torvalds 2021-09-01 17:24 ` Linus Torvalds 2021-09-01 17:29 ` Linus Torvalds 2021-09-01 17:29 ` Linus Torvalds 2021-09-01 22:56 ` Barry Song 2021-09-01 22:56 ` Barry Song 2021-09-02 10:12 ` Huang Shijie 2021-09-02 10:08 ` Huang Shijie [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YTCihsPZL0HtO2lp@hsj \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: Is it possible to implement the per-node page cache for programs/libraries?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.