From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, David Hildenbrand <david@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>, Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@techsingularity.net>, Dave Chinner <david@fromorbit.com>, Nadav Amit <namit@vmware.com>, Peter Xu <peterx@redhat.com>, Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>, Vlastimil Babka <vbabka@suse.cz>, Michael Ellerman <mpe@ellerman.id.au>, Nicholas Piggin <npiggin@gmail.com>, Mike Rapoport <rppt@kernel.org>, Anshuman Khandual <anshuman.khandual@arm.com> Subject: [PATCH RFC 0/5] mm/autonuma: replace savedwrite infrastructure Date: Mon, 26 Sep 2022 17:26:13 +0200 [thread overview] Message-ID: <20220926152618.194810-1-david@redhat.com> (raw) As discussed in my talk at LPC, we can reuse the same mechanism for deciding whether to map a pte writable when upgrading permissions via mprotect() -- e.g., PROT_READ -> PROT_READ|PROT_WRITE -- to replace the savedwrite infrastructure used for NUMA hinting faults (e.g., PROT_NONE -> PROT_READ|PROT_WRITE). Instead of maintaining previous write permissions for a pte/pmd, we re-determine if the pte/pmd can be writable. The big benefit is that we have a common logic for deciding whether we can map a pte/pmd writable on protection changes. For private mappings, there should be no difference -- from what I understand, that is what autonuma benchmarks care about. I ran autonumabench on a system with 2 NUMA nodes, 96 GiB each via: perf stat --null --repeat 10 The numa1 benchmark is quite noisy in my environment. I suspect that there is no actual change in performance, even though the numbers indicate that this series might improve performance slightly. numa1: mm-stable: 156.75 +- 11.67 seconds time elapsed ( +- 7.44% ) mm-stable++: 147.50 +- 9.35 seconds time elapsed ( +- 6.34% ) numa2: mm-stable: 15.9834 +- 0.0589 seconds time elapsed ( +- 0.37% ) mm-stable++: 16.1467 +- 0.0946 seconds time elapsed ( +- 0.59% ) It is worth noting that for shared writable mappings that require writenotify, we will only avoid write faults if the pte/pmd is dirty (inherited from the older mprotect logic). If we ever care about optimizing that further, we'd need a different mechanism to identify whether the FS still needs to get notified on the next write access. In any case, such an optimiztion will then not be autonuma-specific, but mprotect() permission upgrades would similarly benefit from it. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Chinner <david@fromorbit.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> David Hildenbrand (4): mm/mprotect: minor can_change_pte_writable() cleanups mm/huge_memory: try avoiding write faults when changing PMD protection mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite mm: remove unused savedwrite infrastructure Nadav Amit (1): mm/mprotect: allow clean exclusive anon pages to be writable arch/powerpc/include/asm/book3s/64/pgtable.h | 80 +------------------- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +- include/linux/mm.h | 2 + include/linux/pgtable.h | 24 ------ mm/debug_vm_pgtable.c | 32 -------- mm/huge_memory.c | 66 ++++++++++++---- mm/ksm.c | 9 +-- mm/memory.c | 19 ++++- mm/mprotect.c | 23 +++--- 9 files changed, 93 insertions(+), 164 deletions(-) -- 2.37.3
WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: Andrea Arcangeli <aarcange@redhat.com>, David Hildenbrand <david@redhat.com>, linuxppc-dev@lists.ozlabs.org, Anshuman Khandual <anshuman.khandual@arm.com>, Dave Chinner <david@fromorbit.com>, Mel Gorman <mgorman@techsingularity.net>, Peter Xu <peterx@redhat.com>, linux-mm@kvack.org, Hugh Dickins <hughd@google.com>, Nadav Amit <namit@vmware.com>, Nicholas Piggin <npiggin@gmail.com>, Mike Rapoport <rppt@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, Vlastimil Babka <vbabka@suse.cz> Subject: [PATCH RFC 0/5] mm/autonuma: replace savedwrite infrastructure Date: Mon, 26 Sep 2022 17:26:13 +0200 [thread overview] Message-ID: <20220926152618.194810-1-david@redhat.com> (raw) As discussed in my talk at LPC, we can reuse the same mechanism for deciding whether to map a pte writable when upgrading permissions via mprotect() -- e.g., PROT_READ -> PROT_READ|PROT_WRITE -- to replace the savedwrite infrastructure used for NUMA hinting faults (e.g., PROT_NONE -> PROT_READ|PROT_WRITE). Instead of maintaining previous write permissions for a pte/pmd, we re-determine if the pte/pmd can be writable. The big benefit is that we have a common logic for deciding whether we can map a pte/pmd writable on protection changes. For private mappings, there should be no difference -- from what I understand, that is what autonuma benchmarks care about. I ran autonumabench on a system with 2 NUMA nodes, 96 GiB each via: perf stat --null --repeat 10 The numa1 benchmark is quite noisy in my environment. I suspect that there is no actual change in performance, even though the numbers indicate that this series might improve performance slightly. numa1: mm-stable: 156.75 +- 11.67 seconds time elapsed ( +- 7.44% ) mm-stable++: 147.50 +- 9.35 seconds time elapsed ( +- 6.34% ) numa2: mm-stable: 15.9834 +- 0.0589 seconds time elapsed ( +- 0.37% ) mm-stable++: 16.1467 +- 0.0946 seconds time elapsed ( +- 0.59% ) It is worth noting that for shared writable mappings that require writenotify, we will only avoid write faults if the pte/pmd is dirty (inherited from the older mprotect logic). If we ever care about optimizing that further, we'd need a different mechanism to identify whether the FS still needs to get notified on the next write access. In any case, such an optimiztion will then not be autonuma-specific, but mprotect() permission upgrades would similarly benefit from it. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Chinner <david@fromorbit.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> David Hildenbrand (4): mm/mprotect: minor can_change_pte_writable() cleanups mm/huge_memory: try avoiding write faults when changing PMD protection mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite mm: remove unused savedwrite infrastructure Nadav Amit (1): mm/mprotect: allow clean exclusive anon pages to be writable arch/powerpc/include/asm/book3s/64/pgtable.h | 80 +------------------- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +- include/linux/mm.h | 2 + include/linux/pgtable.h | 24 ------ mm/debug_vm_pgtable.c | 32 -------- mm/huge_memory.c | 66 ++++++++++++---- mm/ksm.c | 9 +-- mm/memory.c | 19 ++++- mm/mprotect.c | 23 +++--- 9 files changed, 93 insertions(+), 164 deletions(-) -- 2.37.3
next reply other threads:[~2022-09-26 16:40 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-09-26 15:26 David Hildenbrand [this message] 2022-09-26 15:26 ` [PATCH RFC 0/5] mm/autonuma: replace savedwrite infrastructure David Hildenbrand 2022-09-26 15:26 ` [PATCH RFC 1/5] mm/mprotect: allow clean exclusive anon pages to be writable David Hildenbrand 2022-09-26 15:26 ` David Hildenbrand 2022-09-26 15:26 ` [PATCH RFC 2/5] mm/mprotect: minor can_change_pte_writable() cleanups David Hildenbrand 2022-09-26 15:26 ` David Hildenbrand 2022-09-26 15:26 ` [PATCH RFC 3/5] mm/huge_memory: try avoiding write faults when changing PMD protection David Hildenbrand 2022-09-26 15:26 ` David Hildenbrand 2022-09-26 15:26 ` [PATCH RFC 4/5] mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite David Hildenbrand 2022-09-26 15:26 ` David Hildenbrand 2022-09-26 15:26 ` [PATCH RFC 5/5] mm: remove unused savedwrite infrastructure David Hildenbrand 2022-09-26 15:26 ` David Hildenbrand
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220926152618.194810-1-david@redhat.com \ --to=david@redhat.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=anshuman.khandual@arm.com \ --cc=david@fromorbit.com \ --cc=hughd@google.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mgorman@techsingularity.net \ --cc=mpe@ellerman.id.au \ --cc=namit@vmware.com \ --cc=npiggin@gmail.com \ --cc=peterx@redhat.com \ --cc=rppt@kernel.org \ --cc=torvalds@linux-foundation.org \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.