From: Nadav Amit <nadav.amit@gmail.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Nadav Amit <namit@vmware.com>
Subject: [RESEND PATCH v3 0/5] mm/mprotect: avoid unnecessary TLB flushes
Date: Fri, 11 Mar 2022 11:07:44 -0800 [thread overview]
Message-ID: <20220311190749.338281-1-namit@vmware.com> (raw)
From: Nadav Amit <namit@vmware.com>
This patch-set is intended to remove unnecessary TLB flushes during
mprotect() syscalls. Once this patch-set make it through, similar
and further optimizations for MADV_COLD and userfaultfd would be
possible.
Sorry for the time between it took me to get to v3.
Basically, there are 3 optimizations in this patch-set:
1. Use TLB batching infrastructure to batch flushes across VMAs and
do better/fewer flushes. This would also be handy for later
userfaultfd enhancements.
2. Avoid TLB flushes on permission demotion. This optimization is
the one that provides most of the performance benefits. Note that
the previous batching infrastructure changes are needed for that to
happen.
3. Avoiding TLB flushes on change_huge_pmd() that are only needed to
prevent the A/D bits from changing.
Andrew asked for some benchmark numbers. I do not have an easy
determinate macrobenchmark in which it is easy to show benefit. I therre
ran a microbenchmark: a loop that does the following on anonymous
memory, just as a sanity check to see that time is saved by avoiding TLB
flushes. The loop goes:
mprotect(p, PAGE_SIZE, PROT_READ)
mprotect(p, PAGE_SIZE, PROT_READ|PROT_WRITE)
*p = 0; // make the page writable
The test was run in KVM guest with 1 or 2 threads (the second thread
was busy-looping). I measured the time (cycles) of each operation:
1 thread 2 threads
mmots +patch mmots +patch
PROT_READ 3494 2725 (-22%) 8630 7788 (-10%)
PROT_READ|WRITE 3952 2724 (-31%) 9075 2865 (-68%)
[ mmots = v5.17-rc6-mmots-2022-03-06-20-38 ]
The exact numbers are really meaningless, but the benefit is clear.
There are 2 interesting results though.
(1) PROT_READ is cheaper, while one can expect it not to be affected.
This is presumably due to TLB miss that is saved
(2) Without memory access (*p = 0), the speedup of the patch is even
greater. In that scenario mprotect(PROT_READ) also avoids the TLB flush.
As a result both operations on the patched kernel take roughly ~1500
cycles (with either 1 or 2 threads), whereas on mmotm their cost is as
high as presented in the table.
--
v2 -> v3:
* Fix orders of patches (order could lead to breakage)
* Better comments
* Clearer KNL detection [Dave]
* Assertion on PF error-code [Dave]
* Comments, code, function names improvements [PeterZ]
* Flush on access-bit clearing on PMD changes to follow the way
flushing on x86 is done today in the kernel.
v1 -> v2:
* Wrong detection of permission demotion [Andrea]
* Better comments [Andrea]
* Handle THP [Andrea]
* Batching across VMAs [Peter Xu]
* Avoid open-coding PTE analysis
* Fix wrong use of the mmu_gather()
*** BLURB HERE ***
Nadav Amit (5):
x86: Detection of Knights Landing A/D leak
x86/mm: check exec permissions on fault
mm/mprotect: use mmu_gather
mm/mprotect: do not flush on permission promotion
mm: avoid unnecessary flush on change_huge_pmd()
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/pgtable.h | 5 ++
arch/x86/include/asm/pgtable_types.h | 2 +
arch/x86/include/asm/tlbflush.h | 82 ++++++++++++++++++++++++
arch/x86/kernel/cpu/intel.c | 5 ++
arch/x86/mm/fault.c | 22 ++++++-
arch/x86/mm/pgtable.c | 10 +++
fs/exec.c | 6 +-
include/asm-generic/tlb.h | 14 +++++
include/linux/huge_mm.h | 5 +-
include/linux/mm.h | 5 +-
include/linux/pgtable.h | 20 ++++++
mm/huge_memory.c | 19 ++++--
mm/mprotect.c | 94 +++++++++++++++-------------
mm/pgtable-generic.c | 8 +++
mm/userfaultfd.c | 6 +-
16 files changed, 248 insertions(+), 56 deletions(-)
--
2.25.1
next reply other threads:[~2022-03-11 19:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-11 19:07 Nadav Amit [this message]
2022-03-11 19:07 ` [RESEND PATCH v3 1/5] x86: Detection of Knights Landing A/D leak Nadav Amit
2022-03-11 19:07 ` [RESEND PATCH v3 2/5] x86/mm: check exec permissions on fault Nadav Amit
2022-03-11 19:41 ` Dave Hansen
2022-03-11 20:38 ` Nadav Amit
2022-03-11 20:59 ` Dave Hansen
2022-03-11 21:16 ` Nadav Amit
2022-03-11 21:23 ` Dave Hansen
2022-03-11 19:07 ` [RESEND PATCH v3 3/5] mm/mprotect: use mmu_gather Nadav Amit
2022-03-11 19:07 ` [RESEND PATCH v3 4/5] mm/mprotect: do not flush on permission promotion Nadav Amit
2022-03-11 22:45 ` Nadav Amit
2022-03-11 19:07 ` [RESEND PATCH v3 5/5] mm: avoid unnecessary flush on change_huge_pmd() Nadav Amit
2022-03-11 20:41 ` Dave Hansen
2022-03-11 20:53 ` Nadav Amit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220311190749.338281-1-namit@vmware.com \
--to=nadav.amit@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=namit@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.