All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nadav Amit <namit@vmware.com>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Peter Xu <peterx@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
	Nick Piggin <npiggin@gmail.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 2/2] mm/mprotect: do not flush on permission promotion
Date: Fri, 8 Oct 2021 06:06:34 +0000	[thread overview]
Message-ID: <DA49DBBB-FFEE-4ACC-BB6C-364D07533C5E@vmware.com> (raw)
In-Reply-To: <1952fc7c-fb21-7d0e-661b-afa59b4580e5@redhat.com>



> On Oct 7, 2021, at 10:07 AM, David Hildenbrand <david@redhat.com> wrote:
> 
> On 07.10.21 18:16, Nadav Amit wrote:
>>> On Oct 7, 2021, at 5:13 AM, David Hildenbrand <david@redhat.com> wrote:
>>> 
>>> On 25.09.21 22:54, Nadav Amit wrote:
>>>> From: Nadav Amit <namit@vmware.com>
>>>> Currently, using mprotect() to unprotect a memory region or uffd to
>>>> unprotect a memory region causes a TLB flush. At least on x86, as
>>>> protection is promoted, no TLB flush is needed.
>>>> Add an arch-specific pte_may_need_flush() which tells whether a TLB
>>>> flush is needed based on the old PTE and the new one. Implement an x86
>>>> pte_may_need_flush().
>>>> For x86, PTE protection promotion or changes of software bits does
>>>> require a flush, also add logic that considers the dirty-bit. Changes to
>>>> the access-bit do not trigger a TLB flush, although architecturally they
>>>> should, as Linux considers the access-bit as a hint.
>>> 
>>> Is the added LOC worth the benefit? IOW, do we have some benchmark that really benefits from that?
>> So you ask whether the added ~10 LOC (net) worth the benefit?
> 
> I read  "3 files changed, 46 insertions(+), 1 deletion(-)" to optimize something without proof, so I naturally have to ask. So this is just a "usually we optimize and show numbers to proof" comment.

These numbers are misleading, as they count comments and other non-code.

[snip]

> 
> Any numbers would be helpful.
> 
>> If you want, I will write a microbenchmarks and give you numbers.
>> If you look for further optimizations (although you did not indicate
>> so), such as doing the TLB batching from do_mprotect_key(),
>> (i.e. batching across VMAs), we can discuss it and apply it on
>> top of these patches.
> 
> I think this patch itself is sufficient if we can show a benefit; I do wonder if existing benchmarks could already show a benefit, I feel like they should if this makes a difference. Excessive mprotect() usage (protect<>unprotect) isn't something unusual.

I do not know about a concrete benchmark (other than my workload, which I cannot share right now) that does excessive mprotect() in a way that would actually be measurable on the overall performance. I would argue that many many optimizations in the kernel are such that would not have so measurable benefit by themselves on common macrobenchmarks.

Anyhow, per your request I created a small micro-benchmark that runs mprotect(PROT_READ) and mprotect(PROT_READ|PROT_WRITE) in a loop and measured the time it took to do the latter (where a writeprotect is not needed). I ran the benchmark on a VM (guest) on top of KVM.

The cost (cycles) per mprotect(PROT_READ|PROT_WRITE) operation:

		1 thread		2 threads
		--------		---------
w/patch:	2496			2505			
w/o patch:	5342			10458

[ The results for 1 thread might seem strange as one can expect the overhead in this case to be no more than ~250 cycles, which is the time a TLB invalidation of a single PTE takes. Yet, this overhead are probably related to “page fracturing”, which happens when the VM memory is backed by 4KB pages. In such scenarios, a single PTE invalidation in the VM can cause on Intel a full TLB flush. The full flush is needed to ensure that if the invalidated address was mapped through huge page in the VM, any relevant 4KB mapping that is cached in the TLB (after fracturing due to the 4KB GPA->HPA mapping) would be removed.]

Let me know if you want me to share the micro-benchmark with you. I am not going to mention the results in the commit log, because I think the overhead of unnecessary TLB invalidation is well established.


  reply	other threads:[~2021-10-08  6:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-25 20:54 [PATCH 0/2] mm/mprotect: avoid unnecessary TLB flushes Nadav Amit
2021-09-25 20:54 ` [PATCH 1/2] mm/mprotect: use mmu_gather Nadav Amit
2021-10-03 12:10   ` Peter Zijlstra
2021-10-04 19:24     ` Nadav Amit
2021-10-05  6:53       ` Peter Zijlstra
2021-10-05 16:34         ` Nadav Amit
2021-10-11  3:45   ` Nadav Amit
2021-10-12 10:16   ` Peter Xu
2021-10-12 17:31     ` Nadav Amit
2021-10-12 23:20       ` Peter Xu
2021-10-13 15:59         ` Nadav Amit
2021-09-25 20:54 ` [PATCH 2/2] mm/mprotect: do not flush on permission promotion Nadav Amit
2021-10-07 12:13   ` David Hildenbrand
2021-10-07 16:16     ` Nadav Amit
2021-10-07 17:07       ` David Hildenbrand
2021-10-08  6:06         ` Nadav Amit [this message]
2021-10-08  7:35           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DA49DBBB-FFEE-4ACC-BB6C-364D07533C5E@vmware.com \
    --to=namit@vmware.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.