All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandre Chartre <alexandre.chartre@oracle.com>
To: David Laight <David.Laight@ACULAB.COM>, Borislav Petkov <bp@alien8.de>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"jroedel@suse.de" <jroedel@suse.de>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"jan.setjeeilers@oracle.com" <jan.setjeeilers@oracle.com>,
	"junaids@google.com" <junaids@google.com>,
	"oweisse@google.com" <oweisse@google.com>,
	"rppt@linux.vnet.ibm.com" <rppt@linux.vnet.ibm.com>,
	"graf@amazon.de" <graf@amazon.de>,
	"mgross@linux.intel.com" <mgross@linux.intel.com>,
	"kuzuno@gmail.com" <kuzuno@gmail.com>
Subject: Re: [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code
Date: Wed, 18 Nov 2020 11:29:52 +0100	[thread overview]
Message-ID: <0bedae59-5397-9cae-3c2a-66bc376f5616@oracle.com> (raw)
In-Reply-To: <ce8d862f498042d1bd7a6e8a071f06bf@AcuMS.aculab.com>


On 11/18/20 10:30 AM, David Laight wrote:
> From: Alexandre Chartre
>> Sent: 18 November 2020 07:42
>>
>>
>> On 11/17/20 10:26 PM, Borislav Petkov wrote:
>>> On Tue, Nov 17, 2020 at 07:12:07PM +0100, Alexandre Chartre wrote:
>>>> Some benchmarks are available, in particular from phoronix:
>>>
>>> What I was expecting was benchmarks *you* have run which show that
>>> perf penalty, not something one can find quickly on the internet and
>>> something one cannot always reproduce her-/himself.
>>>
>>> You do know that presenting convincing numbers with a patchset greatly
>>> improves its chances of getting it upstreamed, right?
>>>
>>
>> Well, it looks like I wrongfully assume that KPTI was a well known performance
>> overhead since it was introduced (because it adds extra page-table switches),
>> but you are right I should be presenting my own numbers.
> 
> IIRC the penalty comes from the page table switch.
> Doing it at a different time is unlikely to make much difference.
>

Correct, this RFC is not changing the overhead. However, it is a step forward
for being able to execute some selected syscalls or interrupt handlers without
switching to the kernel page-table. The next step would be to identify and add
the necessary mapping to the user page-table so that specified syscalls can be
executed without switching the page-table.


> For some workloads the penalty is massive - getting on for 50%.
> We are still using old kernels on AWS.
> 

Here are some micro benchmarks of the getppid and getpid syscalls which highlight
the PTI overhead. This uses the kernel tools/perf command, and the getpid command
from libMICRO (https://github.com/redhat-performance/libMicro):

system running 5.10-rc4 booted with nopti:
------------------------------------------

# perf bench syscall basic
# Running 'syscall/basic' benchmark:
# Executed 10000000 getppid() calls
      Total time: 0.792 [sec]

        0.079223 usecs/op
        12622549 ops/sec

# getpid -B 100000
              prc thr   usecs/call      samples   errors cnt/samp
getpid         1   1      0.08029          102        0   100000


We can see that getpid and getppid syscall have the same execution
time around 0.08 usecs. These syscalls are very small and just return
a value, so the time is mostly spent entering/exiting the kernel.


same system booted with pti:
----------------------------

# perf bench syscall basic
# Running 'syscall/basic' benchmark:
# Executed 10000000 getppid() calls
      Total time: 2.025 [sec]

        0.202527 usecs/op
         4937605 ops/sec

# getpid -B 100000
              prc thr   usecs/call      samples   errors cnt/samp
getpid         1   1      0.20241          102        0   100000


With PTI, the execution time jumps to 0.20 usecs (+0.12 usecs = +150%).

That's a very extreme case because these are very small syscalls, and
in that case the overhead to switch page-tables is significant compared
to the execution time of the syscall.

So with an overhead of +0.12 usecs per syscall, the PTI impact is significant
with workload which uses a lot of short syscalls. But if you use longer syscalls,
for example with an average execution time of 2.0 usecs per syscall then you
have a lower overhead of 6%.

alex.

  reply	other threads:[~2020-11-18 10:28 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-16 14:47 [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 01/21] x86/syscall: Add wrapper for invoking syscall function Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 02/21] x86/entry: Update asm_call_on_stack to support more function arguments Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 03/21] x86/entry: Consolidate IST entry from userspace Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 04/21] x86/sev-es: Define a setup stack function for the VC idtentry Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 05/21] x86/entry: Implement ret_from_fork body with C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 06/21] x86/pti: Provide C variants of PTI switch CR3 macros Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 07/21] x86/entry: Fill ESPFIX stack using C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 08/21] x86/pti: Introduce per-task PTI trampoline stack Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 09/21] x86/pti: Function to clone page-table entries from a specified mm Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 10/21] x86/pti: Function to map per-cpu page-table entry Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings Alexandre Chartre
2020-11-16 19:48   ` Andy Lutomirski
2020-11-16 20:21     ` Alexandre Chartre
2020-11-16 23:06       ` Andy Lutomirski
2020-11-17  8:42         ` Alexandre Chartre
2020-11-17 15:49           ` Andy Lutomirski
2020-11-19 19:15           ` Thomas Gleixner
2020-11-16 14:47 ` [RFC][PATCH v2 12/21] x86/pti: Use PTI stack instead of trampoline stack Alexandre Chartre
2020-11-16 16:57   ` Andy Lutomirski
2020-11-16 18:10     ` Alexandre Chartre
2020-11-16 18:34       ` Andy Lutomirski
2020-11-16 19:37         ` Alexandre Chartre
2020-11-17 15:09         ` Alexandre Chartre
2020-11-17 15:52           ` Andy Lutomirski
2020-11-17 17:01             ` Alexandre Chartre
2020-11-19  1:49               ` Andy Lutomirski
2020-11-19  8:05                 ` Alexandre Chartre
2020-11-19 12:06                   ` Alexandre Chartre
2020-11-19 16:06                     ` Andy Lutomirski
2020-11-19 17:02                       ` Alexandre Chartre
2020-11-16 21:24       ` David Laight
2020-11-17  8:27         ` Alexandre Chartre
2020-11-19 19:10       ` Thomas Gleixner
2020-11-19 19:55         ` Alexandre Chartre
2020-11-19 21:20           ` Thomas Gleixner
2020-11-24  7:20   ` [x86/pti] 5da9e742d1: PANIC:double_fault kernel test robot
2020-11-16 14:47 ` [RFC][PATCH v2 13/21] x86/pti: Execute syscall functions on the kernel stack Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 14/21] x86/pti: Execute IDT handlers " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 15/21] x86/pti: Execute IDT handlers with error code " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 16/21] x86/pti: Execute system vector handlers " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 17/21] x86/pti: Execute page fault handler " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 18/21] x86/pti: Execute NMI " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 19/21] x86/pti: Defer CR3 switch to C code for IST entries Alexandre Chartre
2020-11-16 19:41   ` kernel test robot
2020-11-16 14:47 ` [RFC][PATCH v2 20/21] x86/pti: Defer CR3 switch to C code for non-IST and syscall entries Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 21/21] x86/pti: Use a different stack canary with the user and kernel page-table Alexandre Chartre
2020-11-16 16:56   ` Andy Lutomirski
2020-11-16 18:34     ` Alexandre Chartre
2020-11-16 20:17 ` [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code Borislav Petkov
2020-11-17  7:56   ` Alexandre Chartre
2020-11-17 16:55     ` Borislav Petkov
2020-11-17 18:12       ` Alexandre Chartre
2020-11-17 18:28         ` Borislav Petkov
2020-11-17 19:02           ` Alexandre Chartre
2020-11-17 21:23             ` Borislav Petkov
2020-11-18  7:08               ` Alexandre Chartre
2020-11-17 21:26         ` Borislav Petkov
2020-11-18  7:41           ` Alexandre Chartre
2020-11-18  9:30             ` David Laight
2020-11-18 10:29               ` Alexandre Chartre [this message]
2020-11-18 13:22                 ` David Laight
2020-11-18 17:15                   ` Alexandre Chartre
2020-11-18 11:29             ` Borislav Petkov
2020-11-18 19:37               ` Alexandre Chartre
2020-11-16 20:24 ` Borislav Petkov
2020-11-17  8:19   ` Alexandre Chartre
2020-11-17 17:07     ` Borislav Petkov
2020-11-17 18:24       ` Alexandre Chartre
2020-11-19 19:32     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bedae59-5397-9cae-3c2a-66bc376f5616@oracle.com \
    --to=alexandre.chartre@oracle.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=graf@amazon.de \
    --cc=hpa@zytor.com \
    --cc=jan.setjeeilers@oracle.com \
    --cc=jroedel@suse.de \
    --cc=junaids@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kuzuno@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mgross@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=oweisse@google.com \
    --cc=peterz@infradead.org \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.