linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexandre Chartre <alexandre.chartre@oracle.com>
To: Borislav Petkov <bp@alien8.de>
Cc: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	x86@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org,
	peterz@infradead.org, linux-kernel@vger.kernel.org,
	thomas.lendacky@amd.com, jroedel@suse.de, konrad.wilk@oracle.com,
	jan.setjeeilers@oracle.com, junaids@google.com,
	oweisse@google.com, rppt@linux.vnet.ibm.com, graf@amazon.de,
	mgross@linux.intel.com, kuzuno@gmail.com
Subject: Re: [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code
Date: Tue, 17 Nov 2020 19:12:07 +0100	[thread overview]
Message-ID: <890f6b7e-a268-2257-edcb-5eacc7db3d8e@oracle.com> (raw)
In-Reply-To: <20201117165539.GG5719@zn.tnic>


On 11/17/20 5:55 PM, Borislav Petkov wrote:
> On Tue, Nov 17, 2020 at 08:56:23AM +0100, Alexandre Chartre wrote:
>> The main goal of ASI is to provide KVM address space isolation to
>> mitigate guest-to-host speculative attacks like L1TF or MDS.
> 
> Because the current L1TF and MDS mitigations are lacking or why?
> 

Yes. L1TF/MDS allow some inter cpu-thread attacks which are not mitigated at
the moment. In particular, this allows a guest VM to attack another guest VM
or the host kernel running on a sibling cpu-thread. Core Scheduling will
mitigate the guest-to-guest attack but not the guest-to-host attack. Address
Space Isolation provides a mitigation for guest-to-host attack.


>> Current proposal of ASI is plugged into the CR3 switch assembly macro
>> which make the code brittle and complex. (see [1])
>>
>> I am also expected this might help with some other ideas like having
>> syscall (or interrupt handler) which can run without switching the
>> page-table.
> 
> I still fail to see why we need all that. I read, "this does this and
> that" but I don't read "the current problem is this" and "this is our
> suggested solution for it".
> 
> So what is the issue which needs addressing in the current kernel which
> is going to justify adding all that code?

The main issue this is trying to address is that the CR3 switch is currently
done in assembly code from contexts which are very restrictive: the CR3 switch
is often done when only one or two registers are available for use, sometimes
no stack is available. For example, the syscall entry switches CR3 with a single
register available (%sp) and no stack.

Because of this, it is fairly tricky to expand the logic for switching CR3.
This is a problem that we have faced while implementing Address Space Isolation
(ASI) where we need extra logic to drive the page-table switch. We have successfully
implement ASI with the current CR3 switching assembly code, but this requires
complex assembly construction. Hence this proposal to defer CR3 switching to C
code so that it can be more easily expandable.

Hopefully this can also contribute to make the assembly entry code less complex,
and be beneficial to other projects.


>> PTI has a measured overhead of roughly 5% for most workloads, but it can
>> be much higher in some cases.
> 
> "it can be"? Where? Actual use case?

Some benchmarks are available, in particular from phoronix:

https://www.phoronix.com/scan.php?page=article&item=linux-more-x86pti
https://www.phoronix.com/scan.php?page=news_item&px=x86-PTI-Initial-Gaming-Tests
https://www.phoronix.com/scan.php?page=article&item=linux-kpti-kvm
https://medium.com/@loganaden/linux-kpti-performance-hit-on-real-workloads-8da185482df3


>> The latest ASI RFC (RFC v4) is here [1]. This RFC has ASI plugged
>> directly into the CR3 switch assembly macro. We are working on a new
>> implementation, based on these changes which avoid having to deal with
>> assembly code and makes the implementation more robust.
> 
> This still doesn't answer my questions. I read a lot of "could be used
> for" formulations but I still don't know why we need that. So what is
> the problem that the kernel currently has which you're trying to address
> with this?
> 

Hopefully this is clearer with the answer I provided above.

Thanks,

alex.

  reply	other threads:[~2020-11-17 18:10 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-16 14:47 [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 01/21] x86/syscall: Add wrapper for invoking syscall function Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 02/21] x86/entry: Update asm_call_on_stack to support more function arguments Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 03/21] x86/entry: Consolidate IST entry from userspace Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 04/21] x86/sev-es: Define a setup stack function for the VC idtentry Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 05/21] x86/entry: Implement ret_from_fork body with C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 06/21] x86/pti: Provide C variants of PTI switch CR3 macros Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 07/21] x86/entry: Fill ESPFIX stack using C code Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 08/21] x86/pti: Introduce per-task PTI trampoline stack Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 09/21] x86/pti: Function to clone page-table entries from a specified mm Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 10/21] x86/pti: Function to map per-cpu page-table entry Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings Alexandre Chartre
2020-11-16 19:48   ` Andy Lutomirski
2020-11-16 20:21     ` Alexandre Chartre
2020-11-16 23:06       ` Andy Lutomirski
2020-11-17  8:42         ` Alexandre Chartre
2020-11-17 15:49           ` Andy Lutomirski
2020-11-19 19:15           ` Thomas Gleixner
2020-11-16 14:47 ` [RFC][PATCH v2 12/21] x86/pti: Use PTI stack instead of trampoline stack Alexandre Chartre
2020-11-16 16:57   ` Andy Lutomirski
2020-11-16 18:10     ` Alexandre Chartre
2020-11-16 18:34       ` Andy Lutomirski
2020-11-16 19:37         ` Alexandre Chartre
2020-11-17 15:09         ` Alexandre Chartre
2020-11-17 15:52           ` Andy Lutomirski
2020-11-17 17:01             ` Alexandre Chartre
2020-11-19  1:49               ` Andy Lutomirski
2020-11-19  8:05                 ` Alexandre Chartre
2020-11-19 12:06                   ` Alexandre Chartre
2020-11-19 16:06                     ` Andy Lutomirski
2020-11-19 17:02                       ` Alexandre Chartre
2020-11-16 21:24       ` David Laight
2020-11-17  8:27         ` Alexandre Chartre
2020-11-19 19:10       ` Thomas Gleixner
2020-11-19 19:55         ` Alexandre Chartre
2020-11-19 21:20           ` Thomas Gleixner
2020-11-24  7:20   ` [x86/pti] 5da9e742d1: PANIC:double_fault kernel test robot
2020-11-16 14:47 ` [RFC][PATCH v2 13/21] x86/pti: Execute syscall functions on the kernel stack Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 14/21] x86/pti: Execute IDT handlers " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 15/21] x86/pti: Execute IDT handlers with error code " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 16/21] x86/pti: Execute system vector handlers " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 17/21] x86/pti: Execute page fault handler " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 18/21] x86/pti: Execute NMI " Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 19/21] x86/pti: Defer CR3 switch to C code for IST entries Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 20/21] x86/pti: Defer CR3 switch to C code for non-IST and syscall entries Alexandre Chartre
2020-11-16 14:47 ` [RFC][PATCH v2 21/21] x86/pti: Use a different stack canary with the user and kernel page-table Alexandre Chartre
2020-11-16 16:56   ` Andy Lutomirski
2020-11-16 18:34     ` Alexandre Chartre
2020-11-16 20:17 ` [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code Borislav Petkov
2020-11-17  7:56   ` Alexandre Chartre
2020-11-17 16:55     ` Borislav Petkov
2020-11-17 18:12       ` Alexandre Chartre [this message]
2020-11-17 18:28         ` Borislav Petkov
2020-11-17 19:02           ` Alexandre Chartre
2020-11-17 21:23             ` Borislav Petkov
2020-11-18  7:08               ` Alexandre Chartre
2020-11-17 21:26         ` Borislav Petkov
2020-11-18  7:41           ` Alexandre Chartre
2020-11-18  9:30             ` David Laight
2020-11-18 10:29               ` Alexandre Chartre
2020-11-18 13:22                 ` David Laight
2020-11-18 17:15                   ` Alexandre Chartre
2020-11-18 11:29             ` Borislav Petkov
2020-11-18 19:37               ` Alexandre Chartre
2020-11-16 20:24 ` Borislav Petkov
2020-11-17  8:19   ` Alexandre Chartre
2020-11-17 17:07     ` Borislav Petkov
2020-11-17 18:24       ` Alexandre Chartre
2020-11-19 19:32     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=890f6b7e-a268-2257-edcb-5eacc7db3d8e@oracle.com \
    --to=alexandre.chartre@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=graf@amazon.de \
    --cc=hpa@zytor.com \
    --cc=jan.setjeeilers@oracle.com \
    --cc=jroedel@suse.de \
    --cc=junaids@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kuzuno@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mgross@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=oweisse@google.com \
    --cc=peterz@infradead.org \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).