All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Andersen <john.s.andersen@intel.com>
To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	x86@kernel.org, pbonzini@redhat.com
Cc: hpa@zytor.com, sean.j.christopherson@intel.com,
	vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com,
	joro@8bytes.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, John Andersen <john.s.andersen@intel.com>
Subject: [RESEND RFC 0/2] Paravirtualized Control Register pinning
Date: Fri, 20 Dec 2019 11:26:59 -0800	[thread overview]
Message-ID: <20191220192701.23415-1-john.s.andersen@intel.com> (raw)

Paravirtualized Control Register pinning is a strengthened version of
existing protections on the Write Protect, Supervisor Mode Execution /
Access Protection, and User-Mode Instruction Prevention bits. The
existing protections prevent native_write_cr*() functions from writing
values which disable those bits. This patchset prevents any guest
writes to control registers from disabling pinned bits, not just writes
from native_write_cr*(). This stops attackers within the guest from
using ROP to disable protection bits.

https://web.archive.org/web/20171029060939/http://www.blackbunny.io/linux-kernel-x86-64-bypass-smep-kaslr-kptr_restric/

The protection is implemented by adding MSRs to KVM which contain the
bits that are allowed to be pinned, and the bits which are pinned. The
guest or userspace can enable bit pinning by reading MSRs to check
which bits are allowed to be pinned, and then writing MSRs to set which
bits they want pinned.

Other hypervisors such as HyperV have implemented similar protections
for Control Registers and MSRs; which security researchers have found
effective.

https://www.abatchy.com/2018/01/kernel-exploitation-4

We add a CR pin feature bit to the KVM cpuid, read only MSRs which
guests use to identify which bits they may request be pinned, and
CR pinned MSRs which contain the pinned bits. Guests can request that
KVM pin bits within control register 0 or 4 via the CR pinned MSRs.
Writes to the MSRs fail if they include bits that aren't allowed to be
pinned. Host userspace may clear or modify pinned bits at any time.
Once pinned bits are set, the guest may pin more allowed bits, but may
never clear pinned bits.

In the event that the guest vcpu attempts to disable any of the pinned
bits, the vcpu that issued the write is sent a general protection
fault, and the register is left unchanged.

Pinning is not active when running in SMM. Entering SMM disables pinned
bits, writes to control registers within SMM would therefore trigger
general protection faults if pinning was enforced.

The guest may never read pinned bits. If an attacker were to read the
CR pinned MSRs, they might decide to preform another attack which would
not cause a general protection fault.

Should userspace expose the CR pining CPUID feature bit, it must zero CR
pinned MSRs on reboot. If it does not, it runs the risk of having the
guest enable pinning and subsequently cause general protection faults on
next boot due to early boot code setting control registers to values
which do not contain the pinned bits.

When running with KVM guest support and paravirtualized CR pinning
enabled, paravirtualized and existing pinning are setup at the same
point on the boot CPU. Non-boot CPUs setup pinning upon identification.

Guests using the kexec system call currently do not support
paravirtualized control register pinning. This is due to early boot
code writing known good values to control registers, these values do
not contain the protected bits. This is due to CPU feature
identification being done at a later time, when the kernel properly
checks if it can enable protections.

Most distributions enable kexec. However, kexec could be made boot time
disableable. In this case if a user has disabled kexec at boot time
the guest will request that paravirtualized control register pinning
be enabled. This would expand the userbase to users of major
distributions.

Paravirtualized CR pinning will likely be incompatible with kexec for
the foreseeable future. Early boot code could possibly be changed to
not clear protected bits. However, a kernel that requests CR bits be
pinned can't know if the kernel it's kexecing has been updated to not
clear protected bits. This would result in the kernel being kexec'd
almost immediately receiving a general protection fault.

Security conscious kernel configurations disable kexec already, per KSPP
guidelines. Projects such as Kata Containers, AWS Lambda, ChromeOS
Termina, and others using KVM to virtualize Linux will benefit from
this protection.

The usage of SMM in SeaBIOS was explored as a way to communicate to KVM
that a reboot has occurred and it should zero the pinned bits. When
using QEMU and SeaBIOS, SMM initialization occurs on reboot. However,
prior to SMM initialization, BIOS writes zero values to CR0, causing a
general protection fault to be sent to the guest before SMM can signal
that the machine has booted.

Pinning of sensitive CR bits has already been implemented to protect
against exploits directly calling native_write_cr*(). The current
protection cannot stop ROP attacks which jump directly to a MOV CR
instruction. Guests running with paravirtualized CR pinning are now
protected against the use of ROP to disable CR bits. The same bits that
are being pinned natively may be pinned via the CR pinned MSRs. These
bits are WP in CR0, and SMEP, SMAP, and UMIP in CR4.

Future patches could protect bits in MSRs in a similar fashion. The NXE
bit of the EFER MSR is a prime candidate.

John Andersen (2):
  KVM: X86: Add CR pin MSRs
  X86: Use KVM CR pin MSRs

 Documentation/virt/kvm/msr.txt       | 38 +++++++++++++++++++++++
 arch/x86/Kconfig                     |  9 ++++++
 arch/x86/include/asm/kvm_host.h      |  2 ++
 arch/x86/include/asm/kvm_para.h      | 10 +++++++
 arch/x86/include/uapi/asm/kvm_para.h |  5 ++++
 arch/x86/kernel/cpu/common.c         |  5 ++++
 arch/x86/kernel/kvm.c                | 17 +++++++++++
 arch/x86/kvm/cpuid.c                 |  3 +-
 arch/x86/kvm/x86.c                   | 45 ++++++++++++++++++++++++++++
 9 files changed, 133 insertions(+), 1 deletion(-)

-- 
2.21.0


             reply	other threads:[~2019-12-20 19:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 19:26 John Andersen [this message]
2019-12-20 19:27 ` [RESEND RFC 1/2] KVM: X86: Add CR pin MSRs John Andersen
2019-12-20 19:27 ` [RESEND RFC 2/2] X86: Use KVM " John Andersen
2019-12-23  7:39   ` Andy Lutomirski
2019-12-23 12:06     ` Borislav Petkov
2019-12-24 21:18     ` Andersen, John S
2019-12-24  6:45   ` kbuild test robot
2019-12-21 13:59 ` [RESEND RFC 0/2] Paravirtualized Control Register pinning Paolo Bonzini
2019-12-23 17:28   ` Andersen, John S
2019-12-23 14:30 ` Liran Alon
2019-12-24 22:56   ` Liran Alon
2019-12-25  2:04   ` Andy Lutomirski
2019-12-25 13:05     ` Liran Alon
2019-12-23 14:48 ` Liran Alon
2019-12-23 17:09   ` Paolo Bonzini
2019-12-23 17:27     ` Andersen, John S
2019-12-23 17:28     ` Liran Alon
2019-12-23 17:46       ` Paolo Bonzini
2019-12-23 22:49         ` Liran Alon
2019-12-24 19:44   ` Andersen, John S
2019-12-24 20:35     ` Liran Alon
2019-12-24 21:17       ` Andersen, John S

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191220192701.23415-1-john.s.andersen@intel.com \
    --to=john.s.andersen@intel.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.