All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: speck@linutronix.de
Subject: [MODERATED] Re: [patch V10 10/10] Control knobs and Documentation 10
Date: Sun, 15 Jul 2018 09:30:58 +0200	[thread overview]
Message-ID: <20180715073058.GA25608@nazgul.tnic> (raw)
In-Reply-To: <20180712142957.791282859@linutronix.de>

On Thu, Jul 12, 2018 at 04:19:12PM +0200, speck for Thomas Gleixner wrote:
>  Documentation/admin-guide/index.rst |    9 
>  Documentation/admin-guide/l1tf.rst  |  572 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 581 insertions(+)

Reads nicely, just a couple of minor things which sprang at me while
reading, below:

...

> +2. Malicious guest in a virtual machine
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +   The fact that L1TF breaks all domain protections allows malicious guest
> +   OSes, which can control the PTEs directly, and malicious guest user
> +   space applications, which run on an unprotected guest kernel lacking the
> +   PTE inversion mitigation for L1TF, to attack physical host memory.
> +
> +   A special aspect of L1TF in the context of virtualization is symmetric
> +   multi threading (SMT). The Intel implementation of SMT is called
> +   HyperThreading. The fact that Hyperthreads on the affected processors
> +   share the L1 Data Cache (L1D) is important for this. As the flaw allows
> +   only to attack data which is present in L1D, a malicious guest running
> +   on one Hyperthread can attack the data which is brought into the L1D by
> +   the context which runs on the sibling Hyperthread of the same physical
> +   core. This context can be host OS, host user space or a different guest.
> +
> +   If the processor does not support Extended Page Tables, the attack is
> +   only possible, when the hypervisor does not sanitize the content of the
> +   effective (shadow) page tables.
> +
> +   While solutions exist to mitigate these attack vectors fully, these
> +   mitigations are not enabled by default in the Linux kernel because they
> +   can affect performance significantly. The kernel provides several
> +   mechanisms which can be utilized to address the problem depending on the
> +   deployment scenario. The mitigations, their protection scope and impact
> +   are described in the next sections.
> +
> +   The default mitigations and the rationale for chosing them are explained

choosing

> +   at the end of this document. See :ref:`default_mitigations`.
> +
> +.. _l1tf_sys_info:
> +

...

> +1. L1D flush on VMENTER
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +   To make sure that a guest cannot attack data which is present in the L1D
> +   the hypervisor flushes the L1D before entering the guest.
> +
> +   Flushing the L1D evicts not only the data which should not be accessed
> +   by a potentially malicious guest, it also flushes the guest
> +   data. Flushing the L1D has a performance impact as the processor has to

s/Flushing the L1D/Therefore it/

> +   bring the flushed guest data back into the L1D. Depending on the
> +   frequency of VMEXIT/VMENTER and the type of computations in the guest
> +   performance degradation in the range of 1% to 50% has been observed. For
> +   scenarios where guest VMEXIT/VMENTER are rare the performance impact is
> +   minimal. Virtio and mechanisms like posted interrupts are designed to
> +   confine the VMEXITs to a bare minimum, but specific configurations and
> +   application scenarios might still suffer from a high VMEXIT rate.
> +
> +   The general recommendation is to enable L1D flush on VMENTER.
> +
> +   Note, that L1D flush does not prevent the SMT problem because the

s/,//

> +   sibling thread will also bring back its data into the L1D which makes it
> +   attackable again.
> +
> +   L1D flush can be controlled by the administrator via the kernel command
> +   line and sysfs control files. See :ref:`mitigation_control_command_line`
> +   and :ref:`mitigation_control_kvm`.
> +
> +.. _guest_confinement:
> +
> +2. Guest VCPU confinement to dedicated physical cores
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +   To address the SMT problem, it is possible to make a guest or a group of
> +   guests affine to one or more physical cores. The proper mechanism for
> +   that is to utilize exclusive cpusets to ensure that no other guest or
> +   host tasks can run on these cores.
> +
> +   If only a single guest or related guests run on sibling SMT threads on
> +   the same physical core then they can only attack their own memory and
> +   restricted parts of the host memory.
> +
> +   Host memory is attackable, when one of the sibling SMT threads runs in
> +   host OS (hypervisor) context and the other in guest context. The amount
> +   of valuable information from the host OS context depends on the context
> +   which the host OS executes, i.e. interrupts, soft interrupts and kernel
> +   threads. The amount of valuable data from these contexts cannot be
> +   declared as non-interesting for an attacker without deep inspection of
> +   the code.
> +
> +   Note, that assigning guests to a fixed set of physical cores affects the

s/,//

> +   ability of the scheduler to do load balancing and might have negative
> +   effects on CPU utilization depending on the hosting scenario. Disabling
> +   SMT might be a viable alternative for particular scenarios.
> +
> +   For further information about confining guests to a single or to a group
> +   of cores consult the cpusets documentation:
> +
> +   https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

Should this reference be relative to our Documentation/ tree instead, i.e.,

       ../cgroup-v1/cpusets.txt

and the doc system will resolve it to the respective absolute URL where
it is displayed?

> +
> +.. _interrupt_isolation:
> +
> +3. Interrupt affinity
> +^^^^^^^^^^^^^^^^^^^^^
> +
> +   Interrupts can be made affine to logical CPUs. This is not universally
> +   true because there are types of interrupts which are truly per CPU
> +   interrupts, e.g. the local timer interrupt. Aside of that multi queue
							       ^
							       ,

> +   devices affine their interrupts to single CPUs or groups of CPUs per

s/affine/assign/

> +   queue without allowing the administrator to control the affinities.
> +
> +   Moving the interrupts, which can be affinity controlled, away from CPUs
> +   which run untrusted guests, reduces the attack vector space.
> +
> +   Whether the interrupts with are affine to CPUs, which run untrusted

s/with //

> +   guests, provide interesting data for an attacker depends on the system

	      provides

> +   configuration and the scenarios which run on the system. While for some
> +   of the interrupts it can be assumed that they wont expose interesting
> +   information beyond exposing hints about the host OS memory layout, there

s/exposing //

> +   is no way to make general assumptions.
> +
> +   Interrupt affinity can be controlled by the administrator via the
> +   /proc/irq/$NR/smp_affinity[_list] files. Limited documentation is
> +   available at:
> +
> +   https://www.kernel.org/doc/Documentation/IRQ-affinity.txt

Same comment as above.

> +
> +.. _smt_control:
> +
> +4. SMT control
> +^^^^^^^^^^^^^^
> +
> +   To prevent the SMT issues of L1TF it might be necessary to disable SMT
> +   completely. Disabling SMT can have a significant performance impact, but
> +   the impact depends on the hosting scenario and the type of workloads.
> +   The impact of disabling SMT needs also to be weighted against the impact

s/weighted/weighed/

> +   of other mitigation solutions like confining guests to dedicated cores.
> +
> +   The kernel provides a sysfs interface to retrieve the status of SMT and
> +   to control it. It also provides a kernel command line interface to
> +   control SMT.
> +
> +   The kernel command line interface consists of the following options:
> +

...

> +5. Disabling EPT
> +^^^^^^^^^^^^^^^^
> +
> +  Disabling EPT for virtual machines provides full mitigation for L1TF even
> +  with SMT enabled, because the effective page tables for guests are
> +  managed and sanitized by the hypervisor. Though disabling EPT has a

s/Though/However/

> +  significant performance impact especially when the Meltdown mitigation
> +  KPTI is enabled.
> +
> +  EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter.
> +
> +There is ongoing research and development for new mitigation mechanisms to
> +address the performance impact of disabling SMT or EPT.
> +
> +.. _mitigation_control_command_line:
> +
> +Mitigation control on the kernel command line
> +---------------------------------------------
> +
> +The kernel command line allows to control the L1TF mitigations at boot

Passive:

"L1TF mitigations are controlled on the kernel command line with the
option ..."

> +time with the option "l1tf=". The valid arguments for this option are:
> +
> +  ============  ===================================================
> +  full		Provides all available mitigations for the L1TF
> +		vulnerability. Disables SMT and enables all mitigations in
> +		the hypervisors.
> +
> +		SMT control and L1D flush control via the sysfs interface
> +		is still possible after boot.  Hypervisors will issue a
> +		warning when the first VM is started in a potentially
> +		insecure configuration, i.e. SMT enabled or L1D flush
> +		disabled.
> +

...

> +
> +  - Interrupt isolation:
> +
> +    Isolating the guest CPUs from interrupts can reduce the attack surface
> +    further, but still allows a malicious guest to explore a limited amount
> +    of host physical memory. This can at least be used to gain knowledge
> +    about the host address space layout. The interrupts which have a fixed
> +    affinity to the CPUs which run the untrusted guests can depending on
> +    the scenario still trigger soft interrupts and schedule kernel threads

"... which run the untrusted guests can - depending on the scenario - still
trigger... "

> +    which might expose valuable information. See
> +    :ref:`interrupt_isolation`.
> +

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

  parent reply	other threads:[~2018-07-15  7:31 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-12 14:19 [patch V10 00/10] Control knobs and Documentation 0 Thomas Gleixner
2018-07-12 14:19 ` [patch V10 01/10] Control knobs and Documentation 1 Thomas Gleixner
2018-07-12 15:34   ` [MODERATED] " Greg KH
2018-07-12 15:38     ` Thomas Gleixner
2018-07-12 15:46       ` Thomas Gleixner
2018-07-12 17:08         ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 02/10] Control knobs and Documentation 2 Thomas Gleixner
2018-07-12 17:09   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 03/10] Control knobs and Documentation 3 Thomas Gleixner
2018-07-12 16:13   ` [MODERATED] " Josh Poimboeuf
2018-07-13  9:10     ` Thomas Gleixner
2018-07-12 17:09   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 04/10] Control knobs and Documentation 4 Thomas Gleixner
2018-07-12 17:10   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 05/10] Control knobs and Documentation 5 Thomas Gleixner
2018-07-12 17:10   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 06/10] Control knobs and Documentation 6 Thomas Gleixner
2018-07-12 16:14   ` [MODERATED] " Josh Poimboeuf
2018-07-12 17:10   ` Greg KH
2018-07-12 14:19 ` [patch V10 07/10] Control knobs and Documentation 7 Thomas Gleixner
2018-07-12 17:11   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 08/10] Control knobs and Documentation 8 Thomas Gleixner
2018-07-12 16:22   ` [MODERATED] " Josh Poimboeuf
2018-07-12 17:12     ` Greg KH
2018-07-13  9:18     ` Thomas Gleixner
2018-07-12 17:17   ` [MODERATED] " Greg KH
2018-07-12 14:19 ` [patch V10 09/10] Control knobs and Documentation 9 Thomas Gleixner
2018-07-12 16:24   ` [MODERATED] " Josh Poimboeuf
2018-07-12 17:17     ` Greg KH
2018-07-12 17:16   ` Greg KH
2018-07-15  3:12   ` Kees Cook
2018-07-12 14:19 ` [patch V10 10/10] Control knobs and Documentation 10 Thomas Gleixner
2018-07-12 16:03   ` [MODERATED] " Linus Torvalds
2018-07-12 16:31     ` Peter Zijlstra
2018-07-12 16:13   ` Josh Poimboeuf
2018-07-12 16:26     ` Josh Poimboeuf
2018-07-13  9:09     ` Thomas Gleixner
2018-07-12 17:18   ` [MODERATED] " Greg KH
2018-07-15  7:30   ` Borislav Petkov [this message]
2018-07-27 16:41   ` Dave Hansen
2018-07-12 14:54 ` [patch V10 00/10] Control knobs and Documentation 0 Thomas Gleixner
2018-07-12 19:30 ` [MODERATED] " Josh Poimboeuf
2018-07-13 15:03   ` Thomas Gleixner
2018-07-13  8:30 ` [MODERATED] " Jiri Kosina
2018-07-13 16:22 ` Paolo Bonzini
2018-07-13 16:56   ` Andrew Cooper
2018-07-13 17:01     ` Paolo Bonzini
2018-07-13 17:28   ` Konrad Rzeszutek Wilk
2018-07-15 13:58     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180715073058.GA25608@nazgul.tnic \
    --to=bp@suse.de \
    --cc=speck@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.