All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: speck@linutronix.de
Cc: Andi Kleen <ak@linux.intel.com>
Subject: [MODERATED] [PATCH v4 10/28] MDSv4 24
Date: Fri, 11 Jan 2019 17:29:23 -0800	[thread overview]
Message-ID: <afc24f570651f02b914f13243540304ed7b357b5.1547256470.git.ak@linux.intel.com> (raw)
In-Reply-To: <cover.1547256470.git.ak@linux.intel.com>
In-Reply-To: <cover.1547256470.git.ak@linux.intel.com>

Including the theory, and some guide lines for subsystem/driver
maintainers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 Documentation/clearcpu.txt | 173 +++++++++++++++++++++++++++++++++++++
 1 file changed, 173 insertions(+)
 create mode 100644 Documentation/clearcpu.txt

diff --git a/Documentation/clearcpu.txt b/Documentation/clearcpu.txt
new file mode 100644
index 000000000000..b204b1e7051c
--- /dev/null
+++ b/Documentation/clearcpu.txt
@@ -0,0 +1,173 @@
+
+Security model for Microarchitectural Data Sampling
+===================================================
+
+Some CPUs can leave read or written data in internal buffers,
+which then later might be sampled through side effects.
+For more details see CVE-2018-12126 CVE-2018-12130 CVE-2018-12127
+
+This can be avoided by explicitely clearing the CPU state.
+
+We trying to avoid leaking data between different processes,
+and also some sensitive data, like cryptographic data,
+or user data from other processes.
+
+We support three modes:
+
+(1) mitigation off (mds=off)
+(2) clear only when needed (default)
+(3) clear on every kernel exit, or guest entry (mds=full)
+
+(1) and (3) are trivial, the rest of the document discusses (2)
+
+Basic requirements and assumptions
+----------------------------------
+
+Kernel addresses and kernel temporary data are not sensitive.
+
+User data is sensitive, but only for other processes.
+
+Kernel data is sensitive when it is cryptographic keys.
+
+Guidance for driver/subsystem developers
+----------------------------------------
+
+When you touch user supplied data of *other* processes in system call
+context add lazy_clear_cpu().
+
+For the cases below we care only about data from other processes.
+Touching non cryptographic data from the current process is always allowed.
+
+Touching only pointers to user data is always allowed.
+
+When your interrupt does not touch user data directly consider marking
+it with IRQF_NO_USER.
+
+When your tasklet does not touch user data directly consider marking
+it with TASKLET_NO_USER using tasklet_init_flags/or
+DECLARE_TASKLET*_NOUSER.
+
+When your timer does not touch user data mark it with TIMER_NO_USER.
+If it is a hrtimer mark it with HRTIMER_MODE_NO_USER.
+
+When your irq poll handler does not touch user data, mark it
+with IRQ_POLL_F_NO_USER through irq_poll_init_flags.
+
+For networking code make sure to only touch user data through
+skb_push/put/copy [add more], unless it is data from the current
+process. If that is not ensured add lazy_clear_cpu or
+lazy_clear_cpu_interrupt. When the non skb data access is only in a
+hardware interrupt controlled by the driver, it can rely on not
+setting IRQF_NO_USER for that interrupt.
+
+Any cryptographic code touching key data should use memzero_explicit
+or kzfree.
+
+If your RCU callback touches user data add lazy_clear_cpu().
+
+These steps are currently only needed for code that runs on MDS affected
+CPUs, which is currently only x86. But might be worth being prepared
+if other architectures become affected too.
+
+Implementation details/assumptions
+----------------------------------
+
+If a system call touches data it is for its own process, so does not
+need to be cleared, because it has already access to it.
+
+When context switching we clear data, unless the context switch
+is inside a process, or from/to idle. We also clear after any
+context switches from kernel threads.
+
+Idle does not have sensitive data, except for in interrupts, which
+are handled separately.
+
+Cryptographic keys inside the kernel should be protected.
+We assume they use kzfree() or memzero_explicit() to clear
+state, so these functions trigger a cpu clear.
+
+Hard interrupts, tasklets, timers which can run asynchronous are
+assumed to touch random user data, unless they have been audited, and
+marked with NO_USER flags.
+
+Most interrupt handlers for modern devices should not touch
+user data because they rely on DMA and only manipulate
+pointers. This needs auditing to confirm though.
+
+For softirqs we assume that if they touch user data they use
+lazy_clear_cpu()/lazy_clear_interrupt() as needed.
+Networking is handled through skb_* below.
+Timer and Tasklets and IRQ poll are handled through opt-in.
+
+Scheduler softirq is assumed to not touch user data.
+
+Block softirq done callbacks are assumed to not touch user data.
+
+For networking code, any skb functions that are likely
+touching non header packet data schedule a clear cpu at next
+kernel exit. This includes skb_copy and related, skb_put/push,
+checksum functions.  We assume that any networking code touching
+packet data uses these functions.
+
+[In principle packet data should be encrypted anyways for the wire,
+but we try to avoid leaking it anyways]
+
+Some IO related functions like string PIO and memcpy_from/to_io, or
+the software pci dma bounce function, which touch data, schedule a
+buffer clear.
+
+We assume NMI/machine check code does not touch other
+processes' data.
+
+Any buffer clearing is done lazily on next kernel exit, so can be
+triggered in fast paths.
+
+Sandboxes
+---------
+
+We don't do anything special for seccomp processes
+
+If there is a sandbox inside the process the process should take care
+itself of clearing its own sensitive data before running sandbox
+code. This would include data touched by system calls.
+
+BPF
+---
+
+Assume BPF execution does not touch other user's data, so does
+not need to schedule a clear for itself.
+
+BPF could attack the rest of the kernel if it can successfully
+measure side channel side effects.
+
+When the BPF program was loaded unprivileged, always clear the CPU
+to prevent any exploits written in BPF using side channels to read
+data leaked from other kernel code
+
+We only do this when running in an interrupt, or if an clear cpu is
+already scheduled (which means for example there was a context
+switch, or crypto operation before)
+
+In process context we assume the code only accesses data of the
+current user and check that the BPF running was loaded by the
+same user so even if data leaked it would not cross privilege
+boundaries.
+
+Technically we would only need to do this if the BPF program
+contains conditional branches and loads dominated by them, but
+let's assume that near all do.
+
+This could be further optimized by allowing callers that do
+a lot of individual BPF runs and are sure they don't touch
+other user's data inbetween to do the clear only once
+at the beginning. We can add such optimizations later based on
+profile data.
+
+Virtualization
+--------------
+
+When entering a guest in KVM we clear to avoid any leakage to a guest.
+Normally this is done implicitely as part of the L1TF mitigation.
+It relies on this being enabled. It also uses the "fast exit"
+optimization that only clears if an interrupt or context switch
+happened.
-- 
2.17.2

  parent reply	other threads:[~2019-01-12  1:39 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-12  1:29 [MODERATED] [PATCH v4 00/28] MDSv4 2 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 01/28] MDSv4 3 Andi Kleen
2019-01-15 14:11   ` [MODERATED] " Andrew Cooper
2019-01-12  1:29 ` [MODERATED] [PATCH v4 02/28] MDSv4 22 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 03/28] MDSv4 20 Andi Kleen
2019-01-14 18:50   ` [MODERATED] " Dave Hansen
2019-01-14 19:29     ` Andi Kleen
2019-01-14 19:38       ` Linus Torvalds
2019-01-12  1:29 ` [MODERATED] [PATCH v4 04/28] MDSv4 8 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 05/28] MDSv4 10 Andi Kleen
2019-01-14 19:20   ` [MODERATED] " Dave Hansen
2019-01-14 19:31     ` Andi Kleen
2019-01-18  7:33     ` [MODERATED] Encrypted Message Jon Masters
2019-01-14 23:39   ` Tim Chen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 06/28] MDSv4 11 Andi Kleen
2019-01-14 19:23   ` [MODERATED] " Dave Hansen
2019-01-15 12:01     ` Jiri Kosina
2019-01-12  1:29 ` [MODERATED] [PATCH v4 07/28] MDSv4 0 Andi Kleen
2019-01-14  4:03   ` [MODERATED] " Josh Poimboeuf
2019-01-14  4:38     ` Andi Kleen
2019-01-14  4:55       ` Josh Poimboeuf
2019-01-12  1:29 ` [MODERATED] [PATCH v4 08/28] MDSv4 19 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 09/28] MDSv4 16 Andi Kleen
2019-01-12  1:29 ` Andi Kleen [this message]
2019-01-15  1:05   ` [MODERATED] Encrypted Message Tim Chen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 11/28] MDSv4 21 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 12/28] MDSv4 25 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 13/28] MDSv4 4 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 14/28] MDSv4 17 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 15/28] MDSv4 9 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 16/28] MDSv4 6 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 17/28] MDSv4 18 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 18/28] MDSv4 26 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 19/28] MDSv4 14 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 20/28] MDSv4 23 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 21/28] MDSv4 15 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 22/28] MDSv4 5 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 23/28] MDSv4 13 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 24/28] MDSv4 28 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 25/28] MDSv4 1 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 26/28] MDSv4 27 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 27/28] MDSv4 7 Andi Kleen
2019-01-12  1:29 ` [MODERATED] [PATCH v4 28/28] MDSv4 12 Andi Kleen
2019-01-12  3:04 ` [MODERATED] Re: [PATCH v4 00/28] MDSv4 2 Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afc24f570651f02b914f13243540304ed7b357b5.1547256470.git.ak@linux.intel.com \
    --to=andi@firstfloor.org \
    --cc=ak@linux.intel.com \
    --cc=speck@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.