From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=PzCS=LE=vger.kernel.org=linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID,
	DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,
	T_DKIMWL_WL_MED,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 79FFEC4321D
	for <linux-kernel@archiver.kernel.org>; Tue, 21 Aug 2018 17:45:35 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id 1B067213A2
	for <linux-kernel@archiver.kernel.org>; Tue, 21 Aug 2018 17:45:35 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GuBlLeji"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B067213A2
Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com
Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1727408AbeHUVGj (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 21 Aug 2018 17:06:39 -0400
Received: from mail-oi0-f67.google.com ([209.85.218.67]:35509 "EHLO
        mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1726812AbeHUVGj (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 21 Aug 2018 17:06:39 -0400
Received: by mail-oi0-f67.google.com with SMTP id m11-v6so33563917oic.2
        for <linux-kernel@vger.kernel.org>; Tue, 21 Aug 2018 10:45:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=dO0lfgX0fMg7hdyKQlJcjR+/MPdj1lhuIHxlLh/XP2Q=;
        b=GuBlLejiSGrUEXig3Lve1cyNb16zYReYWFDOilqTtrlHAFdisIygNhdqN1NYSdG2Xm
         wcG2Sw1d/Sb9TRrcUetj/AfeAzkasf2qqN66ynG2qh7quM4q4Odw84grClmfPaqZSFCi
         VR6bt9EpsIhUYCQHcJhAzSpOGnEj8J5ZW8OEnhVqFBL1EyphxiRYOSdY+l9zufYcFAsd
         LPqamfI0u+Tftxna75WXCColwG7Th9pjjE0ItPV+fYEz9o7hXDdXS7QkSXBzZNsKDEL5
         BS2ubfm1uU25JMilbRt2VFaZv+I4XEWL0C8kXuocGYB0ItxNa85nD5nhZU2E9Lp47XFN
         TwKw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=dO0lfgX0fMg7hdyKQlJcjR+/MPdj1lhuIHxlLh/XP2Q=;
        b=WYE6IwStQW3Eodg/RJ+pA6gDKuoAZdDCfpZsXpuVowaJA+5MxxrTYAwd9hS/w+tpWP
         igPIECG+hbWSeFnCJGtkG7Y3uG0efaPHvpwIRihN//DYJpA0ctV6JOwdylo7PeWVp/BM
         eHq+MB4euiOuRZx5ag8ahyx9j12vZXyuo1x4+IfVFMuDzer98+Xys+F9vgA3E2VInmPq
         1ecjHiFtVtCNjwvlBfxJ+syLg3INXJuvWdsJMC3Y2VtSWoQN0sGOayYy+iog6EsUOxz0
         TJl+aCDbcKpd4/VgHxY7wMgXVEw3ejVa635/OZZYoLDP2AAf/fIieSW1gScvMx/wgzFM
         uc/A==
X-Gm-Message-State: APzg51BWkVlQNMpk/R+/NM4MFvpOlb3oGsHYDkTG/YqdvsDHh80FU0ZI
        dniWDxV2iq9n9yyt32cQjj6aVPDdLdTeZpQQTQUVL5uIiFxtSA==
X-Google-Smtp-Source: ANB0VdZELc3BR+VAvFOZfzEV6exMdtZbd9ITxR/7qFDOaYHR9MIELZ8sy1li5VeyN0Nog4t6HLuEL7ycEdlUy3pEEuw=
X-Received: by 2002:aca:4204:: with SMTP id p4-v6mr327179oia.242.1534873531261;
 Tue, 21 Aug 2018 10:45:31 -0700 (PDT)
MIME-Version: 1.0
References: <20180817221624.10232-1-casey.schaufler@intel.com>
 <20180817221624.10232-3-casey.schaufler@intel.com> <CAG48ez23G+wcnkCMNZV_mxYMS=h+pZkpx1451Wz5GB9_tcDv3Q@mail.gmail.com>
 <99FC4B6EFCEFD44486C35F4C281DC6732143F769@ORSMSX107.amr.corp.intel.com>
 <CAG48ez1LHOHvB4ud+8asOjKARVLQJGV4ocdKVDJtvTXTfeMa9w@mail.gmail.com> <99FC4B6EFCEFD44486C35F4C281DC67321440056@ORSMSX107.amr.corp.intel.com>
In-Reply-To: <99FC4B6EFCEFD44486C35F4C281DC67321440056@ORSMSX107.amr.corp.intel.com>
From:   Jann Horn <jannh@google.com>
Date:   Tue, 21 Aug 2018 19:45:03 +0200
Message-ID: <CAG48ez0j4+4iaH60Xf857Oh8TwhXzA39M3-CC4kXJKf6Ctvchg@mail.gmail.com>
Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-channel vulnerability
To:     Casey Schaufler <casey.schaufler@intel.com>
Cc:     Kernel Hardening <kernel-hardening@lists.openwall.com>,
        kernel list <linux-kernel@vger.kernel.org>,
        linux-security-module <linux-security-module@vger.kernel.org>,
        selinux@tycho.nsa.gov, Dave Hansen <dave.hansen@intel.com>,
        deneen.t.dock@intel.com, kristen@linux.intel.com,
        Arjan van de Ven <arjan@linux.intel.com>,
        Andy Lutomirski <luto@kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Aug 21, 2018 at 6:37 PM Schaufler, Casey
<casey.schaufler@intel.com> wrote:
>
> > -----Original Message-----
> > From: Jann Horn [mailto:jannh@google.com]
> > Sent: Tuesday, August 21, 2018 3:20 AM
> > To: Schaufler, Casey <casey.schaufler@intel.com>
> > Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>; kernel list
> > <linux-kernel@vger.kernel.org>; linux-security-module <linux-security-
> > module@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > <dave.hansen@intel.com>; Dock, Deneen T <deneen.t.dock@intel.com>;
> > kristen@linux.intel.com; Arjan van de Ven <arjan@linux.intel.com>
> > Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-
> > channel vulnerability
> >
> > On Mon, Aug 20, 2018 at 4:45 PM Schaufler, Casey
> > <casey.schaufler@intel.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Jann Horn [mailto:jannh@google.com]
> > > > Sent: Friday, August 17, 2018 4:55 PM
> > > > To: Schaufler, Casey <casey.schaufler@intel.com>
> > > > Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>; kernel list
> > > > <linux-kernel@vger.kernel.org>; linux-security-module <linux-security-
> > > > module@vger.kernel.org>; selinux@tycho.nsa.gov; Hansen, Dave
> > > > <dave.hansen@intel.com>; Dock, Deneen T <deneen.t.dock@intel.com>;
> > > > kristen@linux.intel.com; Arjan van de Ven <arjan@linux.intel.com>
> > > > Subject: Re: [PATCH RFC v2 2/5] X86: Support LSM determination of side-
> > > > channel vulnerability
> > > >
> > > > On Sat, Aug 18, 2018 at 12:17 AM Casey Schaufler
> > > > <casey.schaufler@intel.com> wrote:
> > > > >
> > > > > From: Casey Schaufler <cschaufler@localhost.localdomain>
> > > > >
> > > > > When switching between tasks it may be necessary
> > > > > to set an indirect branch prediction barrier if the
> > > > > tasks are potentially vulnerable to side-channel
> > > > > attacks. This adds a call to security_task_safe_sidechannel
> > > > > so that security modules can weigh in on the decision.
> > > > >
> > > > > Signed-off-by: Casey Schaufler <casey.schaufler@intel.com>
> > > > > ---
> > > > >  arch/x86/mm/tlb.c | 12 ++++++++----
> > > > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > > > > index 6eb1f34c3c85..8714d4af06aa 100644
> > > > > --- a/arch/x86/mm/tlb.c
> > > > > +++ b/arch/x86/mm/tlb.c
> > > > > @@ -7,6 +7,7 @@
> > > > >  #include <linux/export.h>
> > > > >  #include <linux/cpu.h>
> > > > >  #include <linux/debugfs.h>
> > > > > +#include <linux/security.h>
> > > > >
> > > > >  #include <asm/tlbflush.h>
> > > > >  #include <asm/mmu_context.h>
> > > > > @@ -270,11 +271,14 @@ void switch_mm_irqs_off(struct mm_struct
> > *prev,
> > > > struct mm_struct *next,
> > > > >                  * threads. It will also not flush if we switch to idle
> > > > >                  * thread and back to the same process. It will flush if we
> > > > >                  * switch to a different non-dumpable process.
> > > > > +                * If a security module thinks that the transition
> > > > > +                * is unsafe do the flush.
> > > > >                  */
> > > > > -               if (tsk && tsk->mm &&
> > > > > -                   tsk->mm->context.ctx_id != last_ctx_id &&
> > > > > -                   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> > > > > -                       indirect_branch_prediction_barrier();
> > > > > +               if (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id)
> > {
> > > > > +                       if (get_dumpable(tsk->mm) != SUID_DUMP_USER ||
> > > > > +                           security_task_safe_sidechannel(tsk) != 0)
> > > > > +                               indirect_branch_prediction_barrier();
> > > > > +               }
> > > >
> > > > When you posted v1 of this series, I asked:
> > > >
> > > > | Does this enforce transitivity? What happens if we first switch from
> > > > | an attacker task to a task without ->mm, and immediately afterwards
> > > > | from the task without ->mm to a victim task? In that case, whether a
> > > > | flush happens between the attacker task and the victim task depends on
> > > > | whether the LSM thinks that the mm-less task should have access to the
> > > > | victim task, right?
> > > >
> > > > Have you addressed that? I don't see it...
> > >
> > > Nope. That's going to require maintaining state about all the
> > > tasks in the chain that might still have cache involvement.
> > >
> > >         A -> B -> C -> D
> >
> > Really?
>
> I am willing to be educated otherwise. My understanding
> of Modern Processor Technology will never be so deep that
> I won't listen to reason.
>
> >
> > From what I can tell, it'd be enough to:
> >
> >  - ensure that the LSM-based access checks behave approximately transitively
> >    (which I think they already do, mostly)
>
> Smack rules are explicitly and intentionally not transitive.
>
> A reads B, B reads C does *not* imply A reads C.

Ah. :(

Well, at least for UID-based checks, capability comparisons and
namespace comparisons, the relationship should be transitive, right?

> >  - keep a copy of the metadata of the last non-kernel task on the CPU
>
> Do you have a suggestion of how one might do that?
> I'm willing to believe the information could be available,
> but I have yet to come up with a mechanism for getting it.

The obvious solution would be to take a refcounted reference on the
old task's objective creds, but you probably want to avoid the
resulting cache line bouncing...

For safe_by_uid(), I think you could get away with just stashing the
last UID in a percpu variable, instead of keeping the full creds
struct around. That should be fairly cheap?

Namespace comparisons, and whatever SELinux/Smack/AppArmor do
internally, are probably more complicated, since you'd potentially
have to deal with changes of internal IDs and such if the policy gets
reloaded in the wrong moment.
For namespaces, perhaps you could give each namespace a unique 128-bit
ID and then save and compare those, just like UIDs.
For LSMs whose internal IDs might change after a policy reload, things
would probably be more messy. Perhaps you could save, e.g. for
SELinux, something like an (sid,policy_generation_counter) pair? I
don't know all that much about the internals of classic LSMs.

> > > If B and C don't do anything cacheworthy D could conceivably attack A.
> > > The amount of state required to detect this case would be prohibitive.
> > > I think that if you're sufficiently concerned about this case you should just
> > > go ahead and set the barrier. I'm willing to learn something that says I'm
> > > wrong.
> >
> > That means that an attacker who can e.g. get a CPU to first switch
> > from an attacker task to a softirqd (e.g. for network packet
> > processing or whatever), then switch from the softirqd to a root-owned
> > victim task would be able to bypass the check, right? That doesn't
> > sound like a very complicated attack...
>
> Maybe my brain is still stuck in the 1980's, but that sounds pretty
> complicated to me! Of course, the fact that it's beyond where I would
> go doesn't mean it's implausible.

It seems to me like this could happen relatively easily if you have
one attacker task that keeps calling sched_yield() together with a
victim task on a logical core that's also running a softirqd? Attacker
voluntarily preempts, softirqd runs for packet processing, softirqd
ends processing, kernel schedules victim? I'm not sure how high the
injection success rate would be with that though.

> > I very much dislike the idea of adding a mitigation with a known
> > bypass technique to the kernel.
>
> That's fair. I'll look more closely at getting previous_cred_this_cpu().
>
> Thank!
>