From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58168C43382 for ; Thu, 27 Sep 2018 20:28:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 094F02170E for ; Thu, 27 Sep 2018 20:28:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 094F02170E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728016AbeI1Cs3 (ORCPT ); Thu, 27 Sep 2018 22:48:29 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:52960 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727361AbeI1Cs2 (ORCPT ); Thu, 27 Sep 2018 22:48:28 -0400 Received: from p5492e4c1.dip0.t-ipconnect.de ([84.146.228.193] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1g5ctI-0002iA-9P; Thu, 27 Sep 2018 22:28:04 +0200 Date: Thu, 27 Sep 2018 22:28:03 +0200 (CEST) From: Thomas Gleixner To: Stephen Smalley cc: Jiri Kosina , Ingo Molnar , Peter Zijlstra , Josh Poimboeuf , Andrea Arcangeli , "Woodhouse, David" , Andi Kleen , Tim Chen , "Schaufler, Casey" , linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak In-Reply-To: <7f9e1a22-37a8-db88-ffc0-91961174ced4@tycho.nsa.gov> Message-ID: References: <7f9e1a22-37a8-db88-ffc0-91961174ced4@tycho.nsa.gov> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 27 Sep 2018, Stephen Smalley wrote: > On 09/25/2018 08:38 AM, Jiri Kosina wrote: > > +static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id) > > +{ > > + /* > > + * Check if the current (previous) task has access to the memory > > + * of the @tsk (next) task. If access is denied, make sure to > > + * issue a IBPB to stop user->user Spectre-v2 attacks. > > + * > > + * Note: __ptrace_may_access() returns 0 or -ERRNO. > > + */ > > + return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id && > > + ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB)); > > Would there be any safe way to perform the ptrace check earlier at a point > where the locking constraints are less severe, and just pass down the result > to this code? Possibly just defaulting to enabling IBPB for safety if > something changed in the interim that would invalidate the earlier ptrace > check? Probably not possible, but I thought I'd ask as it would avoid the > need to skip both the ptrace_has_cap check and the LSM hook, and would reduce > the critical section. It's not possible unfortunately as this happens under the scheduler run queue lock and this needs to be taken to figure out which is the next task. We can't drop it before context switch and revisit the decision afterwards to verify it, that would be a massive performance issue and open an even more horrible can of worms. Any check which needs to be done in that context should be as minimalistic as possible. So having a special mode which then might invoke special hooks makes a lot of sense. > > + * Returns true on success, false on denial. > > + * > > + * Similar to ptrace_may_access(). Only to be called from context switch > > + * code. Does not call into audit and the regular LSM hooks due to locking > > + * constraints. > > Pardon my ignorance, but can you clarify exactly what are the locking > constraints for any code that might be called now or in the future from > ptrace_may_access_sched(). What's permissible? rcu_read_lock()? rcu_read_lock() is fine. Locks might be fine, but the probability that you run into a lock inversion is extremly high. Also please keep in mind that this wants to be a raw_spinlock as otherwise preempt-RT will have issues and the lock sections need to be really short. switch_to() is a hot path. Thanks, tglx