From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87360C433F5 for ; Sun, 14 Nov 2021 07:54:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6EF5A610FE for ; Sun, 14 Nov 2021 07:54:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232994AbhKNH50 (ORCPT ); Sun, 14 Nov 2021 02:57:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229469AbhKNH5N (ORCPT ); Sun, 14 Nov 2021 02:57:13 -0500 Received: from mail-oi1-x22d.google.com (mail-oi1-x22d.google.com [IPv6:2607:f8b0:4864:20::22d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25932C061746 for ; Sat, 13 Nov 2021 23:54:16 -0800 (PST) Received: by mail-oi1-x22d.google.com with SMTP id be32so27528643oib.11 for ; Sat, 13 Nov 2021 23:54:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Mt/vtZHK8U1Y/g0MBAk3dgrCqujD9uU9xm4N+TPiSC4=; b=tBOhWBAxKbwGO/xAhEYg4SBgMOy31mURG+gnZgnfLWd0BcqM4s6mHLkFKKh9U20tqm j21UFff0YGhDA2L5IMJge6yKksxUep3zESPktW08MvFCpLVyt4Nv7r1BxR82ws+hY6/S UWFfsFPKT5q+ifV7sZMKU3O57gCLfybInChsSP9l+tNHzXKzudrCgvtFkc+sKedNIXgN MZOYa7Mkq5SglN22g7X9lmOWUZyx3TZKvHGcvEV8BYWbZRIYtB/Bmh51jSluTwi6ATE6 SYXOwfYZXzFY2gYXfZ8NxFvzsR6AMs36aseiZileJ5s2gFJx+8bgkzF0VXINlm21cWq8 /XwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Mt/vtZHK8U1Y/g0MBAk3dgrCqujD9uU9xm4N+TPiSC4=; b=Q9NImmmJ2kFe/b3Vzslbw8A2e8hEY+csDdvqtqV6XlRy3AUsIgMpRlAk5IDAKuhoxo J1YcexAcJEGpQjbxcDa/RyabAw7HeteziCLzLWIU+46GCzXwSz3vDQIonscsmEl8cjLU v+SJ0DjDYyJZuW3aQ6PnQ1L50lOUk0Vhmvcp7eZxiK+shCGnwzr/wiYu02iF2pvGpEwJ uqy28CaH6Gt5A0qw3sZ02f7xGhEFab/92zugF7IFNjNFKXWHvwu2BDMzdPhtpj4f0ZxQ IgNkl4N6l/J+0GfUjrlu0ajfewO/Kyn9KcmgasRyzcHwz6gy+rV/3YBQhAVKVW3GZ3GJ apAw== X-Gm-Message-State: AOAM530Z5JAlPVMQsg8qnRi1asutWQzUP0bUn3n+WpSl5nz3Gsbqk8QH LQcnptVbljdE8XRmfMnRqnbDusnGlhFCZXguNn+zJw== X-Google-Smtp-Source: ABdhPJxjgt74dMqwnlL+VtqSOado4iTXu/hJhaIpCUKV4d77eCrVnjV65nNKQFkeuwmKVjJ2koqS4DBKBgoEFCPKGW8= X-Received: by 2002:aca:2319:: with SMTP id e25mr37402844oie.164.1636876455060; Sat, 13 Nov 2021 23:54:15 -0800 (PST) MIME-Version: 1.0 References: <061ccd49-3b9f-d603-bafd-61a067c3f6fa@intel.com> In-Reply-To: From: Marc Orr Date: Sat, 13 Nov 2021 23:54:03 -0800 Message-ID: Subject: Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support To: Sean Christopherson Cc: Peter Gonda , Borislav Petkov , Dave Hansen , Brijesh Singh , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Michael Roth , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 13, 2021 at 10:35 AM Sean Christopherson wrote: > > On Fri, Nov 12, 2021, Marc Orr wrote: > > > > > If *it* is the host kernel, then you probably shouldn't do that - > > > > > otherwise you just killed the host kernel on which all those guests are > > > > > running. > > > > > > > > I agree, it seems better to terminate the single guest with an issue. > > > > Rather than killing the host (and therefore all guests). So I'd > > > > suggest even in this case we do the 'convert to shared' approach or > > > > just outright terminate the guest. > > > > > > > > Are there already examples in KVM of a KVM bug in servicing a VM's > > > > request results in a BUG/panic/oops? That seems not ideal ever. > > > > > > Plenty of examples. kvm_spurious_fault() is the obvious one. Any NULL pointer > > > deref will lead to a BUG, etc... And it's not just KVM, e.g. it's possible, if > > > unlikely, for the core kernel to run into guest private memory (e.g. if the kernel > > > botches an RMP change), and if that happens there's no guarantee that the kernel > > > can recover. > > > > > > I fully agree that ideally KVM would have a better sense of self-preservation, > > > but IMO that's an orthogonal discussion. > > > > I don't think we should treat the possibility of crashing the host > > with live VMs nonchalantly. It's a big deal. Doing so has big > > implications on the probability that any cloud vendor wil bee able to > > deploy this code to production. And aren't cloud vendors one of the > > main use cases for all of this confidential compute stuff? I'm > > honestly surprised that so many people are OK with crashing the host. > > I'm not treating it nonchalantly, merely acknowledging that (a) some flavors of kernel > bugs (or hardware issues!) are inherently fatal to the system, and (b) crashing the > host may be preferable to continuing on in certain cases, e.g. if continuing on has a > high probablity of corrupting guest data. I disagree. Crashing the host -- and _ALL_ of its VMs (including non-confidential VMs) -- is not preferable to crashing a single SNP VM. Especially when that SNP VM is guaranteed to detect the memory corruption and react accordingly.