From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92BD3173 for ; Tue, 20 Jul 2021 17:38:36 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10051"; a="190879889" X-IronPort-AV: E=Sophos;i="5.84,255,1620716400"; d="scan'208";a="190879889" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jul 2021 10:32:59 -0700 X-IronPort-AV: E=Sophos;i="5.84,255,1620716400"; d="scan'208";a="632388638" Received: from akleen-mobl1.amr.corp.intel.com (HELO [10.212.245.156]) ([10.212.245.156]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jul 2021 10:32:57 -0700 Subject: Re: Runtime Memory Validation in Intel-TDX and AMD-SNP To: Joerg Roedel Cc: Erdem Aktas , Andy Lutomirski , David Rientjes , Borislav Petkov , Sean Christopherson , Andrew Morton , Vlastimil Babka , "Kirill A. Shutemov" , Brijesh Singh , Tom Lendacky , Jon Grimm , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , "Kaplan, David" , Varad Gautam , Dario Faggioli , x86 , linux-mm@kvack.org, linux-coco@lists.linux.dev References: From: Andi Kleen Message-ID: Date: Tue, 20 Jul 2021 10:32:51 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 Precedence: bulk X-Mailing-List: linux-coco@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US On 7/20/2021 2:11 AM, Joerg Roedel wrote: > > I am not sure how it is implemented in TDX hardware, but for SNP the > guest _must_ _not_ double-validate or even double-invalidate memory. In TDX it just zeroes the data. If you can tolerate zeroing it's fine. Of course for most data that's not tolerable, but for kexec (minus kernel itself) it is. > > What I sent here is actually v2 of my proposal, v1 had a much more lazy > approach like you are proposing here. But as I learned what can happen > is this: > > * Hypervisor maps GPA X to HPA A > * Guest validates GPA X > Hardware enforces that HPA A always maps to GPA X > * Hypervisor remaps GPA X to HPA B > * Guest lazily re-validates GPA X > Hardware enforces that HPA B always maps to GPA X > > The situation we have now is that host pages A and B are validated for > the same guest page, and the hypervisor can switch between them at will, > without the guest being able to notice it. I don't believe that's possible on TDX > > This can open various attack vectors from the hypervisor towards the > guest, like tricking the guest into a code-path where it accidentially > reveals its secrets. Well things would certainly easier if you had a purge interface then. But for the kexec crash case it would be just attacks against the crash dump, which I assume are not a real security concern. The crash kexec mostly runs in its own memory, which doesn't need this, or is small enough that it can be fully pre-accepted. And for the previous memory view probably these issues are acceptable. That leaves the non crash kexec case, but perhaps it is acceptable to just restart the guest in such a case instead of creating complicated and fragile new interfaces. >> If the device filter is active it won't. > We are not going to pohibit dma_alloc_coherent() in SNP guests just > because we are too lazy to implement memory re-validation. dma_alloc_coherent is of course allowed, just not freeing. Or rather if you free you would need a pool to recycle there. If you have anything that free coherent dma frequently the performance would be terrible so you should probably avoid that at all costs anyways. But since pretty much all the current IO models rely on a small number of static bounce buffers that's not a problem. -Andi