From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55401) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1faMwu-0000eU-II for qemu-devel@nongnu.org; Tue, 03 Jul 2018 11:10:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1faMwt-0003vz-39 for qemu-devel@nongnu.org; Tue, 03 Jul 2018 11:10:36 -0400 Received: from mail-wr0-x243.google.com ([2a00:1450:400c:c0c::243]:41517) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1faMws-0003t5-AO for qemu-devel@nongnu.org; Tue, 03 Jul 2018 11:10:34 -0400 Received: by mail-wr0-x243.google.com with SMTP id h10-v6so2344186wrq.8 for ; Tue, 03 Jul 2018 08:10:34 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20180703100555.GG16791@stefanha-x1.localdomain> References: <20180331084500.33313-1-jiangshanlai@gmail.com> <20180702131054.GE2155@stefanha-x1.localdomain> <20180703100555.GG16791@stefanha-x1.localdomain> From: Peng Tao Date: Tue, 3 Jul 2018 23:10:12 +0800 Message-ID: Content-Type: text/plain; charset="UTF-8" Subject: Re: [Qemu-devel] [PATCH] migration: add capability to bypass the shared memory List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Lai Jiangshan , Samuel Ortiz , Xu Wang , qemu-devel@nongnu.org, "James O . D . Hunt" , "Dr. David Alan Gilbert" , Markus Armbruster , Juan Quintela , Sebastien Boeuf , Xiao Guangrong , Xiao Guangrong , Paolo Bonzini , Andrea Arcangeli , Marcelo Tosatti , kata-dev@lists.katacontainers.io On Tue, Jul 3, 2018 at 6:05 PM, Stefan Hajnoczi wrote: > On Mon, Jul 02, 2018 at 09:52:08PM +0800, Peng Tao wrote: >> On Mon, Jul 2, 2018 at 9:10 PM, Stefan Hajnoczi wrote: >> > On Sat, Mar 31, 2018 at 04:45:00PM +0800, Lai Jiangshan wrote: >> > Risks: >> > 1. If one cloned VM is exploited then all other VMs are more likely to >> > be exploitable (e.g. kernel address space layout randomization). >> w.r.t. KASLR, any memory duplication technology would expose it. I >> remember there are CVEs (e.g., CVE-2015-2877) specific to this kind >> attack against KSM and it was stated that "Basically if you care about >> this attack vector, disable deduplication.". Share-until-written >> approaches for memory conservation among mutually untrusting tenants >> are inherently detectable for information disclosure, and can be >> classified as potentially misunderstood behaviors rather than >> vulnerabilities. [1] >> >> I think the same applies to vm templating as well. Actually VM >> templating is more useful (than KSM) in this regard since we can >> create a template for each trusted tenant where as with KSM all VMs on >> a host are treated equally. >> >> [1] https://access.redhat.com/security/cve/cve-2015-2877 > > That solves the problem between untrusted users but a breach in one > clone may reveal secrets of all other clones belonging to the same > tenant. As a user, I would be uncomfortable knowing that if one of my > machines is breached then secrets used by all of my machines might be > exposed. > Secrets are really point 2 in your list and I'll answer it below :) >> > 2. If you give VMs cloned from the same template to untrusted users, >> > they may be able to determine the secrets other users' VMs. >> In kata and runv, vm templating is used carefully so that we do not >> use or save any secret keys before creating the template VM. IOW, the >> feature is not supposed to be used generally to create any template >> VMs at any stage. > > At what point are templates captured to avoid these problems? Is there > code that shows how to do this? > Both runv and kata pauses the VM right after the agent inside guest is up and running, which, in the initramfs case, translates into the point that kernel boots and the init process starts. If you are interested in seeing the actual code, you can look at https://github.com/hyperhq/hyperstart/ and https://github.com/kata-containers/agent for what is done in the guest at that point. If you see any secrets being saved there, I'll be more than happy to fix it. :) >> > Security is a >> > major factor for using Kata, so it's important not to leak secrets >> > between cloned VMs. >> > >> Yes, indeed! And it is all about trade-offs, VM templating or KSM. If >> we want security above anything, we should just disable all the >> sharing. But there is actually no ceiling (think about physical >> isolation!). So it's more about trade-offs. With Kata, VM templating >> and KSM give users options to achieve better performance and lower >> memory footprint with little sacrifice. The security advantage of >> running VM-based containers is still there. > > Adding options to enable/disable features leads to confusion among > users, makes performance comparisons harder, and increases support > overhead. > > Technical solutions to the security problems are possible. I'm > interested in progress in this area because it means users don't need to > make a choice, they can benefit from the feature without sacrificing > security. > Well, that is really beyond the scope of the reviewing of this particular QEMU patch. But as a Kata developer, I can answer it anyway. For one thing, Kata already has quite a few configuration options that let users choose different features. For another thing, Kata already ships with KSM support by default and VM templating in Kata is better off than KSM in many aspects (e.g., by providing similar level of memory conservation w/o affecting host and guest performance). So from Kata containers point of view, it makes sense to have VM templating support and let users decide which one they want to use. PS: CCing kata-dev since the discussion starts to concern about Kata specific usage. Cheers, Tao