From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E22B1C433DF for ; Mon, 25 May 2020 15:59:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9763A2071A for ; Mon, 25 May 2020 15:59:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ktt0B1DH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9763A2071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3E19280052; Mon, 25 May 2020 11:59:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 393558E0008; Mon, 25 May 2020 11:59:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A8B780052; Mon, 25 May 2020 11:59:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 140438E0008 for ; Mon, 25 May 2020 11:59:34 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id CA9F05003 for ; Mon, 25 May 2020 15:59:33 +0000 (UTC) X-FDA: 76855701426.14.ice90_3655f47776c2f X-HE-Tag: ice90_3655f47776c2f X-Filterd-Recvd-Size: 9136 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Mon, 25 May 2020 15:59:33 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04PFwghJ129757; Mon, 25 May 2020 15:59:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=Y0KNdejFsfkVxkixHd4qUi3/9lIVbUPFXxLx2tMB3mo=; b=ktt0B1DHcMRPvUUhQIHh+XEmZH4NOxqhRe6nehGlGe6jWE0wP/BKZz6JEMgAEzDgr4Wt Iyw2VMRvZgMlAMUPTq1xWQNlvHNIQS9Hk3zTqSdffRXf5mV7hA6z5WHxk6GB/6HyB/zQ 51jwLGiQCLXMbGUesZSNUg71QQxsrm082PQgdMQkm+jHvJ5iwhpa4nL7UvoNyapqmpQW g7iyh42jaEG3tha5rot3T7+WEyCSd7fsSajvSIfQKeIQe0mEQ6IoQXkD4BNCMPKkSjC4 hISIz4pP5LzYrh0UZJHOsSXum9iprA2Sk1BD+A4AkSaZD/Phl7WyugXVyCvpQ+/wSsGS nw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 316vfn6096-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 25 May 2020 15:59:13 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04PFrU0i025028; Mon, 25 May 2020 15:57:13 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 317j5jjg0w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 May 2020 15:57:12 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04PFv3EO022146; Mon, 25 May 2020 15:57:08 GMT Received: from [192.168.14.112] (/79.178.199.48) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 25 May 2020 08:57:03 -0700 Subject: Re: [RFC 00/16] KVM protected memory extension To: "Kirill A. Shutemov" Cc: Dave Hansen , Andy Lutomirski , Peter Zijlstra , Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , David Rientjes , Andrea Arcangeli , Kees Cook , Will Drewry , "Edgecombe, Rick P" , "Kleen, Andi" , x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" References: <20200522125214.31348-1-kirill.shutemov@linux.intel.com> <42685c32-a7a9-b971-0cf4-e8af8d9a40c6@oracle.com> <20200525144656.phfxjp2qip6736fj@box> From: Liran Alon Message-ID: <29c62691-0d50-8a02-5f43-761fa56ab551@oracle.com> Date: Mon, 25 May 2020 18:56:57 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <20200525144656.phfxjp2qip6736fj@box> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9632 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=804 mlxscore=0 adultscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005250124 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9632 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 clxscore=1015 priorityscore=1501 mlxscore=0 malwarescore=0 spamscore=0 impostorscore=0 mlxlogscore=835 lowpriorityscore=0 bulkscore=0 adultscore=0 suspectscore=0 cotscore=-2147483648 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005250125 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 25/05/2020 17:46, Kirill A. Shutemov wrote: > On Mon, May 25, 2020 at 04:47:18PM +0300, Liran Alon wrote: >> On 22/05/2020 15:51, Kirill A. Shutemov wrote: >>> =3D=3D Background / Problem =3D=3D >>> >>> There are a number of hardware features (MKTME, SEV) which protect gu= est >>> memory from some unauthorized host access. The patchset proposes a pu= rely >>> software feature that mitigates some of the same host-side read-only >>> attacks. >>> >>> >>> =3D=3D What does this set mitigate? =3D=3D >>> >>> - Host kernel =E2=80=9Daccidental=E2=80=9D access to guest data (t= hink speculation) >> Just to clarify: This is any host kernel memory info-leak vulnerabilit= y. Not >> just speculative execution memory info-leaks. Also architectural ones. >> >> In addition, note that removing guest data from host kernel VA space a= lso >> makes guest<->host memory exploits more difficult. >> E.g. Guest cannot use already available memory buffer in kernel VA spa= ce for >> ROP or placing valuable guest-controlled code/data in general. >> >>> - Host kernel induced access to guest data (write(fd, &guest_data_= ptr, len)) >>> >>> - Host userspace access to guest data (compromised qemu) >> I don't quite understand what is the benefit of preventing userspace V= MM >> access to guest data while the host kernel can still access it. > Let me clarify: the guest memory mapped into host userspace is not > accessible by both host kernel and userspace. Host still has way to acc= ess > it via a new interface: GUP(FOLL_KVM). The GUP will give you struct pag= e > that kernel has to map (temporarily) if need to access the data. So onl= y > blessed codepaths would know how to deal with the memory. Yes, I understood that. I meant explicit host kernel access. > > It can help preventing some host->guest attack on the compromised host. > Like if an VM has successfully attacked the host it cannot attack other > VMs as easy. We have mechanisms to sandbox the userspace VMM process for that. You need to be more specific on what is the attack scenario you attempt=20 to address here that is not covered by existing mechanisms. i.e. Be crystal clear=20 on the extra value of the feature of not exposing guest data to userspace VMM. > > It would also help to protect against guest->host attack by removing on= e > more places where the guest's data is mapped on the host. Because guest have explicit interface to request which guest pages can=20 be mapped in userspace VMM, the value of this is very small. Guest already have ability to map guest controlled code/data in=20 userspace VMM either via this interface or via forcing userspace VMM to create various objects during device emulation handling. The only=20 extra property this patch-series provides, is that only a small portion of guest pages will be mapped to host userspace instead of=20 all of it. Resulting in smaller regions for exploits that require guessing a virtual address. But: (a) Userspace VMM device emulation may=20 still allow guest to spray userspace heap with objects containing guest controlled data. (b) How is userspace VMM suppose to limit which=20 guest pages should not be mapped to userspace VMM even though guest have explicitly requested them to be mapped? (E.g. Because they are valid DMA=20 sources/targets for virtual devices or because it's vGPU frame-buffer). >> QEMU is more easily compromised than the host kernel because it's >> guest<->host attack surface is larger (E.g. Various device emulation). >> But this compromise comes from the guest itself. Not other guests. In >> contrast to host kernel attack surface, which an info-leak there can >> be exploited from one guest to leak another guest data. > Consider the case when unprivileged guest user exploits bug in a QEMU > device emulation to gain access to data it cannot normally have access > within the guest. With the feature it would able to see only other shar= ed > regions of guest memory such as DMA and IO buffers, but not the rest. This is a scenario where an unpriviledged guest userspace have direct=20 access to a virtual device and is able to exploit a bug in device emulation handling such that it=20 will allow it to compromise the security *inside* the guest. i.e. Leak guest kernel data or other=20 guest userspace processes data. That's true. Good point. This is a very important missing argument from=20 the cover-letter. Now it's crystal clear on the trade-off considered here: Is the extra complication and perf cost provided by the mechanism of=20 this patch-series worth to protect against the scenario of a userspace VMM vulnerability that=20 may be accessible to unpriviledged guest userspace process to leak other *in-guest* data that is not=20 otherwise accessible to that process? -Liran