From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2312FC43387 for ; Tue, 15 Jan 2019 02:19:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E5C8720659 for ; Tue, 15 Jan 2019 02:19:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727586AbfAOCTZ (ORCPT ); Mon, 14 Jan 2019 21:19:25 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54038 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726886AbfAOCTZ (ORCPT ); Mon, 14 Jan 2019 21:19:25 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C63043DE03; Tue, 15 Jan 2019 02:19:23 +0000 (UTC) Received: from redhat.com (ovpn-124-229.rdu2.redhat.com [10.10.124.229]) by smtp.corp.redhat.com (Postfix) with SMTP id 958A110027CE; Tue, 15 Jan 2019 02:19:10 +0000 (UTC) Date: Mon, 14 Jan 2019 21:19:09 -0500 From: "Michael S. Tsirkin" To: Dave Chinner Cc: Dan Williams , Pankaj Gupta , Matthew Wilcox , Linux Kernel Mailing List , KVM list , Qemu Developers , linux-nvdimm , linux-fsdevel , virtualization@lists.linux-foundation.org, Linux ACPI , linux-ext4 , linux-xfs , Jan Kara , Stefan Hajnoczi , Rik van Riel , Nitesh Narayan Lal , Kevin Wolf , Paolo Bonzini , Ross Zwisler , vishal l verma , dave jiang , David Hildenbrand , jmoyer , xiaoguangrong eric , Christoph Hellwig , Jason Wang , lcapitulino@redhat.com, Igor Mammedov , Eric Blake , Theodore Ts'o , adilger kernel , darrick wong , "Rafael J. Wysocki" Subject: Re: [PATCH v3 0/5] kvm "virtio pmem" device Message-ID: <20190114205031-mutt-send-email-mst@kernel.org> References: <20190109144736.17452-1-pagupta@redhat.com> <20190110012617.GA4205@dastard> <1326478078.61913951.1547192704870.JavaMail.zimbra@redhat.com> <20190113232902.GD4205@dastard> <20190113233820.GX6310@bombadil.infradead.org> <942065073.64011540.1547450140670.JavaMail.zimbra@redhat.com> <20190114212501.GG4205@dastard> <20190114222132.GH4205@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190114222132.GH4205@dastard> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 15 Jan 2019 02:19:24 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Message-ID: <20190115021909.4wobVUvGxx0lteqsX4YuQ2-kXqCz2I1A6rz5zY0UuBI@z> On Tue, Jan 15, 2019 at 09:21:32AM +1100, Dave Chinner wrote: > On Mon, Jan 14, 2019 at 01:35:57PM -0800, Dan Williams wrote: > > On Mon, Jan 14, 2019 at 1:25 PM Dave Chinner wrote: > > > > > > On Mon, Jan 14, 2019 at 02:15:40AM -0500, Pankaj Gupta wrote: > > > > > > > > > > Until you have images (and hence host page cache) shared between > > > > > > multiple guests. People will want to do this, because it means they > > > > > > only need a single set of pages in host memory for executable > > > > > > binaries rather than a set of pages per guest. Then you have > > > > > > multiple guests being able to detect residency of the same set of > > > > > > pages. If the guests can then, in any way, control eviction of the > > > > > > pages from the host cache, then we have a guest-to-guest information > > > > > > leak channel. > > > > > > > > > > I don't think we should ever be considering something that would allow a > > > > > guest to evict page's from the host's pagecache [1]. The guest should > > > > > be able to kick its own references to the host's pagecache out of its > > > > > own pagecache, but not be able to influence whether the host or another > > > > > guest has a read-only mapping cached. > > > > > > > > > > [1] Unless the guest is allowed to modify the host's file; obviously > > > > > truncation, holepunching, etc are going to evict pages from the host's > > > > > page cache. > > > > > > > > This is so correct. Guest does not not evict host page cache pages directly. > > > > > > They don't right now. > > > > > > But someone is going to end up asking for discard to work so that > > > the guest can free unused space in the underlying spares image (i.e. > > > make use of fstrim or mount -o discard) because they have workloads > > > that have bursts of space usage and they need to trim the image > > > files afterwards to keep their overall space usage under control. > > > > > > And then.... > > > > ...we reject / push back on that patch citing the above concern. > > So at what point do we draw the line? > > We're allowing writable DAX mappings, but as I've pointed out that > means we are going to be allowing a potential information leak via > files with shared extents to be directly mapped and written to. > > But we won't allow useful admin operations that allow better > management of host side storage space similar to how normal image > files are used by guests because it's an information leak vector? > > That's splitting some really fine hairs there... May I summarize that th security implications need to be documented? In fact that would make a fine security implications section in the device specification. > > > > In case of virtio-pmem & DAX, guest clears guest page cache exceptional entries. > > > > Its solely decision of host to take action on the host page cache pages. > > > > > > > > In case of virtio-pmem, guest does not modify host file directly i.e don't > > > > perform hole punch & truncation operation directly on host file. > > > > > > ... this will no longer be true, and the nuclear landmine in this > > > driver interface will have been armed.... > > > > I agree with the need to be careful when / if explicit cache control > > is added, but that's not the case today. > > "if"? > > I expect it to be "when", not if. Expect the worst, plan for it now. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com