From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2130.oracle.com ([156.151.31.86]:35796 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726248AbeLLUa4 (ORCPT ); Wed, 12 Dec 2018 15:30:56 -0500 Date: Wed, 12 Dec 2018 15:30:49 -0500 From: Konrad Rzeszutek Wilk To: Vivek Goyal Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, miklos@szeredi.hu, stefanha@redhat.com, dgilbert@redhat.com, sweil@redhat.com, swhiteho@redhat.com Subject: Re: [PATCH 00/52] [RFC] virtio-fs: shared file system for virtual machines Message-ID: <20181212203049.GH9077@char.us.oracle.com> References: <20181210171318.16998-1-vgoyal@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181210171318.16998-1-vgoyal@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote: > Hi, > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > These patches should apply on top of 4.20-rc5. We have also put code for > various components here. > > https://gitlab.com/virtio-fs > > Problem Description > =================== > We want to be able to take a directory tree on the host and share it with > guest[s]. Our goal is to be able to do it in a fast, consistent and secure > manner. Our primary use case is kata containers, but it should be usable in > other scenarios as well. > > Containers may rely on local file system semantics for shared volumes, > read-write mounts that multiple containers access simultaneously. File > system changes must be visible to other containers with the same consistency > expected of a local file system, including mmap MAP_SHARED. > > Existing Solutions > ================== > We looked at existing solutions and virtio-9p already provides basic shared > file system functionality although does not offer local file system semantics, > causing some workloads and test suites to fail. In addition, virtio-9p > performance has been an issue for Kata Containers and we believe this cannot > be alleviated without major changes that do not fit into the 9P protocol. > > Design Overview > =============== > With the goal of designing something with better performance and local file > system semantics, a bunch of ideas were proposed. > > - Use fuse protocol (instead of 9p) for communication between guest > and host. Guest kernel will be fuse client and a fuse server will > run on host to serve the requests. Benchmark results (see below) are > encouraging and show this approach performs well (2x to 8x improvement > depending on test being run). > > - For data access inside guest, mmap portion of file in QEMU address > space and guest accesses this memory using dax. That way guest page > cache is bypassed and there is only one copy of data (on host). This > will also enable mmap(MAP_SHARED) between guests. > > - For metadata coherency, there is a shared memory region which contains > version number associated with metadata and any guest changing metadata > updates version number and other guests refresh metadata on next > access. This is still experimental and implementation is not complete. What about Windows guests or BSD ones? Is there a plan to make that work with them as well? What about the Virtio spec? Plans to make changes there as well?