From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53BE6C433EF for ; Fri, 3 Sep 2021 05:31:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CC3460F56 for ; Fri, 3 Sep 2021 05:31:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233634AbhICFcA (ORCPT ); Fri, 3 Sep 2021 01:32:00 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:44553 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232553AbhICFb7 (ORCPT ); Fri, 3 Sep 2021 01:31:59 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R881e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=alimailimapcm10staff010182156082;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0Un3x3K1_1630647058; Received: from admindeMacBook-Pro-2.local(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0Un3x3K1_1630647058) by smtp.aliyun-inc.com(127.0.0.1); Fri, 03 Sep 2021 13:30:58 +0800 Subject: Re: [PATCH v4 0/8] fuse,virtiofs: support per-file DAX To: Miklos Szeredi Cc: Vivek Goyal , Stefan Hajnoczi , linux-fsdevel@vger.kernel.org, virtualization@lists.linux-foundation.org, virtio-fs-list , Joseph Qi , Liu Bo References: <20210817022220.17574-1-jefflexu@linux.alibaba.com> <6043c0b8-0ff1-2e11-0dd0-e23f9ff6b952@linux.alibaba.com> From: JeffleXu Message-ID: Date: Fri, 3 Sep 2021 13:30:58 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 8/17/21 10:08 PM, Miklos Szeredi wrote: > On Tue, 17 Aug 2021 at 15:22, JeffleXu wrote: >> >> >> >> On 8/17/21 8:39 PM, Vivek Goyal wrote: >>> On Tue, Aug 17, 2021 at 10:06:53AM +0200, Miklos Szeredi wrote: >>>> On Tue, 17 Aug 2021 at 04:22, Jeffle Xu wrote: >>>>> >>>>> This patchset adds support of per-file DAX for virtiofs, which is >>>>> inspired by Ira Weiny's work on ext4[1] and xfs[2]. >>>> >>>> Can you please explain the background of this change in detail? >>>> >>>> Why would an admin want to enable DAX for a particular virtiofs file >>>> and not for others? >>> >>> Initially I thought that they needed it because they are downloading >>> files on the fly from server. So they don't want to enable dax on the file >>> till file is completely downloaded. >> >> Right, it's our initial requirement. >> >> >>> But later I realized that they should >>> be able to block in FUSE_SETUPMAPPING call and make sure associated >>> file section has been downloaded before returning and solve the problem. >>> So that can't be the primary reason. >> >> Saying we want to access 4KB of one file inside guest, if it goes >> through FUSE request routine, then the fuse daemon only need to download >> this 4KB from remote server. But if it goes through DAX, then the fuse >> daemon need to download the whole DAX window (e.g., 2MB) from remote >> server, so called amplification. Maybe we could decrease the DAX window >> size, but it's a trade off. > > That could be achieved with a plain fuse filesystem on the host (which > will get 4k READ requests for accesses to mapped area inside guest). > Since this can be done selectively for files which are not yet > downloaded, the extra layer wouldn't be a performance problem. > > Is there a reason why that wouldn't work? I didn't realize this mechanism (working around from user space) before sending this patch set. After learning the virtualization and KVM stuffs, I find that, as Vivek Goyal replied in [1], virtiofsd/qemu need to somehow hook the user page fault and then download the remained part. IMHO, this mechanism (as you proposed by implementing a plain fuse filesystem on the host) seems a little bit sophisticated so far. [1] https://lore.kernel.org/linux-fsdevel/YR08KnP8cO8LjKY7@redhat.com/ -- Thanks, Jeffle From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11CD1C433F5 for ; Fri, 3 Sep 2021 05:31:09 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8B97460F90 for ; Fri, 3 Sep 2021 05:31:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8B97460F90 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 474EC606DE; Fri, 3 Sep 2021 05:31:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id azWn2AyyQ8s5; Fri, 3 Sep 2021 05:31:07 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id C52F0606D3; Fri, 3 Sep 2021 05:31:06 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 98453C0010; Fri, 3 Sep 2021 05:31:06 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id B2E2DC000E for ; Fri, 3 Sep 2021 05:31:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id A9E2A606DE for ; Fri, 3 Sep 2021 05:31:04 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KCsip0e4MXL2 for ; Fri, 3 Sep 2021 05:31:03 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by smtp3.osuosl.org (Postfix) with ESMTPS id 2774F606D3 for ; Fri, 3 Sep 2021 05:31:02 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R881e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=alimailimapcm10staff010182156082; MF=jefflexu@linux.alibaba.com; NM=1; PH=DS; RN=8; SR=0; TI=SMTPD_---0Un3x3K1_1630647058; Received: from admindeMacBook-Pro-2.local(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0Un3x3K1_1630647058) by smtp.aliyun-inc.com(127.0.0.1); Fri, 03 Sep 2021 13:30:58 +0800 Subject: Re: [PATCH v4 0/8] fuse,virtiofs: support per-file DAX To: Miklos Szeredi References: <20210817022220.17574-1-jefflexu@linux.alibaba.com> <6043c0b8-0ff1-2e11-0dd0-e23f9ff6b952@linux.alibaba.com> From: JeffleXu Message-ID: Date: Fri, 3 Sep 2021 13:30:58 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Cc: virtualization@lists.linux-foundation.org, virtio-fs-list , Joseph Qi , Liu Bo , Stefan Hajnoczi , linux-fsdevel@vger.kernel.org, Vivek Goyal X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" On 8/17/21 10:08 PM, Miklos Szeredi wrote: > On Tue, 17 Aug 2021 at 15:22, JeffleXu wrote: >> >> >> >> On 8/17/21 8:39 PM, Vivek Goyal wrote: >>> On Tue, Aug 17, 2021 at 10:06:53AM +0200, Miklos Szeredi wrote: >>>> On Tue, 17 Aug 2021 at 04:22, Jeffle Xu wrote: >>>>> >>>>> This patchset adds support of per-file DAX for virtiofs, which is >>>>> inspired by Ira Weiny's work on ext4[1] and xfs[2]. >>>> >>>> Can you please explain the background of this change in detail? >>>> >>>> Why would an admin want to enable DAX for a particular virtiofs file >>>> and not for others? >>> >>> Initially I thought that they needed it because they are downloading >>> files on the fly from server. So they don't want to enable dax on the file >>> till file is completely downloaded. >> >> Right, it's our initial requirement. >> >> >>> But later I realized that they should >>> be able to block in FUSE_SETUPMAPPING call and make sure associated >>> file section has been downloaded before returning and solve the problem. >>> So that can't be the primary reason. >> >> Saying we want to access 4KB of one file inside guest, if it goes >> through FUSE request routine, then the fuse daemon only need to download >> this 4KB from remote server. But if it goes through DAX, then the fuse >> daemon need to download the whole DAX window (e.g., 2MB) from remote >> server, so called amplification. Maybe we could decrease the DAX window >> size, but it's a trade off. > > That could be achieved with a plain fuse filesystem on the host (which > will get 4k READ requests for accesses to mapped area inside guest). > Since this can be done selectively for files which are not yet > downloaded, the extra layer wouldn't be a performance problem. > > Is there a reason why that wouldn't work? I didn't realize this mechanism (working around from user space) before sending this patch set. After learning the virtualization and KVM stuffs, I find that, as Vivek Goyal replied in [1], virtiofsd/qemu need to somehow hook the user page fault and then download the remained part. IMHO, this mechanism (as you proposed by implementing a plain fuse filesystem on the host) seems a little bit sophisticated so far. [1] https://lore.kernel.org/linux-fsdevel/YR08KnP8cO8LjKY7@redhat.com/ -- Thanks, Jeffle _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization From mboxrd@z Thu Jan 1 00:00:00 1970 References: <20210817022220.17574-1-jefflexu@linux.alibaba.com> <6043c0b8-0ff1-2e11-0dd0-e23f9ff6b952@linux.alibaba.com> From: JeffleXu Message-ID: Date: Fri, 3 Sep 2021 13:30:58 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Virtio-fs] [PATCH v4 0/8] fuse,virtiofs: support per-file DAX List-Id: Development discussions about virtio-fs List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Miklos Szeredi Cc: virtualization@lists.linux-foundation.org, virtio-fs-list , Joseph Qi , linux-fsdevel@vger.kernel.org, Vivek Goyal On 8/17/21 10:08 PM, Miklos Szeredi wrote: > On Tue, 17 Aug 2021 at 15:22, JeffleXu wrote: >> >> >> >> On 8/17/21 8:39 PM, Vivek Goyal wrote: >>> On Tue, Aug 17, 2021 at 10:06:53AM +0200, Miklos Szeredi wrote: >>>> On Tue, 17 Aug 2021 at 04:22, Jeffle Xu wrote: >>>>> >>>>> This patchset adds support of per-file DAX for virtiofs, which is >>>>> inspired by Ira Weiny's work on ext4[1] and xfs[2]. >>>> >>>> Can you please explain the background of this change in detail? >>>> >>>> Why would an admin want to enable DAX for a particular virtiofs file >>>> and not for others? >>> >>> Initially I thought that they needed it because they are downloading >>> files on the fly from server. So they don't want to enable dax on the file >>> till file is completely downloaded. >> >> Right, it's our initial requirement. >> >> >>> But later I realized that they should >>> be able to block in FUSE_SETUPMAPPING call and make sure associated >>> file section has been downloaded before returning and solve the problem. >>> So that can't be the primary reason. >> >> Saying we want to access 4KB of one file inside guest, if it goes >> through FUSE request routine, then the fuse daemon only need to download >> this 4KB from remote server. But if it goes through DAX, then the fuse >> daemon need to download the whole DAX window (e.g., 2MB) from remote >> server, so called amplification. Maybe we could decrease the DAX window >> size, but it's a trade off. > > That could be achieved with a plain fuse filesystem on the host (which > will get 4k READ requests for accesses to mapped area inside guest). > Since this can be done selectively for files which are not yet > downloaded, the extra layer wouldn't be a performance problem. > > Is there a reason why that wouldn't work? I didn't realize this mechanism (working around from user space) before sending this patch set. After learning the virtualization and KVM stuffs, I find that, as Vivek Goyal replied in [1], virtiofsd/qemu need to somehow hook the user page fault and then download the remained part. IMHO, this mechanism (as you proposed by implementing a plain fuse filesystem on the host) seems a little bit sophisticated so far. [1] https://lore.kernel.org/linux-fsdevel/YR08KnP8cO8LjKY7@redhat.com/ -- Thanks, Jeffle