From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6DF3C433E0 for ; Fri, 19 Mar 2021 01:53:12 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4413564E3F for ; Fri, 19 Mar 2021 01:53:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4413564E3F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 14076100F2244; Thu, 18 Mar 2021 18:53:12 -0700 (PDT) Received-SPF: Neutral (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 9439C100EB35B for ; Thu, 18 Mar 2021 18:53:09 -0700 (PDT) IronPort-HdrOrdr: =?us-ascii?q?A9a23=3AIHRCo65wLKeSV+wgWwPXwCjXdLJzesId70hD?= =?us-ascii?q?6mlaTxtJfsuE0/2/hfhz726RtB89elEF3eqBNq6JXG/G+fdOjLU5EL++UGDd1l?= =?us-ascii?q?eAA41v4IDryT+lOwCWzIRg/Ih6dawWMrzNJHxbqeq/3wWiCdYnx7C8gcWVrMPT?= =?us-ascii?q?1W1kQw0vS4wI1XYbNi+hHkd7RBZLCPMCffLy2uN8uzGidX4LB/7LZEUtYu6rnb?= =?us-ascii?q?32vaOjSRsHKjpi0wOWkA6vgYSQLzGomjsYTBNDqI1PzVT4?= X-IronPort-AV: E=Sophos;i="5.81,259,1610380800"; d="scan'208";a="105876682" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 19 Mar 2021 09:53:06 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id 5AB894CEB2B4; Fri, 19 Mar 2021 09:53:06 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 19 Mar 2021 09:52:56 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 19 Mar 2021 09:52:55 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH v3 04/10] fsdax: Introduce dax_iomap_cow_copy() Date: Fri, 19 Mar 2021 09:52:31 +0800 Message-ID: <20210319015237.993880-5-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> References: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 5AB894CEB2B4.A4A01 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@fujitsu.com Message-ID-Hash: 33GD3DMRYYQI7QIYWZ3EVWVOOPTD3Z74 X-Message-ID-Hash: 33GD3DMRYYQI7QIYWZ3EVWVOOPTD3Z74 X-MailFrom: ruansy.fnst@fujitsu.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit In the case where the iomap is a write operation and iomap is not equal to srcmap after iomap_begin, we consider it is a CoW operation. The destance extent which iomap indicated is new allocated extent. So, it is needed to copy the data from srcmap to new allocated extent. In theory, it is better to copy the head and tail ranges which is outside of the non-aligned area instead of copying the whole aligned range. But in dax page fault, it will always be an aligned range. So, we have to copy the whole range in this case. Signed-off-by: Shiyang Ruan Reviewed-by: Christoph Hellwig --- fs/dax.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a70e6aa285bb..181aad97136a 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1037,6 +1037,51 @@ static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, return rc; } +/* + * Copy the head and tail part of the pages not included in the write but + * required for CoW, because pos/pos+length are not page aligned. But in dax + * page fault case, the range is page aligned, we need to copy the whole range + * of data. Use copy_edge to distinguish these cases. + */ +static int dax_iomap_cow_copy(loff_t pos, loff_t length, size_t align_size, + struct iomap *srcmap, void *daddr, bool copy_edge) +{ + loff_t head_off = pos & (align_size - 1); + size_t size = ALIGN(head_off + length, align_size); + loff_t end = pos + length; + loff_t pg_end = round_up(end, align_size); + void *saddr = 0; + int ret = 0; + + ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); + if (ret) + return ret; + + if (!copy_edge) + return copy_mc_to_kernel(daddr, saddr, length); + + /* Copy the head part of the range. Note: we pass offset as length. */ + if (head_off) { + if (saddr) + ret = copy_mc_to_kernel(daddr, saddr, head_off); + else + memset(daddr, 0, head_off); + } + /* Copy the tail part of the range */ + if (end < pg_end) { + loff_t tail_off = head_off + length; + loff_t tail_len = pg_end - end; + + if (saddr) + ret = copy_mc_to_kernel(daddr + tail_off, + saddr + tail_off, tail_len); + else + memset(daddr + tail_off, 0, tail_len); + } + + return ret; +} + /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with @@ -1166,11 +1211,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct dax_device *dax_dev = iomap->dax_dev; struct iov_iter *iter = data; loff_t end = pos + length, done = 0; + bool write = iov_iter_rw(iter) == WRITE; ssize_t ret = 0; size_t xfer; int id; - if (iov_iter_rw(iter) == READ) { + if (!write) { end = min(end, i_size_read(inode)); if (pos >= end) return 0; @@ -1179,7 +1225,8 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && + !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* @@ -1218,6 +1265,13 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (write && srcmap->addr != iomap->addr) { + ret = dax_iomap_cow_copy(pos, length, PAGE_SIZE, srcmap, + kaddr, true); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; @@ -1229,7 +1283,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * validated via access_ok() in either vfs_read() or * vfs_write(), depending on which operation we are doing. */ - if (iov_iter_rw(iter) == WRITE) + if (write) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, map_len, iter); else @@ -1379,6 +1433,7 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, bool sync = dax_fault_is_synchronous(flags, vmf->vma, iomap); int err = 0; pfn_t pfn; + void *kaddr; /* if we are reading UNWRITTEN and HOLE, return a hole. */ if (!write && @@ -1389,18 +1444,24 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, return dax_pmd_load_hole(xas, vmf, iomap, &entry); } - if (iomap->type != IOMAP_MAPPED) { + if (iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED)) { WARN_ON_ONCE(1); return VM_FAULT_SIGBUS; } - err = dax_iomap_direct_access(iomap, pos, size, NULL, &pfn); + err = dax_iomap_direct_access(iomap, pos, size, &kaddr, &pfn); if (err) return dax_fault_return(err); entry = dax_insert_entry(xas, mapping, vmf, entry, pfn, 0, write && !sync); + if (write && srcmap->addr != iomap->addr) { + err = dax_iomap_cow_copy(pos, size, size, srcmap, kaddr, false); + if (err) + return dax_fault_return(err); + } + if (sync) return dax_fault_synchronous_pfnp(pfnp, pfn); -- 2.30.1 _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20E64C432C3 for ; Fri, 19 Mar 2021 01:54:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F36BB64E99 for ; Fri, 19 Mar 2021 01:54:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230203AbhCSBxb (ORCPT ); Thu, 18 Mar 2021 21:53:31 -0400 Received: from mail.cn.fujitsu.com ([183.91.158.132]:41484 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231440AbhCSBxI (ORCPT ); Thu, 18 Mar 2021 21:53:08 -0400 IronPort-HdrOrdr: =?us-ascii?q?A9a23=3AIHRCo65wLKeSV+wgWwPXwCjXdLJzesId70hD?= =?us-ascii?q?6mlaTxtJfsuE0/2/hfhz726RtB89elEF3eqBNq6JXG/G+fdOjLU5EL++UGDd1l?= =?us-ascii?q?eAA41v4IDryT+lOwCWzIRg/Ih6dawWMrzNJHxbqeq/3wWiCdYnx7C8gcWVrMPT?= =?us-ascii?q?1W1kQw0vS4wI1XYbNi+hHkd7RBZLCPMCffLy2uN8uzGidX4LB/7LZEUtYu6rnb?= =?us-ascii?q?32vaOjSRsHKjpi0wOWkA6vgYSQLzGomjsYTBNDqI1PzVT4?= X-IronPort-AV: E=Sophos;i="5.81,259,1610380800"; d="scan'208";a="105876682" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 19 Mar 2021 09:53:06 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id 5AB894CEB2B4; Fri, 19 Mar 2021 09:53:06 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 19 Mar 2021 09:52:56 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 19 Mar 2021 09:52:55 +0800 From: Shiyang Ruan To: , , , CC: , , , , , , , , , Subject: [PATCH v3 04/10] fsdax: Introduce dax_iomap_cow_copy() Date: Fri, 19 Mar 2021 09:52:31 +0800 Message-ID: <20210319015237.993880-5-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> References: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-yoursite-MailScanner-ID: 5AB894CEB2B4.A4A01 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@fujitsu.com Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In the case where the iomap is a write operation and iomap is not equal to srcmap after iomap_begin, we consider it is a CoW operation. The destance extent which iomap indicated is new allocated extent. So, it is needed to copy the data from srcmap to new allocated extent. In theory, it is better to copy the head and tail ranges which is outside of the non-aligned area instead of copying the whole aligned range. But in dax page fault, it will always be an aligned range. So, we have to copy the whole range in this case. Signed-off-by: Shiyang Ruan Reviewed-by: Christoph Hellwig --- fs/dax.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a70e6aa285bb..181aad97136a 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1037,6 +1037,51 @@ static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, return rc; } +/* + * Copy the head and tail part of the pages not included in the write but + * required for CoW, because pos/pos+length are not page aligned. But in dax + * page fault case, the range is page aligned, we need to copy the whole range + * of data. Use copy_edge to distinguish these cases. + */ +static int dax_iomap_cow_copy(loff_t pos, loff_t length, size_t align_size, + struct iomap *srcmap, void *daddr, bool copy_edge) +{ + loff_t head_off = pos & (align_size - 1); + size_t size = ALIGN(head_off + length, align_size); + loff_t end = pos + length; + loff_t pg_end = round_up(end, align_size); + void *saddr = 0; + int ret = 0; + + ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); + if (ret) + return ret; + + if (!copy_edge) + return copy_mc_to_kernel(daddr, saddr, length); + + /* Copy the head part of the range. Note: we pass offset as length. */ + if (head_off) { + if (saddr) + ret = copy_mc_to_kernel(daddr, saddr, head_off); + else + memset(daddr, 0, head_off); + } + /* Copy the tail part of the range */ + if (end < pg_end) { + loff_t tail_off = head_off + length; + loff_t tail_len = pg_end - end; + + if (saddr) + ret = copy_mc_to_kernel(daddr + tail_off, + saddr + tail_off, tail_len); + else + memset(daddr + tail_off, 0, tail_len); + } + + return ret; +} + /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with @@ -1166,11 +1211,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct dax_device *dax_dev = iomap->dax_dev; struct iov_iter *iter = data; loff_t end = pos + length, done = 0; + bool write = iov_iter_rw(iter) == WRITE; ssize_t ret = 0; size_t xfer; int id; - if (iov_iter_rw(iter) == READ) { + if (!write) { end = min(end, i_size_read(inode)); if (pos >= end) return 0; @@ -1179,7 +1225,8 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && + !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* @@ -1218,6 +1265,13 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (write && srcmap->addr != iomap->addr) { + ret = dax_iomap_cow_copy(pos, length, PAGE_SIZE, srcmap, + kaddr, true); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; @@ -1229,7 +1283,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * validated via access_ok() in either vfs_read() or * vfs_write(), depending on which operation we are doing. */ - if (iov_iter_rw(iter) == WRITE) + if (write) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, map_len, iter); else @@ -1379,6 +1433,7 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, bool sync = dax_fault_is_synchronous(flags, vmf->vma, iomap); int err = 0; pfn_t pfn; + void *kaddr; /* if we are reading UNWRITTEN and HOLE, return a hole. */ if (!write && @@ -1389,18 +1444,24 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, return dax_pmd_load_hole(xas, vmf, iomap, &entry); } - if (iomap->type != IOMAP_MAPPED) { + if (iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED)) { WARN_ON_ONCE(1); return VM_FAULT_SIGBUS; } - err = dax_iomap_direct_access(iomap, pos, size, NULL, &pfn); + err = dax_iomap_direct_access(iomap, pos, size, &kaddr, &pfn); if (err) return dax_fault_return(err); entry = dax_insert_entry(xas, mapping, vmf, entry, pfn, 0, write && !sync); + if (write && srcmap->addr != iomap->addr) { + err = dax_iomap_cow_copy(pos, size, size, srcmap, kaddr, false); + if (err) + return dax_fault_return(err); + } + if (sync) return dax_fault_synchronous_pfnp(pfnp, pfn); -- 2.30.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AD68C433E0 for ; Fri, 19 Mar 2021 01:53:42 +0000 (UTC) Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3149B64D74 for ; Fri, 19 Mar 2021 01:53:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3149B64D74 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=fujitsu.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 12J1n9b1128305; Fri, 19 Mar 2021 01:53:41 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 378nbmhdce-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Mar 2021 01:53:41 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 12J1k9fm030537; Fri, 19 Mar 2021 01:53:40 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 3797b3p3bu-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Fri, 19 Mar 2021 01:53:40 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1lN4KZ-0006Vp-Ey; Thu, 18 Mar 2021 18:53:39 -0700 Received: from aserp3030.oracle.com ([141.146.126.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1lN4K5-0006Rb-ER for ocfs2-devel@oss.oracle.com; Thu, 18 Mar 2021 18:53:09 -0700 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 12J1ipK8055041 for ; Fri, 19 Mar 2021 01:53:09 GMT Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by aserp3030.oracle.com with ESMTP id 3796ywwyvh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Fri, 19 Mar 2021 01:53:09 +0000 Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 12J1r3KW004987 for ; Fri, 19 Mar 2021 01:53:08 GMT Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by userp2040.oracle.com with ESMTP id 37axjvp744-2 for ; Fri, 19 Mar 2021 01:53:08 +0000 IronPort-HdrOrdr: =?us-ascii?q?A9a23=3AIHRCo65wLKeSV+wgWwPXwCjXdLJzesId70hD?= =?us-ascii?q?6mlaTxtJfsuE0/2/hfhz726RtB89elEF3eqBNq6JXG/G+fdOjLU5EL++UGDd1l?= =?us-ascii?q?eAA41v4IDryT+lOwCWzIRg/Ih6dawWMrzNJHxbqeq/3wWiCdYnx7C8gcWVrMPT?= =?us-ascii?q?1W1kQw0vS4wI1XYbNi+hHkd7RBZLCPMCffLy2uN8uzGidX4LB/7LZEUtYu6rnb?= =?us-ascii?q?32vaOjSRsHKjpi0wOWkA6vgYSQLzGomjsYTBNDqI1PzVT4?= X-IronPort-AV: E=Sophos;i="5.81,259,1610380800"; d="scan'208";a="105876682" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 19 Mar 2021 09:53:06 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id 5AB894CEB2B4; Fri, 19 Mar 2021 09:53:06 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 19 Mar 2021 09:52:56 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 19 Mar 2021 09:52:55 +0800 From: Shiyang Ruan To: , , , Date: Fri, 19 Mar 2021 09:52:31 +0800 Message-ID: <20210319015237.993880-5-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> References: <20210319015237.993880-1-ruansy.fnst@fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 5AB894CEB2B4.A4A01 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@fujitsu.com X-PDR: PASS X-Source-IP: 183.91.158.132 X-ServerName: mail.cn.fujitsu.com X-Proofpoint-SPF-Result: neutral X-Proofpoint-SPF-Record: v=spf1 ip4:211.128.242.0/26 ip4:202.219.69.128/26 ip4:207.54.90.47 ip4:207.54.90.48 ip4:207.54.90.49 ip4:68.232.139.117 ip4:68.232.139.130 ip4:68.232.139.139 include:spf.protection.outlook.com include:mktomail.com mx:fujitsu.com include:spf.messagelabs.com ?all X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9927 signatures=668683 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 mlxscore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 adultscore=0 priorityscore=0 impostorscore=0 phishscore=0 mlxlogscore=999 clxscore=8 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103190011 X-Spam: Clean Cc: jack@suse.cz, darrick.wong@oracle.com, david@fromorbit.com, ocfs2-devel@oss.oracle.com, viro@zeniv.linux.org.uk, dan.j.williams@intel.com, linux-btrfs@vger.kernel.org Subject: [Ocfs2-devel] [PATCH v3 04/10] fsdax: Introduce dax_iomap_cow_copy() X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9927 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 spamscore=0 bulkscore=0 malwarescore=0 adultscore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103190010 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9927 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 impostorscore=0 malwarescore=0 adultscore=0 mlxscore=0 clxscore=1015 mlxlogscore=999 lowpriorityscore=0 phishscore=0 priorityscore=1501 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103190010 In the case where the iomap is a write operation and iomap is not equal to srcmap after iomap_begin, we consider it is a CoW operation. The destance extent which iomap indicated is new allocated extent. So, it is needed to copy the data from srcmap to new allocated extent. In theory, it is better to copy the head and tail ranges which is outside of the non-aligned area instead of copying the whole aligned range. But in dax page fault, it will always be an aligned range. So, we have to copy the whole range in this case. Signed-off-by: Shiyang Ruan Reviewed-by: Christoph Hellwig --- fs/dax.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a70e6aa285bb..181aad97136a 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1037,6 +1037,51 @@ static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, return rc; } +/* + * Copy the head and tail part of the pages not included in the write but + * required for CoW, because pos/pos+length are not page aligned. But in dax + * page fault case, the range is page aligned, we need to copy the whole range + * of data. Use copy_edge to distinguish these cases. + */ +static int dax_iomap_cow_copy(loff_t pos, loff_t length, size_t align_size, + struct iomap *srcmap, void *daddr, bool copy_edge) +{ + loff_t head_off = pos & (align_size - 1); + size_t size = ALIGN(head_off + length, align_size); + loff_t end = pos + length; + loff_t pg_end = round_up(end, align_size); + void *saddr = 0; + int ret = 0; + + ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); + if (ret) + return ret; + + if (!copy_edge) + return copy_mc_to_kernel(daddr, saddr, length); + + /* Copy the head part of the range. Note: we pass offset as length. */ + if (head_off) { + if (saddr) + ret = copy_mc_to_kernel(daddr, saddr, head_off); + else + memset(daddr, 0, head_off); + } + /* Copy the tail part of the range */ + if (end < pg_end) { + loff_t tail_off = head_off + length; + loff_t tail_len = pg_end - end; + + if (saddr) + ret = copy_mc_to_kernel(daddr + tail_off, + saddr + tail_off, tail_len); + else + memset(daddr + tail_off, 0, tail_len); + } + + return ret; +} + /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with @@ -1166,11 +1211,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct dax_device *dax_dev = iomap->dax_dev; struct iov_iter *iter = data; loff_t end = pos + length, done = 0; + bool write = iov_iter_rw(iter) == WRITE; ssize_t ret = 0; size_t xfer; int id; - if (iov_iter_rw(iter) == READ) { + if (!write) { end = min(end, i_size_read(inode)); if (pos >= end) return 0; @@ -1179,7 +1225,8 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && + !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* @@ -1218,6 +1265,13 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (write && srcmap->addr != iomap->addr) { + ret = dax_iomap_cow_copy(pos, length, PAGE_SIZE, srcmap, + kaddr, true); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; @@ -1229,7 +1283,7 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, * validated via access_ok() in either vfs_read() or * vfs_write(), depending on which operation we are doing. */ - if (iov_iter_rw(iter) == WRITE) + if (write) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, map_len, iter); else @@ -1379,6 +1433,7 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, bool sync = dax_fault_is_synchronous(flags, vmf->vma, iomap); int err = 0; pfn_t pfn; + void *kaddr; /* if we are reading UNWRITTEN and HOLE, return a hole. */ if (!write && @@ -1389,18 +1444,24 @@ static vm_fault_t dax_fault_actor(struct vm_fault *vmf, pfn_t *pfnp, return dax_pmd_load_hole(xas, vmf, iomap, &entry); } - if (iomap->type != IOMAP_MAPPED) { + if (iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED)) { WARN_ON_ONCE(1); return VM_FAULT_SIGBUS; } - err = dax_iomap_direct_access(iomap, pos, size, NULL, &pfn); + err = dax_iomap_direct_access(iomap, pos, size, &kaddr, &pfn); if (err) return dax_fault_return(err); entry = dax_insert_entry(xas, mapping, vmf, entry, pfn, 0, write && !sync); + if (write && srcmap->addr != iomap->addr) { + err = dax_iomap_cow_copy(pos, size, size, srcmap, kaddr, false); + if (err) + return dax_fault_return(err); + } + if (sync) return dax_fault_synchronous_pfnp(pfnp, pfn); -- 2.30.1 _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel