From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751838AbdKVWYL (ORCPT ); Wed, 22 Nov 2017 17:24:11 -0500 Received: from omzsmtpe01.verizonbusiness.com ([199.249.25.210]:15352 "EHLO omzsmtpe01.verizonbusiness.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751646AbdKVWYF (ORCPT ); Wed, 22 Nov 2017 17:24:05 -0500 From: alexander.levin@verizon.com Cc: Jan Kara , Dan Williams , alexander.levin@verizon.com X-Host: pioneer.tdc.vzwcorp.com To: "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" Subject: [PATCH AUTOSEL for 4.9 01/54] dax: Avoid page invalidation races and unnecessary radix tree traversals Thread-Topic: [PATCH AUTOSEL for 4.9 01/54] dax: Avoid page invalidation races and unnecessary radix tree traversals Thread-Index: AQHTY+CCMOVDonCWgkiL6yIoJxENww== Date: Wed, 22 Nov 2017 22:23:23 +0000 Message-ID: <20171122222246.19618-2-alexander.levin@verizon.com> References: <20171122222246.19618-1-alexander.levin@verizon.com> In-Reply-To: <20171122222246.19618-1-alexander.levin@verizon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.144.60.250] Content-Type: text/plain; charset="iso-8859-1" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id vAMMOKhj022386 From: Jan Kara [ Upstream commit e3fce68cdbed297d927e993b3ea7b8b1cee545da ] Currently dax_iomap_rw() takes care of invalidating page tables and evicting hole pages from the radix tree when write(2) to the file happens. This invalidation is only necessary when there is some block allocation resulting from write(2). Furthermore in current place the invalidation is racy wrt page fault instantiating a hole page just after we have invalidated it. So perform the page invalidation inside dax_iomap_actor() where we can do it only when really necessary and after blocks have been allocated so nobody will be instantiating new hole pages anymore. Reviewed-by: Christoph Hellwig Reviewed-by: Ross Zwisler Signed-off-by: Jan Kara Signed-off-by: Dan Williams Signed-off-by: Sasha Levin --- fs/dax.c | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index bf6218da7928..800748f10b3d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1265,6 +1265,17 @@ iomap_dax_actor(struct inode *inode, loff_t pos, loff_t length, void *data, if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) return -EIO; + /* + * Write can allocate block for an area which has a hole page mapped + * into page tables. We have to tear down these mappings so that data + * written by write(2) is visible in mmap. + */ + if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + invalidate_inode_pages2_range(inode->i_mapping, + pos >> PAGE_SHIFT, + (end - 1) >> PAGE_SHIFT); + } + while (pos < end) { unsigned offset = pos & (PAGE_SIZE - 1); struct blk_dax_ctl dax = { 0 }; @@ -1329,23 +1340,6 @@ iomap_dax_rw(struct kiocb *iocb, struct iov_iter *iter, if (iov_iter_rw(iter) == WRITE) flags |= IOMAP_WRITE; - /* - * Yes, even DAX files can have page cache attached to them: A zeroed - * page is inserted into the pagecache when we have to serve a write - * fault on a hole. It should never be dirtied and can simply be - * dropped from the pagecache once we get real data for the page. - * - * XXX: This is racy against mmap, and there's nothing we can do about - * it. We'll eventually need to shift this down even further so that - * we can check if we allocated blocks over a hole first. - */ - if (mapping->nrpages) { - ret = invalidate_inode_pages2_range(mapping, - pos >> PAGE_SHIFT, - (pos + iov_iter_count(iter) - 1) >> PAGE_SHIFT); - WARN_ON_ONCE(ret); - } - while (iov_iter_count(iter)) { ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops, iter, iomap_dax_actor); -- 2.11.0