From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965565AbbJ3Dzi (ORCPT ); Thu, 29 Oct 2015 23:55:38 -0400 Received: from ipmail05.adl6.internode.on.net ([150.101.137.143]:26156 "EHLO ipmail05.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753231AbbJ3Dzh (ORCPT ); Thu, 29 Oct 2015 23:55:37 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2AOBwAA6TJWXQLd03ZegztTb4Jdp1IBAQQGiy2LMBcMhXAEAgKBO00BAQEBAQEHRECENQEBAQMBAQIkExwjBQsIAxgJJQ8FFBEDBxoTG4gNBw7EWAEBAQEBBQIBIBmGF4VFgnGBVoR5BZZDhR2FHoJigimaFYR7KjQBhDSBSQEBAQ Date: Fri, 30 Oct 2015 14:55:33 +1100 From: Dave Chinner To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, "H. Peter Anvin" , "J. Bruce Fields" , "Theodore Ts'o" , Alexander Viro , Andreas Dilger , Dan Williams , Ingo Molnar , Jan Kara , Jeff Layton , Matthew Wilcox , Thomas Gleixner , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org, x86@kernel.org, xfs@oss.sgi.com, Andrew Morton , Matthew Wilcox Subject: Re: [RFC 00/11] DAX fsynx/msync support Message-ID: <20151030035533.GU19199@dastard> References: <1446149535-16200-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1446149535-16200-1-git-send-email-ross.zwisler@linux.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 29, 2015 at 02:12:04PM -0600, Ross Zwisler wrote: > This patch series adds support for fsync/msync to DAX. > > Patches 1 through 8 add various utilities that the DAX code will eventually > need, and the DAX code itself is added by patch 9. Patches 10 and 11 are > filesystem changes that are needed after the DAX code is added, but these > patches may change slightly as the filesystem fault handling for DAX is > being modified ([1] and [2]). > > I've marked this series as RFC because I'm still testing, but I wanted to > get this out there so people would see the direction I was going and > hopefully comment on any big red flags sooner rather than later. > > I realize that we are getting pretty dang close to the v4.4 merge window, > but I think that if we can get this reviewed and working it's a much better > solution than the "big hammer" approach that blindly flushes entire PMEM > namespaces [3]. We need the "big hammer" regardless of fsync. If REQ_FLUSH and REQ_FUA don't do the right thing when it comes to ordering journal writes against other IO operations, then the filesystems are not crash safe. i.e. we need REQ_FLUSH/REQ_FUA to commit all outstanding changes back to stable storage, just like they do for existing storage.... > [1] http://oss.sgi.com/archives/xfs/2015-10/msg00523.html > [2] http://marc.info/?l=linux-ext4&m=144550211312472&w=2 > [3] https://lists.01.org/pipermail/linux-nvdimm/2015-October/002614.html > > Ross Zwisler (11): > pmem: add wb_cache_pmem() to the PMEM API > mm: add pmd_mkclean() > pmem: enable REQ_FLUSH handling > dax: support dirty DAX entries in radix tree > mm: add follow_pte_pmd() > mm: add pgoff_mkclean() > mm: add find_get_entries_tag() > fs: add get_block() to struct inode_operations I don't think this is the right thing to do - it propagates the use of bufferheads as a mapping structure into places where we do not want bufferheads. We've recently added a similar block mapping interface to the export operations structure for PNFS and that uses a "struct iomap" which is far more suited to being an inode operation this. We have plans to move this to the inode operations for various reasons. e.g: multipage write, adding interfaces that support proper mapping of holes, etc: https://www.redhat.com/archives/cluster-devel/2014-October/msg00167.html So after many years of saying no to moving getblocks to the inode operations it seems like the wrong thing to do now considering I want to convert all the DAX code to use iomaps while only 2/3 filesystems are supported... > dax: add support for fsync/sync Why put the dax_flush_mapping() in do_writepages()? Why not call it directly from the filesystem ->fsync() implementations where a getblocks callback could also be provided? Cheers, Dave. -- Dave Chinner david@fromorbit.com