From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 In-Reply-To: <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> References: <1447459610-14259-1-git-send-email-ross.zwisler@linux.intel.com> <1447459610-14259-4-git-send-email-ross.zwisler@linux.intel.com> <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> Date: Fri, 13 Nov 2015 18:32:40 -0800 Message-ID: Subject: Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling From: Dan Williams Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org To: Andreas Dilger Cc: Ross Zwisler , "linux-kernel@vger.kernel.org" , "H. Peter Anvin" , "J. Bruce Fields" , Theodore Ts'o , Alexander Viro , Dave Chinner , Ingo Molnar , Jan Kara , Jeff Layton , Matthew Wilcox , Thomas Gleixner , linux-ext4 , linux-fsdevel , Linux MM , "linux-nvdimm@lists.01.org" , X86 ML , XFS Developers , Andrew Morton , Matthew Wilcox , Dave Hansen List-ID: On Fri, Nov 13, 2015 at 4:43 PM, Andreas Dilger wrote: > On Nov 13, 2015, at 5:20 PM, Dan Williams wrote: >> >> On Fri, Nov 13, 2015 at 4:06 PM, Ross Zwisler >> wrote: >>> Currently the PMEM driver doesn't accept REQ_FLUSH or REQ_FUA bios. These >>> are sent down via blkdev_issue_flush() in response to a fsync() or msync() >>> and are used by filesystems to order their metadata, among other things. >>> >>> When we get an msync() or fsync() it is the responsibility of the DAX code >>> to flush all dirty pages to media. The PMEM driver then just has issue a >>> wmb_pmem() in response to the REQ_FLUSH to ensure that before we return all >>> the flushed data has been durably stored on the media. >>> >>> Signed-off-by: Ross Zwisler >> >> Hmm, I'm not seeing why we need this patch. If the actual flushing of >> the cache is done by the core why does the driver need support >> REQ_FLUSH? Especially since it's just a couple instructions. REQ_FUA >> only makes sense if individual writes can bypass the "drive" cache, >> but no I/O submitted to the driver proper is ever cached we always >> flush it through to media. > > If the upper level filesystem gets an error when submitting a flush > request, then it assumes the underlying hardware is broken and cannot > be as aggressive in IO submission, but instead has to wait for in-flight > IO to complete. Upper level filesystems won't get errors when the driver does not support flush. Those requests are ended cleanly in generic_make_request_checks(). Yes, the fs still needs to wait for outstanding I/O to complete but in the case of pmem all I/O is synchronous. There's never anything to await when flushing at the pmem driver level. > Since FUA/FLUSH is basically a no-op for pmem devices, > it doesn't make sense _not_ to support this functionality. Seems to be a nop either way. Given that DAX may lead to dirty data pending to the device in the cpu cache that a REQ_FLUSH request will not touch, its better to leave it all to the mm core to handle. I.e. it doesn't make sense to call the driver just for two instructions (sfence + pcommit) when the mm core is taking on the cache flushing. Either handle it all in the mm or the driver, not a mixture. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752791AbbKNCcn (ORCPT ); Fri, 13 Nov 2015 21:32:43 -0500 Received: from mail-wm0-f51.google.com ([74.125.82.51]:35316 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752306AbbKNCcl (ORCPT ); Fri, 13 Nov 2015 21:32:41 -0500 MIME-Version: 1.0 In-Reply-To: <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> References: <1447459610-14259-1-git-send-email-ross.zwisler@linux.intel.com> <1447459610-14259-4-git-send-email-ross.zwisler@linux.intel.com> <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> Date: Fri, 13 Nov 2015 18:32:40 -0800 Message-ID: Subject: Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling From: Dan Williams To: Andreas Dilger Cc: Ross Zwisler , "linux-kernel@vger.kernel.org" , "H. Peter Anvin" , "J. Bruce Fields" , "Theodore Ts'o" , Alexander Viro , Dave Chinner , Ingo Molnar , Jan Kara , Jeff Layton , Matthew Wilcox , Thomas Gleixner , linux-ext4 , linux-fsdevel , Linux MM , "linux-nvdimm@lists.01.org" , X86 ML , XFS Developers , Andrew Morton , Matthew Wilcox , Dave Hansen Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 13, 2015 at 4:43 PM, Andreas Dilger wrote: > On Nov 13, 2015, at 5:20 PM, Dan Williams wrote: >> >> On Fri, Nov 13, 2015 at 4:06 PM, Ross Zwisler >> wrote: >>> Currently the PMEM driver doesn't accept REQ_FLUSH or REQ_FUA bios. These >>> are sent down via blkdev_issue_flush() in response to a fsync() or msync() >>> and are used by filesystems to order their metadata, among other things. >>> >>> When we get an msync() or fsync() it is the responsibility of the DAX code >>> to flush all dirty pages to media. The PMEM driver then just has issue a >>> wmb_pmem() in response to the REQ_FLUSH to ensure that before we return all >>> the flushed data has been durably stored on the media. >>> >>> Signed-off-by: Ross Zwisler >> >> Hmm, I'm not seeing why we need this patch. If the actual flushing of >> the cache is done by the core why does the driver need support >> REQ_FLUSH? Especially since it's just a couple instructions. REQ_FUA >> only makes sense if individual writes can bypass the "drive" cache, >> but no I/O submitted to the driver proper is ever cached we always >> flush it through to media. > > If the upper level filesystem gets an error when submitting a flush > request, then it assumes the underlying hardware is broken and cannot > be as aggressive in IO submission, but instead has to wait for in-flight > IO to complete. Upper level filesystems won't get errors when the driver does not support flush. Those requests are ended cleanly in generic_make_request_checks(). Yes, the fs still needs to wait for outstanding I/O to complete but in the case of pmem all I/O is synchronous. There's never anything to await when flushing at the pmem driver level. > Since FUA/FLUSH is basically a no-op for pmem devices, > it doesn't make sense _not_ to support this functionality. Seems to be a nop either way. Given that DAX may lead to dirty data pending to the device in the cpu cache that a REQ_FLUSH request will not touch, its better to leave it all to the mm core to handle. I.e. it doesn't make sense to call the driver just for two instructions (sfence + pcommit) when the mm core is taking on the cache flushing. Either handle it all in the mm or the driver, not a mixture. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id 8C3D67F50 for ; Fri, 13 Nov 2015 20:32:47 -0600 (CST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay3.corp.sgi.com (Postfix) with ESMTP id 38DCBAC030 for ; Fri, 13 Nov 2015 18:32:44 -0800 (PST) Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by cuda.sgi.com with ESMTP id l75nREyp8WfHDC2a (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for ; Fri, 13 Nov 2015 18:32:41 -0800 (PST) Received: by wmec201 with SMTP id c201so106552505wme.0 for ; Fri, 13 Nov 2015 18:32:40 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> References: <1447459610-14259-1-git-send-email-ross.zwisler@linux.intel.com> <1447459610-14259-4-git-send-email-ross.zwisler@linux.intel.com> <22E0F870-C1FB-431E-BF6C-B395A09A2B0D@dilger.ca> Date: Fri, 13 Nov 2015 18:32:40 -0800 Message-ID: Subject: Re: [PATCH v2 03/11] pmem: enable REQ_FUA/REQ_FLUSH handling From: Dan Williams List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Andreas Dilger Cc: X86 ML , Theodore Ts'o , Andrew Morton , "linux-nvdimm@lists.01.org" , Jan Kara , "linux-kernel@vger.kernel.org" , Dave Hansen , XFS Developers , "J. Bruce Fields" , Linux MM , Ingo Molnar , Thomas Gleixner , Alexander Viro , "H. Peter Anvin" , linux-fsdevel , Matthew Wilcox , Ross Zwisler , linux-ext4 , Jeff Layton , Matthew Wilcox On Fri, Nov 13, 2015 at 4:43 PM, Andreas Dilger wrote: > On Nov 13, 2015, at 5:20 PM, Dan Williams wrote: >> >> On Fri, Nov 13, 2015 at 4:06 PM, Ross Zwisler >> wrote: >>> Currently the PMEM driver doesn't accept REQ_FLUSH or REQ_FUA bios. These >>> are sent down via blkdev_issue_flush() in response to a fsync() or msync() >>> and are used by filesystems to order their metadata, among other things. >>> >>> When we get an msync() or fsync() it is the responsibility of the DAX code >>> to flush all dirty pages to media. The PMEM driver then just has issue a >>> wmb_pmem() in response to the REQ_FLUSH to ensure that before we return all >>> the flushed data has been durably stored on the media. >>> >>> Signed-off-by: Ross Zwisler >> >> Hmm, I'm not seeing why we need this patch. If the actual flushing of >> the cache is done by the core why does the driver need support >> REQ_FLUSH? Especially since it's just a couple instructions. REQ_FUA >> only makes sense if individual writes can bypass the "drive" cache, >> but no I/O submitted to the driver proper is ever cached we always >> flush it through to media. > > If the upper level filesystem gets an error when submitting a flush > request, then it assumes the underlying hardware is broken and cannot > be as aggressive in IO submission, but instead has to wait for in-flight > IO to complete. Upper level filesystems won't get errors when the driver does not support flush. Those requests are ended cleanly in generic_make_request_checks(). Yes, the fs still needs to wait for outstanding I/O to complete but in the case of pmem all I/O is synchronous. There's never anything to await when flushing at the pmem driver level. > Since FUA/FLUSH is basically a no-op for pmem devices, > it doesn't make sense _not_ to support this functionality. Seems to be a nop either way. Given that DAX may lead to dirty data pending to the device in the cpu cache that a REQ_FLUSH request will not touch, its better to leave it all to the mm core to handle. I.e. it doesn't make sense to call the driver just for two instructions (sfence + pcommit) when the mm core is taking on the cache flushing. Either handle it all in the mm or the driver, not a mixture. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs