From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from newverein.lst.de (verein.lst.de [213.95.11.211]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 59F5B2034C0BC for ; Fri, 20 Oct 2017 09:28:43 -0700 (PDT) Date: Fri, 20 Oct 2017 18:32:21 +0200 From: Christoph Hellwig Subject: Re: [PATCH v3 12/13] dax: handle truncate of dma-busy pages Message-ID: <20171020163221.GB26320@lst.de> References: <150846713528.24336.4459262264611579791.stgit@dwillia2-desk3.amr.corp.intel.com> <150846720244.24336.16885325309403883980.stgit@dwillia2-desk3.amr.corp.intel.com> <1508504726.5572.41.camel@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Dan Williams Cc: linux-xfs@vger.kernel.org, Jan Kara , Matthew Wilcox , Dave Chinner , Dave Hansen , Jeff Layton , "linux-kernel@vger.kernel.org" , Christoph Hellwig , "J. Bruce Fields" , Linux MM , Alexander Viro , linux-fsdevel , Andrew Morton , "Darrick J. Wong" , "linux-nvdimm@lists.01.org" List-ID: On Fri, Oct 20, 2017 at 08:42:00AM -0700, Dan Williams wrote: > I agree, but it needs quite a bit more thought and restructuring of > the truncate path. I also wonder how we reclaim those stranded > filesystem blocks, but a first approximation is wait for the > administrator to delete them or auto-delete them at the next mount. > XFS seems well prepared to reflink-swap these DMA blocks around, but > I'm not sure about EXT4. reflink still is an optional and experimental feature in XFS. That being said we should not need to swap block pointers around on disk. We just need to prevent the block allocator from reusing the blocks for new allocations, and we have code for that, both for transactions that haven't been committed to disk yet, and for deleted blocks undergoing discard operations. But as mentioned in my second mail from this morning I'm not even sure we need that. For short-term elevated page counts like normal get_user_pages users I think we can just wait for the page count to reach zero, while for abuses of get_user_pages for long term pinning memory (not sure if anyone but rdma is doing that) we'll need something like FL_LAYOUT leases to release the mapping. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753115AbdJTQcY (ORCPT ); Fri, 20 Oct 2017 12:32:24 -0400 Received: from verein.lst.de ([213.95.11.211]:49796 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751698AbdJTQcX (ORCPT ); Fri, 20 Oct 2017 12:32:23 -0400 Date: Fri, 20 Oct 2017 18:32:21 +0200 From: Christoph Hellwig To: Dan Williams Cc: Jeff Layton , Andrew Morton , Jan Kara , Matthew Wilcox , Dave Hansen , Dave Chinner , "linux-kernel@vger.kernel.org" , "J. Bruce Fields" , Linux MM , Jeff Moyer , Alexander Viro , linux-fsdevel , "Darrick J. Wong" , Ross Zwisler , linux-xfs@vger.kernel.org, Christoph Hellwig , "linux-nvdimm@lists.01.org" Subject: Re: [PATCH v3 12/13] dax: handle truncate of dma-busy pages Message-ID: <20171020163221.GB26320@lst.de> References: <150846713528.24336.4459262264611579791.stgit@dwillia2-desk3.amr.corp.intel.com> <150846720244.24336.16885325309403883980.stgit@dwillia2-desk3.amr.corp.intel.com> <1508504726.5572.41.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 20, 2017 at 08:42:00AM -0700, Dan Williams wrote: > I agree, but it needs quite a bit more thought and restructuring of > the truncate path. I also wonder how we reclaim those stranded > filesystem blocks, but a first approximation is wait for the > administrator to delete them or auto-delete them at the next mount. > XFS seems well prepared to reflink-swap these DMA blocks around, but > I'm not sure about EXT4. reflink still is an optional and experimental feature in XFS. That being said we should not need to swap block pointers around on disk. We just need to prevent the block allocator from reusing the blocks for new allocations, and we have code for that, both for transactions that haven't been committed to disk yet, and for deleted blocks undergoing discard operations. But as mentioned in my second mail from this morning I'm not even sure we need that. For short-term elevated page counts like normal get_user_pages users I think we can just wait for the page count to reach zero, while for abuses of get_user_pages for long term pinning memory (not sure if anyone but rdma is doing that) we'll need something like FL_LAYOUT leases to release the mapping. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 20 Oct 2017 18:32:21 +0200 From: Christoph Hellwig To: Dan Williams Cc: Jeff Layton , Andrew Morton , Jan Kara , Matthew Wilcox , Dave Hansen , Dave Chinner , "linux-kernel@vger.kernel.org" , "J. Bruce Fields" , Linux MM , Jeff Moyer , Alexander Viro , linux-fsdevel , "Darrick J. Wong" , Ross Zwisler , linux-xfs@vger.kernel.org, Christoph Hellwig , "linux-nvdimm@lists.01.org" Subject: Re: [PATCH v3 12/13] dax: handle truncate of dma-busy pages Message-ID: <20171020163221.GB26320@lst.de> References: <150846713528.24336.4459262264611579791.stgit@dwillia2-desk3.amr.corp.intel.com> <150846720244.24336.16885325309403883980.stgit@dwillia2-desk3.amr.corp.intel.com> <1508504726.5572.41.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: On Fri, Oct 20, 2017 at 08:42:00AM -0700, Dan Williams wrote: > I agree, but it needs quite a bit more thought and restructuring of > the truncate path. I also wonder how we reclaim those stranded > filesystem blocks, but a first approximation is wait for the > administrator to delete them or auto-delete them at the next mount. > XFS seems well prepared to reflink-swap these DMA blocks around, but > I'm not sure about EXT4. reflink still is an optional and experimental feature in XFS. That being said we should not need to swap block pointers around on disk. We just need to prevent the block allocator from reusing the blocks for new allocations, and we have code for that, both for transactions that haven't been committed to disk yet, and for deleted blocks undergoing discard operations. But as mentioned in my second mail from this morning I'm not even sure we need that. For short-term elevated page counts like normal get_user_pages users I think we can just wait for the page count to reach zero, while for abuses of get_user_pages for long term pinning memory (not sure if anyone but rdma is doing that) we'll need something like FL_LAYOUT leases to release the mapping. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org