From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 13 Sep 2016 11:31:28 +1000 From: Nicholas Piggin Subject: Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps) Message-ID: <20160913113128.4eae792e@roar.ozlabs.ibm.com> In-Reply-To: <20160912150148.GA10039@infradead.org> References: <20160908225636.GB15167@linux.intel.com> <20160912052703.GA1897@infradead.org> <20160912075128.GB21474@infradead.org> <20160912180507.533b3549@roar.ozlabs.ibm.com> <20160912150148.GA10039@infradead.org> MIME-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Christoph Hellwig Cc: Yumei Huang , Michal Hocko , Xiao Guangrong , KVM list , Dave Hansen , Gleb Natapov , "linux-nvdimm@lists.01.org" , mtosatti@redhat.com, "linux-kernel@vger.kernel.org" , Linux MM , Stefan Hajnoczi , linux-fsdevel , Paolo Bonzini , Andrew Morton List-ID: On Mon, 12 Sep 2016 08:01:48 -0700 Christoph Hellwig wrote: > On Mon, Sep 12, 2016 at 06:05:07PM +1000, Nicholas Piggin wrote: > > It's not fundamentally broken, it just doesn't fit well existing > > filesystems. > > Or the existing file system architecture for that matter. Which makes > it a fundamentally broken model. Not really. A few reasonable changes can be made to improve things. Until just now you thought it was fundamentally impossible to make a reasonable implementation due to Dave's "constraints". > > > Dave's post of requirements is also wrong. A filesystem does not have > > to guarantee all that, it only has to guarantee that is the case for > > a given block after it has a mapping and page fault returns, other > > operations can be supported by invalidating mappings, etc. > > Which doesn't really matter if your use case is manipulating > fully mapped files. Nothing that says you have to use them fully mapped always and not use other APIs on them. > But back to the point: if you want to use a full blown Linux or Unix > filesystem you will always have to fsync (or variants of it like msync), > period. That's circular logic. First you said that should not be done because of your imagined constraints. In fact, it's not unreasonable to describe some additional semantics of the storage that is unavailable with traditional filesystems. That said, a noop system call is on the order of 100 cycles nowadays, so rushing to implement these APIs without seeing good numbers and actual users ready to go seems premature. *This* is the real reason not to implement new APIs yet. > If you want a volume manager on stereoids that hands out large chunks > of storage memory that can't ever be moved, truncated, shared, allocated > on demand, etc - implement it in your library on top of a device file. Those constraints don't exist either. I've written a filesystem that avoids them. It isn't rocket science. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753104AbcIMBbn (ORCPT ); Mon, 12 Sep 2016 21:31:43 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:34290 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750975AbcIMBbk (ORCPT ); Mon, 12 Sep 2016 21:31:40 -0400 Date: Tue, 13 Sep 2016 11:31:28 +1000 From: Nicholas Piggin To: Christoph Hellwig Cc: "Oliver O'Halloran" , Yumei Huang , Michal Hocko , Xiao Guangrong , Andrew Morton , KVM list , Linux MM , Gleb Natapov , "linux-nvdimm@lists.01.org" , mtosatti@redhat.com, "linux-kernel@vger.kernel.org" , Dave Hansen , Stefan Hajnoczi , linux-fsdevel , Paolo Bonzini Subject: Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps) Message-ID: <20160913113128.4eae792e@roar.ozlabs.ibm.com> In-Reply-To: <20160912150148.GA10039@infradead.org> References: <20160908225636.GB15167@linux.intel.com> <20160912052703.GA1897@infradead.org> <20160912075128.GB21474@infradead.org> <20160912180507.533b3549@roar.ozlabs.ibm.com> <20160912150148.GA10039@infradead.org> Organization: IBM X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 12 Sep 2016 08:01:48 -0700 Christoph Hellwig wrote: > On Mon, Sep 12, 2016 at 06:05:07PM +1000, Nicholas Piggin wrote: > > It's not fundamentally broken, it just doesn't fit well existing > > filesystems. > > Or the existing file system architecture for that matter. Which makes > it a fundamentally broken model. Not really. A few reasonable changes can be made to improve things. Until just now you thought it was fundamentally impossible to make a reasonable implementation due to Dave's "constraints". > > > Dave's post of requirements is also wrong. A filesystem does not have > > to guarantee all that, it only has to guarantee that is the case for > > a given block after it has a mapping and page fault returns, other > > operations can be supported by invalidating mappings, etc. > > Which doesn't really matter if your use case is manipulating > fully mapped files. Nothing that says you have to use them fully mapped always and not use other APIs on them. > But back to the point: if you want to use a full blown Linux or Unix > filesystem you will always have to fsync (or variants of it like msync), > period. That's circular logic. First you said that should not be done because of your imagined constraints. In fact, it's not unreasonable to describe some additional semantics of the storage that is unavailable with traditional filesystems. That said, a noop system call is on the order of 100 cycles nowadays, so rushing to implement these APIs without seeing good numbers and actual users ready to go seems premature. *This* is the real reason not to implement new APIs yet. > If you want a volume manager on stereoids that hands out large chunks > of storage memory that can't ever be moved, truncated, shared, allocated > on demand, etc - implement it in your library on top of a device file. Those constraints don't exist either. I've written a filesystem that avoids them. It isn't rocket science. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 13 Sep 2016 11:31:28 +1000 From: Nicholas Piggin To: Christoph Hellwig Cc: Oliver O'Halloran , Yumei Huang , Michal Hocko , Xiao Guangrong , Andrew Morton , KVM list , Linux MM , Gleb Natapov , "linux-nvdimm@lists.01.org" , mtosatti@redhat.com, "linux-kernel@vger.kernel.org" , Dave Hansen , Stefan Hajnoczi , linux-fsdevel , Paolo Bonzini Subject: Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps) Message-ID: <20160913113128.4eae792e@roar.ozlabs.ibm.com> In-Reply-To: <20160912150148.GA10039@infradead.org> References: <20160908225636.GB15167@linux.intel.com> <20160912052703.GA1897@infradead.org> <20160912075128.GB21474@infradead.org> <20160912180507.533b3549@roar.ozlabs.ibm.com> <20160912150148.GA10039@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Mon, 12 Sep 2016 08:01:48 -0700 Christoph Hellwig wrote: > On Mon, Sep 12, 2016 at 06:05:07PM +1000, Nicholas Piggin wrote: > > It's not fundamentally broken, it just doesn't fit well existing > > filesystems. > > Or the existing file system architecture for that matter. Which makes > it a fundamentally broken model. Not really. A few reasonable changes can be made to improve things. Until just now you thought it was fundamentally impossible to make a reasonable implementation due to Dave's "constraints". > > > Dave's post of requirements is also wrong. A filesystem does not have > > to guarantee all that, it only has to guarantee that is the case for > > a given block after it has a mapping and page fault returns, other > > operations can be supported by invalidating mappings, etc. > > Which doesn't really matter if your use case is manipulating > fully mapped files. Nothing that says you have to use them fully mapped always and not use other APIs on them. > But back to the point: if you want to use a full blown Linux or Unix > filesystem you will always have to fsync (or variants of it like msync), > period. That's circular logic. First you said that should not be done because of your imagined constraints. In fact, it's not unreasonable to describe some additional semantics of the storage that is unavailable with traditional filesystems. That said, a noop system call is on the order of 100 cycles nowadays, so rushing to implement these APIs without seeing good numbers and actual users ready to go seems premature. *This* is the real reason not to implement new APIs yet. > If you want a volume manager on stereoids that hands out large chunks > of storage memory that can't ever be moved, truncated, shared, allocated > on demand, etc - implement it in your library on top of a device file. Those constraints don't exist either. I've written a filesystem that avoids them. It isn't rocket science. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicholas Piggin Subject: Re: DAX mapping detection (was: Re: [PATCH] Fix region lost in /proc/self/smaps) Date: Tue, 13 Sep 2016 11:31:28 +1000 Message-ID: <20160913113128.4eae792e@roar.ozlabs.ibm.com> References: <20160908225636.GB15167@linux.intel.com> <20160912052703.GA1897@infradead.org> <20160912075128.GB21474@infradead.org> <20160912180507.533b3549@roar.ozlabs.ibm.com> <20160912150148.GA10039@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: Yumei Huang , Michal Hocko , Xiao Guangrong , KVM list , Dave Hansen , Gleb Natapov , "linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org" , mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Linux MM , Stefan Hajnoczi , linux-fsdevel , Paolo Bonzini , Andrew Morton To: Christoph Hellwig Return-path: In-Reply-To: <20160912150148.GA10039-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" List-Id: kvm.vger.kernel.org On Mon, 12 Sep 2016 08:01:48 -0700 Christoph Hellwig wrote: > On Mon, Sep 12, 2016 at 06:05:07PM +1000, Nicholas Piggin wrote: > > It's not fundamentally broken, it just doesn't fit well existing > > filesystems. > > Or the existing file system architecture for that matter. Which makes > it a fundamentally broken model. Not really. A few reasonable changes can be made to improve things. Until just now you thought it was fundamentally impossible to make a reasonable implementation due to Dave's "constraints". > > > Dave's post of requirements is also wrong. A filesystem does not have > > to guarantee all that, it only has to guarantee that is the case for > > a given block after it has a mapping and page fault returns, other > > operations can be supported by invalidating mappings, etc. > > Which doesn't really matter if your use case is manipulating > fully mapped files. Nothing that says you have to use them fully mapped always and not use other APIs on them. > But back to the point: if you want to use a full blown Linux or Unix > filesystem you will always have to fsync (or variants of it like msync), > period. That's circular logic. First you said that should not be done because of your imagined constraints. In fact, it's not unreasonable to describe some additional semantics of the storage that is unavailable with traditional filesystems. That said, a noop system call is on the order of 100 cycles nowadays, so rushing to implement these APIs without seeing good numbers and actual users ready to go seems premature. *This* is the real reason not to implement new APIs yet. > If you want a volume manager on stereoids that hands out large chunks > of storage memory that can't ever be moved, truncated, shared, allocated > on demand, etc - implement it in your library on top of a device file. Those constraints don't exist either. I've written a filesystem that avoids them. It isn't rocket science.