From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([65.50.211.133]:36436 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754179AbdKKAh0 (ORCPT ); Fri, 10 Nov 2017 19:37:26 -0500 Date: Fri, 10 Nov 2017 16:37:21 -0800 From: Matthew Wilcox To: "Fu, Rodney" Cc: "hch@lst.de" , "viro@zeniv.linux.org.uk" , "linux-fsdevel@vger.kernel.org" Subject: Re: Provision for filesystem specific open flags Message-ID: <20171111003721.GA9546@bombadil.infradead.org> References: <20171110172344.GA15288@lst.de> <20171110192902.GA10339@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Nov 10, 2017 at 09:04:31PM +0000, Fu, Rodney wrote: > > No. If you want new flags bits, make a public proposal. Maybe some other > > filesystem would also benefit from them. > > Ah, I see what you mean now, thanks. > > I would like to propose O_CONCURRENT_WRITE as a new open flag. It is > currently used in the Panasas filesystem (panfs) and defined with value: > > #define O_CONCURRENT_WRITE 020000000000 > > This flag has been provided by panfs to HPC users via the mpich package for > well over a decade. See: > > https://github.com/pmodels/mpich/blob/master/src/mpi/romio/adio/ad_panfs/ad_panfs_open6.c#L344 > > O_CONCURRENT_WRITE indicates to the filesystem that the application doing the > open is participating in a coordinated distributed manner with other such > applications, possibly running on different hosts. This allows the panfs > filesystem to delegate some of the cache coherency responsibilities to the > application, improving performance. > > The reason this flag is used on open as opposed to having a post-open ioctl or > fcntl SETFL is to allow panfs to catch and reject opens by applications that > attempt to access files that have already been opened by applications that have > set O_CONCURRENT_WRITE. OK, let me just check I understand. Once any application has opened the inode with O_CONCURRENT_WRITE, all subsequent attempts to open the same inode without O_CONCURRENT_WRITE will fail. Presumably also if somebody already has the inode open without O_CONCURRENT_WRITE set, the first open with O_CONCURRENT_WRITE will fail? Are opens with O_RDONLY also blocked? This feels a lot like leases ... maybe there's an opportunity to give better semantics here -- rather than rejecting opens without O_CONCURRENT_WRITE, all existing users could be forced to use the stricter coherency model?