* cto changes for v4 atomic open @ 2021-07-30 13:25 Benjamin Coddington 2021-07-30 14:48 ` Trond Myklebust 0 siblings, 1 reply; 28+ messages in thread From: Benjamin Coddington @ 2021-07-30 13:25 UTC (permalink / raw) To: Trond Myklebust, Linux NFS Mailing List; +Cc: Pierguido Lambri I have some folks unhappy about behavior changes after: 479219218fbe NFS: Optimise away the close-to-open GETATTR when we have NFSv4 OPEN Before this change, a client holding a RO open would invalidate the pagecache when doing a second RW open. Now the client doesn't invalidate the pagecache, though technically it could because we see a changeattr update on the RW OPEN response. I feel this is a grey area in CTO if we're already holding an open. Do we know how the client ought to behave in this case? Should the client's open upgrade to RW invalidate the pagecache? Ben ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-07-30 13:25 cto changes for v4 atomic open Benjamin Coddington @ 2021-07-30 14:48 ` Trond Myklebust 2021-07-30 15:14 ` Benjamin Coddington 2021-08-03 20:30 ` J. Bruce Fields 0 siblings, 2 replies; 28+ messages in thread From: Trond Myklebust @ 2021-07-30 14:48 UTC (permalink / raw) To: linux-nfs, bcodding; +Cc: plambri On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > I have some folks unhappy about behavior changes after: 479219218fbe > NFS: > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > Before this change, a client holding a RO open would invalidate the > pagecache when doing a second RW open. > > Now the client doesn't invalidate the pagecache, though technically > it could > because we see a changeattr update on the RW OPEN response. > > I feel this is a grey area in CTO if we're already holding an open. > Do we > know how the client ought to behave in this case? Should the > client's open > upgrade to RW invalidate the pagecache? > It's not a "grey area in close-to-open" at all. It is very cut and dried. If you need to invalidate your page cache while the file is open, then by definition you are in a situation where there is a write by another client going on while you are reading. You're clearly not doing close- to-open. The people who are doing this should be using uncached I/O. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-07-30 14:48 ` Trond Myklebust @ 2021-07-30 15:14 ` Benjamin Coddington 2021-08-03 20:30 ` J. Bruce Fields 1 sibling, 0 replies; 28+ messages in thread From: Benjamin Coddington @ 2021-07-30 15:14 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs, plambri On 30 Jul 2021, at 10:48, Trond Myklebust wrote: > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: >> I have some folks unhappy about behavior changes after: 479219218fbe >> NFS: >> Optimise away the close-to-open GETATTR when we have NFSv4 OPEN >> >> Before this change, a client holding a RO open would invalidate the >> pagecache when doing a second RW open. >> >> Now the client doesn't invalidate the pagecache, though technically >> it could >> because we see a changeattr update on the RW OPEN response. >> >> I feel this is a grey area in CTO if we're already holding an open. >> Do we >> know how the client ought to behave in this case? Should the >> client's open >> upgrade to RW invalidate the pagecache? >> > > It's not a "grey area in close-to-open" at all. It is very cut and > dried. > > If you need to invalidate your page cache while the file is open, then > by definition you are in a situation where there is a write by another > client going on while you are reading. You're clearly not doing close- > to-open. > > The people who are doing this should be using uncached I/O. Thanks Trond, that corrects my ambiguity and yes - there's a much better way. Ben ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-07-30 14:48 ` Trond Myklebust 2021-07-30 15:14 ` Benjamin Coddington @ 2021-08-03 20:30 ` J. Bruce Fields 2021-08-03 21:07 ` Trond Myklebust 1 sibling, 1 reply; 28+ messages in thread From: J. Bruce Fields @ 2021-08-03 20:30 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs, bcodding, plambri On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > I have some folks unhappy about behavior changes after: 479219218fbe > > NFS: > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > > > Before this change, a client holding a RO open would invalidate the > > pagecache when doing a second RW open. > > > > Now the client doesn't invalidate the pagecache, though technically > > it could > > because we see a changeattr update on the RW OPEN response. > > > > I feel this is a grey area in CTO if we're already holding an open. > > Do we > > know how the client ought to behave in this case? Should the > > client's open > > upgrade to RW invalidate the pagecache? > > > > It's not a "grey area in close-to-open" at all. It is very cut and > dried. > > If you need to invalidate your page cache while the file is open, then > by definition you are in a situation where there is a write by another > client going on while you are reading. You're clearly not doing close- > to-open. Documentation is really unclear about this case. Every definition of close-to-open that I've seen says that it requires a cache consistency check on every application open. I've never seen one that says "on every open that doesn't overlap with an already-existing open on that client". They *usually* also preface that by saying that this is motivated by the use case where opens don't overlap. But it's never made clear that that's part of the definition. --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 20:30 ` J. Bruce Fields @ 2021-08-03 21:07 ` Trond Myklebust 2021-08-03 21:36 ` bfields 0 siblings, 1 reply; 28+ messages in thread From: Trond Myklebust @ 2021-08-03 21:07 UTC (permalink / raw) To: bfields; +Cc: plambri, linux-nfs, bcodding On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > I have some folks unhappy about behavior changes after: > > > 479219218fbe > > > NFS: > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > > > > > Before this change, a client holding a RO open would invalidate > > > the > > > pagecache when doing a second RW open. > > > > > > Now the client doesn't invalidate the pagecache, though > > > technically > > > it could > > > because we see a changeattr update on the RW OPEN response. > > > > > > I feel this is a grey area in CTO if we're already holding an > > > open. > > > Do we > > > know how the client ought to behave in this case? Should the > > > client's open > > > upgrade to RW invalidate the pagecache? > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut and > > dried. > > > > If you need to invalidate your page cache while the file is open, > > then > > by definition you are in a situation where there is a write by > > another > > client going on while you are reading. You're clearly not doing > > close- > > to-open. > > Documentation is really unclear about this case. Every definition of > close-to-open that I've seen says that it requires a cache > consistency > check on every application open. I've never seen one that says "on > every open that doesn't overlap with an already-existing open on that > client". > > They *usually* also preface that by saying that this is motivated by > the > use case where opens don't overlap. But it's never made clear that > that's part of the definition. > I'm not following your logic. The close-to-open model assumes that the file is only being modified by one client at a time and it assumes that file contents may be cached while an application is holding it open. The point checks exist in order to detect if the file is being changed when the file is not open. Linux does not have a per-application cache. It has a page cache that is shared among all applications. It is impossible for two applications to open the same file using buffered I/O, and yet see different contents. So why do we need a second point check of the validity of the page cache contents when one application has already verified that the cache was valid when it opened it? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 21:07 ` Trond Myklebust @ 2021-08-03 21:36 ` bfields 2021-08-03 21:43 ` Trond Myklebust 2021-08-04 1:43 ` Matt Benjamin 0 siblings, 2 replies; 28+ messages in thread From: bfields @ 2021-08-03 21:36 UTC (permalink / raw) To: Trond Myklebust; +Cc: plambri, linux-nfs, bcodding On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > > I have some folks unhappy about behavior changes after: > > > > 479219218fbe > > > > NFS: > > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > > > > > > > Before this change, a client holding a RO open would invalidate > > > > the > > > > pagecache when doing a second RW open. > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > technically > > > > it could > > > > because we see a changeattr update on the RW OPEN response. > > > > > > > > I feel this is a grey area in CTO if we're already holding an > > > > open. > > > > Do we > > > > know how the client ought to behave in this case? Should the > > > > client's open > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut and > > > dried. > > > > > > If you need to invalidate your page cache while the file is open, > > > then > > > by definition you are in a situation where there is a write by > > > another > > > client going on while you are reading. You're clearly not doing > > > close- > > > to-open. > > > > Documentation is really unclear about this case. Every definition of > > close-to-open that I've seen says that it requires a cache > > consistency > > check on every application open. I've never seen one that says "on > > every open that doesn't overlap with an already-existing open on that > > client". > > > > They *usually* also preface that by saying that this is motivated by > > the > > use case where opens don't overlap. But it's never made clear that > > that's part of the definition. > > > > I'm not following your logic. It's just a question of what every source I can find says close-to-open means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency provides a guarantee of cache consistency at the level of file opens and closes. When a file is closed by an application, the client flushes any cached changs to the server. When a file is opened, the client ignores any cache time remaining (if the file data are cached) and makes an explicit GETATTR call to the server to check the file modification time." > The close-to-open model assumes that the file is only being modified by > one client at a time and it assumes that file contents may be cached > while an application is holding it open. > The point checks exist in order to detect if the file is being changed > when the file is not open. > > Linux does not have a per-application cache. It has a page cache that > is shared among all applications. It is impossible for two applications > to open the same file using buffered I/O, and yet see different > contents. Right, so based on the descriptions like the one above, I would have expected both applications to see new data at that point. Maybe that's not practical to implement. It'd be nice at least if that was explicit in the documentation. --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 21:36 ` bfields @ 2021-08-03 21:43 ` Trond Myklebust 2021-08-03 23:47 ` NeilBrown 2021-08-04 1:43 ` Matt Benjamin 1 sibling, 1 reply; 28+ messages in thread From: Trond Myklebust @ 2021-08-03 21:43 UTC (permalink / raw) To: bfields; +Cc: plambri, linux-nfs, bcodding On Tue, 2021-08-03 at 17:36 -0400, bfields@fieldses.org wrote: > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > > > I have some folks unhappy about behavior changes after: > > > > > 479219218fbe > > > > > NFS: > > > > > Optimise away the close-to-open GETATTR when we have NFSv4 > > > > > OPEN > > > > > > > > > > Before this change, a client holding a RO open would > > > > > invalidate > > > > > the > > > > > pagecache when doing a second RW open. > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > technically > > > > > it could > > > > > because we see a changeattr update on the RW OPEN response. > > > > > > > > > > I feel this is a grey area in CTO if we're already holding an > > > > > open. > > > > > Do we > > > > > know how the client ought to behave in this case? Should the > > > > > client's open > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut > > > > and > > > > dried. > > > > > > > > If you need to invalidate your page cache while the file is > > > > open, > > > > then > > > > by definition you are in a situation where there is a write by > > > > another > > > > client going on while you are reading. You're clearly not doing > > > > close- > > > > to-open. > > > > > > Documentation is really unclear about this case. Every > > > definition of > > > close-to-open that I've seen says that it requires a cache > > > consistency > > > check on every application open. I've never seen one that says > > > "on > > > every open that doesn't overlap with an already-existing open on > > > that > > > client". > > > > > > They *usually* also preface that by saying that this is motivated > > > by > > > the > > > use case where opens don't overlap. But it's never made clear > > > that > > > that's part of the definition. > > > > > > > I'm not following your logic. > > It's just a question of what every source I can find says close-to- > open > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > provides a guarantee of cache consistency at the level of file opens > and > closes. When a file is closed by an application, the client flushes > any > cached changs to the server. When a file is opened, the client > ignores > any cache time remaining (if the file data are cached) and makes an > explicit GETATTR call to the server to check the file modification > time." > > > The close-to-open model assumes that the file is only being > > modified by > > one client at a time and it assumes that file contents may be > > cached > > while an application is holding it open. > > The point checks exist in order to detect if the file is being > > changed > > when the file is not open. > > > > Linux does not have a per-application cache. It has a page cache > > that > > is shared among all applications. It is impossible for two > > applications > > to open the same file using buffered I/O, and yet see different > > contents. > > Right, so based on the descriptions like the one above, I would have > expected both applications to see new data at that point. Why? That would be a clear violation of the close-to-open rule that nobody else can write to the file while it is open. > > Maybe that's not practical to implement. It'd be nice at least if > that > was explicit in the documentation. > > --b. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 21:43 ` Trond Myklebust @ 2021-08-03 23:47 ` NeilBrown 2021-08-04 0:00 ` Trond Myklebust 0 siblings, 1 reply; 28+ messages in thread From: NeilBrown @ 2021-08-03 23:47 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 04 Aug 2021, Trond Myklebust wrote: > On Tue, 2021-08-03 at 17:36 -0400, bfields@fieldses.org wrote: > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > > > > I have some folks unhappy about behavior changes after: > > > > > > 479219218fbe > > > > > > NFS: > > > > > > Optimise away the close-to-open GETATTR when we have NFSv4 > > > > > > OPEN > > > > > > > > > > > > Before this change, a client holding a RO open would > > > > > > invalidate > > > > > > the > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > technically > > > > > > it could > > > > > > because we see a changeattr update on the RW OPEN response. > > > > > > > > > > > > I feel this is a grey area in CTO if we're already holding an > > > > > > open. > > > > > > Do we > > > > > > know how the client ought to behave in this case? Should the > > > > > > client's open > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut > > > > > and > > > > > dried. > > > > > > > > > > If you need to invalidate your page cache while the file is > > > > > open, > > > > > then > > > > > by definition you are in a situation where there is a write by > > > > > another > > > > > client going on while you are reading. You're clearly not doing > > > > > close- > > > > > to-open. > > > > > > > > Documentation is really unclear about this case. Every > > > > definition of > > > > close-to-open that I've seen says that it requires a cache > > > > consistency > > > > check on every application open. I've never seen one that says > > > > "on > > > > every open that doesn't overlap with an already-existing open on > > > > that > > > > client". > > > > > > > > They *usually* also preface that by saying that this is motivated > > > > by > > > > the > > > > use case where opens don't overlap. But it's never made clear > > > > that > > > > that's part of the definition. > > > > > > > > > > I'm not following your logic. > > > > It's just a question of what every source I can find says close-to- > > open > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > > provides a guarantee of cache consistency at the level of file opens > > and > > closes. When a file is closed by an application, the client flushes > > any > > cached changs to the server. When a file is opened, the client > > ignores > > any cache time remaining (if the file data are cached) and makes an > > explicit GETATTR call to the server to check the file modification > > time." > > > > > The close-to-open model assumes that the file is only being > > > modified by > > > one client at a time and it assumes that file contents may be > > > cached > > > while an application is holding it open. > > > The point checks exist in order to detect if the file is being > > > changed > > > when the file is not open. > > > > > > Linux does not have a per-application cache. It has a page cache > > > that > > > is shared among all applications. It is impossible for two > > > applications > > > to open the same file using buffered I/O, and yet see different > > > contents. > > > > Right, so based on the descriptions like the one above, I would have > > expected both applications to see new data at that point. > > Why? That would be a clear violation of the close-to-open rule that > nobody else can write to the file while it is open. > Is the rule A - "it is not permitted for any other application/client to write to the file while another has it open" or B - "it is not expected for any other application/client to write to the file while another has it open" I think B, because A is clearly not enforced. That suggests that there is no *need* to check for changes, but equally there is no barrier to checking for changes. So that fact that one application has the file open should not prevent a check when another application opens the file. Equally it should not prevent a flush when some other application closes the file. It is somewhat weird that if an application on one client misbehaves by keeping a file open, that will prevent other applications on the same client from seeing non-local changes, but will not prevent applications on other clients from seeing any changes. NeilBrown ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 23:47 ` NeilBrown @ 2021-08-04 0:00 ` Trond Myklebust 2021-08-04 0:04 ` Trond Myklebust 2021-08-04 0:57 ` NeilBrown 0 siblings, 2 replies; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 0:00 UTC (permalink / raw) To: neilb; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 2021-08-04 at 09:47 +1000, NeilBrown wrote: > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > On Tue, 2021-08-03 at 17:36 -0400, bfields@fieldses.org wrote: > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust > > > > > wrote: > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington > > > > > > wrote: > > > > > > > I have some folks unhappy about behavior changes after: > > > > > > > 479219218fbe > > > > > > > NFS: > > > > > > > Optimise away the close-to-open GETATTR when we have > > > > > > > NFSv4 > > > > > > > OPEN > > > > > > > > > > > > > > Before this change, a client holding a RO open would > > > > > > > invalidate > > > > > > > the > > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > > technically > > > > > > > it could > > > > > > > because we see a changeattr update on the RW OPEN > > > > > > > response. > > > > > > > > > > > > > > I feel this is a grey area in CTO if we're already > > > > > > > holding an > > > > > > > open. > > > > > > > Do we > > > > > > > know how the client ought to behave in this case? Should > > > > > > > the > > > > > > > client's open > > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very > > > > > > cut > > > > > > and > > > > > > dried. > > > > > > > > > > > > If you need to invalidate your page cache while the file is > > > > > > open, > > > > > > then > > > > > > by definition you are in a situation where there is a write > > > > > > by > > > > > > another > > > > > > client going on while you are reading. You're clearly not > > > > > > doing > > > > > > close- > > > > > > to-open. > > > > > > > > > > Documentation is really unclear about this case. Every > > > > > definition of > > > > > close-to-open that I've seen says that it requires a cache > > > > > consistency > > > > > check on every application open. I've never seen one that > > > > > says > > > > > "on > > > > > every open that doesn't overlap with an already-existing open > > > > > on > > > > > that > > > > > client". > > > > > > > > > > They *usually* also preface that by saying that this is > > > > > motivated > > > > > by > > > > > the > > > > > use case where opens don't overlap. But it's never made > > > > > clear > > > > > that > > > > > that's part of the definition. > > > > > > > > > > > > > I'm not following your logic. > > > > > > It's just a question of what every source I can find says close- > > > to- > > > open > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > > > provides a guarantee of cache consistency at the level of file > > > opens > > > and > > > closes. When a file is closed by an application, the client > > > flushes > > > any > > > cached changs to the server. When a file is opened, the client > > > ignores > > > any cache time remaining (if the file data are cached) and makes > > > an > > > explicit GETATTR call to the server to check the file > > > modification > > > time." > > > > > > > The close-to-open model assumes that the file is only being > > > > modified by > > > > one client at a time and it assumes that file contents may be > > > > cached > > > > while an application is holding it open. > > > > The point checks exist in order to detect if the file is being > > > > changed > > > > when the file is not open. > > > > > > > > Linux does not have a per-application cache. It has a page > > > > cache > > > > that > > > > is shared among all applications. It is impossible for two > > > > applications > > > > to open the same file using buffered I/O, and yet see different > > > > contents. > > > > > > Right, so based on the descriptions like the one above, I would > > > have > > > expected both applications to see new data at that point. > > > > Why? That would be a clear violation of the close-to-open rule that > > nobody else can write to the file while it is open. > > > > Is the rule > A - "it is not permitted for any other application/client to write > to > the file while another has it open" > or > B - "it is not expected for any other application/client to write to > the file while another has it open" > > I think B, because A is clearly not enforced. That suggests that > there > is no *need* to check for changes, but equally there is no barrier to > checking for changes. So that fact that one application has the file > open should not prevent a check when another application opens the > file. > Equally it should not prevent a flush when some other application > closes > the file. > > It is somewhat weird that if an application on one client misbehaves > by > keeping a file open, that will prevent other applications on the same > client from seeing non-local changes, but will not prevent > applications > on other clients from seeing any changes. > > NeilBrown No. What you propose is to optimise for a fringe case, which we cannot guarantee will work anyway. I'd much rather optimise for the common case, which is the only case with predictable semantics. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 0:00 ` Trond Myklebust @ 2021-08-04 0:04 ` Trond Myklebust 2021-08-04 0:57 ` NeilBrown 1 sibling, 0 replies; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 0:04 UTC (permalink / raw) To: neilb; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 2021-08-04 at 00:00 +0000, Trond Myklebust wrote: > On Wed, 2021-08-04 at 09:47 +1000, NeilBrown wrote: > > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > On Tue, 2021-08-03 at 17:36 -0400, bfields@fieldses.org wrote: > > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust > > > > wrote: > > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust > > > > > > wrote: > > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington > > > > > > > wrote: > > > > > > > > I have some folks unhappy about behavior changes after: > > > > > > > > 479219218fbe > > > > > > > > NFS: > > > > > > > > Optimise away the close-to-open GETATTR when we have > > > > > > > > NFSv4 > > > > > > > > OPEN > > > > > > > > > > > > > > > > Before this change, a client holding a RO open would > > > > > > > > invalidate > > > > > > > > the > > > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > > > technically > > > > > > > > it could > > > > > > > > because we see a changeattr update on the RW OPEN > > > > > > > > response. > > > > > > > > > > > > > > > > I feel this is a grey area in CTO if we're already > > > > > > > > holding an > > > > > > > > open. > > > > > > > > Do we > > > > > > > > know how the client ought to behave in this case? > > > > > > > > Should > > > > > > > > the > > > > > > > > client's open > > > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is > > > > > > > very > > > > > > > cut > > > > > > > and > > > > > > > dried. > > > > > > > > > > > > > > If you need to invalidate your page cache while the file > > > > > > > is > > > > > > > open, > > > > > > > then > > > > > > > by definition you are in a situation where there is a > > > > > > > write > > > > > > > by > > > > > > > another > > > > > > > client going on while you are reading. You're clearly not > > > > > > > doing > > > > > > > close- > > > > > > > to-open. > > > > > > > > > > > > Documentation is really unclear about this case. Every > > > > > > definition of > > > > > > close-to-open that I've seen says that it requires a cache > > > > > > consistency > > > > > > check on every application open. I've never seen one that > > > > > > says > > > > > > "on > > > > > > every open that doesn't overlap with an already-existing > > > > > > open > > > > > > on > > > > > > that > > > > > > client". > > > > > > > > > > > > They *usually* also preface that by saying that this is > > > > > > motivated > > > > > > by > > > > > > the > > > > > > use case where opens don't overlap. But it's never made > > > > > > clear > > > > > > that > > > > > > that's part of the definition. > > > > > > > > > > > > > > > > I'm not following your logic. > > > > > > > > It's just a question of what every source I can find says > > > > close- > > > > to- > > > > open > > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open > > > > consistency > > > > provides a guarantee of cache consistency at the level of file > > > > opens > > > > and > > > > closes. When a file is closed by an application, the client > > > > flushes > > > > any > > > > cached changs to the server. When a file is opened, the client > > > > ignores > > > > any cache time remaining (if the file data are cached) and > > > > makes > > > > an > > > > explicit GETATTR call to the server to check the file > > > > modification > > > > time." > > > > > > > > > The close-to-open model assumes that the file is only being > > > > > modified by > > > > > one client at a time and it assumes that file contents may be > > > > > cached > > > > > while an application is holding it open. > > > > > The point checks exist in order to detect if the file is > > > > > being > > > > > changed > > > > > when the file is not open. > > > > > > > > > > Linux does not have a per-application cache. It has a page > > > > > cache > > > > > that > > > > > is shared among all applications. It is impossible for two > > > > > applications > > > > > to open the same file using buffered I/O, and yet see > > > > > different > > > > > contents. > > > > > > > > Right, so based on the descriptions like the one above, I would > > > > have > > > > expected both applications to see new data at that point. > > > > > > Why? That would be a clear violation of the close-to-open rule > > > that > > > nobody else can write to the file while it is open. > > > > > > > Is the rule > > A - "it is not permitted for any other application/client to write > > to > > the file while another has it open" > > or > > B - "it is not expected for any other application/client to write > > to > > the file while another has it open" > > > > I think B, because A is clearly not enforced. That suggests that > > there > > is no *need* to check for changes, but equally there is no barrier > > to > > checking for changes. So that fact that one application has the > > file > > open should not prevent a check when another application opens the > > file. > > Equally it should not prevent a flush when some other application > > closes > > the file. > > > > It is somewhat weird that if an application on one client > > misbehaves > > by > > keeping a file open, that will prevent other applications on the > > same > > client from seeing non-local changes, but will not prevent > > applications > > on other clients from seeing any changes. > > > > NeilBrown > > No. What you propose is to optimise for a fringe case, which we > cannot > guarantee will work anyway. I'd much rather optimise for the common > case, which is the only case with predictable semantics. > The point is that we do support uncached I/O (a.k.a. O_DIRECT) precisely for the cases where users care about the difference in the above to scenarios. Why should we break cached I/O just because of FUD? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 0:00 ` Trond Myklebust 2021-08-04 0:04 ` Trond Myklebust @ 2021-08-04 0:57 ` NeilBrown 2021-08-04 1:03 ` Trond Myklebust 1 sibling, 1 reply; 28+ messages in thread From: NeilBrown @ 2021-08-04 0:57 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 04 Aug 2021, Trond Myklebust wrote: > > No. What you propose is to optimise for a fringe case, which we cannot > guarantee will work anyway. I'd much rather optimise for the common > case, which is the only case with predictable semantics. > "predictable"?? As I understand it (I haven't examined the code) the current semantics includes: If a file is open for read, some other client changed the file, and the file is then opened, then the second open might see new data, or might see old data, depending on whether the requested data is still in cache or not. I find this to be less predictable than the easy-to-understand semantics that Bruce has quoted: - revalidate on every open, flush on every close I'm suggesting we optimize for fringe cases, I'm suggesting we provide semantics that are simple, documentated, and predictable. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 0:57 ` NeilBrown @ 2021-08-04 1:03 ` Trond Myklebust 2021-08-04 1:16 ` bfields 2021-08-04 1:30 ` NeilBrown 0 siblings, 2 replies; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 1:03 UTC (permalink / raw) To: neilb; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote: > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > > No. What you propose is to optimise for a fringe case, which we > > cannot > > guarantee will work anyway. I'd much rather optimise for the common > > case, which is the only case with predictable semantics. > > > > "predictable"?? > > As I understand it (I haven't examined the code) the current > semantics > includes: > If a file is open for read, some other client changed the file, and > the > file is then opened, then the second open might see new data, or > might > see old data, depending on whether the requested data is still in > cache or not. > > I find this to be less predictable than the easy-to-understand > semantics > that Bruce has quoted: > - revalidate on every open, flush on every close > > I'm suggesting we optimize for fringe cases, I'm suggesting we > provide > semantics that are simple, documentated, and predictable. > "Predictable" how? This is cached I/O. By definition, it is allowed to do things like readahead, writeback caching, metadata caching. What you're proposing is to optimise for a case that breaks all of the above. What's the point? We might just as well throw in the towel and just make uncached I/O and 'noac' mounts the default. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:03 ` Trond Myklebust @ 2021-08-04 1:16 ` bfields 2021-08-04 1:25 ` Trond Myklebust 2021-08-04 1:30 ` NeilBrown 1 sibling, 1 reply; 28+ messages in thread From: bfields @ 2021-08-04 1:16 UTC (permalink / raw) To: Trond Myklebust; +Cc: neilb, plambri, linux-nfs, bcodding On Wed, Aug 04, 2021 at 01:03:58AM +0000, Trond Myklebust wrote: > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote: > > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > > > > No. What you propose is to optimise for a fringe case, which we > > > cannot > > > guarantee will work anyway. I'd much rather optimise for the common > > > case, which is the only case with predictable semantics. > > > > > > > "predictable"?? > > > > As I understand it (I haven't examined the code) the current > > semantics > > includes: > > If a file is open for read, some other client changed the file, and > > the > > file is then opened, then the second open might see new data, or > > might > > see old data, depending on whether the requested data is still in > > cache or not. > > > > I find this to be less predictable than the easy-to-understand > > semantics > > that Bruce has quoted: > > - revalidate on every open, flush on every close > > > > I'm suggesting we optimize for fringe cases, I'm suggesting we > > provide > > semantics that are simple, documentated, and predictable. > > > > "Predictable" how? > > This is cached I/O. By definition, it is allowed to do things like > readahead, writeback caching, metadata caching. What you're proposing > is to optimise for a case that breaks all of the above. What's the > point? We might just as well throw in the towel and just make uncached > I/O and 'noac' mounts the default. It's possible to revalidate on every open and also still do readahead, writeback caching, and metadata caching. --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:16 ` bfields @ 2021-08-04 1:25 ` Trond Myklebust 0 siblings, 0 replies; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 1:25 UTC (permalink / raw) To: bfields; +Cc: plambri, linux-nfs, neilb, bcodding On Tue, 2021-08-03 at 21:16 -0400, bfields@fieldses.org wrote: > On Wed, Aug 04, 2021 at 01:03:58AM +0000, Trond Myklebust wrote: > > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote: > > > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > > > > > > No. What you propose is to optimise for a fringe case, which we > > > > cannot > > > > guarantee will work anyway. I'd much rather optimise for the > > > > common > > > > case, which is the only case with predictable semantics. > > > > > > > > > > "predictable"?? > > > > > > As I understand it (I haven't examined the code) the current > > > semantics > > > includes: > > > If a file is open for read, some other client changed the file, > > > and > > > the > > > file is then opened, then the second open might see new data, > > > or > > > might > > > see old data, depending on whether the requested data is still > > > in > > > cache or not. > > > > > > I find this to be less predictable than the easy-to-understand > > > semantics > > > that Bruce has quoted: > > > - revalidate on every open, flush on every close > > > > > > I'm suggesting we optimize for fringe cases, I'm suggesting we > > > provide > > > semantics that are simple, documentated, and predictable. > > > > > > > "Predictable" how? > > > > This is cached I/O. By definition, it is allowed to do things like > > readahead, writeback caching, metadata caching. What you're > > proposing > > is to optimise for a case that breaks all of the above. What's the > > point? We might just as well throw in the towel and just make > > uncached > > I/O and 'noac' mounts the default. > > It's possible to revalidate on every open and also still do > readahead, > writeback caching, and metadata caching. > Sure. It is also possible to revalidate on every read, every write and every metadata operation. That's not the point. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:03 ` Trond Myklebust 2021-08-04 1:16 ` bfields @ 2021-08-04 1:30 ` NeilBrown 2021-08-04 1:38 ` Trond Myklebust 1 sibling, 1 reply; 28+ messages in thread From: NeilBrown @ 2021-08-04 1:30 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 04 Aug 2021, Trond Myklebust wrote: > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote: > > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > > > > No. What you propose is to optimise for a fringe case, which we > > > cannot > > > guarantee will work anyway. I'd much rather optimise for the common > > > case, which is the only case with predictable semantics. > > > > > > > "predictable"?? > > > > As I understand it (I haven't examined the code) the current > > semantics > > includes: > > If a file is open for read, some other client changed the file, and > > the > > file is then opened, then the second open might see new data, or > > might > > see old data, depending on whether the requested data is still in > > cache or not. > > > > I find this to be less predictable than the easy-to-understand > > semantics > > that Bruce has quoted: > > - revalidate on every open, flush on every close > > > > I'm suggesting we optimize for fringe cases, I'm suggesting we > > provide > > semantics that are simple, documentated, and predictable. > > > > "Predictable" how? > > This is cached I/O. By definition, it is allowed to do things like > readahead, writeback caching, metadata caching. What you're proposing > is to optimise for a case that breaks all of the above. What's the > point? We might just as well throw in the towel and just make uncached > I/O and 'noac' mounts the default. How are readahead, and other caching broken? Indeed, how are they even predictable? Caching is almost by definition a best-effort. Read requests may, or may not, be served from read-ahead data. Write maybe written back sooner or later. Various system-load factors can affect this. You can never predict that a cache *will* be used. "revalidate on every open, flush on every close" (in the absence of delegations of course) provides access to the only element of cache behaviour that *can* be predictable: the times when it *wont* be used. NeilBrown ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:30 ` NeilBrown @ 2021-08-04 1:38 ` Trond Myklebust 2021-08-09 4:20 ` NeilBrown 0 siblings, 1 reply; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 1:38 UTC (permalink / raw) To: neilb; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote: > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote: > > > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > > > > > > > No. What you propose is to optimise for a fringe case, which we > > > > cannot > > > > guarantee will work anyway. I'd much rather optimise for the > > > > common > > > > case, which is the only case with predictable semantics. > > > > > > > > > > "predictable"?? > > > > > > As I understand it (I haven't examined the code) the current > > > semantics > > > includes: > > > If a file is open for read, some other client changed the file, > > > and > > > the > > > file is then opened, then the second open might see new data, > > > or > > > might > > > see old data, depending on whether the requested data is still > > > in > > > cache or not. > > > > > > I find this to be less predictable than the easy-to-understand > > > semantics > > > that Bruce has quoted: > > > - revalidate on every open, flush on every close > > > > > > I'm suggesting we optimize for fringe cases, I'm suggesting we > > > provide > > > semantics that are simple, documentated, and predictable. > > > > > > > "Predictable" how? > > > > This is cached I/O. By definition, it is allowed to do things like > > readahead, writeback caching, metadata caching. What you're > > proposing > > is to optimise for a case that breaks all of the above. What's the > > point? We might just as well throw in the towel and just make > > uncached > > I/O and 'noac' mounts the default. > > How are readahead, and other caching broken? Indeed, how are they > even > predictable? Caching is almost by definition a best-effort. Read > requests may, or may not, be served from read-ahead data. Write > maybe > written back sooner or later. Various system-load factors can affect > this. You can never predict that a cache *will* be used. > Caching not a "best effort" attempt. The client is expected to provide a perfect reproduction of the data stored on the server in the case where there is no close-to-open violation. In the case where there are close-to-open violations then there are two cases: 1. The user cares, and is using uncached I/O together with a synchronisation protocol in order to mitigate any data+metadata discrepancies between the client and server. 2. The user doesn't care, and we're in the standard buffered I/O case. Why are you and Bruce insisting that case (2) needs to be treated as special? > "revalidate on every open, flush on every close" (in the absence of > delegations of course) provides access to the only element of cache > behaviour that *can* be predictable: the times when it *wont* be > used. > No. ...and the very fact you had to qualify the above with "in the absence of delegations" proves my point. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:38 ` Trond Myklebust @ 2021-08-09 4:20 ` NeilBrown 2021-08-09 14:22 ` Trond Myklebust 0 siblings, 1 reply; 28+ messages in thread From: NeilBrown @ 2021-08-09 4:20 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, plambri, linux-nfs, bcodding On Wed, 04 Aug 2021, Trond Myklebust wrote: > On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote: > > Caching not a "best effort" attempt. The client is expected to provide > a perfect reproduction of the data stored on the server in the case > where there is no close-to-open violation. > In the case where there are close-to-open violations then there are two > cases: > > 1. The user cares, and is using uncached I/O together with a > synchronisation protocol in order to mitigate any data+metadata > discrepancies between the client and server. > 2. The user doesn't care, and we're in the standard buffered I/O > case. > > > Why are you and Bruce insisting that case (2) needs to be treated as > special? I don't see these as the relevant cases. They seem to assume that "the user" is a single entity with a coherent opinion. I don't think that is necessarily the case. I think it best to focus on the behaviours, and intentions behind, individual applications. You said previously that NFS doesn't provide caches for applications, only for whole clients. This is obviously true but I think it misses an important point. While the cache belongs to the whole client, the "open" and "close" are performed by individual applications. close-to-open addresses what happens between a CLOSE and an OPEN. While it may be reasonable to accept that any application must depend on correctness of any other application with write access to the file, it doesn't necessary follow that any application can only be correct when all applications with read access are well behaved. If an application arranges, through some external means, to only open a file after all possible writing application have closed it, then the NFS caching should not get in the way for the application being able to read anything that the other application(s) wrote. This, it me, is the core of close-to-open consistency. Another application writing concurrently may, of course, affect the read results in an unpredictable way. However another application READING concurrently should not affect an application which is carefully serialised with any writers. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-09 4:20 ` NeilBrown @ 2021-08-09 14:22 ` Trond Myklebust 2021-08-09 14:43 ` Chuck Lever III 0 siblings, 1 reply; 28+ messages in thread From: Trond Myklebust @ 2021-08-09 14:22 UTC (permalink / raw) To: neilb; +Cc: bfields, plambri, linux-nfs, bcodding On Mon, 2021-08-09 at 14:20 +1000, NeilBrown wrote: > On Wed, 04 Aug 2021, Trond Myklebust wrote: > > On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote: > > > > Caching not a "best effort" attempt. The client is expected to > > provide > > a perfect reproduction of the data stored on the server in the case > > where there is no close-to-open violation. > > In the case where there are close-to-open violations then there are > > two > > cases: > > > > 1. The user cares, and is using uncached I/O together with a > > synchronisation protocol in order to mitigate any > > data+metadata > > discrepancies between the client and server. > > 2. The user doesn't care, and we're in the standard buffered I/O > > case. > > > > > > Why are you and Bruce insisting that case (2) needs to be treated > > as > > special? > > I don't see these as the relevant cases. They seem to assume that > "the > user" is a single entity with a coherent opinion. I don't think that > is > necessarily the case. > > I think it best to focus on the behaviours, and intentions behind, > individual applications. You said previously that NFS doesn't > provide > caches for applications, only for whole clients. This is obviously > true > but I think it misses an important point. While the cache belongs to > the whole client, the "open" and "close" are performed by individual > applications. close-to-open addresses what happens between a CLOSE > and > an OPEN. > > While it may be reasonable to accept that any application must depend > on > correctness of any other application with write access to the file, > it > doesn't necessary follow that any application can only be correct > when > all applications with read access are well behaved. > > If an application arranges, through some external means, to only open > a > file after all possible writing application have closed it, then the > NFS > caching should not get in the way for the application being able to > read > anything that the other application(s) wrote. This, it me, is the > core > of close-to-open consistency. > > Another application writing concurrently may, of course, affect the > read > results in an unpredictable way. However another application READING > concurrently should not affect an application which is carefully > serialised with any writers. > That's a discussion we can have after Bruce and Chuck implement read and write delegations that are always handed out when possible. Until that's the case, there will be no changes made to the close-to-open behaviour on the Linux NFSv4 client. As for NFSv3, I don't see the above suggestion ever being implemented in the Linux client because at this point, people deliberately choosing NFSv3 are doing so almost exclusively for performance reasons. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-09 14:22 ` Trond Myklebust @ 2021-08-09 14:43 ` Chuck Lever III 0 siblings, 0 replies; 28+ messages in thread From: Chuck Lever III @ 2021-08-09 14:43 UTC (permalink / raw) To: Trond Myklebust Cc: Neil Brown, Bruce Fields, plambri, Linux NFS Mailing List, bcodding > On Aug 9, 2021, at 10:22 AM, Trond Myklebust <trondmy@hammerspace.com> wrote: > > That's a discussion we can have after Bruce and Chuck implement read > and write delegations that are always handed out when possible. I opened an enhancement request: https://bugzilla.linux-nfs.org/show_bug.cgi?id=364 Feel free to add details or correct any naive assumptions. -- Chuck Lever ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-03 21:36 ` bfields 2021-08-03 21:43 ` Trond Myklebust @ 2021-08-04 1:43 ` Matt Benjamin 2021-08-04 1:51 ` Matt Benjamin 1 sibling, 1 reply; 28+ messages in thread From: Matt Benjamin @ 2021-08-04 1:43 UTC (permalink / raw) To: bfields; +Cc: Trond Myklebust, plambri, linux-nfs, bcodding I think it is how close-to-open has been traditionally understood. I do not believe that close-to-open in any way implies a single writer, rather it sets the consistency expectation for all readers. Matt On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org <bfields@fieldses.org> wrote: > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > > > I have some folks unhappy about behavior changes after: > > > > > 479219218fbe > > > > > NFS: > > > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > > > > > > > > > Before this change, a client holding a RO open would invalidate > > > > > the > > > > > pagecache when doing a second RW open. > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > technically > > > > > it could > > > > > because we see a changeattr update on the RW OPEN response. > > > > > > > > > > I feel this is a grey area in CTO if we're already holding an > > > > > open. > > > > > Do we > > > > > know how the client ought to behave in this case? Should the > > > > > client's open > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut and > > > > dried. > > > > > > > > If you need to invalidate your page cache while the file is open, > > > > then > > > > by definition you are in a situation where there is a write by > > > > another > > > > client going on while you are reading. You're clearly not doing > > > > close- > > > > to-open. > > > > > > Documentation is really unclear about this case. Every definition of > > > close-to-open that I've seen says that it requires a cache > > > consistency > > > check on every application open. I've never seen one that says "on > > > every open that doesn't overlap with an already-existing open on that > > > client". > > > > > > They *usually* also preface that by saying that this is motivated by > > > the > > > use case where opens don't overlap. But it's never made clear that > > > that's part of the definition. > > > > > > > I'm not following your logic. > > It's just a question of what every source I can find says close-to-open > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > provides a guarantee of cache consistency at the level of file opens and > closes. When a file is closed by an application, the client flushes any > cached changs to the server. When a file is opened, the client ignores > any cache time remaining (if the file data are cached) and makes an > explicit GETATTR call to the server to check the file modification > time." > > > The close-to-open model assumes that the file is only being modified by > > one client at a time and it assumes that file contents may be cached > > while an application is holding it open. > > The point checks exist in order to detect if the file is being changed > > when the file is not open. > > > > Linux does not have a per-application cache. It has a page cache that > > is shared among all applications. It is impossible for two applications > > to open the same file using buffered I/O, and yet see different > > contents. > > Right, so based on the descriptions like the one above, I would have > expected both applications to see new data at that point. > > Maybe that's not practical to implement. It'd be nice at least if that > was explicit in the documentation. > > --b. > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:43 ` Matt Benjamin @ 2021-08-04 1:51 ` Matt Benjamin 2021-08-04 2:10 ` Trond Myklebust 0 siblings, 1 reply; 28+ messages in thread From: Matt Benjamin @ 2021-08-04 1:51 UTC (permalink / raw) To: bfields; +Cc: Trond Myklebust, plambri, linux-nfs, bcodding (who have performed an open) On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@redhat.com> wrote: > > I think it is how close-to-open has been traditionally understood. I > do not believe that close-to-open in any way implies a single writer, > rather it sets the consistency expectation for all readers. > > Matt > > On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org > <bfields@fieldses.org> wrote: > > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust wrote: > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington wrote: > > > > > > I have some folks unhappy about behavior changes after: > > > > > > 479219218fbe > > > > > > NFS: > > > > > > Optimise away the close-to-open GETATTR when we have NFSv4 OPEN > > > > > > > > > > > > Before this change, a client holding a RO open would invalidate > > > > > > the > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > technically > > > > > > it could > > > > > > because we see a changeattr update on the RW OPEN response. > > > > > > > > > > > > I feel this is a grey area in CTO if we're already holding an > > > > > > open. > > > > > > Do we > > > > > > know how the client ought to behave in this case? Should the > > > > > > client's open > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very cut and > > > > > dried. > > > > > > > > > > If you need to invalidate your page cache while the file is open, > > > > > then > > > > > by definition you are in a situation where there is a write by > > > > > another > > > > > client going on while you are reading. You're clearly not doing > > > > > close- > > > > > to-open. > > > > > > > > Documentation is really unclear about this case. Every definition of > > > > close-to-open that I've seen says that it requires a cache > > > > consistency > > > > check on every application open. I've never seen one that says "on > > > > every open that doesn't overlap with an already-existing open on that > > > > client". > > > > > > > > They *usually* also preface that by saying that this is motivated by > > > > the > > > > use case where opens don't overlap. But it's never made clear that > > > > that's part of the definition. > > > > > > > > > > I'm not following your logic. > > > > It's just a question of what every source I can find says close-to-open > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > > provides a guarantee of cache consistency at the level of file opens and > > closes. When a file is closed by an application, the client flushes any > > cached changs to the server. When a file is opened, the client ignores > > any cache time remaining (if the file data are cached) and makes an > > explicit GETATTR call to the server to check the file modification > > time." > > > > > The close-to-open model assumes that the file is only being modified by > > > one client at a time and it assumes that file contents may be cached > > > while an application is holding it open. > > > The point checks exist in order to detect if the file is being changed > > > when the file is not open. > > > > > > Linux does not have a per-application cache. It has a page cache that > > > is shared among all applications. It is impossible for two applications > > > to open the same file using buffered I/O, and yet see different > > > contents. > > > > Right, so based on the descriptions like the one above, I would have > > expected both applications to see new data at that point. > > > > Maybe that's not practical to implement. It'd be nice at least if that > > was explicit in the documentation. > > > > --b. > > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 1:51 ` Matt Benjamin @ 2021-08-04 2:10 ` Trond Myklebust 2021-08-04 14:49 ` Patrick Goetz 2021-08-04 18:33 ` Matt Benjamin 0 siblings, 2 replies; 28+ messages in thread From: Trond Myklebust @ 2021-08-04 2:10 UTC (permalink / raw) To: bfields, mbenjami; +Cc: plambri, linux-nfs, bcodding On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote: > (who have performed an open) > > On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@redhat.com> > wrote: > > > > I think it is how close-to-open has been traditionally understood. > > I > > do not believe that close-to-open in any way implies a single > > writer, > > rather it sets the consistency expectation for all readers. > > OK. I'll bite, despite the obvious troll-bait... close-to-open implies a single writer because it is impossible to guarantee ordering semantics in RPC. You could, in theory, do so by serialising on the client, but none of us do that because we care about performance. If you don't serialise between clients, then it is trivial (and I'm seriously tired of people who whine about this) to reproduce reads to file areas that have not been fully synced to the server, despite having data on the client that is writing. i.e. the reader sees holes that never existed on the client that wrote the data. The reason is that the writes got re-ordered en route to the server, and so reads to the areas that have not yet been filled are showing up as holes. So, no, the close-to-open semantics definitely apply to both readers and writers. > > Matt > > > > On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org > > <bfields@fieldses.org> wrote: > > > > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust > > > > > wrote: > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington > > > > > > wrote: > > > > > > > I have some folks unhappy about behavior changes after: > > > > > > > 479219218fbe > > > > > > > NFS: > > > > > > > Optimise away the close-to-open GETATTR when we have > > > > > > > NFSv4 OPEN > > > > > > > > > > > > > > Before this change, a client holding a RO open would > > > > > > > invalidate > > > > > > > the > > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > > technically > > > > > > > it could > > > > > > > because we see a changeattr update on the RW OPEN > > > > > > > response. > > > > > > > > > > > > > > I feel this is a grey area in CTO if we're already > > > > > > > holding an > > > > > > > open. > > > > > > > Do we > > > > > > > know how the client ought to behave in this case? Should > > > > > > > the > > > > > > > client's open > > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very > > > > > > cut and > > > > > > dried. > > > > > > > > > > > > If you need to invalidate your page cache while the file is > > > > > > open, > > > > > > then > > > > > > by definition you are in a situation where there is a write > > > > > > by > > > > > > another > > > > > > client going on while you are reading. You're clearly not > > > > > > doing > > > > > > close- > > > > > > to-open. > > > > > > > > > > Documentation is really unclear about this case. Every > > > > > definition of > > > > > close-to-open that I've seen says that it requires a cache > > > > > consistency > > > > > check on every application open. I've never seen one that > > > > > says "on > > > > > every open that doesn't overlap with an already-existing open > > > > > on that > > > > > client". > > > > > > > > > > They *usually* also preface that by saying that this is > > > > > motivated by > > > > > the > > > > > use case where opens don't overlap. But it's never made > > > > > clear that > > > > > that's part of the definition. > > > > > > > > > > > > > I'm not following your logic. > > > > > > It's just a question of what every source I can find says close- > > > to-open > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > > > provides a guarantee of cache consistency at the level of file > > > opens and > > > closes. When a file is closed by an application, the client > > > flushes any > > > cached changs to the server. When a file is opened, the client > > > ignores > > > any cache time remaining (if the file data are cached) and makes > > > an > > > explicit GETATTR call to the server to check the file > > > modification > > > time." > > > > > > > The close-to-open model assumes that the file is only being > > > > modified by > > > > one client at a time and it assumes that file contents may be > > > > cached > > > > while an application is holding it open. > > > > The point checks exist in order to detect if the file is being > > > > changed > > > > when the file is not open. > > > > > > > > Linux does not have a per-application cache. It has a page > > > > cache that > > > > is shared among all applications. It is impossible for two > > > > applications > > > > to open the same file using buffered I/O, and yet see different > > > > contents. > > > > > > Right, so based on the descriptions like the one above, I would > > > have > > > expected both applications to see new data at that point. > > > > > > Maybe that's not practical to implement. It'd be nice at least > > > if that > > > was explicit in the documentation. > > > > > > --b. > > > > > > > > > -- > > > > Matt Benjamin > > Red Hat, Inc. > > 315 West Huron Street, Suite 140A > > Ann Arbor, Michigan 48103 > > > > http://www.redhat.com/en/technologies/storage > > > > tel. 734-821-5101 > > fax. 734-769-8938 > > cel. 734-216-5309 > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 2:10 ` Trond Myklebust @ 2021-08-04 14:49 ` Patrick Goetz 2021-08-04 15:42 ` Rick Macklem 2021-08-04 18:24 ` Anna Schumaker 2021-08-04 18:33 ` Matt Benjamin 1 sibling, 2 replies; 28+ messages in thread From: Patrick Goetz @ 2021-08-04 14:49 UTC (permalink / raw) To: Trond Myklebust, bfields, mbenjami; +Cc: plambri, linux-nfs, bcodding On 8/3/21 9:10 PM, Trond Myklebust wrote: > > > On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote: >> (who have performed an open) >> >> On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@redhat.com> >> wrote: >>> >>> I think it is how close-to-open has been traditionally understood. >>> I >>> do not believe that close-to-open in any way implies a single >>> writer, >>> rather it sets the consistency expectation for all readers. >>> > > OK. I'll bite, despite the obvious troll-bait... > > > close-to-open implies a single writer because it is impossible to > guarantee ordering semantics in RPC. You could, in theory, do so by > serialising on the client, but none of us do that because we care about > performance. > > If you don't serialise between clients, then it is trivial (and I'm > seriously tired of people who whine about this) to reproduce reads to > file areas that have not been fully synced to the server, despite > having data on the client that is writing. i.e. the reader sees holes > that never existed on the client that wrote the data. > The reason is that the writes got re-ordered en route to the server, > and so reads to the areas that have not yet been filled are showing up > as holes. > > So, no, the close-to-open semantics definitely apply to both readers > and writers. > So, I have a naive question. When a client is writing to cache, why wouldn't it be possible to send an alert to the server indicating that the file is being changed. The server would keep track of such files (client cached, updated) and act accordingly; i.e. sending a request to the client to flush the cache for that file if another client is asking to open the file? The process could be bookended by the client alerting the server when the cached version has been fully synchronized with the copy on the server so that the server wouldn't serve that file until the synchronization is complete. The only problem I can see with this is the client crashing or disconnecting before the file is fully written to the server, but then some timeout condition could be set. >>> Matt >>> >>> On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org >>> <bfields@fieldses.org> wrote: >>>> >>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: >>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: >>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust >>>>>> wrote: >>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington >>>>>>> wrote: >>>>>>>> I have some folks unhappy about behavior changes after: >>>>>>>> 479219218fbe >>>>>>>> NFS: >>>>>>>> Optimise away the close-to-open GETATTR when we have >>>>>>>> NFSv4 OPEN >>>>>>>> >>>>>>>> Before this change, a client holding a RO open would >>>>>>>> invalidate >>>>>>>> the >>>>>>>> pagecache when doing a second RW open. >>>>>>>> >>>>>>>> Now the client doesn't invalidate the pagecache, though >>>>>>>> technically >>>>>>>> it could >>>>>>>> because we see a changeattr update on the RW OPEN >>>>>>>> response. >>>>>>>> >>>>>>>> I feel this is a grey area in CTO if we're already >>>>>>>> holding an >>>>>>>> open. >>>>>>>> Do we >>>>>>>> know how the client ought to behave in this case? Should >>>>>>>> the >>>>>>>> client's open >>>>>>>> upgrade to RW invalidate the pagecache? >>>>>>>> >>>>>>> >>>>>>> It's not a "grey area in close-to-open" at all. It is very >>>>>>> cut and >>>>>>> dried. >>>>>>> >>>>>>> If you need to invalidate your page cache while the file is >>>>>>> open, >>>>>>> then >>>>>>> by definition you are in a situation where there is a write >>>>>>> by >>>>>>> another >>>>>>> client going on while you are reading. You're clearly not >>>>>>> doing >>>>>>> close- >>>>>>> to-open. >>>>>> >>>>>> Documentation is really unclear about this case. Every >>>>>> definition of >>>>>> close-to-open that I've seen says that it requires a cache >>>>>> consistency >>>>>> check on every application open. I've never seen one that >>>>>> says "on >>>>>> every open that doesn't overlap with an already-existing open >>>>>> on that >>>>>> client". >>>>>> >>>>>> They *usually* also preface that by saying that this is >>>>>> motivated by >>>>>> the >>>>>> use case where opens don't overlap. But it's never made >>>>>> clear that >>>>>> that's part of the definition. >>>>>> >>>>> >>>>> I'm not following your logic. >>>> >>>> It's just a question of what every source I can find says close- >>>> to-open >>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency >>>> provides a guarantee of cache consistency at the level of file >>>> opens and >>>> closes. When a file is closed by an application, the client >>>> flushes any >>>> cached changs to the server. When a file is opened, the client >>>> ignores >>>> any cache time remaining (if the file data are cached) and makes >>>> an >>>> explicit GETATTR call to the server to check the file >>>> modification >>>> time." >>>> >>>>> The close-to-open model assumes that the file is only being >>>>> modified by >>>>> one client at a time and it assumes that file contents may be >>>>> cached >>>>> while an application is holding it open. >>>>> The point checks exist in order to detect if the file is being >>>>> changed >>>>> when the file is not open. >>>>> >>>>> Linux does not have a per-application cache. It has a page >>>>> cache that >>>>> is shared among all applications. It is impossible for two >>>>> applications >>>>> to open the same file using buffered I/O, and yet see different >>>>> contents. >>>> >>>> Right, so based on the descriptions like the one above, I would >>>> have >>>> expected both applications to see new data at that point. >>>> >>>> Maybe that's not practical to implement. It'd be nice at least >>>> if that >>>> was explicit in the documentation. >>>> >>>> --b. >>>> >>> >>> >>> -- >>> >>> Matt Benjamin >>> Red Hat, Inc. >>> 315 West Huron Street, Suite 140A >>> Ann Arbor, Michigan 48103 >>> >>> http://www.redhat.com/en/technologies/storage >>> >>> tel. 734-821-5101 >>> fax. 734-769-8938 >>> cel. 734-216-5309 >> >> >> > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 14:49 ` Patrick Goetz @ 2021-08-04 15:42 ` Rick Macklem 2021-08-04 18:24 ` Anna Schumaker 1 sibling, 0 replies; 28+ messages in thread From: Rick Macklem @ 2021-08-04 15:42 UTC (permalink / raw) To: Patrick Goetz, Trond Myklebust, bfields, mbenjami Cc: plambri, linux-nfs, bcodding Patrick Goetz wrote: [stuff snipped] >So, I have a naive question. When a client is writing to cache, why >wouldn't it be possible to send an alert to the server indicating that >the file is being changed. The server would keep track of such files >(client cached, updated) and act accordingly; i.e. sending a request to >the client to flush the cache for that file if another client is asking >to open the file? The process could be bookended by the client alerting >the server when the cached version has been fully synchronized with the >copy on the server so that the server wouldn't serve that file until the >synchronization is complete. The only problem I can see with this is the >client crashing or disconnecting before the file is fully written to the >server, but then some timeout condition could be set. Well, I wouldn't call this a naive question. There is no notification mechanism defined for any version of NFS. However, although it isn't exactly a notification per se, in NFSv4 a client can exclusively lock a byte range (all bytes if desired). The limitation is that all clients have to "play the game" and acquire byte range locks before doing I/O on the file. I've always thought close-to-open consistency was sketchy at best, and clients should use byte range locks if they care about getting up-to-date file data for cases where other clients might be writing the file. The FreeBSD client only implements close-to-open consistency approximately. It uses cached attributes (which may not be up to date) to re-validate cached data upon open syscalls and doesn't worry about mtime clock resolution for NFSv3. --> As such, the client will see data written by another client within a bounded time, but not necessarily immediately after the writer closes the file on another client. When I work on the FreeBSD NFS client, it always seems to come down to "correctness vs good performance via caching" or "how incorrect can I get away with" if you prefer. rick, who chooses to not have an opinion w.r.t. how the Linux NFS client should handle close-to-open consistency ps: I just told Bruce I wasn't going to post, but... >>> Matt >>> >>> On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org >>> <bfields@fieldses.org> wrote: >>>> >>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: >>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: >>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust >>>>>> wrote: >>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington >>>>>>> wrote: >>>>>>>> I have some folks unhappy about behavior changes after: >>>>>>>> 479219218fbe >>>>>>>> NFS: >>>>>>>> Optimise away the close-to-open GETATTR when we have >>>>>>>> NFSv4 OPEN >>>>>>>> >>>>>>>> Before this change, a client holding a RO open would >>>>>>>> invalidate >>>>>>>> the >>>>>>>> pagecache when doing a second RW open. >>>>>>>> >>>>>>>> Now the client doesn't invalidate the pagecache, though >>>>>>>> technically >>>>>>>> it could >>>>>>>> because we see a changeattr update on the RW OPEN >>>>>>>> response. >>>>>>>> >>>>>>>> I feel this is a grey area in CTO if we're already >>>>>>>> holding an >>>>>>>> open. >>>>>>>> Do we >>>>>>>> know how the client ought to behave in this case? Should >>>>>>>> the >>>>>>>> client's open >>>>>>>> upgrade to RW invalidate the pagecache? >>>>>>>> >>>>>>> >>>>>>> It's not a "grey area in close-to-open" at all. It is very >>>>>>> cut and >>>>>>> dried. >>>>>>> >>>>>>> If you need to invalidate your page cache while the file is >>>>>>> open, >>>>>>> then >>>>>>> by definition you are in a situation where there is a write >>>>>>> by >>>>>>> another >>>>>>> client going on while you are reading. You're clearly not >>>>>>> doing >>>>>>> close- >>>>>>> to-open. >>>>>> >>>>>> Documentation is really unclear about this case. Every >>>>>> definition of >>>>>> close-to-open that I've seen says that it requires a cache >>>>>> consistency >>>>>> check on every application open. I've never seen one that >>>>>> says "on >>>>>> every open that doesn't overlap with an already-existing open >>>>>> on that >>>>>> client". >>>>>> >>>>>> They *usually* also preface that by saying that this is >>>>>> motivated by >>>>>> the >>>>>> use case where opens don't overlap. But it's never made >>>>>> clear that >>>>>> that's part of the definition. >>>>>> >>>>> >>>>> I'm not following your logic. >>>> >>>> It's just a question of what every source I can find says close- >>>> to-open >>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency >>>> provides a guarantee of cache consistency at the level of file >>>> opens and >>>> closes. When a file is closed by an application, the client >>>> flushes any >>>> cached changs to the server. When a file is opened, the client >>>> ignores >>>> any cache time remaining (if the file data are cached) and makes >>>> an >>>> explicit GETATTR call to the server to check the file >>>> modification >>>> time." >>>> >>>>> The close-to-open model assumes that the file is only being >>>>> modified by >>>>> one client at a time and it assumes that file contents may be >>>>> cached >>>>> while an application is holding it open. >>>>> The point checks exist in order to detect if the file is being >>>>> changed >>>>> when the file is not open. >>>>> >>>>> Linux does not have a per-application cache. It has a page >>>>> cache that >>>>> is shared among all applications. It is impossible for two >>>>> applications >>>>> to open the same file using buffered I/O, and yet see different >>>>> contents. >>>> >>>> Right, so based on the descriptions like the one above, I would >>>> have >>>> expected both applications to see new data at that point. >>>> >>>> Maybe that's not practical to implement. It'd be nice at least >>>> if that >>>> was explicit in the documentation. >>>> >>>> --b. >>>> >>> >>> >>> -- >>> >>> Matt Benjamin >>> Red Hat, Inc. >>> 315 West Huron Street, Suite 140A >>> Ann Arbor, Michigan 48103 >>> >>> http://www.redhat.com/en/technologies/storage >>> >>> tel. 734-821-5101 >>> fax. 734-769-8938 >>> cel. 734-216-5309 >> >> >> > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 14:49 ` Patrick Goetz 2021-08-04 15:42 ` Rick Macklem @ 2021-08-04 18:24 ` Anna Schumaker 2021-08-06 18:58 ` Patrick Goetz 1 sibling, 1 reply; 28+ messages in thread From: Anna Schumaker @ 2021-08-04 18:24 UTC (permalink / raw) To: Patrick Goetz Cc: Trond Myklebust, bfields, mbenjami, plambri, linux-nfs, bcodding Hi Patrick, On Wed, Aug 4, 2021 at 2:17 PM Patrick Goetz <pgoetz@math.utexas.edu> wrote: > > > > On 8/3/21 9:10 PM, Trond Myklebust wrote: > > > > > > On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote: > >> (who have performed an open) > >> > >> On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@redhat.com> > >> wrote: > >>> > >>> I think it is how close-to-open has been traditionally understood. > >>> I > >>> do not believe that close-to-open in any way implies a single > >>> writer, > >>> rather it sets the consistency expectation for all readers. > >>> > > > > OK. I'll bite, despite the obvious troll-bait... > > > > > > close-to-open implies a single writer because it is impossible to > > guarantee ordering semantics in RPC. You could, in theory, do so by > > serialising on the client, but none of us do that because we care about > > performance. > > > > If you don't serialise between clients, then it is trivial (and I'm > > seriously tired of people who whine about this) to reproduce reads to > > file areas that have not been fully synced to the server, despite > > having data on the client that is writing. i.e. the reader sees holes > > that never existed on the client that wrote the data. > > The reason is that the writes got re-ordered en route to the server, > > and so reads to the areas that have not yet been filled are showing up > > as holes. > > > > So, no, the close-to-open semantics definitely apply to both readers > > and writers. > > > > So, I have a naive question. When a client is writing to cache, why > wouldn't it be possible to send an alert to the server indicating that > the file is being changed. The server would keep track of such files > (client cached, updated) and act accordingly; i.e. sending a request to > the client to flush the cache for that file if another client is asking > to open the file? The process could be bookended by the client alerting > the server when the cached version has been fully synchronized with the > copy on the server so that the server wouldn't serve that file until the > synchronization is complete. The only problem I can see with this is the > client crashing or disconnecting before the file is fully written to the > server, but then some timeout condition could be set. We already have this! What you're describing is almost exactly how delegations work :) Anna > > > > >>> Matt > >>> > >>> On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org > >>> <bfields@fieldses.org> wrote: > >>>> > >>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > >>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > >>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust > >>>>>> wrote: > >>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington > >>>>>>> wrote: > >>>>>>>> I have some folks unhappy about behavior changes after: > >>>>>>>> 479219218fbe > >>>>>>>> NFS: > >>>>>>>> Optimise away the close-to-open GETATTR when we have > >>>>>>>> NFSv4 OPEN > >>>>>>>> > >>>>>>>> Before this change, a client holding a RO open would > >>>>>>>> invalidate > >>>>>>>> the > >>>>>>>> pagecache when doing a second RW open. > >>>>>>>> > >>>>>>>> Now the client doesn't invalidate the pagecache, though > >>>>>>>> technically > >>>>>>>> it could > >>>>>>>> because we see a changeattr update on the RW OPEN > >>>>>>>> response. > >>>>>>>> > >>>>>>>> I feel this is a grey area in CTO if we're already > >>>>>>>> holding an > >>>>>>>> open. > >>>>>>>> Do we > >>>>>>>> know how the client ought to behave in this case? Should > >>>>>>>> the > >>>>>>>> client's open > >>>>>>>> upgrade to RW invalidate the pagecache? > >>>>>>>> > >>>>>>> > >>>>>>> It's not a "grey area in close-to-open" at all. It is very > >>>>>>> cut and > >>>>>>> dried. > >>>>>>> > >>>>>>> If you need to invalidate your page cache while the file is > >>>>>>> open, > >>>>>>> then > >>>>>>> by definition you are in a situation where there is a write > >>>>>>> by > >>>>>>> another > >>>>>>> client going on while you are reading. You're clearly not > >>>>>>> doing > >>>>>>> close- > >>>>>>> to-open. > >>>>>> > >>>>>> Documentation is really unclear about this case. Every > >>>>>> definition of > >>>>>> close-to-open that I've seen says that it requires a cache > >>>>>> consistency > >>>>>> check on every application open. I've never seen one that > >>>>>> says "on > >>>>>> every open that doesn't overlap with an already-existing open > >>>>>> on that > >>>>>> client". > >>>>>> > >>>>>> They *usually* also preface that by saying that this is > >>>>>> motivated by > >>>>>> the > >>>>>> use case where opens don't overlap. But it's never made > >>>>>> clear that > >>>>>> that's part of the definition. > >>>>>> > >>>>> > >>>>> I'm not following your logic. > >>>> > >>>> It's just a question of what every source I can find says close- > >>>> to-open > >>>> means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > >>>> provides a guarantee of cache consistency at the level of file > >>>> opens and > >>>> closes. When a file is closed by an application, the client > >>>> flushes any > >>>> cached changs to the server. When a file is opened, the client > >>>> ignores > >>>> any cache time remaining (if the file data are cached) and makes > >>>> an > >>>> explicit GETATTR call to the server to check the file > >>>> modification > >>>> time." > >>>> > >>>>> The close-to-open model assumes that the file is only being > >>>>> modified by > >>>>> one client at a time and it assumes that file contents may be > >>>>> cached > >>>>> while an application is holding it open. > >>>>> The point checks exist in order to detect if the file is being > >>>>> changed > >>>>> when the file is not open. > >>>>> > >>>>> Linux does not have a per-application cache. It has a page > >>>>> cache that > >>>>> is shared among all applications. It is impossible for two > >>>>> applications > >>>>> to open the same file using buffered I/O, and yet see different > >>>>> contents. > >>>> > >>>> Right, so based on the descriptions like the one above, I would > >>>> have > >>>> expected both applications to see new data at that point. > >>>> > >>>> Maybe that's not practical to implement. It'd be nice at least > >>>> if that > >>>> was explicit in the documentation. > >>>> > >>>> --b. > >>>> > >>> > >>> > >>> -- > >>> > >>> Matt Benjamin > >>> Red Hat, Inc. > >>> 315 West Huron Street, Suite 140A > >>> Ann Arbor, Michigan 48103 > >>> > >>> http://www.redhat.com/en/technologies/storage > >>> > >>> tel. 734-821-5101 > >>> fax. 734-769-8938 > >>> cel. 734-216-5309 > >> > >> > >> > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 18:24 ` Anna Schumaker @ 2021-08-06 18:58 ` Patrick Goetz 2021-08-07 1:03 ` Rick Macklem 0 siblings, 1 reply; 28+ messages in thread From: Patrick Goetz @ 2021-08-06 18:58 UTC (permalink / raw) To: Anna Schumaker; +Cc: linux-nfs, Rick Macklem Hi - I'm having trouble reconciling this comment: On 8/4/21 1:24 PM, Anna Schumaker wrote: >> >> So, I have a naive question. When a client is writing to cache, why >> wouldn't it be possible to send an alert to the server indicating that >> the file is being changed. The server would keep track of such files >> (client cached, updated) and act accordingly; i.e. sending a request to >> the client to flush the cache for that file if another client is asking >> to open the file? The process could be bookended by the client alerting >> the server when the cached version has been fully synchronized with the >> copy on the server so that the server wouldn't serve that file until the >> synchronization is complete. The only problem I can see with this is the >> client crashing or disconnecting before the file is fully written to the >> server, but then some timeout condition could be set. > > We already have this! What you're describing is almost exactly how > delegations work :) > with this one: On 8/4/21 10:42 AM, Rick Macklem wrote: > > There is no notification mechanism defined for any version of NFS. How can you do delegations if there's no notification system? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-06 18:58 ` Patrick Goetz @ 2021-08-07 1:03 ` Rick Macklem 0 siblings, 0 replies; 28+ messages in thread From: Rick Macklem @ 2021-08-07 1:03 UTC (permalink / raw) To: Patrick Goetz, Anna Schumaker; +Cc: linux-nfs Patrick Goetz wrote: >Hi - > >I'm having trouble reconciling this comment: > >On 8/4/21 1:24 PM, Anna Schumaker wrote: >>> >>> So, I have a naive question. When a client is writing to cache, why >>> wouldn't it be possible to send an alert to the server indicating that >>> the file is being changed. The server would keep track of such files >>> (client cached, updated) and act accordingly; i.e. sending a request to >>> the client to flush the cache for that file if another client is asking >>> to open the file? The process could be bookended by the client alerting >>> the server when the cached version has been fully synchronized with the >>> copy on the server so that the server wouldn't serve that file until the >>> synchronization is complete. The only problem I can see with this is the >>> client crashing or disconnecting before the file is fully written to the >>> server, but then some timeout condition could be set. >> >> We already have this! What you're describing is almost exactly how >> delegations work :) >> > > >with this one: > >On 8/4/21 10:42 AM, Rick Macklem wrote: > > > > There is no notification mechanism defined for any version of NFS. > > >How can you do delegations if there's no notification system? When you asked the question, there was no mention of delegations and only a discussion of caching. Delegations deal with Opens and, yes, can be used to maintain consistent data caches when they happen to be issued to client(s). For write delegations, it works like this: - When a client does an Open for writing, the server might choose to issue a write delegation to the client. (It is not required to do so and there is nothing a client can do to ensure that the server chooses to do so. The only rule is "no callback path-->no delegation can be issued". - If the client happens to get a write delegation, then it can assume no other client is reading or writing the file (unless the client fails to maintain its lease, due to network partitioning or ???). --> Therefore it can safely cache the file's data, unless the server allow I/O to be done using special stateids. More on this later. - If the server received an Open request from another client for the file, then it does a CB_RECALL callback to tell the client that it must return the delegation. --> The client can no longer safely cache file data once the delegation is returned, since the server will then allow the other client to Open the file. --> If the client fails to return the delegation for a lease duration, then the server can throw the delegation away. --> If the client does not maintain its lease and maintain its callback path, the client cannot safely cache data based on the delegation, since it might have been discarded by the server. In general, a delegation allows the client to do additional Opens on a file without doing an Open on the server (called level 2 OpLocks in Windows world, I think?). The effect of consistent data caches depends upon two things, which a server might or might not do: 1 - Issue delegations. 2 - Not allow I/O using special stateids. If any client can do I/O using special stateids, then the I/O can be done without having an Open or delegation for the file on the server. In general, a client cannot easily tell if these are the case. I suppose it could try an I/O with a special stateid, but that really only confirms that this particular client cannot do I/O with special stateids, not that no client can do so. A client can see that it acquired a delegation, but can do nothing if it did not get one. --> Is a client going to not cache data for every file, where the server chooses not to issue a delegation. Back to your question. You can consider the CB_RECALL callback a notification, but it is in a sense a notification to the client that the delegation must be returned, not that file data has changed on the server. In other words, a CB_RECALL is done when another client requests a conflicting Open, not when data on the server has been changed. --> This has a similar effect to a notification that the data will/has changed, only if the server requires all I/O present an Open/Lock/Delegation stateid. --> No special stateids allowed and no NFSv3 I/O allowed by the server. (The term notification is used in the NFSv4 RFCs for other things, but no CB_RECALL callbacks.) rick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: cto changes for v4 atomic open 2021-08-04 2:10 ` Trond Myklebust 2021-08-04 14:49 ` Patrick Goetz @ 2021-08-04 18:33 ` Matt Benjamin 1 sibling, 0 replies; 28+ messages in thread From: Matt Benjamin @ 2021-08-04 18:33 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, plambri, linux-nfs, bcodding That was not intended as a troll, and I don't see why you would assume that. Of course, what you're saying is correct, multiple writers are not effectively synchronized by close-to-open, and I wasn't implying they should be. Another 3rd (...) writer operating on the file is still relevant to the consumers, regardless of whether they can achieve a uniform view of the data. Matt On Tue, Aug 3, 2021 at 10:11 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote: > > (who have performed an open) > > > > On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@redhat.com> > > wrote: > > > > > > I think it is how close-to-open has been traditionally understood. > > > I > > > do not believe that close-to-open in any way implies a single > > > writer, > > > rather it sets the consistency expectation for all readers. > > > > > OK. I'll bite, despite the obvious troll-bait... > > > close-to-open implies a single writer because it is impossible to > guarantee ordering semantics in RPC. You could, in theory, do so by > serialising on the client, but none of us do that because we care about > performance. > > If you don't serialise between clients, then it is trivial (and I'm > seriously tired of people who whine about this) to reproduce reads to > file areas that have not been fully synced to the server, despite > having data on the client that is writing. i.e. the reader sees holes > that never existed on the client that wrote the data. > The reason is that the writes got re-ordered en route to the server, > and so reads to the areas that have not yet been filled are showing up > as holes. > > So, no, the close-to-open semantics definitely apply to both readers > and writers. > > > > Matt > > > > > > On Tue, Aug 3, 2021 at 5:36 PM bfields@fieldses.org > > > <bfields@fieldses.org> wrote: > > > > > > > > On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote: > > > > > On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote: > > > > > > On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust > > > > > > wrote: > > > > > > > On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington > > > > > > > wrote: > > > > > > > > I have some folks unhappy about behavior changes after: > > > > > > > > 479219218fbe > > > > > > > > NFS: > > > > > > > > Optimise away the close-to-open GETATTR when we have > > > > > > > > NFSv4 OPEN > > > > > > > > > > > > > > > > Before this change, a client holding a RO open would > > > > > > > > invalidate > > > > > > > > the > > > > > > > > pagecache when doing a second RW open. > > > > > > > > > > > > > > > > Now the client doesn't invalidate the pagecache, though > > > > > > > > technically > > > > > > > > it could > > > > > > > > because we see a changeattr update on the RW OPEN > > > > > > > > response. > > > > > > > > > > > > > > > > I feel this is a grey area in CTO if we're already > > > > > > > > holding an > > > > > > > > open. > > > > > > > > Do we > > > > > > > > know how the client ought to behave in this case? Should > > > > > > > > the > > > > > > > > client's open > > > > > > > > upgrade to RW invalidate the pagecache? > > > > > > > > > > > > > > > > > > > > > > It's not a "grey area in close-to-open" at all. It is very > > > > > > > cut and > > > > > > > dried. > > > > > > > > > > > > > > If you need to invalidate your page cache while the file is > > > > > > > open, > > > > > > > then > > > > > > > by definition you are in a situation where there is a write > > > > > > > by > > > > > > > another > > > > > > > client going on while you are reading. You're clearly not > > > > > > > doing > > > > > > > close- > > > > > > > to-open. > > > > > > > > > > > > Documentation is really unclear about this case. Every > > > > > > definition of > > > > > > close-to-open that I've seen says that it requires a cache > > > > > > consistency > > > > > > check on every application open. I've never seen one that > > > > > > says "on > > > > > > every open that doesn't overlap with an already-existing open > > > > > > on that > > > > > > client". > > > > > > > > > > > > They *usually* also preface that by saying that this is > > > > > > motivated by > > > > > > the > > > > > > use case where opens don't overlap. But it's never made > > > > > > clear that > > > > > > that's part of the definition. > > > > > > > > > > > > > > > > I'm not following your logic. > > > > > > > > It's just a question of what every source I can find says close- > > > > to-open > > > > means. E.g., NFS Illustrated, p. 248, "Close-to-open consistency > > > > provides a guarantee of cache consistency at the level of file > > > > opens and > > > > closes. When a file is closed by an application, the client > > > > flushes any > > > > cached changs to the server. When a file is opened, the client > > > > ignores > > > > any cache time remaining (if the file data are cached) and makes > > > > an > > > > explicit GETATTR call to the server to check the file > > > > modification > > > > time." > > > > > > > > > The close-to-open model assumes that the file is only being > > > > > modified by > > > > > one client at a time and it assumes that file contents may be > > > > > cached > > > > > while an application is holding it open. > > > > > The point checks exist in order to detect if the file is being > > > > > changed > > > > > when the file is not open. > > > > > > > > > > Linux does not have a per-application cache. It has a page > > > > > cache that > > > > > is shared among all applications. It is impossible for two > > > > > applications > > > > > to open the same file using buffered I/O, and yet see different > > > > > contents. > > > > > > > > Right, so based on the descriptions like the one above, I would > > > > have > > > > expected both applications to see new data at that point. > > > > > > > > Maybe that's not practical to implement. It'd be nice at least > > > > if that > > > > was explicit in the documentation. > > > > > > > > --b. > > > > > > > > > > > > > -- > > > > > > Matt Benjamin > > > Red Hat, Inc. > > > 315 West Huron Street, Suite 140A > > > Ann Arbor, Michigan 48103 > > > > > > http://www.redhat.com/en/technologies/storage > > > > > > tel. 734-821-5101 > > > fax. 734-769-8938 > > > cel. 734-216-5309 > > > > > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2021-08-09 14:44 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-30 13:25 cto changes for v4 atomic open Benjamin Coddington 2021-07-30 14:48 ` Trond Myklebust 2021-07-30 15:14 ` Benjamin Coddington 2021-08-03 20:30 ` J. Bruce Fields 2021-08-03 21:07 ` Trond Myklebust 2021-08-03 21:36 ` bfields 2021-08-03 21:43 ` Trond Myklebust 2021-08-03 23:47 ` NeilBrown 2021-08-04 0:00 ` Trond Myklebust 2021-08-04 0:04 ` Trond Myklebust 2021-08-04 0:57 ` NeilBrown 2021-08-04 1:03 ` Trond Myklebust 2021-08-04 1:16 ` bfields 2021-08-04 1:25 ` Trond Myklebust 2021-08-04 1:30 ` NeilBrown 2021-08-04 1:38 ` Trond Myklebust 2021-08-09 4:20 ` NeilBrown 2021-08-09 14:22 ` Trond Myklebust 2021-08-09 14:43 ` Chuck Lever III 2021-08-04 1:43 ` Matt Benjamin 2021-08-04 1:51 ` Matt Benjamin 2021-08-04 2:10 ` Trond Myklebust 2021-08-04 14:49 ` Patrick Goetz 2021-08-04 15:42 ` Rick Macklem 2021-08-04 18:24 ` Anna Schumaker 2021-08-06 18:58 ` Patrick Goetz 2021-08-07 1:03 ` Rick Macklem 2021-08-04 18:33 ` Matt Benjamin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.