linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation
@ 2023-05-22 23:52 Dai Ngo
  2023-05-22 23:52 ` [PATCH v5 1/3] NFSD: enable support for " Dai Ngo
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Dai Ngo @ 2023-05-22 23:52 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, linux-fsdevel

NFSD: add support for NFSv4 write delegation

The NFSv4 server currently supports read delegation using VFS lease
which is implemented using file_lock. 

This patch series add write delegation support for NFSv4 server by:

    . remove the check for F_WRLCK in generic_add_lease to allow
      file_lock to be used for write delegation.  

    . grant write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
      if there is no conflict with other OPENs.

Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
are handled the same as read delegation using notify_change, try_break_deleg.

Changes since v1:

[PATCH 3/4] NFSD: add supports for CB_GETATTR callback
- remove WARN_ON_ONCE from encode_bitmap4
- replace decode_bitmap4 with xdr_stream_decode_uint32_array
- replace xdr_inline_decode and xdr_decode_hyper in decode_cb_getattr
   with xdr_stream_decode_u64. Also remove the un-needed likely().
- modify signature of encode_cb_getattr4args to take pointer to
   nfs4_cb_fattr
- replace decode_attr_length with xdr_stream_decode_u32
- rename decode_cb_getattr to decode_cb_fattr4
- fold the initialization of cb_cinfo and cb_fsize into decode_cb_fattr4
- rename ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
  in fs/nfsd/nfs4xdr.c
- correct NFS4_dec_cb_getattr_sz and update size description

[PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
- change nfs4_handle_wrdeleg_conflict returns __be32 to fix test robot
- change ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
  in fs/nfsd/nfs4xdr.c

Changes since v2:

[PATCH 2/4] NFSD: enable support for write delegation
- rename 'deleg' to 'dl_type' in nfs4_set_delegation
- remove 'wdeleg' in nfs4_open_delegation

- drop [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
  and [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
  for futher clarification of the benefits of these patches

Changes since v3:

- recall write delegation when there is GETATTR from 2nd client
- add trace point to track when write delegation is granted 

Changes since v4:
- squash 4/4 into 2/4
- apply 1/4 last instead of first
- combine nfs4_wrdeleg_filelock and nfs4_handle_wrdeleg_conflict to
  nfsd4_deleg_getattr_conflict and move it to fs/nfsd/nfs4state.c
- check for lock belongs to delegation before proceed and do it
  under the fl_lock
- check and skip FL_LAYOUT file_locks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 1/3] NFSD: enable support for write delegation
  2023-05-22 23:52 [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation Dai Ngo
@ 2023-05-22 23:52 ` Dai Ngo
  2023-05-24 15:08   ` Jeff Layton
  2023-05-22 23:52 ` [PATCH v5 2/3] NFSD: handle GETATTR conflict with " Dai Ngo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Dai Ngo @ 2023-05-22 23:52 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, linux-fsdevel

This patch grants write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
if there is no conflict with other OPENs.

Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
are handled the same as read delegation using notify_change,
try_break_deleg.

Add trace point to track whether read or write delegation is granted.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/nfs4state.c | 28 +++++++++++++++++++---------
 fs/nfsd/trace.h     |  1 +
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6e61fa3acaf1..b90b74a5e66e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1144,7 +1144,7 @@ static void block_delegations(struct knfsd_fh *fh)
 
 static struct nfs4_delegation *
 alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
-		 struct nfs4_clnt_odstate *odstate)
+		struct nfs4_clnt_odstate *odstate, u32 dl_type)
 {
 	struct nfs4_delegation *dp;
 	long n;
@@ -1170,7 +1170,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
 	INIT_LIST_HEAD(&dp->dl_recall_lru);
 	dp->dl_clnt_odstate = odstate;
 	get_clnt_odstate(odstate);
-	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
+	dp->dl_type = dl_type;
 	dp->dl_retries = 1;
 	dp->dl_recalled = false;
 	nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client,
@@ -5451,6 +5451,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 	struct nfs4_delegation *dp;
 	struct nfsd_file *nf;
 	struct file_lock *fl;
+	u32 dl_type;
 
 	/*
 	 * The fi_had_conflict and nfs_get_existing_delegation checks
@@ -5460,7 +5461,13 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 	if (fp->fi_had_conflict)
 		return ERR_PTR(-EAGAIN);
 
-	nf = find_readable_file(fp);
+	if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
+		nf = find_writeable_file(fp);
+		dl_type = NFS4_OPEN_DELEGATE_WRITE;
+	} else {
+		nf = find_readable_file(fp);
+		dl_type = NFS4_OPEN_DELEGATE_READ;
+	}
 	if (!nf) {
 		/*
 		 * We probably could attempt another open and get a read
@@ -5491,11 +5498,11 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 		return ERR_PTR(status);
 
 	status = -ENOMEM;
-	dp = alloc_init_deleg(clp, fp, odstate);
+	dp = alloc_init_deleg(clp, fp, odstate, dl_type);
 	if (!dp)
 		goto out_delegees;
 
-	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
+	fl = nfs4_alloc_init_lease(dp, dl_type);
 	if (!fl)
 		goto out_clnt_odstate;
 
@@ -5590,8 +5597,6 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 		case NFS4_OPEN_CLAIM_PREVIOUS:
 			if (!cb_up)
 				open->op_recall = 1;
-			if (open->op_delegate_type != NFS4_OPEN_DELEGATE_READ)
-				goto out_no_deleg;
 			break;
 		case NFS4_OPEN_CLAIM_NULL:
 			parent = currentfh;
@@ -5616,8 +5621,13 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 
 	memcpy(&open->op_delegate_stateid, &dp->dl_stid.sc_stateid, sizeof(dp->dl_stid.sc_stateid));
 
-	trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
-	open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
+	if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
+		open->op_delegate_type = NFS4_OPEN_DELEGATE_WRITE;
+		trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
+	} else {
+		open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
+		trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
+	}
 	nfs4_put_stid(&dp->dl_stid);
 	return;
 out_no_deleg:
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 4183819ea082..a14cf8684255 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -607,6 +607,7 @@ DEFINE_STATEID_EVENT(layout_recall_release);
 
 DEFINE_STATEID_EVENT(open);
 DEFINE_STATEID_EVENT(deleg_read);
+DEFINE_STATEID_EVENT(deleg_write);
 DEFINE_STATEID_EVENT(deleg_return);
 DEFINE_STATEID_EVENT(deleg_recall);
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-22 23:52 [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation Dai Ngo
  2023-05-22 23:52 ` [PATCH v5 1/3] NFSD: enable support for " Dai Ngo
@ 2023-05-22 23:52 ` Dai Ngo
  2023-05-23  2:43   ` Chuck Lever
  2023-05-24 15:07   ` Jeff Layton
  2023-05-22 23:52 ` [PATCH v5 3/3] locks: allow support for " Dai Ngo
  2023-06-12 16:14 ` [PATCH v5 0/3] NFSD: add support for NFSv4 " Chuck Lever III
  3 siblings, 2 replies; 20+ messages in thread
From: Dai Ngo @ 2023-05-22 23:52 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, linux-fsdevel

If the GETATTR request on a file that has write delegation in effect
and the request attributes include the change info and size attribute
then the write delegation is recalled and NFS4ERR_DELAY is returned
for the GETATTR.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
 fs/nfsd/nfs4xdr.c   |  5 +++++
 fs/nfsd/state.h     |  3 +++
 3 files changed, 45 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b90b74a5e66e..ea9cd781db5f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
 {
 	get_stateid(cstate, &u->write.wr_stateid);
 }
+
+__be32
+nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)
+{
+	struct file_lock_context *ctx;
+	struct file_lock *fl;
+	struct nfs4_delegation *dp;
+
+	ctx = locks_inode_context(inode);
+	if (!ctx)
+		return 0;
+	spin_lock(&ctx->flc_lock);
+	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
+		if (fl->fl_flags == FL_LAYOUT ||
+				fl->fl_lmops != &nfsd_lease_mng_ops)
+			continue;
+		if (fl->fl_type == F_WRLCK) {
+			dp = fl->fl_owner;
+			/*
+			 * increment the sc_count to prevent the delegation to
+			 * be freed while sending the CB_RECALL. The refcount is
+			 * decremented by nfs4_put_stid in nfsd4_cb_recall_release
+			 * after the request was sent.
+			 */
+			if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
+					!refcount_inc_not_zero(&dp->dl_stid.sc_count)) {
+				spin_unlock(&ctx->flc_lock);
+				return 0;
+			}
+			spin_unlock(&ctx->flc_lock);
+			return nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
+		}
+		break;
+	}
+	spin_unlock(&ctx->flc_lock);
+	return 0;
+}
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 76db2fe29624..ed09b575afac 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2966,6 +2966,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
 		if (status)
 			goto out;
 	}
+	if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
+		status = nfs4_handle_wrdeleg_conflict(rqstp, d_inode(dentry));
+		if (status)
+			goto out;
+	}
 
 	err = vfs_getattr(&path, &stat,
 			  STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE,
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index d49d3060ed4f..64727a39f0db 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -732,4 +732,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
 	cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE);
 	return clp->cl_state == NFSD4_EXPIRABLE;
 }
+
+extern __be32 nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp,
+				struct inode *inode);
 #endif   /* NFSD4_STATE_H */
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-22 23:52 [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation Dai Ngo
  2023-05-22 23:52 ` [PATCH v5 1/3] NFSD: enable support for " Dai Ngo
  2023-05-22 23:52 ` [PATCH v5 2/3] NFSD: handle GETATTR conflict with " Dai Ngo
@ 2023-05-22 23:52 ` Dai Ngo
  2023-05-24 15:08   ` Jeff Layton
  2023-06-12 16:14 ` [PATCH v5 0/3] NFSD: add support for NFSv4 " Chuck Lever III
  3 siblings, 1 reply; 20+ messages in thread
From: Dai Ngo @ 2023-05-22 23:52 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, linux-fsdevel

Remove the check for F_WRLCK in generic_add_lease to allow file_lock
to be used for write delegation.

First consumer is NFSD.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/locks.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index df8b26a42524..08fb0b4fd4f8 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
 	if (is_deleg && !inode_trylock(inode))
 		return -EAGAIN;
 
-	if (is_deleg && arg == F_WRLCK) {
-		/* Write delegations are not currently supported: */
-		inode_unlock(inode);
-		WARN_ON_ONCE(1);
-		return -EINVAL;
-	}
-
 	percpu_down_read(&file_rwsem);
 	spin_lock(&ctx->flc_lock);
 	time_out_leases(inode, &dispose);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-22 23:52 ` [PATCH v5 2/3] NFSD: handle GETATTR conflict with " Dai Ngo
@ 2023-05-23  2:43   ` Chuck Lever
  2023-05-23  6:02     ` dai.ngo
  2023-05-24 15:07   ` Jeff Layton
  1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2023-05-23  2:43 UTC (permalink / raw)
  To: Dai Ngo; +Cc: chuck.lever, jlayton, linux-nfs, linux-fsdevel

On Mon, May 22, 2023 at 04:52:39PM -0700, Dai Ngo wrote:
> If the GETATTR request on a file that has write delegation in effect
> and the request attributes include the change info and size attribute
> then the write delegation is recalled and NFS4ERR_DELAY is returned
> for the GETATTR.

Isn't this yet another case where the server should send the
CB_RECALL and wait for it briefly, before resorting to
NFS4ERR_DELAY?


> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
>  fs/nfsd/nfs4xdr.c   |  5 +++++
>  fs/nfsd/state.h     |  3 +++
>  3 files changed, 45 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index b90b74a5e66e..ea9cd781db5f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
>  {
>  	get_stateid(cstate, &u->write.wr_stateid);
>  }
> +
> +__be32
> +nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)

As a globally-visible function, this needs a documenting comment, and
please use "nfsd4_" rather than "nfs4_".


> +{
> +	struct file_lock_context *ctx;
> +	struct file_lock *fl;
> +	struct nfs4_delegation *dp;
> +
> +	ctx = locks_inode_context(inode);
> +	if (!ctx)
> +		return 0;
> +	spin_lock(&ctx->flc_lock);
> +	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
> +		if (fl->fl_flags == FL_LAYOUT ||
> +				fl->fl_lmops != &nfsd_lease_mng_ops)
> +			continue;
> +		if (fl->fl_type == F_WRLCK) {
> +			dp = fl->fl_owner;
> +			/*
> +			 * increment the sc_count to prevent the delegation to
> +			 * be freed while sending the CB_RECALL. The refcount is
> +			 * decremented by nfs4_put_stid in nfsd4_cb_recall_release
> +			 * after the request was sent.
> +			 */
> +			if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
> +					!refcount_inc_not_zero(&dp->dl_stid.sc_count)) {
> +				spin_unlock(&ctx->flc_lock);
> +				return 0;
> +			}
> +			spin_unlock(&ctx->flc_lock);
> +			return nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
> +		}
> +		break;
> +	}
> +	spin_unlock(&ctx->flc_lock);
> +	return 0;
> +}
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 76db2fe29624..ed09b575afac 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -2966,6 +2966,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
>  		if (status)
>  			goto out;
>  	}
> +	if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
> +		status = nfs4_handle_wrdeleg_conflict(rqstp, d_inode(dentry));
> +		if (status)
> +			goto out;
> +	}
>  
>  	err = vfs_getattr(&path, &stat,
>  			  STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE,
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index d49d3060ed4f..64727a39f0db 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -732,4 +732,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>  	cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE);
>  	return clp->cl_state == NFSD4_EXPIRABLE;
>  }
> +
> +extern __be32 nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp,
> +				struct inode *inode);
>  #endif   /* NFSD4_STATE_H */
> -- 
> 2.9.5
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-23  2:43   ` Chuck Lever
@ 2023-05-23  6:02     ` dai.ngo
  2023-05-24 17:51       ` Chuck Lever III
  0 siblings, 1 reply; 20+ messages in thread
From: dai.ngo @ 2023-05-23  6:02 UTC (permalink / raw)
  To: Chuck Lever; +Cc: chuck.lever, jlayton, linux-nfs, linux-fsdevel


On 5/22/23 7:43 PM, Chuck Lever wrote:
> On Mon, May 22, 2023 at 04:52:39PM -0700, Dai Ngo wrote:
>> If the GETATTR request on a file that has write delegation in effect
>> and the request attributes include the change info and size attribute
>> then the write delegation is recalled and NFS4ERR_DELAY is returned
>> for the GETATTR.
> Isn't this yet another case where the server should send the
> CB_RECALL and wait for it briefly, before resorting to
> NFS4ERR_DELAY?

Think about this more, I don't think we even need to recall the
delegation at all. The Linux client does not flush the dirty file
data before returning the delegation so the GETATTR still get stale
attributes at the server. And the spec is not explicitly requires
the delegation to be recalled.

If we want to provide the client with more accurate file attributes
then we need to use the CB_GETATTR to get the up-to-date change info
and file size from the client. I think we agreed to defer this for later.

>
>
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>>   fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
>>   fs/nfsd/nfs4xdr.c   |  5 +++++
>>   fs/nfsd/state.h     |  3 +++
>>   3 files changed, 45 insertions(+)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index b90b74a5e66e..ea9cd781db5f 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
>>   {
>>   	get_stateid(cstate, &u->write.wr_stateid);
>>   }
>> +
>> +__be32
>> +nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)

need to change this function name to nfsd4_deleg_getattr_conflict.

> As a globally-visible function, this needs a documenting comment, and
> please use "nfsd4_" rather than "nfs4_".

will fix, if we decide to do the recall.

-Dai

>
>
>> +{
>> +	struct file_lock_context *ctx;
>> +	struct file_lock *fl;
>> +	struct nfs4_delegation *dp;
>> +
>> +	ctx = locks_inode_context(inode);
>> +	if (!ctx)
>> +		return 0;
>> +	spin_lock(&ctx->flc_lock);
>> +	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
>> +		if (fl->fl_flags == FL_LAYOUT ||
>> +				fl->fl_lmops != &nfsd_lease_mng_ops)
>> +			continue;
>> +		if (fl->fl_type == F_WRLCK) {
>> +			dp = fl->fl_owner;
>> +			/*
>> +			 * increment the sc_count to prevent the delegation to
>> +			 * be freed while sending the CB_RECALL. The refcount is
>> +			 * decremented by nfs4_put_stid in nfsd4_cb_recall_release
>> +			 * after the request was sent.
>> +			 */
>> +			if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
>> +					!refcount_inc_not_zero(&dp->dl_stid.sc_count)) {
>> +				spin_unlock(&ctx->flc_lock);
>> +				return 0;
>> +			}
>> +			spin_unlock(&ctx->flc_lock);
>> +			return nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
>> +		}
>> +		break;
>> +	}
>> +	spin_unlock(&ctx->flc_lock);
>> +	return 0;
>> +}
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 76db2fe29624..ed09b575afac 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -2966,6 +2966,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
>>   		if (status)
>>   			goto out;
>>   	}
>> +	if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
>> +		status = nfs4_handle_wrdeleg_conflict(rqstp, d_inode(dentry));
>> +		if (status)
>> +			goto out;
>> +	}
>>   
>>   	err = vfs_getattr(&path, &stat,
>>   			  STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE,
>> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
>> index d49d3060ed4f..64727a39f0db 100644
>> --- a/fs/nfsd/state.h
>> +++ b/fs/nfsd/state.h
>> @@ -732,4 +732,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>>   	cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE);
>>   	return clp->cl_state == NFSD4_EXPIRABLE;
>>   }
>> +
>> +extern __be32 nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp,
>> +				struct inode *inode);
>>   #endif   /* NFSD4_STATE_H */
>> -- 
>> 2.9.5
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-22 23:52 ` [PATCH v5 2/3] NFSD: handle GETATTR conflict with " Dai Ngo
  2023-05-23  2:43   ` Chuck Lever
@ 2023-05-24 15:07   ` Jeff Layton
  2023-05-24 17:49     ` dai.ngo
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff Layton @ 2023-05-24 15:07 UTC (permalink / raw)
  To: Dai Ngo, chuck.lever; +Cc: linux-nfs, linux-fsdevel

On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
> If the GETATTR request on a file that has write delegation in effect
> and the request attributes include the change info and size attribute
> then the write delegation is recalled and NFS4ERR_DELAY is returned
> for the GETATTR.
> 
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
>  fs/nfsd/nfs4xdr.c   |  5 +++++
>  fs/nfsd/state.h     |  3 +++
>  3 files changed, 45 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index b90b74a5e66e..ea9cd781db5f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
>  {
>  	get_stateid(cstate, &u->write.wr_stateid);
>  }
> +
> +__be32
> +nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)
> +{
> +	struct file_lock_context *ctx;
> +	struct file_lock *fl;
> +	struct nfs4_delegation *dp;
> +
> +	ctx = locks_inode_context(inode);
> +	if (!ctx)
> +		return 0;
> +	spin_lock(&ctx->flc_lock);
> +	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
> +		if (fl->fl_flags == FL_LAYOUT ||
> +				fl->fl_lmops != &nfsd_lease_mng_ops)
> +			continue;
> +		if (fl->fl_type == F_WRLCK) {
> +			dp = fl->fl_owner;
> +			/*
> +			 * increment the sc_count to prevent the delegation to
> +			 * be freed while sending the CB_RECALL. The refcount is
> +			 * decremented by nfs4_put_stid in nfsd4_cb_recall_release
> +			 * after the request was sent.
> +			 */
> +			if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
> +					!refcount_inc_not_zero(&dp->dl_stid.sc_count)) {

I still don't get why you're incrementing the refcount of this stateid.
At this point, you know that this stateid is owned by a different client
altogether,  and breaking its lease doesn't require a reference to the
stateid.

I think this will cause a refcount leak.

> +				spin_unlock(&ctx->flc_lock);
> +				return 0;
> +			}
> +			spin_unlock(&ctx->flc_lock);
> +			return nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
> +		}
> +		break;
> +	}
> +	spin_unlock(&ctx->flc_lock);
> +	return 0;
> +}
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 76db2fe29624..ed09b575afac 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -2966,6 +2966,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
>  		if (status)
>  			goto out;
>  	}
> +	if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
> +		status = nfs4_handle_wrdeleg_conflict(rqstp, d_inode(dentry));
> +		if (status)
> +			goto out;
> +	}
>  
>  	err = vfs_getattr(&path, &stat,
>  			  STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE,
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index d49d3060ed4f..64727a39f0db 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -732,4 +732,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>  	cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE);
>  	return clp->cl_state == NFSD4_EXPIRABLE;
>  }
> +
> +extern __be32 nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp,
> +				struct inode *inode);
>  #endif   /* NFSD4_STATE_H */

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-22 23:52 ` [PATCH v5 3/3] locks: allow support for " Dai Ngo
@ 2023-05-24 15:08   ` Jeff Layton
  2023-05-24 15:09     ` Chuck Lever III
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff Layton @ 2023-05-24 15:08 UTC (permalink / raw)
  To: Dai Ngo, chuck.lever; +Cc: linux-nfs, linux-fsdevel

On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
> to be used for write delegation.
> 
> First consumer is NFSD.
> 
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/locks.c | 7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index df8b26a42524..08fb0b4fd4f8 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>  	if (is_deleg && !inode_trylock(inode))
>  		return -EAGAIN;
>  
> -	if (is_deleg && arg == F_WRLCK) {
> -		/* Write delegations are not currently supported: */
> -		inode_unlock(inode);
> -		WARN_ON_ONCE(1);
> -		return -EINVAL;
> -	}
> -
>  	percpu_down_read(&file_rwsem);
>  	spin_lock(&ctx->flc_lock);
>  	time_out_leases(inode, &dispose);

I'd probably move this back to the first patch in the series.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/3] NFSD: enable support for write delegation
  2023-05-22 23:52 ` [PATCH v5 1/3] NFSD: enable support for " Dai Ngo
@ 2023-05-24 15:08   ` Jeff Layton
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Layton @ 2023-05-24 15:08 UTC (permalink / raw)
  To: Dai Ngo, chuck.lever; +Cc: linux-nfs, linux-fsdevel

On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
> This patch grants write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
> if there is no conflict with other OPENs.
> 
> Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
> are handled the same as read delegation using notify_change,
> try_break_deleg.
> 
> Add trace point to track whether read or write delegation is granted.
> 
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfsd/nfs4state.c | 28 +++++++++++++++++++---------
>  fs/nfsd/trace.h     |  1 +
>  2 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 6e61fa3acaf1..b90b74a5e66e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1144,7 +1144,7 @@ static void block_delegations(struct knfsd_fh *fh)
>  
>  static struct nfs4_delegation *
>  alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
> -		 struct nfs4_clnt_odstate *odstate)
> +		struct nfs4_clnt_odstate *odstate, u32 dl_type)
>  {
>  	struct nfs4_delegation *dp;
>  	long n;
> @@ -1170,7 +1170,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
>  	INIT_LIST_HEAD(&dp->dl_recall_lru);
>  	dp->dl_clnt_odstate = odstate;
>  	get_clnt_odstate(odstate);
> -	dp->dl_type = NFS4_OPEN_DELEGATE_READ;
> +	dp->dl_type = dl_type;
>  	dp->dl_retries = 1;
>  	dp->dl_recalled = false;
>  	nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client,
> @@ -5451,6 +5451,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  	struct nfs4_delegation *dp;
>  	struct nfsd_file *nf;
>  	struct file_lock *fl;
> +	u32 dl_type;
>  
>  	/*
>  	 * The fi_had_conflict and nfs_get_existing_delegation checks
> @@ -5460,7 +5461,13 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  	if (fp->fi_had_conflict)
>  		return ERR_PTR(-EAGAIN);
>  
> -	nf = find_readable_file(fp);
> +	if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
> +		nf = find_writeable_file(fp);
> +		dl_type = NFS4_OPEN_DELEGATE_WRITE;
> +	} else {
> +		nf = find_readable_file(fp);
> +		dl_type = NFS4_OPEN_DELEGATE_READ;
> +	}
>  	if (!nf) {
>  		/*
>  		 * We probably could attempt another open and get a read
> @@ -5491,11 +5498,11 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  		return ERR_PTR(status);
>  
>  	status = -ENOMEM;
> -	dp = alloc_init_deleg(clp, fp, odstate);
> +	dp = alloc_init_deleg(clp, fp, odstate, dl_type);
>  	if (!dp)
>  		goto out_delegees;
>  
> -	fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ);
> +	fl = nfs4_alloc_init_lease(dp, dl_type);
>  	if (!fl)
>  		goto out_clnt_odstate;
>  
> @@ -5590,8 +5597,6 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  		case NFS4_OPEN_CLAIM_PREVIOUS:
>  			if (!cb_up)
>  				open->op_recall = 1;
> -			if (open->op_delegate_type != NFS4_OPEN_DELEGATE_READ)
> -				goto out_no_deleg;
>  			break;
>  		case NFS4_OPEN_CLAIM_NULL:
>  			parent = currentfh;
> @@ -5616,8 +5621,13 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  
>  	memcpy(&open->op_delegate_stateid, &dp->dl_stid.sc_stateid, sizeof(dp->dl_stid.sc_stateid));
>  
> -	trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
> -	open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
> +	if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
> +		open->op_delegate_type = NFS4_OPEN_DELEGATE_WRITE;
> +		trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
> +	} else {
> +		open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
> +		trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
> +	}
>  	nfs4_put_stid(&dp->dl_stid);
>  	return;
>  out_no_deleg:
> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> index 4183819ea082..a14cf8684255 100644
> --- a/fs/nfsd/trace.h
> +++ b/fs/nfsd/trace.h
> @@ -607,6 +607,7 @@ DEFINE_STATEID_EVENT(layout_recall_release);
>  
>  DEFINE_STATEID_EVENT(open);
>  DEFINE_STATEID_EVENT(deleg_read);
> +DEFINE_STATEID_EVENT(deleg_write);
>  DEFINE_STATEID_EVENT(deleg_return);
>  DEFINE_STATEID_EVENT(deleg_recall);
>  

This one looks fine and is fairly straightforward.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 15:08   ` Jeff Layton
@ 2023-05-24 15:09     ` Chuck Lever III
  2023-05-24 16:55       ` Jeff Layton
  0 siblings, 1 reply; 20+ messages in thread
From: Chuck Lever III @ 2023-05-24 15:09 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Dai Ngo, Linux NFS Mailing List, linux-fsdevel



> On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
>> to be used for write delegation.
>> 
>> First consumer is NFSD.
>> 
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>> fs/locks.c | 7 -------
>> 1 file changed, 7 deletions(-)
>> 
>> diff --git a/fs/locks.c b/fs/locks.c
>> index df8b26a42524..08fb0b4fd4f8 100644
>> --- a/fs/locks.c
>> +++ b/fs/locks.c
>> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>> if (is_deleg && !inode_trylock(inode))
>> return -EAGAIN;
>> 
>> - if (is_deleg && arg == F_WRLCK) {
>> - /* Write delegations are not currently supported: */
>> - inode_unlock(inode);
>> - WARN_ON_ONCE(1);
>> - return -EINVAL;
>> - }
>> -
>> percpu_down_read(&file_rwsem);
>> spin_lock(&ctx->flc_lock);
>> time_out_leases(inode, &dispose);
> 
> I'd probably move this back to the first patch in the series.
> 
> Reviewed-by: Jeff Layton <jlayton@kernel.org>

I asked him to move it to the end. Is it safe to take out this
check before write delegation is actually implemented?


--
Chuck Lever



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 15:09     ` Chuck Lever III
@ 2023-05-24 16:55       ` Jeff Layton
  2023-05-24 17:41         ` Chuck Lever III
  2023-05-24 17:52         ` dai.ngo
  0 siblings, 2 replies; 20+ messages in thread
From: Jeff Layton @ 2023-05-24 16:55 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: Dai Ngo, Linux NFS Mailing List, linux-fsdevel

On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
> 
> > On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
> > > Remove the check for F_WRLCK in generic_add_lease to allow file_lock
> > > to be used for write delegation.
> > > 
> > > First consumer is NFSD.
> > > 
> > > Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> > > ---
> > > fs/locks.c | 7 -------
> > > 1 file changed, 7 deletions(-)
> > > 
> > > diff --git a/fs/locks.c b/fs/locks.c
> > > index df8b26a42524..08fb0b4fd4f8 100644
> > > --- a/fs/locks.c
> > > +++ b/fs/locks.c
> > > @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
> > > if (is_deleg && !inode_trylock(inode))
> > > return -EAGAIN;
> > > 
> > > - if (is_deleg && arg == F_WRLCK) {
> > > - /* Write delegations are not currently supported: */
> > > - inode_unlock(inode);
> > > - WARN_ON_ONCE(1);
> > > - return -EINVAL;
> > > - }
> > > -
> > > percpu_down_read(&file_rwsem);
> > > spin_lock(&ctx->flc_lock);
> > > time_out_leases(inode, &dispose);
> > 
> > I'd probably move this back to the first patch in the series.
> > 
> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
> I asked him to move it to the end. Is it safe to take out this
> check before write delegation is actually implemented?
> 

I think so, but it don't think it doesn't make much difference either
way. The only real downside of putting it at the end is that you might
have to contend with a WARN_ON_ONCE if you're bisecting.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 16:55       ` Jeff Layton
@ 2023-05-24 17:41         ` Chuck Lever III
  2023-05-24 18:05           ` dai.ngo
  2023-05-24 17:52         ` dai.ngo
  1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever III @ 2023-05-24 17:41 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Dai Ngo, Linux NFS Mailing List, linux-fsdevel



> On May 24, 2023, at 12:55 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
>> 
>>> On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>>>> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
>>>> to be used for write delegation.
>>>> 
>>>> First consumer is NFSD.
>>>> 
>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>> ---
>>>> fs/locks.c | 7 -------
>>>> 1 file changed, 7 deletions(-)
>>>> 
>>>> diff --git a/fs/locks.c b/fs/locks.c
>>>> index df8b26a42524..08fb0b4fd4f8 100644
>>>> --- a/fs/locks.c
>>>> +++ b/fs/locks.c
>>>> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>>>> if (is_deleg && !inode_trylock(inode))
>>>> return -EAGAIN;
>>>> 
>>>> - if (is_deleg && arg == F_WRLCK) {
>>>> - /* Write delegations are not currently supported: */
>>>> - inode_unlock(inode);
>>>> - WARN_ON_ONCE(1);
>>>> - return -EINVAL;
>>>> - }
>>>> -
>>>> percpu_down_read(&file_rwsem);
>>>> spin_lock(&ctx->flc_lock);
>>>> time_out_leases(inode, &dispose);
>>> 
>>> I'd probably move this back to the first patch in the series.
>>> 
>>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>> 
>> I asked him to move it to the end. Is it safe to take out this
>> check before write delegation is actually implemented?
>> 
> 
> I think so, but it don't think it doesn't make much difference either
> way. The only real downside of putting it at the end is that you might
> have to contend with a WARN_ON_ONCE if you're bisecting.

My main concern is in fact preventing problems during bisection.
I can apply 3/3 and then 1/3, if you're good with that.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-24 15:07   ` Jeff Layton
@ 2023-05-24 17:49     ` dai.ngo
  2023-05-25 17:21       ` Chuck Lever III
  0 siblings, 1 reply; 20+ messages in thread
From: dai.ngo @ 2023-05-24 17:49 UTC (permalink / raw)
  To: Jeff Layton, chuck.lever; +Cc: linux-nfs, linux-fsdevel


On 5/24/23 8:07 AM, Jeff Layton wrote:
> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>> If the GETATTR request on a file that has write delegation in effect
>> and the request attributes include the change info and size attribute
>> then the write delegation is recalled and NFS4ERR_DELAY is returned
>> for the GETATTR.
>>
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>>   fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
>>   fs/nfsd/nfs4xdr.c   |  5 +++++
>>   fs/nfsd/state.h     |  3 +++
>>   3 files changed, 45 insertions(+)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index b90b74a5e66e..ea9cd781db5f 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
>>   {
>>   	get_stateid(cstate, &u->write.wr_stateid);
>>   }
>> +
>> +__be32
>> +nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)
>> +{
>> +	struct file_lock_context *ctx;
>> +	struct file_lock *fl;
>> +	struct nfs4_delegation *dp;
>> +
>> +	ctx = locks_inode_context(inode);
>> +	if (!ctx)
>> +		return 0;
>> +	spin_lock(&ctx->flc_lock);
>> +	list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
>> +		if (fl->fl_flags == FL_LAYOUT ||
>> +				fl->fl_lmops != &nfsd_lease_mng_ops)
>> +			continue;
>> +		if (fl->fl_type == F_WRLCK) {
>> +			dp = fl->fl_owner;
>> +			/*
>> +			 * increment the sc_count to prevent the delegation to
>> +			 * be freed while sending the CB_RECALL. The refcount is
>> +			 * decremented by nfs4_put_stid in nfsd4_cb_recall_release
>> +			 * after the request was sent.
>> +			 */
>> +			if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
>> +					!refcount_inc_not_zero(&dp->dl_stid.sc_count)) {
> I still don't get why you're incrementing the refcount of this stateid.
> At this point, you know that this stateid is owned by a different client
> altogether,  and breaking its lease doesn't require a reference to the
> stateid.

You're right, the intention was to make sure the delegation does not go
away when the recall is being sent. However, this was already done in
nfsd_break_one_deleg where the sc_count is incremented. Incrementing the
sc_count refcount would be needed here if we do the CB_GETATTR. I'll remove
this in next version.

But should we drop the this patch altogether? since there is no value in
recall the write delegation when there is an GETATTR from another client
as I mentioned in the previous email.

-Dai

>
> I think this will cause a refcount leak.
>
>> +				spin_unlock(&ctx->flc_lock);
>> +				return 0;
>> +			}
>> +			spin_unlock(&ctx->flc_lock);
>> +			return nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
>> +		}
>> +		break;
>> +	}
>> +	spin_unlock(&ctx->flc_lock);
>> +	return 0;
>> +}
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 76db2fe29624..ed09b575afac 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -2966,6 +2966,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp,
>>   		if (status)
>>   			goto out;
>>   	}
>> +	if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
>> +		status = nfs4_handle_wrdeleg_conflict(rqstp, d_inode(dentry));
>> +		if (status)
>> +			goto out;
>> +	}
>>   
>>   	err = vfs_getattr(&path, &stat,
>>   			  STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE,
>> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
>> index d49d3060ed4f..64727a39f0db 100644
>> --- a/fs/nfsd/state.h
>> +++ b/fs/nfsd/state.h
>> @@ -732,4 +732,7 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
>>   	cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE);
>>   	return clp->cl_state == NFSD4_EXPIRABLE;
>>   }
>> +
>> +extern __be32 nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp,
>> +				struct inode *inode);
>>   #endif   /* NFSD4_STATE_H */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-23  6:02     ` dai.ngo
@ 2023-05-24 17:51       ` Chuck Lever III
  0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever III @ 2023-05-24 17:51 UTC (permalink / raw)
  To: Dai Ngo; +Cc: Chuck Lever, Jeff Layton, Linux NFS Mailing List, linux-fsdevel



> On May 23, 2023, at 2:02 AM, Dai Ngo <dai.ngo@oracle.com> wrote:
> 
> 
> On 5/22/23 7:43 PM, Chuck Lever wrote:
>> On Mon, May 22, 2023 at 04:52:39PM -0700, Dai Ngo wrote:
>>> If the GETATTR request on a file that has write delegation in effect
>>> and the request attributes include the change info and size attribute
>>> then the write delegation is recalled and NFS4ERR_DELAY is returned
>>> for the GETATTR.
>> Isn't this yet another case where the server should send the
>> CB_RECALL and wait for it briefly, before resorting to
>> NFS4ERR_DELAY?
> 
> Think about this more, I don't think we even need to recall the
> delegation at all. The Linux client does not flush the dirty file
> data before returning the delegation so the GETATTR still get stale
> attributes at the server. And the spec is not explicitly requires
> the delegation to be recalled.

I'm trying to get some feedback from other server implementers
about this. I'm willing to consider dropping this patch for now.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 16:55       ` Jeff Layton
  2023-05-24 17:41         ` Chuck Lever III
@ 2023-05-24 17:52         ` dai.ngo
  1 sibling, 0 replies; 20+ messages in thread
From: dai.ngo @ 2023-05-24 17:52 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever III; +Cc: Linux NFS Mailing List, linux-fsdevel


On 5/24/23 9:55 AM, Jeff Layton wrote:
> On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
>>> On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>>
>>> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>>>> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
>>>> to be used for write delegation.
>>>>
>>>> First consumer is NFSD.
>>>>
>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>> ---
>>>> fs/locks.c | 7 -------
>>>> 1 file changed, 7 deletions(-)
>>>>
>>>> diff --git a/fs/locks.c b/fs/locks.c
>>>> index df8b26a42524..08fb0b4fd4f8 100644
>>>> --- a/fs/locks.c
>>>> +++ b/fs/locks.c
>>>> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>>>> if (is_deleg && !inode_trylock(inode))
>>>> return -EAGAIN;
>>>>
>>>> - if (is_deleg && arg == F_WRLCK) {
>>>> - /* Write delegations are not currently supported: */
>>>> - inode_unlock(inode);
>>>> - WARN_ON_ONCE(1);
>>>> - return -EINVAL;
>>>> - }
>>>> -
>>>> percpu_down_read(&file_rwsem);
>>>> spin_lock(&ctx->flc_lock);
>>>> time_out_leases(inode, &dispose);
>>> I'd probably move this back to the first patch in the series.
>>>
>>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>> I asked him to move it to the end. Is it safe to take out this
>> check before write delegation is actually implemented?
>>
> I think so, but it don't think it doesn't make much difference either
> way. The only real downside of putting it at the end is that you might
> have to contend with a WARN_ON_ONCE if you're bisecting.

I will make this patch to be the 1st patch, we don't want the user to
get WARN_ON_ONCE when bisecting.

-Dai


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 17:41         ` Chuck Lever III
@ 2023-05-24 18:05           ` dai.ngo
  2023-05-24 19:03             ` Jeff Layton
  0 siblings, 1 reply; 20+ messages in thread
From: dai.ngo @ 2023-05-24 18:05 UTC (permalink / raw)
  To: Chuck Lever III, Jeff Layton; +Cc: Linux NFS Mailing List, linux-fsdevel


On 5/24/23 10:41 AM, Chuck Lever III wrote:
>
>> On May 24, 2023, at 12:55 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>
>> On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
>>>> On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>>>
>>>> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>>>>> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
>>>>> to be used for write delegation.
>>>>>
>>>>> First consumer is NFSD.
>>>>>
>>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>>> ---
>>>>> fs/locks.c | 7 -------
>>>>> 1 file changed, 7 deletions(-)
>>>>>
>>>>> diff --git a/fs/locks.c b/fs/locks.c
>>>>> index df8b26a42524..08fb0b4fd4f8 100644
>>>>> --- a/fs/locks.c
>>>>> +++ b/fs/locks.c
>>>>> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>>>>> if (is_deleg && !inode_trylock(inode))
>>>>> return -EAGAIN;
>>>>>
>>>>> - if (is_deleg && arg == F_WRLCK) {
>>>>> - /* Write delegations are not currently supported: */
>>>>> - inode_unlock(inode);
>>>>> - WARN_ON_ONCE(1);
>>>>> - return -EINVAL;
>>>>> - }
>>>>> -
>>>>> percpu_down_read(&file_rwsem);
>>>>> spin_lock(&ctx->flc_lock);
>>>>> time_out_leases(inode, &dispose);
>>>> I'd probably move this back to the first patch in the series.
>>>>
>>>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>>> I asked him to move it to the end. Is it safe to take out this
>>> check before write delegation is actually implemented?
>>>
>> I think so, but it don't think it doesn't make much difference either
>> way. The only real downside of putting it at the end is that you might
>> have to contend with a WARN_ON_ONCE if you're bisecting.
> My main concern is in fact preventing problems during bisection.
> I can apply 3/3 and then 1/3, if you're good with that.

I'm good with that. You can apply 3/3 then 1/3 and drop 2/3 so I
don't have to send out v6.

-Dai

>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 18:05           ` dai.ngo
@ 2023-05-24 19:03             ` Jeff Layton
  2023-05-24 20:27               ` dai.ngo
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff Layton @ 2023-05-24 19:03 UTC (permalink / raw)
  To: dai.ngo, Chuck Lever III; +Cc: Linux NFS Mailing List, linux-fsdevel

On Wed, 2023-05-24 at 11:05 -0700, dai.ngo@oracle.com wrote:
> On 5/24/23 10:41 AM, Chuck Lever III wrote:
> > 
> > > On May 24, 2023, at 12:55 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > > 
> > > On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
> > > > > On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > > > > 
> > > > > On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
> > > > > > Remove the check for F_WRLCK in generic_add_lease to allow file_lock
> > > > > > to be used for write delegation.
> > > > > > 
> > > > > > First consumer is NFSD.
> > > > > > 
> > > > > > Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> > > > > > ---
> > > > > > fs/locks.c | 7 -------
> > > > > > 1 file changed, 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/locks.c b/fs/locks.c
> > > > > > index df8b26a42524..08fb0b4fd4f8 100644
> > > > > > --- a/fs/locks.c
> > > > > > +++ b/fs/locks.c
> > > > > > @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
> > > > > > if (is_deleg && !inode_trylock(inode))
> > > > > > return -EAGAIN;
> > > > > > 
> > > > > > - if (is_deleg && arg == F_WRLCK) {
> > > > > > - /* Write delegations are not currently supported: */
> > > > > > - inode_unlock(inode);
> > > > > > - WARN_ON_ONCE(1);
> > > > > > - return -EINVAL;
> > > > > > - }
> > > > > > -
> > > > > > percpu_down_read(&file_rwsem);
> > > > > > spin_lock(&ctx->flc_lock);
> > > > > > time_out_leases(inode, &dispose);
> > > > > I'd probably move this back to the first patch in the series.
> > > > > 
> > > > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > > I asked him to move it to the end. Is it safe to take out this
> > > > check before write delegation is actually implemented?
> > > > 
> > > I think so, but it don't think it doesn't make much difference either
> > > way. The only real downside of putting it at the end is that you might
> > > have to contend with a WARN_ON_ONCE if you're bisecting.
> > My main concern is in fact preventing problems during bisection.
> > I can apply 3/3 and then 1/3, if you're good with that.
> 
> I'm good with that. You can apply 3/3 then 1/3 and drop 2/3 so I
> don't have to send out v6.
> 

I'm fine with that too, particularly if other vendors don't recall on a
getattr currently.

I wonder though, if we need some clarification in the spec on
CB_GETATTR?

    https://www.rfc-editor.org/rfc/rfc8881.html#section-10.4.3

In that section, there is a big distinction about the client being able
to tell that the data was modified from the point where the delegation
was handed out.

There is always a point in time where a client has buffered writes that
haven't been flushed to the server yet, but that's true when it doesn't
have a delegation too. Mostly the client tries to start some writeback
fairly quickly so any lag how the in the change attr/size update is
usually short lived.

I don't think the Linux client materially changes its writeback behavior
based on a write delegation, so I guess (as Olga pointed out) the main
benefit from a write delegation is being able to do delegated opens for
write. A getattr's results won't be changed by extra opens or closes, so
yeah...I guess the utility of CB_GETATTR is really limited.

I guess it _might_ be useful in the case where the server has handed out
a write delegation, but hasn't gotten any writes. That would at least
tell the client that something has changed, even if the deleg holder
hasn't gotten around to writing anything back yet. The problem is that
it's common for applications to open O_RDWR and only do reads.

Maybe we ought to take this discussion to the IETF list? It seems like
the spec mandates that you must recall the delegation if you don't
implement CB_GETATTR, but I don't see much in way of harm in ignoring
that.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/3] locks: allow support for write delegation
  2023-05-24 19:03             ` Jeff Layton
@ 2023-05-24 20:27               ` dai.ngo
  0 siblings, 0 replies; 20+ messages in thread
From: dai.ngo @ 2023-05-24 20:27 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever III; +Cc: Linux NFS Mailing List, linux-fsdevel


On 5/24/23 12:03 PM, Jeff Layton wrote:
> On Wed, 2023-05-24 at 11:05 -0700, dai.ngo@oracle.com wrote:
>> On 5/24/23 10:41 AM, Chuck Lever III wrote:
>>>> On May 24, 2023, at 12:55 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>>>
>>>> On Wed, 2023-05-24 at 15:09 +0000, Chuck Lever III wrote:
>>>>>> On May 24, 2023, at 11:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>>>>>
>>>>>> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>>>>>>> Remove the check for F_WRLCK in generic_add_lease to allow file_lock
>>>>>>> to be used for write delegation.
>>>>>>>
>>>>>>> First consumer is NFSD.
>>>>>>>
>>>>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>>>>> ---
>>>>>>> fs/locks.c | 7 -------
>>>>>>> 1 file changed, 7 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/locks.c b/fs/locks.c
>>>>>>> index df8b26a42524..08fb0b4fd4f8 100644
>>>>>>> --- a/fs/locks.c
>>>>>>> +++ b/fs/locks.c
>>>>>>> @@ -1729,13 +1729,6 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
>>>>>>> if (is_deleg && !inode_trylock(inode))
>>>>>>> return -EAGAIN;
>>>>>>>
>>>>>>> - if (is_deleg && arg == F_WRLCK) {
>>>>>>> - /* Write delegations are not currently supported: */
>>>>>>> - inode_unlock(inode);
>>>>>>> - WARN_ON_ONCE(1);
>>>>>>> - return -EINVAL;
>>>>>>> - }
>>>>>>> -
>>>>>>> percpu_down_read(&file_rwsem);
>>>>>>> spin_lock(&ctx->flc_lock);
>>>>>>> time_out_leases(inode, &dispose);
>>>>>> I'd probably move this back to the first patch in the series.
>>>>>>
>>>>>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>>>>> I asked him to move it to the end. Is it safe to take out this
>>>>> check before write delegation is actually implemented?
>>>>>
>>>> I think so, but it don't think it doesn't make much difference either
>>>> way. The only real downside of putting it at the end is that you might
>>>> have to contend with a WARN_ON_ONCE if you're bisecting.
>>> My main concern is in fact preventing problems during bisection.
>>> I can apply 3/3 and then 1/3, if you're good with that.
>> I'm good with that. You can apply 3/3 then 1/3 and drop 2/3 so I
>> don't have to send out v6.
>>
> I'm fine with that too, particularly if other vendors don't recall on a
> getattr currently.
>
> I wonder though, if we need some clarification in the spec on
> CB_GETATTR?
>
>      https://www.rfc-editor.org/rfc/rfc8881.html#section-10.4.3
>
> In that section, there is a big distinction about the client being able
> to tell that the data was modified from the point where the delegation
> was handed out.
>
> There is always a point in time where a client has buffered writes that
> haven't been flushed to the server yet, but that's true when it doesn't
> have a delegation too. Mostly the client tries to start some writeback
> fairly quickly so any lag how the in the change attr/size update is
> usually short lived.
>
> I don't think the Linux client materially changes its writeback behavior
> based on a write delegation, so I guess (as Olga pointed out) the main
> benefit from a write delegation is being able to do delegated opens for
> write. A getattr's results won't be changed by extra opens or closes, so
> yeah...I guess the utility of CB_GETATTR is really limited.
>
> I guess it _might_ be useful in the case where the server has handed out
> a write delegation, but hasn't gotten any writes. That would at least
> tell the client that something has changed, even if the deleg holder
> hasn't gotten around to writing anything back yet. The problem is that
> it's common for applications to open O_RDWR and only do reads.
>
> Maybe we ought to take this discussion to the IETF list? It seems like
> the spec mandates that you must recall the delegation if you don't
> implement CB_GETATTR, but I don't see much in way of harm in ignoring
> that.

Yes, I think we should, for clarification.
Jeff, would you mind to drive this discussion on IETF since you can
present the issue much clearer than I would.

Thanks,
-Dai


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/3] NFSD: handle GETATTR conflict with write delegation
  2023-05-24 17:49     ` dai.ngo
@ 2023-05-25 17:21       ` Chuck Lever III
  0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever III @ 2023-05-25 17:21 UTC (permalink / raw)
  To: Dai Ngo; +Cc: Jeff Layton, Linux NFS Mailing List, linux-fsdevel



> On May 24, 2023, at 1:49 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
> 
> 
> On 5/24/23 8:07 AM, Jeff Layton wrote:
>> On Mon, 2023-05-22 at 16:52 -0700, Dai Ngo wrote:
>>> If the GETATTR request on a file that has write delegation in effect
>>> and the request attributes include the change info and size attribute
>>> then the write delegation is recalled and NFS4ERR_DELAY is returned
>>> for the GETATTR.
>>> 
>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>> ---
>>>  fs/nfsd/nfs4state.c | 37 +++++++++++++++++++++++++++++++++++++
>>>  fs/nfsd/nfs4xdr.c   |  5 +++++
>>>  fs/nfsd/state.h     |  3 +++
>>>  3 files changed, 45 insertions(+)
>>> 
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index b90b74a5e66e..ea9cd781db5f 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -8353,3 +8353,40 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
>>>  {
>>>   get_stateid(cstate, &u->write.wr_stateid);
>>>  }
>>> +
>>> +__be32
>>> +nfs4_handle_wrdeleg_conflict(struct svc_rqst *rqstp, struct inode *inode)
>>> +{
>>> + struct file_lock_context *ctx;
>>> + struct file_lock *fl;
>>> + struct nfs4_delegation *dp;
>>> +
>>> + ctx = locks_inode_context(inode);
>>> + if (!ctx)
>>> + return 0;
>>> + spin_lock(&ctx->flc_lock);
>>> + list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
>>> + if (fl->fl_flags == FL_LAYOUT ||
>>> + fl->fl_lmops != &nfsd_lease_mng_ops)
>>> + continue;
>>> + if (fl->fl_type == F_WRLCK) {
>>> + dp = fl->fl_owner;
>>> + /*
>>> +  * increment the sc_count to prevent the delegation to
>>> +  * be freed while sending the CB_RECALL. The refcount is
>>> +  * decremented by nfs4_put_stid in nfsd4_cb_recall_release
>>> +  * after the request was sent.
>>> +  */
>>> + if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker) ||
>>> + !refcount_inc_not_zero(&dp->dl_stid.sc_count)) {
>> I still don't get why you're incrementing the refcount of this stateid.
>> At this point, you know that this stateid is owned by a different client
>> altogether,  and breaking its lease doesn't require a reference to the
>> stateid.
> 
> You're right, the intention was to make sure the delegation does not go
> away when the recall is being sent. However, this was already done in
> nfsd_break_one_deleg where the sc_count is incremented. Incrementing the
> sc_count refcount would be needed here if we do the CB_GETATTR. I'll remove
> this in next version.
> 
> But should we drop the this patch altogether? since there is no value in
> recall the write delegation when there is an GETATTR from another client
> as I mentioned in the previous email.

a. Neither Solaris nor OnTAP do a CB_RECALL or a CB_GETATTR in this case

b. RFC 8881 Section 18.7.4 states that a server MUST NOT proceed with a
   GETATTR response unless it has done one of those callbacks

So we have two examples of non-compliant server implementations that
don't seem to have consequences for not following that MUST.

It doesn't seem unreasonable to recall in this scenario, but it's
unknown what kind of performance impact that will have.

Can you send an updated version of this patch, rebased on nfsd-next
(which now has the other two applied), plus a patch to add a counter
for this type of conflict?


--
Chuck Lever



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation
  2023-05-22 23:52 [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation Dai Ngo
                   ` (2 preceding siblings ...)
  2023-05-22 23:52 ` [PATCH v5 3/3] locks: allow support for " Dai Ngo
@ 2023-06-12 16:14 ` Chuck Lever III
  3 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever III @ 2023-06-12 16:14 UTC (permalink / raw)
  To: Dai Ngo, Jeff Layton; +Cc: Linux NFS Mailing List, linux-fsdevel



> On May 22, 2023, at 7:52 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
> 
> NFSD: add support for NFSv4 write delegation
> 
> The NFSv4 server currently supports read delegation using VFS lease
> which is implemented using file_lock. 
> 
> This patch series add write delegation support for NFSv4 server by:
> 
>    . remove the check for F_WRLCK in generic_add_lease to allow
>      file_lock to be used for write delegation.  
> 
>    . grant write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
>      if there is no conflict with other OPENs.
> 
> Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
> are handled the same as read delegation using notify_change, try_break_deleg.
> 
> Changes since v1:
> 
> [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
> - remove WARN_ON_ONCE from encode_bitmap4
> - replace decode_bitmap4 with xdr_stream_decode_uint32_array
> - replace xdr_inline_decode and xdr_decode_hyper in decode_cb_getattr
>   with xdr_stream_decode_u64. Also remove the un-needed likely().
> - modify signature of encode_cb_getattr4args to take pointer to
>   nfs4_cb_fattr
> - replace decode_attr_length with xdr_stream_decode_u32
> - rename decode_cb_getattr to decode_cb_fattr4
> - fold the initialization of cb_cinfo and cb_fsize into decode_cb_fattr4
> - rename ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
>  in fs/nfsd/nfs4xdr.c
> - correct NFS4_dec_cb_getattr_sz and update size description
> 
> [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
> - change nfs4_handle_wrdeleg_conflict returns __be32 to fix test robot
> - change ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
>  in fs/nfsd/nfs4xdr.c
> 
> Changes since v2:
> 
> [PATCH 2/4] NFSD: enable support for write delegation
> - rename 'deleg' to 'dl_type' in nfs4_set_delegation
> - remove 'wdeleg' in nfs4_open_delegation
> 
> - drop [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
>  and [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
>  for futher clarification of the benefits of these patches
> 
> Changes since v3:
> 
> - recall write delegation when there is GETATTR from 2nd client
> - add trace point to track when write delegation is granted 
> 
> Changes since v4:
> - squash 4/4 into 2/4
> - apply 1/4 last instead of first
> - combine nfs4_wrdeleg_filelock and nfs4_handle_wrdeleg_conflict to
>  nfsd4_deleg_getattr_conflict and move it to fs/nfsd/nfs4state.c
> - check for lock belongs to delegation before proceed and do it
>  under the fl_lock
> - check and skip FL_LAYOUT file_locks

Dai has identified a few non-trivial issues with the latest set of
NFSD write delegation patches. We've decided to pull these out of
v6.5 and try again during the 6.6 cycle.

Thanks to Dai for his focus on this, and to testers and reviewers.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-06-12 16:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 23:52 [PATCH v5 0/3] NFSD: add support for NFSv4 write delegation Dai Ngo
2023-05-22 23:52 ` [PATCH v5 1/3] NFSD: enable support for " Dai Ngo
2023-05-24 15:08   ` Jeff Layton
2023-05-22 23:52 ` [PATCH v5 2/3] NFSD: handle GETATTR conflict with " Dai Ngo
2023-05-23  2:43   ` Chuck Lever
2023-05-23  6:02     ` dai.ngo
2023-05-24 17:51       ` Chuck Lever III
2023-05-24 15:07   ` Jeff Layton
2023-05-24 17:49     ` dai.ngo
2023-05-25 17:21       ` Chuck Lever III
2023-05-22 23:52 ` [PATCH v5 3/3] locks: allow support for " Dai Ngo
2023-05-24 15:08   ` Jeff Layton
2023-05-24 15:09     ` Chuck Lever III
2023-05-24 16:55       ` Jeff Layton
2023-05-24 17:41         ` Chuck Lever III
2023-05-24 18:05           ` dai.ngo
2023-05-24 19:03             ` Jeff Layton
2023-05-24 20:27               ` dai.ngo
2023-05-24 17:52         ` dai.ngo
2023-06-12 16:14 ` [PATCH v5 0/3] NFSD: add support for NFSv4 " Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).