All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v1 00/17] NFSD support for inter+async COPY
@ 2017-03-02 16:01 Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 01/18] NFSD add ca_source_server<> to COPY Olga Kornievskaia
                   ` (19 more replies)
  0 siblings, 20 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

This is server-side support for NFSv4.2 inter and async COPY which is
on top of existing intra sync COPY. It also depends on the NFS client
piece for NFSv4.2 to do client side of the destination server piece
in the inter SSC.

NFSD determines if COPY is intra or inter and if sync or async. For
inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
(superblock). It will destroy the mount point when copy is done.

To do asynchronous copies, NFSD creates a single threaded workqueue
and does not tie up an NFSD thread to complete the copy. Upon receiving
the COPY, it generates a unique copy stateid (stores a global list
for keeping track of state for OFFLOAD_STATUS to be queried by),
queues up a workqueue for the copy, and replies back to the client.
nfsd4_copy arguments that are allocated on the stack are copied for
the work item.

In the async copy handler, it calls into VFS copy_file_range() with
4MB chunks and loops until it completes the requested copy size. If
error is encountered it's saved but also we save the amount of data
copied so far. Once done, the results are queued for the callback
workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
up the copy state information stored in the global list when cope is
done and not doing it when callback's release function (it could be
done there alternatively if needed it?).

On the source server, upon receiving a COPY_NOTIFY, it generate a
unique stateid that's kept in the global list. Upon receiving a READ
with a stateid, the code checks the normal list of open stateid and
now additionally, it'll check the copy state list as well before
deciding to either fail with BAD_STATEID or find one that matches.
The stored stateid is only valid to be used for the first time
with a choosen lease period (90s currently). When the source server
received an OFFLOAD_CANCEL, it will remove the stateid from the
global list. Otherwise, the copy stateid is removed upon the removal
of its "parent" stateid (open/lock/delegation stateid).


Andy Adamson (7):
  NFSD add ca_source_server<> to COPY
  NFSD generalize nfsd4_compound_state flag names
  NFSD: allow inter server COPY to have a STALE source server fh
  NFSD return nfs4_stid in nfs4_preprocess_stateid_op
  NFSD add COPY_NOTIFY operation
  NFSD add nfs4 inter ssc to nfsd4_copy
  NFSD Unique stateid_t for inter server to server COPY authentication

Olga Kornievskaia (10):
  NFSD CB_OFFLOAD xdr
  NFSD OFFLOAD_STATUS xdr
  NFSD OFFLOAD_CANCEL xdr
  NFSD xdr callback stateid in async COPY reply
  NFSD first draft of async copy
  NFSD handle OFFLOAD_CANCEL op
  NFSD stop queued async copies on client shutdown
  NFSD create new stateid for async copy
  NFSD define EBADF in nfserrno
  NFSD support OFFLOAD_STATUS

 fs/nfsd/Kconfig        |  10 +
 fs/nfsd/netns.h        |   8 +
 fs/nfsd/nfs4callback.c |  95 +++++++
 fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
 fs/nfsd/nfs4state.c    | 142 +++++++++-
 fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
 fs/nfsd/nfsctl.c       |   2 +
 fs/nfsd/nfsd.h         |   2 +
 fs/nfsd/nfsproc.c      |   1 +
 fs/nfsd/state.h        |  32 ++-
 fs/nfsd/xdr4.h         |  53 +++-
 fs/nfsd/xdr4cb.h       |  10 +
 include/linux/nfs4.h   |   1 +
 13 files changed, 1273 insertions(+), 53 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC v1 01/18] NFSD add ca_source_server<> to COPY
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-09-01 19:52   ` J. Bruce Fields
  2017-03-02 16:01 ` [RFC v1 02/18] NFSD add COPY_NOTIFY operation Olga Kornievskaia
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Andy Adamson <andros@netapp.com>

Note: followed conventions and have struct nfsd4_compoundargs pointer as a
parameter even though it is unused.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/nfs4xdr.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 fs/nfsd/xdr4.h    |  4 +++
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 382c1fd..f62cbad 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -41,6 +41,7 @@
 #include <linux/utsname.h>
 #include <linux/pagemap.h>
 #include <linux/sunrpc/svcauth_gss.h>
+#include <linux/sunrpc/addr.h>
 
 #include "idmap.h"
 #include "acl.h"
@@ -1726,11 +1727,58 @@ static __be32 nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
 	DECODE_TAIL;
 }
 
+static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
+				      struct nl4_server *ns)
+{
+	DECODE_HEAD;
+	struct nfs42_netaddr *naddr;
+
+	READ_BUF(4);
+	ns->nl4_type = be32_to_cpup(p++);
+
+	/* currently support for 1 inter-server source server */
+	switch (ns->nl4_type) {
+	case NL4_NAME:
+	case NL4_URL:
+		READ_BUF(4);
+		ns->u.nl4_str_sz = be32_to_cpup(p++);
+		if (ns->u.nl4_str_sz > NFS4_OPAQUE_LIMIT)
+			goto xdr_error;
+
+		READ_BUF(ns->u.nl4_str_sz);
+		COPYMEM(ns->u.nl4_str,
+			ns->u.nl4_str_sz);
+		break;
+	case NL4_NETADDR:
+		naddr = &ns->u.nl4_addr;
+
+		READ_BUF(4);
+		naddr->na_netid_len = be32_to_cpup(p++);
+		if (naddr->na_netid_len > RPCBIND_MAXNETIDLEN)
+			goto xdr_error;
+
+		READ_BUF(naddr->na_netid_len + 4); /* 4 for uaddr len */
+		COPYMEM(naddr->na_netid, naddr->na_netid_len);
+
+		naddr->na_uaddr_len = be32_to_cpup(p++);
+		if (naddr->na_uaddr_len > RPCBIND_MAXUADDRLEN)
+			goto xdr_error;
+
+		READ_BUF(naddr->na_uaddr_len);
+		COPYMEM(naddr->na_uaddr, naddr->na_uaddr_len);
+		break;
+	default:
+		goto xdr_error;
+	}
+	DECODE_TAIL;
+}
+
 static __be32
 nfsd4_decode_copy(struct nfsd4_compoundargs *argp, struct nfsd4_copy *copy)
 {
 	DECODE_HEAD;
-	unsigned int tmp;
+	struct nl4_server *ns;
+	int i;
 
 	status = nfsd4_decode_stateid(argp, &copy->cp_src_stateid);
 	if (status)
@@ -1745,8 +1793,29 @@ static __be32 nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
 	p = xdr_decode_hyper(p, &copy->cp_count);
 	copy->cp_consecutive = be32_to_cpup(p++);
 	copy->cp_synchronous = be32_to_cpup(p++);
-	tmp = be32_to_cpup(p); /* Source server list not supported */
+	copy->cp_src.nl_nsvr = be32_to_cpup(p++);
 
+	if (copy->cp_src.nl_nsvr == 0) /* intra-server copy */
+		goto intra;
+
+	/** Support NFSD4_MAX_SSC_SRC number of source servers.
+	 * freed in nfsd4_encode_copy
+	 */
+	if (copy->cp_src.nl_nsvr > NFSD4_MAX_SSC_SRC)
+		copy->cp_src.nl_nsvr = NFSD4_MAX_SSC_SRC;
+	copy->cp_src.nl_svr = kmalloc(copy->cp_src.nl_nsvr *
+					sizeof(struct nl4_server), GFP_KERNEL);
+	if (copy->cp_src.nl_svr == NULL)
+		return nfserrno(-ENOMEM);
+
+	ns = copy->cp_src.nl_svr;
+	for (i = 0; i < copy->cp_src.nl_nsvr; i++) {
+		status = nfsd4_decode_nl4_server(argp, ns);
+		if (status)
+			return status;
+		ns++;
+	}
+intra:
 	DECODE_TAIL;
 }
 
@@ -4295,6 +4364,8 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 		*p++ = cpu_to_be32(copy->cp_consecutive);
 		*p++ = cpu_to_be32(copy->cp_synchronous);
 	}
+	/* allocated in nfsd4_decode_copy */
+	kfree(copy->cp_src.nl_svr);
 	return nfserr;
 }
 
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 8fda4ab..6b1a61fc 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -509,6 +509,9 @@ struct nfsd42_write_res {
 	nfs4_verifier		wr_verifier;
 };
 
+/*  support 1 source server for now */
+#define NFSD4_MAX_SSC_SRC       1
+
 struct nfsd4_copy {
 	/* request */
 	stateid_t	cp_src_stateid;
@@ -516,6 +519,7 @@ struct nfsd4_copy {
 	u64		cp_src_pos;
 	u64		cp_dst_pos;
 	u64		cp_count;
+	struct nl4_servers cp_src;
 
 	/* both */
 	bool		cp_consecutive;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 02/18] NFSD add COPY_NOTIFY operation
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 01/18] NFSD add ca_source_server<> to COPY Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 03/18] NFSD generalize nfsd4_compound_state flag names Olga Kornievskaia
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/nfs4proc.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/nfs4xdr.c  | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/xdr4.h     |  13 ++++++
 3 files changed, 239 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index cbeeda1..bcd3c51 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -35,6 +35,7 @@
 #include <linux/file.h>
 #include <linux/falloc.h>
 #include <linux/slab.h>
+#include <linux/sunrpc/addr.h>
 
 #include "idmap.h"
 #include "cache.h"
@@ -1087,6 +1088,100 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 }
 
 static __be32
+nfsd4_set_src_nl4_netaddr(struct svc_rqst *rqstp, struct nl4_servers *svrs)
+{
+	const struct sockaddr *addr = (struct sockaddr *)&rqstp->rq_daddr;
+	const struct sockaddr_in *sin = (const struct sockaddr_in *)addr;
+	const struct sockaddr_in6 *sin6 = (const struct sockaddr_in6 *)addr;
+	int uaddr_len = rqstp->rq_daddrlen + 4 + 1; /* port (4) and '\0' (1) */
+	struct nfs42_netaddr *naddr;
+	size_t ret;
+	unsigned short port;
+
+	/* freed in nfsd4_encode_copy_notify */
+	svrs->nl_svr = kmalloc_array(svrs->nl_nsvr, sizeof(struct nl4_server),
+				GFP_KERNEL);
+	if (svrs->nl_svr == NULL)
+		return nfserrno(-ENOMEM);
+
+	svrs->nl_svr->nl4_type = NL4_NETADDR;
+	naddr = &svrs->nl_svr->u.nl4_addr;
+
+	switch (addr->sa_family) {
+	case AF_INET:
+		port = ntohs(sin->sin_port);
+		ret = rpc_ntop(addr, naddr->na_uaddr, sizeof(naddr->na_uaddr));
+		snprintf(naddr->na_uaddr + ret, uaddr_len, ".%u.%u",
+			 port >> 8, port & 255);
+		naddr->na_uaddr_len = strlen(naddr->na_uaddr);
+
+		snprintf(naddr->na_netid, 4, "%s", "tcp");
+			naddr->na_netid_len = 3;
+		break;
+	case AF_INET6:
+		port = ntohs(sin6->sin6_port);
+		ret = rpc_ntop(addr, naddr->na_uaddr, sizeof(naddr->na_uaddr));
+		snprintf(naddr->na_uaddr + ret, uaddr_len, ".%u.%u",
+			 port >> 8, port & 255);
+		naddr->na_uaddr_len = strlen(naddr->na_uaddr);
+
+		snprintf(naddr->na_netid, 5, "%s", "tcp6");
+			naddr->na_netid_len = 4;
+		break;
+	default:
+		dprintk("NFSD  nfsd4_set_notify_src: unknown address type: %d",
+			addr->sa_family);
+		kfree(svrs->nl_svr);
+		return nfserrno(-EINVAL);
+	}
+	return 0;
+}
+
+static __be32
+nfsd4_copy_notify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+		  struct nfsd4_copy_notify *cn)
+{
+	__be32 status;
+	struct file *src = NULL;
+	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+
+	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
+					&cn->cpn_src_stateid, RD_STATE, &src,
+					NULL);
+	if (status)
+		return status;
+
+	cn->cpn_sec = nn->nfsd4_lease;
+	cn->cpn_nsec = 0;
+
+
+	/** XXX Save cpn_src_statid, cpn_src, and any other returned source
+	 * server addresses on which the source server is williing to accept
+	 * connections from the destination e.g. what is returned in cpn_src,
+	 * to verify READ from dest server.
+	 */
+
+	/**
+	 * For now, only return one server address in cpn_src, the
+	 * address used by the client to connect to this server.
+	 */
+	cn->cpn_src.nl_nsvr = 1;
+
+	status = nfsd4_set_src_nl4_netaddr(rqstp, &cn->cpn_src);
+	if (status != 0)
+		goto out;
+
+	dprintk("<-- %s cpn_dst %s:%s nl_nsvr %d nl_svr %s:%s\n", __func__,
+		cn->cpn_dst.u.nl4_addr.na_netid,
+		cn->cpn_dst.u.nl4_addr.na_uaddr,
+		cn->cpn_src.nl_nsvr,
+		cn->cpn_src.nl_svr->u.nl4_addr.na_netid,
+		cn->cpn_src.nl_svr->u.nl4_addr.na_uaddr);
+out:
+	return status;
+}
+
+static __be32
 nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		struct nfsd4_fallocate *fallocate, int flags)
 {
@@ -2042,6 +2137,21 @@ static inline u32 nfsd4_copy_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 		1 /* cr_synchronous */) * sizeof(__be32);
 }
 
+static inline u32 nfsd4_copy_notify_rsize(struct svc_rqst *rqstp,
+					struct nfsd4_op *op)
+{
+	return (op_encode_hdr_size +
+		3 /* cnr_lease_time */ +
+		1 /* We support one cnr_source_server */ +
+		1 /* cnr_stateid seq */ +
+		op_encode_stateid_maxsz /* cnr_stateid */ +
+		1 /* num cnr_source_server*/ +
+		1 /* nl4_type */ +
+		1 /* nl4 size */ +
+		XDR_QUADLEN(NFS4_OPAQUE_LIMIT) /*nl4_loc + nl4_loc_sz */)
+		* sizeof(__be32);
+}
+
 #ifdef CONFIG_NFSD_PNFS
 static inline u32 nfsd4_getdeviceinfo_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 {
@@ -2446,6 +2556,12 @@ static inline u32 nfsd4_seek_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 		.op_name = "OP_SEEK",
 		.op_rsize_bop = (nfsd4op_rsize)nfsd4_seek_rsize,
 	},
+	[OP_COPY_NOTIFY] = {
+		.op_func = (nfsd4op_func)nfsd4_copy_notify,
+		.op_flags = OP_MODIFIES_SOMETHING | OP_CACHEME,
+		.op_name = "OP_COPY_NOTIFY",
+		.op_rsize_bop = (nfsd4op_rsize)nfsd4_copy_notify_rsize,
+	},
 };
 
 /**
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index f62cbad..c632156 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1820,6 +1820,22 @@ static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
 }
 
 static __be32
+nfsd4_decode_copy_notify(struct nfsd4_compoundargs *argp,
+			 struct nfsd4_copy_notify *cn)
+{
+	int status;
+
+	status = nfsd4_decode_stateid(argp, &cn->cpn_src_stateid);
+	if (status)
+		return status;
+	status = nfsd4_decode_nl4_server(argp, &cn->cpn_dst);
+	if (status)
+		return status;
+
+	return status;
+}
+
+static __be32
 nfsd4_decode_seek(struct nfsd4_compoundargs *argp, struct nfsd4_seek *seek)
 {
 	DECODE_HEAD;
@@ -1920,7 +1936,7 @@ static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
 	/* new operations for NFSv4.2 */
 	[OP_ALLOCATE]		= (nfsd4_dec)nfsd4_decode_fallocate,
 	[OP_COPY]		= (nfsd4_dec)nfsd4_decode_copy,
-	[OP_COPY_NOTIFY]	= (nfsd4_dec)nfsd4_decode_notsupp,
+	[OP_COPY_NOTIFY]	= (nfsd4_dec)nfsd4_decode_copy_notify,
 	[OP_DEALLOCATE]		= (nfsd4_dec)nfsd4_decode_fallocate,
 	[OP_IO_ADVISE]		= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_LAYOUTERROR]	= (nfsd4_dec)nfsd4_decode_notsupp,
@@ -4350,6 +4366,52 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 }
 
 static __be32
+nfsd42_encode_nl4_server(struct nfsd4_compoundres *resp, struct nl4_server *ns)
+{
+	struct xdr_stream *xdr = &resp->xdr;
+	struct nfs42_netaddr *addr;
+	__be32 *p;
+
+	p = xdr_reserve_space(xdr, 4);
+	*p++ = cpu_to_be32(ns->nl4_type);
+
+	switch (ns->nl4_type) {
+	case NL4_NAME:
+	case NL4_URL:
+		p = xdr_reserve_space(xdr, 4 /* url or name len */ +
+				      (XDR_QUADLEN(ns->u.nl4_str_sz) * 4));
+		if (!p)
+			return nfserr_resource;
+		*p++ = cpu_to_be32(ns->u.nl4_str_sz);
+		p = xdr_encode_opaque_fixed(p, ns->u.nl4_str, ns->u.nl4_str_sz);
+			break;
+	case NL4_NETADDR:
+		addr = &ns->u.nl4_addr;
+
+		/** netid_len, netid, uaddr_len, uaddr (port included
+		 * in RPCBIND_MAXUADDRLEN)
+		 */
+		p = xdr_reserve_space(xdr,
+			4 /* netid len */ +
+			(XDR_QUADLEN(addr->na_netid_len) * 4) +
+			4 /* uaddr len */ +
+			(XDR_QUADLEN(addr->na_uaddr_len) * 4));
+		if (!p)
+			return nfserr_resource;
+
+		*p++ = cpu_to_be32(addr->na_netid_len);
+		p = xdr_encode_opaque_fixed(p, addr->na_netid,
+					    addr->na_netid_len);
+		*p++ = cpu_to_be32(addr->na_uaddr_len);
+		p = xdr_encode_opaque_fixed(p, addr->na_uaddr,
+					addr->na_uaddr_len);
+		break;
+	}
+
+	return 0;
+}
+
+static __be32
 nfsd4_encode_copy(struct nfsd4_compoundres *resp, __be32 nfserr,
 		  struct nfsd4_copy *copy)
 {
@@ -4370,6 +4432,52 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 }
 
 static __be32
+nfsd4_encode_copy_notify(struct nfsd4_compoundres *resp, __be32 nfserr,
+			 struct nfsd4_copy_notify *cn)
+{
+	struct xdr_stream *xdr = &resp->xdr;
+	struct nl4_server *ns;
+	__be32 *p;
+	int i;
+
+	if (nfserr)
+		return nfserr;
+
+	/* 8 sec, 4 nsec */
+	p = xdr_reserve_space(xdr, 12);
+	if (!p)
+		return nfserr_resource;
+
+	/* cnr_lease_time */
+	p = xdr_encode_hyper(p, cn->cpn_sec);
+	*p++ = cpu_to_be32(cn->cpn_nsec);
+
+	/* cnr_stateid */
+	nfserr = nfsd4_encode_stateid(xdr, &cn->cpn_src_stateid);
+	if (nfserr)
+		return nfserr;
+
+	/* cnr_src.nl_nsvr */
+	p = xdr_reserve_space(xdr, 4);
+	if (!p)
+		return nfserr_resource;
+
+	*p++ = cpu_to_be32(cn->cpn_src.nl_nsvr);
+
+	ns = cn->cpn_src.nl_svr;
+	for (i = 0; i < cn->cpn_src.nl_nsvr; i++) {
+		nfserr = nfsd42_encode_nl4_server(resp, ns);
+		if (nfserr)
+			return nfserr;
+		ns++;
+	}
+
+	/* allocated in nfsd4_copy_notify */
+	kfree(cn->cpn_src.nl_svr);
+	return nfserr;
+}
+
+static __be32
 nfsd4_encode_seek(struct nfsd4_compoundres *resp, __be32 nfserr,
 		  struct nfsd4_seek *seek)
 {
@@ -4469,7 +4577,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	/* NFSv4.2 operations */
 	[OP_ALLOCATE]		= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_COPY]		= (nfsd4_enc)nfsd4_encode_copy,
-	[OP_COPY_NOTIFY]	= (nfsd4_enc)nfsd4_encode_noop,
+	[OP_COPY_NOTIFY]	= (nfsd4_enc)nfsd4_encode_copy_notify,
 	[OP_DEALLOCATE]		= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_IO_ADVISE]		= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_LAYOUTERROR]	= (nfsd4_enc)nfsd4_encode_noop,
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 6b1a61fc..5db7cd8 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -540,6 +540,18 @@ struct nfsd4_seek {
 	loff_t		seek_pos;
 };
 
+struct nfsd4_copy_notify {
+	/* request */
+	stateid_t		cpn_src_stateid;
+	struct nl4_server	cpn_dst;
+
+	/* response */
+	/* Note: cpn_src_stateid is used for cnr_stateid */
+	u64			cpn_sec;
+	u32			cpn_nsec;
+	struct nl4_servers	cpn_src;
+};
+
 struct nfsd4_op {
 	int					opnum;
 	__be32					status;
@@ -595,6 +607,7 @@ struct nfsd4_op {
 		struct nfsd4_fallocate		deallocate;
 		struct nfsd4_clone		clone;
 		struct nfsd4_copy		copy;
+		struct nfsd4_copy_notify	copy_notify;
 		struct nfsd4_seek		seek;
 	} u;
 	struct nfs4_replay *			replay;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 03/18] NFSD generalize nfsd4_compound_state flag names
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 01/18] NFSD add ca_source_server<> to COPY Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 02/18] NFSD add COPY_NOTIFY operation Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh Olga Kornievskaia
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Andy Adamson <andros@netapp.com>

Allow for sid_flag field non-stateid use.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/nfs4proc.c  | 8 ++++----
 fs/nfsd/nfs4state.c | 7 ++++---
 fs/nfsd/xdr4.h      | 6 +++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index bcd3c51..a680c8c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -522,9 +522,9 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
 		return nfserr_restorefh;
 
 	fh_dup2(&cstate->current_fh, &cstate->save_fh);
-	if (HAS_STATE_ID(cstate, SAVED_STATE_ID_FLAG)) {
+	if (HAS_CSTATE_FLAG(cstate, SAVED_STATE_ID_FLAG)) {
 		memcpy(&cstate->current_stateid, &cstate->save_stateid, sizeof(stateid_t));
-		SET_STATE_ID(cstate, CURRENT_STATE_ID_FLAG);
+		SET_CSTATE_FLAG(cstate, CURRENT_STATE_ID_FLAG);
 	}
 	return nfs_ok;
 }
@@ -537,9 +537,9 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
 		return nfserr_nofilehandle;
 
 	fh_dup2(&cstate->save_fh, &cstate->current_fh);
-	if (HAS_STATE_ID(cstate, CURRENT_STATE_ID_FLAG)) {
+	if (HAS_CSTATE_FLAG(cstate, CURRENT_STATE_ID_FLAG)) {
 		memcpy(&cstate->save_stateid, &cstate->current_stateid, sizeof(stateid_t));
-		SET_STATE_ID(cstate, SAVED_STATE_ID_FLAG);
+		SET_CSTATE_FLAG(cstate, SAVED_STATE_ID_FLAG);
 	}
 	return nfs_ok;
 }
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e9ef50a..1d307a7 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7093,7 +7093,8 @@ static int nfs4_state_create_net(struct net *net)
 static void
 get_stateid(struct nfsd4_compound_state *cstate, stateid_t *stateid)
 {
-	if (HAS_STATE_ID(cstate, CURRENT_STATE_ID_FLAG) && CURRENT_STATEID(stateid))
+	if (HAS_CSTATE_FLAG(cstate, CURRENT_STATE_ID_FLAG) &&
+	    CURRENT_STATEID(stateid))
 		memcpy(stateid, &cstate->current_stateid, sizeof(stateid_t));
 }
 
@@ -7102,14 +7103,14 @@ static int nfs4_state_create_net(struct net *net)
 {
 	if (cstate->minorversion) {
 		memcpy(&cstate->current_stateid, stateid, sizeof(stateid_t));
-		SET_STATE_ID(cstate, CURRENT_STATE_ID_FLAG);
+		SET_CSTATE_FLAG(cstate, CURRENT_STATE_ID_FLAG);
 	}
 }
 
 void
 clear_current_stateid(struct nfsd4_compound_state *cstate)
 {
-	CLEAR_STATE_ID(cstate, CURRENT_STATE_ID_FLAG);
+	CLEAR_CSTATE_FLAG(cstate, CURRENT_STATE_ID_FLAG);
 }
 
 /*
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 5db7cd8..38fcb4f 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -46,9 +46,9 @@
 #define CURRENT_STATE_ID_FLAG (1<<0)
 #define SAVED_STATE_ID_FLAG (1<<1)
 
-#define SET_STATE_ID(c, f) ((c)->sid_flags |= (f))
-#define HAS_STATE_ID(c, f) ((c)->sid_flags & (f))
-#define CLEAR_STATE_ID(c, f) ((c)->sid_flags &= ~(f))
+#define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |= (f))
+#define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
+#define CLEAR_CSTATE_FLAG(c, f) ((c)->sid_flags &= ~(f))
 
 struct nfsd4_compound_state {
 	struct svc_fh		current_fh;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (2 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 03/18] NFSD generalize nfsd4_compound_state flag names Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-09-01 20:23   ` J. Bruce Fields
  2017-03-02 16:01 ` [RFC v1 05/18] NFSD add nfs4 inter ssc to nfsd4_copy Olga Kornievskaia
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Andy Adamson <andros@netapp.com>

The inter server to server COPY source server filehandle
is guaranteed to be stale as the COPY is sent to the destination
server.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/nfs4proc.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfs4xdr.c  | 26 +++++++++++++++++++++++++-
 fs/nfsd/nfsd.h     |  2 ++
 fs/nfsd/xdr4.h     |  4 ++++
 4 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a680c8c..733a9aa 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -496,11 +496,19 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
 nfsd4_putfh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	    struct nfsd4_putfh *putfh)
 {
+	__be32 ret;
+
 	fh_put(&cstate->current_fh);
 	cstate->current_fh.fh_handle.fh_size = putfh->pf_fhlen;
 	memcpy(&cstate->current_fh.fh_handle.fh_base, putfh->pf_fhval,
 	       putfh->pf_fhlen);
-	return fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
+	ret = fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
+	if (ret == nfserr_stale && HAS_CSTATE_FLAG(cstate, NO_VERIFY_FH)) {
+		CLEAR_CSTATE_FLAG(cstate, NO_VERIFY_FH);
+		SET_CSTATE_FLAG(cstate, IS_STALE_FH);
+		ret = 0;
+	}
+	return ret;
 }
 
 static __be32
@@ -533,6 +541,16 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
 nfsd4_savefh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	     void *arg)
 {
+	/**
+	* This is either an inter COPY (most likely) or an intra COPY with a
+	* stale file handle. If the latter, nfsd4_copy will reset the PUTFH to
+	* return nfserr_stale. No fh_dentry, just copy the file handle
+	* to use with the inter COPY READ.
+	*/
+	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
+		cstate->save_fh = cstate->current_fh;
+		return nfs_ok;
+	}
 	if (!cstate->current_fh.fh_dentry)
 		return nfserr_nofilehandle;
 
@@ -1067,6 +1085,13 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 	if (status)
 		goto out;
 
+	/* Intra copy source fh is stale. PUTFH will fail with ESTALE */
+	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
+		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
+		cstate->status = nfserr_copy_stalefh;
+		goto out_put;
+	}
+
 	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
 			dst, copy->cp_dst_pos, copy->cp_count);
 
@@ -1081,6 +1106,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 		status = nfs_ok;
 	}
 
+out_put:
 	fput(src);
 	fput(dst);
 out:
@@ -1776,6 +1802,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
 	struct nfsd4_compound_state *cstate = &resp->cstate;
 	struct svc_fh *current_fh = &cstate->current_fh;
 	struct svc_fh *save_fh = &cstate->save_fh;
+	int		i;
 	__be32		status;
 
 	svcxdr_init_encode(rqstp, resp);
@@ -1808,6 +1835,12 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
 		goto encode_op;
 	}
 
+	/* NFSv4.2 COPY source file handle may be from a different server */
+	for (i = 0; i < args->opcnt; i++) {
+		op = &args->ops[i];
+		if (op->opnum == OP_COPY)
+			SET_CSTATE_FLAG(cstate, NO_VERIFY_FH);
+	}
 	while (!status && resp->opcnt < args->opcnt) {
 		op = &args->ops[resp->opcnt++];
 
@@ -1827,6 +1860,9 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
 
 		opdesc = OPDESC(op);
 
+		if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
+			goto call_op;
+
 		if (!current_fh->fh_dentry) {
 			if (!(opdesc->op_flags & ALLOWED_WITHOUT_FH)) {
 				op->status = nfserr_nofilehandle;
@@ -1861,6 +1897,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
 
 		if (opdesc->op_get_currentstateid)
 			opdesc->op_get_currentstateid(cstate, &op->u);
+call_op:
 		op->status = opdesc->op_func(rqstp, cstate, &op->u);
 
 		if (!op->status) {
@@ -1881,6 +1918,14 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
 			status = op->status;
 			goto out;
 		}
+		/* Only from intra COPY */
+		if (cstate->status == nfserr_copy_stalefh) {
+			dprintk("%s NFS4.2 intra COPY stale src filehandle\n",
+				__func__);
+			status = nfserr_stale;
+			nfsd4_adjust_encode(resp);
+			goto out;
+		}
 		if (op->status == nfserr_replay_me) {
 			op->replay = &cstate->replay_owner->so_replay;
 			nfsd4_encode_replay(&resp->xdr, op);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index c632156..328ff9c 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4619,15 +4619,28 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
 	return nfserr_rep_too_big;
 }
 
+/** Rewind the encoding to return nfserr_stale on the PUTFH
+ * in this failed Intra COPY compound
+ */
+void
+nfsd4_adjust_encode(struct nfsd4_compoundres *resp)
+{
+	__be32 *p;
+
+	p = resp->cstate.putfh_errp;
+	*p++ = nfserr_stale;
+}
+
 void
 nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
 {
 	struct xdr_stream *xdr = &resp->xdr;
 	struct nfs4_stateowner *so = resp->cstate.replay_owner;
+	struct nfsd4_compound_state *cstate = &resp->cstate;
 	struct svc_rqst *rqstp = resp->rqstp;
 	int post_err_offset;
 	nfsd4_enc encoder;
-	__be32 *p;
+	__be32 *p, *statp;
 
 	p = xdr_reserve_space(xdr, 8);
 	if (!p) {
@@ -4636,9 +4649,20 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
 	}
 	*p++ = cpu_to_be32(op->opnum);
 	post_err_offset = xdr->buf->len;
+	statp = p;
 
 	if (op->opnum == OP_ILLEGAL)
 		goto status;
+
+	/** This is a COPY compound with a stale source server file handle.
+	 * If OP_COPY processing determines that this is an intra server to
+	 * server COPY, then this PUTFH should return nfserr_ stale so the
+	 * putfh_errp will be set to nfserr_stale. If this is an inter server
+	 * to server COPY, ignore the nfserr_stale.
+	 */
+	if (op->opnum == OP_PUTFH && HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
+		cstate->putfh_errp = statp;
+
 	BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
 	       !nfsd4_enc_ops[op->opnum]);
 	encoder = nfsd4_enc_ops[op->opnum];
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index d966068..8d6fb0f 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -272,6 +272,8 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
 #define	nfserr_replay_me	cpu_to_be32(11001)
 /* nfs41 replay detected */
 #define	nfserr_replay_cache	cpu_to_be32(11002)
+/* nfs42 intra copy failed with nfserr_stale */
+#define nfserr_copy_stalefh	cpu_to_be32(1103)
 
 /* Check for dir entries '.' and '..' */
 #define isdotent(n, l)	(l < 3 && n[0] == '.' && (l == 1 || n[1] == '.'))
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 38fcb4f..aa94295 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -45,6 +45,8 @@
 
 #define CURRENT_STATE_ID_FLAG (1<<0)
 #define SAVED_STATE_ID_FLAG (1<<1)
+#define NO_VERIFY_FH (1<<2)
+#define IS_STALE_FH  (1<<3)
 
 #define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |= (f))
 #define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
@@ -63,6 +65,7 @@ struct nfsd4_compound_state {
 	size_t			iovlen;
 	u32			minorversion;
 	__be32			status;
+	__be32			*putfh_errp;
 	stateid_t	current_stateid;
 	stateid_t	save_stateid;
 	/* to indicate current and saved state id presents */
@@ -705,6 +708,7 @@ int nfs4svc_decode_compoundargs(struct svc_rqst *, __be32 *,
 int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
 		struct nfsd4_compoundres *);
 __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
+void nfsd4_adjust_encode(struct nfsd4_compoundres *);
 void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
 void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op);
 __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 05/18] NFSD add nfs4 inter ssc to nfsd4_copy
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (3 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 06/18] NFSD return nfs4_stid in nfs4_preprocess_stateid_op Olga Kornievskaia
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Given a universal address, mount the source server from the destination
server.  Use an internal mount. Call the NFS client nfs42_ssc_open to
obtain the NFS struct file suitable for nfsd_copy_range.

Add Kconfig dependencies for inter server to server copy

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/Kconfig      |  10 ++
 fs/nfsd/nfs4proc.c   | 270 +++++++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/nfs4.h |   1 +
 3 files changed, 271 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
index 20b1c17..37ff3d5 100644
--- a/fs/nfsd/Kconfig
+++ b/fs/nfsd/Kconfig
@@ -131,6 +131,16 @@ config NFSD_FLEXFILELAYOUT
 
 	  If unsure, say N.
 
+config NFSD_V4_2_INTER_SSC
+	bool "NFSv4.2 inter server to server COPY"
+	depends on NFSD_V4 && NFS_V4_1 && NFS_V4_2
+	help
+	  This option enables support for NFSv4.2 inter server to
+	  server copy where the destination server calls the NFSv4.2
+	  client to read the data to copy from the source server.
+
+	  If unsure, say N.
+
 config NFSD_V4_SECURITY_LABEL
 	bool "Provide Security Label support for NFSv4 server"
 	depends on NFSD_V4 && SECURITY
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 733a9aa..52a21bc 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1072,16 +1072,235 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 	return status;
 }
 
+#ifdef CONFIG_NFSD_V4_2_INTER_SSC
+
+extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
+				   struct nfs_fh *src_fh,
+				   nfs4_stateid *stateid);
+extern struct file *nfs42_ssc_close(struct file *filep);
+
+extern void nfs_sb_deactive(struct super_block *sb);
+
+#define NFSD42_INTERSSC_RAWDATA "minorversion=1,vers=4,addr=%s,clientaddr=%s"
+
+/**
+ * Support one copy source server for now.
+ */
+static struct vfsmount *
+nfsd4_interssc_connect(struct nl4_servers *nss, struct svc_rqst *rqstp)
+{
+	struct file_system_type *type;
+	struct vfsmount *ss_mnt;
+	struct nfs42_netaddr *naddr;
+	struct sockaddr_storage tmp_addr;
+	size_t tmp_addrlen, match_netid_len = 3;
+	char *startsep = "", *endsep = "", *match_netid = "tcp";
+	char *ipaddr, *ipaddr2, *raw_data;
+	int len, raw_len, status = -EINVAL;
+
+	/* Currently support for one NL4_NETADDR source server */
+	if (nss->nl_svr->nl4_type != NL4_NETADDR) {
+		WARN(nss->nl_svr->nl4_type != NL4_NETADDR,
+			"nfsd4_copy src server not NL4_NETADDR\n");
+		goto out_err;
+	}
+
+	naddr = &nss->nl_svr->u.nl4_addr;
+
+	tmp_addrlen = rpc_uaddr2sockaddr(SVC_NET(rqstp), naddr->na_uaddr,
+					naddr->na_uaddr_len,
+					(struct sockaddr *)&tmp_addr,
+					sizeof(tmp_addr));
+	if (tmp_addrlen == 0)
+		goto out_err;
+
+	if (tmp_addr.ss_family == AF_INET6) {
+		startsep = "[";
+		endsep = "]";
+		match_netid = "tcp6";
+		match_netid_len = 4;
+	}
+
+	if (naddr->na_netid_len != match_netid_len ||
+	    strncmp(naddr->na_netid, match_netid, naddr->na_netid_len))
+		goto out_err;
+
+	/* Construct the raw data for the vfs_kern_mount call */
+	len = RPC_MAX_ADDRBUFLEN + 1;
+	ipaddr = kzalloc(len, GFP_KERNEL);
+	if (!ipaddr)
+		goto out_err;
+
+	rpc_ntop((struct sockaddr *)&tmp_addr, ipaddr, len);
+
+	/* 2 for ipv6 endsep and startsep. 3 for ":/" and trailing '/0'*/
+	ipaddr2 = kzalloc(len + 5, GFP_KERNEL);
+	if (!ipaddr2)
+		goto out_free_ipaddr;
+
+	rpc_ntop((struct sockaddr *)&rqstp->rq_daddr, ipaddr2, len + 5);
+
+	raw_len = strlen(NFSD42_INTERSSC_RAWDATA) + strlen(ipaddr) +
+			strlen(ipaddr2);
+	raw_data = kzalloc(raw_len, GFP_KERNEL);
+	if (!raw_data)
+		goto out_free_ipaddr2;
+
+	snprintf(raw_data, raw_len, NFSD42_INTERSSC_RAWDATA, ipaddr,
+		 ipaddr2);
+
+	status = -ENODEV;
+	type = get_fs_type("nfs");
+	if (!type)
+		goto out_free_rawdata;
+
+	/* Set the server:<export> for the vfs_kerne_mount call */
+	memset(ipaddr2, 0, len + 5);
+	snprintf(ipaddr2, len + 5, "%s%s%s:/", startsep, ipaddr, endsep);
+
+	dprintk("%s  Raw mount data:  %s server:export %s\n", __func__,
+		raw_data, ipaddr2);
+
+	/* Use an 'internal' mount: MS_KERNMOUNT -> MNT_INTERNAL */
+	ss_mnt = vfs_kern_mount(type, MS_KERNMOUNT, ipaddr2, raw_data);
+	if (IS_ERR(ss_mnt)) {
+		status = PTR_ERR(ss_mnt);
+		goto out_free_rawdata;
+	}
+
+	kfree(raw_data);
+	kfree(ipaddr2);
+	kfree(ipaddr);
+
+	return ss_mnt;
+
+out_free_rawdata:
+	kfree(raw_data);
+out_free_ipaddr2:
+	kfree(ipaddr2);
+out_free_ipaddr:
+	kfree(ipaddr);
+out_err:
+	dprintk("--> %s ERROR %d\n", __func__, status);
+	return ERR_PTR(status);
+}
+
+static void
+nfsd4_interssc_disconnect(struct vfsmount *ss_mnt)
+{
+	struct super_block *sb = ss_mnt->mnt_sb;
+
+	if (!list_empty(&sb->s_inodes)) {
+		dprintk("[%s]: super block=%p has active inodes.\n",
+			__func__, sb);
+		return;
+	}
+
+	mntput(ss_mnt);
+	nfs_sb_deactive(sb);
+}
+
+/**
+ * nfsd4_setup_inter_ssc
+ *
+ * Verify COPY destination stateid.
+ * Connect to the source server with NFSv4.1.
+ * Create the source struct file for nfsd_copy_range.
+ * Called with COPY cstate:
+ *    SAVED_FH: source filehandle
+ *    CURRENT_FH: destination filehandle
+ *
+ * Returns errno (not nfserrxxx)
+ */
+static struct vfsmount *
+nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
+			struct nfsd4_compound_state *cstate,
+			struct nfsd4_copy *copy, struct file **src,
+			struct file **dst)
+{
+	struct svc_fh *s_fh = NULL;
+	stateid_t *s_stid = &copy->cp_src_stateid;
+	struct nfs_fh fh;
+	nfs4_stateid stateid;
+	struct file *filp;
+	struct vfsmount *ss_mnt;
+	__be32 status;
+
+	/* Verify the destination stateid and set dst struct file*/
+	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
+					&copy->cp_dst_stateid,
+					WR_STATE, dst, NULL);
+	if (status) {
+		ss_mnt = ERR_PTR(be32_to_cpu(status));
+		goto out;
+	}
+
+	/* Inter copy source fh is always stale */
+	CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
+
+	ss_mnt = nfsd4_interssc_connect(&copy->cp_src, rqstp);
+	if (IS_ERR(ss_mnt))
+		goto out;
+
+	s_fh = &cstate->save_fh;
+
+	fh.size = s_fh->fh_handle.fh_size;
+	memcpy(fh.data, &s_fh->fh_handle.fh_base, fh.size);
+	stateid.seqid = s_stid->si_generation;
+	memcpy(stateid.other, (void *)&s_stid->si_opaque,
+		sizeof(stateid_opaque_t));
+
+	filp =  nfs42_ssc_open(ss_mnt, &fh, &stateid);
+	if (IS_ERR(filp)) {
+		nfsd4_interssc_disconnect(ss_mnt);
+		return ERR_CAST(filp);
+	}
+	*src = filp;
+out:
+	return ss_mnt;
+}
+
+static void
+nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct file *src,
+			struct file *dst)
+{
+	nfs42_ssc_close(src);
+	fput(src);
+	fput(dst);
+
+	nfsd4_interssc_disconnect(ss_mnt);
+
+}
+
+#else /* CONFIG_NFSD_V4_2_INTER_SSC */
+
+static struct vfsmount *
+nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
+			struct nfsd4_compound_state *cstate,
+			struct nfsd4_copy *copy, struct file **src,
+			struct file **dst)
+{
+	return ERR_PTR(-EINVAL);
+}
+
+static void
+nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct file *src,
+			struct file *dst)
+{
+}
+
+#endif /* CONFIG_NFSD_V4_2_INTER_SSC */
+
 static __be32
-nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
-		struct nfsd4_copy *copy)
+nfsd4_setup_intra_ssc(struct svc_rqst *rqstp,
+		      struct nfsd4_compound_state *cstate,
+		      struct nfsd4_copy *copy, struct file **src,
+		      struct file **dst)
 {
-	struct file *src, *dst;
 	__be32 status;
-	ssize_t bytes;
 
-	status = nfsd4_verify_copy(rqstp, cstate, &copy->cp_src_stateid, &src,
-				   &copy->cp_dst_stateid, &dst);
+	status = nfsd4_verify_copy(rqstp, cstate, &copy->cp_src_stateid, src,
+				   &copy->cp_dst_stateid, dst);
 	if (status)
 		goto out;
 
@@ -1089,7 +1308,37 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
 		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
 		cstate->status = nfserr_copy_stalefh;
-		goto out_put;
+	}
+out:
+	return status;
+}
+
+static void
+nfsd4_cleanup_intra_ssc(struct file *src, struct file *dst)
+{
+	fput(src);
+	fput(dst);
+}
+
+static __be32
+nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+	struct nfsd4_copy *copy)
+{
+	struct vfsmount *ss_mnt = NULL;
+	struct file *src, *dst;
+	__be32 status;
+	ssize_t bytes;
+
+	if (copy->cp_src.nl_nsvr > 0) { /* Inter server SSC */
+		ss_mnt = nfsd4_setup_inter_ssc(rqstp, cstate, copy, &src, &dst);
+		if (IS_ERR(ss_mnt)) {
+			status = nfserrno(PTR_ERR(ss_mnt));
+			goto out;
+		}
+	} else {
+		status = nfsd4_setup_intra_ssc(rqstp, cstate, copy, &src, &dst);
+		if (status)
+			goto out;
 	}
 
 	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
@@ -1106,9 +1355,10 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 		status = nfs_ok;
 	}
 
-out_put:
-	fput(src);
-	fput(dst);
+	if (copy->cp_src.nl_nsvr > 0)   /* Inter server SSC */
+		nfsd4_cleanup_inter_ssc(ss_mnt, src, dst);
+	else
+		nfsd4_cleanup_intra_ssc(src, dst);
 out:
 	return status;
 }
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index e43c106..c623486 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -15,6 +15,7 @@
 #include <linux/list.h>
 #include <linux/uidgid.h>
 #include <uapi/linux/nfs4.h>
+#include <linux/nfs.h>
 #include <linux/sunrpc/msg_prot.h>
 
 enum nfs4_acl_whotype {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 06/18] NFSD return nfs4_stid in nfs4_preprocess_stateid_op
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (4 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 05/18] NFSD add nfs4 inter ssc to nfsd4_copy Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 07/18] NFSD Unique stateid_t for inter server to server COPY authentication Olga Kornievskaia
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

From: Andy Adamson <andros@netapp.com>

Needed for nfsd4_copy_notify to add nfs4_cp_state to the nfs4_stid.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/nfs4proc.c  | 22 +++++++++++++---------
 fs/nfsd/nfs4state.c |  8 ++++++--
 fs/nfsd/state.h     |  3 ++-
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 52a21bc..b079904 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -780,7 +780,8 @@ static __be32 nfsd4_do_lookupp(struct svc_rqst *rqstp, struct svc_fh *fh)
 	/* check stateid */
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&read->rd_stateid, RD_STATE,
-					&read->rd_filp, &read->rd_tmp_file);
+					&read->rd_filp, &read->rd_tmp_file,
+					NULL);
 	if (status) {
 		dprintk("NFSD: nfsd4_read: couldn't process stateid!\n");
 		goto out;
@@ -926,7 +927,7 @@ static __be32 nfsd4_do_lookupp(struct svc_rqst *rqstp, struct svc_fh *fh)
 	if (setattr->sa_iattr.ia_valid & ATTR_SIZE) {
 		status = nfs4_preprocess_stateid_op(rqstp, cstate,
 				&cstate->current_fh, &setattr->sa_stateid,
-				WR_STATE, NULL, NULL);
+				WR_STATE, NULL, NULL, NULL);
 		if (status) {
 			dprintk("NFSD: nfsd4_setattr: couldn't process stateid!\n");
 			return status;
@@ -991,7 +992,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 		return nfserr_inval;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
-						stateid, WR_STATE, &filp, NULL);
+					stateid, WR_STATE, &filp, NULL, NULL);
 	if (status) {
 		dprintk("NFSD: nfsd4_write: couldn't process stateid!\n");
 		return status;
@@ -1022,14 +1023,16 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 	__be32 status;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->save_fh,
-					    src_stateid, RD_STATE, src, NULL);
+					    src_stateid, RD_STATE, src, NULL,
+					    NULL);
 	if (status) {
 		dprintk("NFSD: %s: couldn't process src stateid!\n", __func__);
 		goto out;
 	}
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
-					    dst_stateid, WR_STATE, dst, NULL);
+					    dst_stateid, WR_STATE, dst, NULL,
+					    NULL);
 	if (status) {
 		dprintk("NFSD: %s: couldn't process dst stateid!\n", __func__);
 		goto out_put_src;
@@ -1229,7 +1232,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	/* Verify the destination stateid and set dst struct file*/
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&copy->cp_dst_stateid,
-					WR_STATE, dst, NULL);
+					WR_STATE, dst, NULL, NULL);
 	if (status) {
 		ss_mnt = ERR_PTR(be32_to_cpu(status));
 		goto out;
@@ -1420,10 +1423,11 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	__be32 status;
 	struct file *src = NULL;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+	struct nfs4_stid *stid;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&cn->cpn_src_stateid, RD_STATE, &src,
-					NULL);
+					NULL, &stid);
 	if (status)
 		return status;
 
@@ -1466,7 +1470,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					    &fallocate->falloc_stateid,
-					    WR_STATE, &file, NULL);
+					    WR_STATE, &file, NULL, NULL);
 	if (status != nfs_ok) {
 		dprintk("NFSD: nfsd4_fallocate: couldn't process stateid!\n");
 		return status;
@@ -1505,7 +1509,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					    &seek->seek_stateid,
-					    RD_STATE, &file, NULL);
+					    RD_STATE, &file, NULL, NULL);
 	if (status) {
 		dprintk("NFSD: nfsd4_seek: couldn't process stateid!\n");
 		return status;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1d307a7..5ebd992 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4945,7 +4945,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 __be32
 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, struct svc_fh *fhp,
-		stateid_t *stateid, int flags, struct file **filpp, bool *tmp_file)
+		stateid_t *stateid, int flags, struct file **filpp,
+		bool *tmp_file, struct nfs4_stid **cstid)
 {
 	struct inode *ino = d_inode(fhp->fh_dentry);
 	struct net *net = SVC_NET(rqstp);
@@ -4996,8 +4997,11 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 	if (!status && filpp)
 		status = nfs4_check_file(rqstp, fhp, s, filpp, tmp_file, flags);
 out:
-	if (s)
+	if (s) {
+		if (!status && cstid)
+			*cstid = s;
 		nfs4_put_stid(s);
+	}
 	return status;
 }
 
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 005c911..f5ab89a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -599,7 +599,8 @@ struct nfsd4_blocked_lock {
 
 extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		struct nfsd4_compound_state *cstate, struct svc_fh *fhp,
-		stateid_t *stateid, int flags, struct file **filp, bool *tmp_file);
+		stateid_t *stateid, int flags, struct file **filp,
+		bool *tmp_file, struct nfs4_stid **cstid);
 __be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		     stateid_t *stateid, unsigned char typemask,
 		     struct nfs4_stid **s, struct nfsd_net *nn);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 07/18] NFSD Unique stateid_t for inter server to server COPY authentication
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (5 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 06/18] NFSD return nfs4_stid in nfs4_preprocess_stateid_op Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 08/18] NFSD CB_OFFLOAD xdr Olga Kornievskaia
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

A per COPY unique stateid_t is required to authenticate the READ from
the destination server (acting as a client) on the source server.

Multiple concurrent inter server to server copies of the same source file
are also supported.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfsd/netns.h     |   8 ++++
 fs/nfsd/nfs4proc.c  |  29 +++++++++-----
 fs/nfsd/nfs4state.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/nfs4xdr.c   |   2 +-
 fs/nfsd/nfsctl.c    |   2 +
 fs/nfsd/state.h     |  18 +++++++++
 fs/nfsd/xdr4.h      |   2 +-
 7 files changed, 158 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 3714231..2c88a95 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -119,6 +119,14 @@ struct nfsd_net {
 	u32 clverifier_counter;
 
 	struct svc_serv *nfsd_serv;
+
+	/*
+	 * clientid and stateid data for construction of net unique COPY
+	 * stateids.
+	 */
+	u32		s2s_cp_cl_id;
+	struct idr	s2s_cp_stateids;
+	spinlock_t	s2s_cp_lock;
 };
 
 /* Simple check to find out if a given net was properly initialized */
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index b079904..feea20b 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1416,6 +1416,13 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	return 0;
 }
 
+/*
+ * Use a unique stateid_t as the cnr_stateid so that the source server
+ * can authenticate the inter server to server copy READ from the
+ * destination server.
+ *
+ * Set the cnr_leasetime to the nfsd4_lease.
+ */
 static __be32
 nfsd4_copy_notify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		  struct nfsd4_copy_notify *cn)
@@ -1424,6 +1431,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	struct file *src = NULL;
 	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 	struct nfs4_stid *stid;
+	struct nfs4_cp_state *cps;
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&cn->cpn_src_stateid, RD_STATE, &src,
@@ -1434,12 +1442,11 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	cn->cpn_sec = nn->nfsd4_lease;
 	cn->cpn_nsec = 0;
 
-
-	/** XXX Save cpn_src_statid, cpn_src, and any other returned source
-	 * server addresses on which the source server is williing to accept
-	 * connections from the destination e.g. what is returned in cpn_src,
-	 * to verify READ from dest server.
-	 */
+	status = nfserrno(-ENOMEM);
+	cps = nfs4_alloc_init_cp_state(nn, nn->nfsd4_lease, stid);
+	if (!cps)
+		goto out;
+	memcpy(&cn->cpn_cnr_stateid, &cps->cp_stateid, sizeof(stateid_t));
 
 	/**
 	 * For now, only return one server address in cpn_src, the
@@ -1448,15 +1455,19 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	cn->cpn_src.nl_nsvr = 1;
 
 	status = nfsd4_set_src_nl4_netaddr(rqstp, &cn->cpn_src);
-	if (status != 0)
+	if (status != 0) {
+		nfs4_free_cp_state(cps);
 		goto out;
+	}
 
-	dprintk("<-- %s cpn_dst %s:%s nl_nsvr %d nl_svr %s:%s\n", __func__,
+	dprintk("<-- %s cpn_dst %s:%s nl_nsvr %d nl_svr %s:%s so_id %d\n",
+		__func__,
 		cn->cpn_dst.u.nl4_addr.na_netid,
 		cn->cpn_dst.u.nl4_addr.na_uaddr,
 		cn->cpn_src.nl_nsvr,
 		cn->cpn_src.nl_svr->u.nl4_addr.na_netid,
-		cn->cpn_src.nl_svr->u.nl4_addr.na_uaddr);
+		cn->cpn_src.nl_svr->u.nl4_addr.na_uaddr,
+		cn->cpn_cnr_stateid.si_opaque.so_id);
 out:
 	return status;
 }
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5ebd992..606009a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -658,6 +658,7 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
 	/* Will be incremented before return to client: */
 	atomic_set(&stid->sc_count, 1);
 	spin_lock_init(&stid->sc_lock);
+	INIT_LIST_HEAD(&stid->sc_cp_list);
 
 	/*
 	 * It shouldn't be a problem to reuse an opaque stateid value.
@@ -674,6 +675,106 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
 	return NULL;
 }
 
+/*
+ * Create a unique stateid_t to represent each COPY - returned by
+ * COPY_NOTIFY cnr_stateid and used in the copy READ from the
+ * destination server. Hang the copy stateids off the OPEN/LOCK/DELEG
+ * stateid from the client open of the source file. Bookeep other
+ * copy state such as the cnr_leasetime.
+ */
+struct nfs4_cp_state *nfs4_alloc_init_cp_state(struct nfsd_net *nn,
+						u64 cpn_sec,
+						struct nfs4_stid *p_stid)
+{
+	struct nfs4_cp_state *cps;
+	int new_id;
+
+	cps = kzalloc(sizeof(struct nfs4_cp_state), GFP_KERNEL);
+	if (!cps)
+		return NULL;
+	idr_preload(GFP_KERNEL);
+	spin_lock(&nn->s2s_cp_lock);
+	new_id = idr_alloc_cyclic(&nn->s2s_cp_stateids, cps, 0, 0, GFP_NOWAIT);
+	spin_unlock(&nn->s2s_cp_lock);
+	idr_preload_end();
+	if (new_id < 0)
+		goto out_free;
+	cps->cp_stateid.si_opaque.so_id = new_id;
+	cps->cp_stateid.si_opaque.so_clid.cl_boot = nn->boot_time;
+	cps->cp_stateid.si_opaque.so_clid.cl_id = nn->s2s_cp_cl_id;
+	cps->cp_p_stid = p_stid;
+	cps->cp_active = false;
+	INIT_LIST_HEAD(&cps->cp_list);
+	cps->cp_timeout = jiffies + (cpn_sec * HZ);
+	list_add(&cps->cp_list, &p_stid->sc_cp_list);
+
+	return cps;
+out_free:
+	kfree(cps);
+	return NULL;
+}
+
+void nfs4_free_cp_state(struct nfs4_cp_state *cps)
+{
+	struct nfsd_net *nn;
+
+	dprintk("--> %s freeing cp_state so_id %d\n", __func__,
+		cps->cp_stateid.si_opaque.so_id);
+
+	nn = net_generic(cps->cp_p_stid->sc_client->net, nfsd_net_id);
+	spin_lock(&nn->s2s_cp_lock);
+	idr_remove(&nn->s2s_cp_stateids, cps->cp_stateid.si_opaque.so_id);
+	spin_unlock(&nn->s2s_cp_lock);
+
+	kfree(cps);
+}
+
+static void nfs4_free_cp_statelist(struct nfs4_stid *stid)
+{
+	struct nfs4_cp_state *cps;
+
+	might_sleep();
+
+	while (!list_empty(&stid->sc_cp_list)) {
+		cps = list_first_entry(&stid->sc_cp_list, struct nfs4_cp_state,
+				       cp_list);
+		list_del(&cps->cp_list);
+		nfs4_free_cp_state(cps);
+	}
+}
+
+/*
+ * A READ from an inter server to server COPY will have a
+ * copy stateid. Return the parent nfs4_stid.
+ */
+static __be32 find_cp_state_parent(struct nfsd_net *nn, stateid_t *st,
+				   struct nfs4_stid **stid)
+{
+	struct nfs4_cp_state *cps = NULL;
+
+	if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
+		return nfserr_bad_stateid;
+	spin_lock(&nn->s2s_cp_lock);
+	cps = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
+	spin_unlock(&nn->s2s_cp_lock);
+	if (!cps) {
+		pr_info("NFSD: find_cp_state cl_id %d so_id %d NOT FOUND\n",
+			st->si_opaque.so_clid.cl_id, st->si_opaque.so_id);
+		return nfserr_bad_stateid;
+	}
+
+	/* Did the inter server to server copy start in time? */
+	if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies))
+		return nfserr_partner_no_auth;
+	else
+		cps->cp_active = true;
+
+	*stid = cps->cp_p_stid;
+	atomic_inc(&cps->cp_p_stid->sc_count);
+
+	return nfs_ok;
+}
+
 static struct nfs4_ol_stateid * nfs4_alloc_open_stateid(struct nfs4_client *clp)
 {
 	struct nfs4_stid *stid;
@@ -819,6 +920,9 @@ static void block_delegations(struct knfsd_fh *fh)
 	}
 	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
 	spin_unlock(&clp->cl_lock);
+
+	nfs4_free_cp_statelist(s);
+
 	s->sc_free(s);
 	if (fp)
 		put_nfs4_file(fp);
@@ -4970,6 +5074,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 	status = nfsd4_lookup_stateid(cstate, stateid,
 				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
 				&s, nn);
+	if (status == nfserr_bad_stateid)
+		status = find_cp_state_parent(nn, stateid, &s);
 	if (status)
 		return status;
 	status = check_stateid_generation(stateid, &s->sc_stateid,
@@ -6943,6 +7049,8 @@ static int nfs4_state_create_net(struct net *net)
 	INIT_LIST_HEAD(&nn->close_lru);
 	INIT_LIST_HEAD(&nn->del_recall_lru);
 	spin_lock_init(&nn->client_lock);
+	spin_lock_init(&nn->s2s_cp_lock);
+	idr_init(&nn->s2s_cp_stateids);
 
 	spin_lock_init(&nn->blocked_locks_lock);
 	INIT_LIST_HEAD(&nn->blocked_locks_lru);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 328ff9c..3235796 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4453,7 +4453,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	*p++ = cpu_to_be32(cn->cpn_nsec);
 
 	/* cnr_stateid */
-	nfserr = nfsd4_encode_stateid(xdr, &cn->cpn_src_stateid);
+	nfserr = nfsd4_encode_stateid(xdr, &cn->cpn_cnr_stateid);
 	if (nfserr)
 		return nfserr;
 
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 73e75ac..ebe57bb 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1218,6 +1218,8 @@ static __net_init int nfsd_init_net(struct net *net)
 	nn->nfsd4_grace = 90;
 	nn->clverifier_counter = prandom_u32();
 	nn->clientid_counter = prandom_u32();
+	nn->s2s_cp_cl_id = nn->clientid_counter++;
+	pr_info("%s s2s_cp_cl_id %d\n", __func__, nn->s2s_cp_cl_id);
 	return 0;
 
 out_idmap_error:
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index f5ab89a..c084c57 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -93,6 +93,7 @@ struct nfs4_stid {
 #define NFS4_REVOKED_DELEG_STID 16
 #define NFS4_CLOSED_DELEG_STID 32
 #define NFS4_LAYOUT_STID 64
+	struct list_head	sc_cp_list;
 	unsigned char		sc_type;
 	stateid_t		sc_stateid;
 	spinlock_t		sc_lock;
@@ -102,6 +103,20 @@ struct nfs4_stid {
 };
 
 /*
+ * An inter server to server copy stateid that is unique per nfsd_net.
+ * cp_stateid is returned as the COPY_NOTIFY cnr_stateid and presented
+ * by the destination server to the source server as a COPY authenticator.
+ * Used to lookup the parent COPY_NOTIFY cna_src_stateid nfs4_stid.
+ */
+struct nfs4_cp_state {
+	stateid_t		cp_stateid;
+	struct list_head	cp_list;	/* per parent nfs4_stid */
+	struct nfs4_stid	*cp_p_stid;	/* pointer to parent */
+	bool			cp_active;	/* has the copy started */
+	unsigned long		cp_timeout;	/* copy timeout */
+};
+
+/*
  * Represents a delegation stateid. The nfs4_client holds references to these
  * and they are put when it is being destroyed or when the delegation is
  * returned by the client:
@@ -606,6 +621,9 @@ __be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		     struct nfs4_stid **s, struct nfsd_net *nn);
 struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *slab,
 				  void (*sc_free)(struct nfs4_stid *));
+struct nfs4_cp_state *nfs4_alloc_init_cp_state(struct nfsd_net *nn, u64 cpn_sec,
+					       struct nfs4_stid *p_stid);
+void nfs4_free_cp_state(struct nfs4_cp_state *cps);
 void nfs4_unhash_stid(struct nfs4_stid *s);
 void nfs4_put_stid(struct nfs4_stid *s);
 void nfs4_inc_and_copy_stateid(stateid_t *dst, struct nfs4_stid *stid);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index aa94295..5b38f0a 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -549,7 +549,7 @@ struct nfsd4_copy_notify {
 	struct nl4_server	cpn_dst;
 
 	/* response */
-	/* Note: cpn_src_stateid is used for cnr_stateid */
+	stateid_t		cpn_cnr_stateid;
 	u64			cpn_sec;
 	u32			cpn_nsec;
 	struct nl4_servers	cpn_src;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 08/18] NFSD CB_OFFLOAD xdr
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (6 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 07/18] NFSD Unique stateid_t for inter server to server COPY authentication Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 09/18] NFSD OFFLOAD_STATUS xdr Olga Kornievskaia
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4callback.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/state.h        |  1 +
 fs/nfsd/xdr4.h         |  6 ++++
 fs/nfsd/xdr4cb.h       | 10 ++++++
 4 files changed, 112 insertions(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 0274db6..d36fc05 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -39,6 +39,7 @@
 #include "state.h"
 #include "netns.h"
 #include "xdr4cb.h"
+#include "xdr4.h"
 
 #define NFSDDBG_FACILITY                NFSDDBG_PROC
 
@@ -105,6 +106,7 @@ enum nfs_cb_opnum4 {
 	OP_CB_WANTS_CANCELLED		= 12,
 	OP_CB_NOTIFY_LOCK		= 13,
 	OP_CB_NOTIFY_DEVICEID		= 14,
+	OP_CB_OFFLOAD			= 15,
 	OP_CB_ILLEGAL			= 10044
 };
 
@@ -677,6 +679,98 @@ static int nfs4_xdr_dec_cb_notify_lock(struct rpc_rqst *rqstp,
 }
 
 /*
+ * struct write_response4 {
+ *	stateid4	wr_callback_id<1>;
+ *	length4		wr_count;
+ *	stable_how4	wr_committed;
+ *	verifier4	wr_writeverf;
+ * };
+ * union offload_info4 switch (nfsstat4 coa_status) {
+ *	case NFS4_OK:
+ *		write_response4	coa_resok4;
+ *	default:
+ *	length4		coa_bytes_copied;
+ * };
+ * struct CB_OFFLOAD4args {
+ *	nfs_fh4		coa_fh;
+ *	stateid4	coa_stateid;
+ *	offload_info4	coa_offload_info;
+ * };
+ */
+static void encode_offload_info4(struct xdr_stream *xdr,
+				 __be32 nfserr,
+				 const struct nfsd4_copy *cp)
+{
+	__be32 *p;
+
+	p = xdr_reserve_space(xdr, 4);
+	*p++ = nfserr;
+	if (!nfserr) {
+		p = xdr_reserve_space(xdr, 4 + 8 + 4 + NFS4_VERIFIER_SIZE);
+		p = xdr_encode_empty_array(p);
+		p = xdr_encode_hyper(p, cp->cp_res.wr_bytes_written);
+		*p++ = cpu_to_be32(cp->cp_res.wr_stable_how);
+		p = xdr_encode_opaque_fixed(p, cp->cp_res.wr_verifier.data,
+					    NFS4_VERIFIER_SIZE);
+	} else {
+		p = xdr_reserve_space(xdr, 8);
+		p = xdr_encode_hyper(p, cp->cp_res.wr_bytes_written);
+	}
+}
+
+static void encode_cb_offload4args(struct xdr_stream *xdr,
+				   __be32 nfserr,
+				   const struct knfsd_fh *fh,
+				   const struct nfsd4_copy *cp,
+				   struct nfs4_cb_compound_hdr *hdr)
+{
+	__be32 *p;
+
+	p = xdr_reserve_space(xdr, 4);
+	*p++ = cpu_to_be32(OP_CB_OFFLOAD);
+	encode_nfs_fh4(xdr, fh);
+	encode_stateid4(xdr, &cp->cp_res.cb_stateid);
+	encode_offload_info4(xdr, nfserr, cp);
+
+	hdr->nops++;
+}
+
+static void nfs4_xdr_enc_cb_offload(struct rpc_rqst *req,
+				    struct xdr_stream *xdr,
+				    const struct nfsd4_callback *cb)
+{
+	const struct nfsd4_copy *cp =
+		container_of(cb, struct nfsd4_copy, cp_cb);
+	struct nfs4_cb_compound_hdr hdr = {
+		.ident = 0,
+		.minorversion = cb->cb_clp->cl_minorversion,
+	};
+
+	encode_cb_compound4args(xdr, &hdr);
+	encode_cb_sequence4args(xdr, cb, &hdr);
+	encode_cb_offload4args(xdr, cp->nfserr, &cp->fh, cp, &hdr);
+	encode_cb_nops(&hdr);
+}
+
+static int nfs4_xdr_dec_cb_offload(struct rpc_rqst *rqstp,
+				   struct xdr_stream *xdr,
+				   struct nfsd4_callback *cb)
+{
+	struct nfs4_cb_compound_hdr hdr;
+	int status;
+
+	status = decode_cb_compound4res(xdr, &hdr);
+	if (unlikely(status))
+		return status;
+
+	if (cb) {
+		status = decode_cb_sequence4res(xdr, cb);
+		if (unlikely(status || cb->cb_seq_status))
+			return status;
+	}
+	return decode_cb_op_status(xdr, OP_CB_OFFLOAD, &cb->cb_status);
+}
+/*
  * RPC procedure tables
  */
 #define PROC(proc, call, argtype, restype)				\
@@ -697,6 +791,7 @@ static int nfs4_xdr_dec_cb_notify_lock(struct rpc_rqst *rqstp,
 	PROC(CB_LAYOUT,	COMPOUND,	cb_layout,	cb_layout),
 #endif
 	PROC(CB_NOTIFY_LOCK,	COMPOUND,	cb_notify_lock,	cb_notify_lock),
+	PROC(CB_OFFLOAD,	COMPOUND,	cb_offload,	cb_offload),
 };
 
 static struct rpc_version nfs_cb_version4 = {
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index c084c57..3b0da32 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -585,6 +585,7 @@ enum nfsd4_cb_op {
 	NFSPROC4_CLNT_CB_NULL = 0,
 	NFSPROC4_CLNT_CB_RECALL,
 	NFSPROC4_CLNT_CB_LAYOUT,
+	NFSPROC4_CLNT_CB_OFFLOAD,
 	NFSPROC4_CLNT_CB_SEQUENCE,
 	NFSPROC4_CLNT_CB_NOTIFY_LOCK,
 };
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 5b38f0a..e6fa914 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -510,6 +510,7 @@ struct nfsd42_write_res {
 	u64			wr_bytes_written;
 	u32			wr_stable_how;
 	nfs4_verifier		wr_verifier;
+	stateid_t		cb_stateid;
 };
 
 /*  support 1 source server for now */
@@ -530,6 +531,11 @@ struct nfsd4_copy {
 
 	/* response */
 	struct nfsd42_write_res	cp_res;
+
+	/* for cb_offload */
+	struct nfsd4_callback	cp_cb;
+	__be32			nfserr;
+	struct knfsd_fh		fh;
 };
 
 struct nfsd4_seek {
diff --git a/fs/nfsd/xdr4cb.h b/fs/nfsd/xdr4cb.h
index 49b719d..7e39913 100644
--- a/fs/nfsd/xdr4cb.h
+++ b/fs/nfsd/xdr4cb.h
@@ -37,3 +37,13 @@
 #define NFS4_dec_cb_notify_lock_sz	(cb_compound_dec_hdr_sz  +      \
 					cb_sequence_dec_sz +            \
 					op_dec_sz)
+#define enc_cb_offload_info_sz		(1 + 1 + 2 + 1 +		\
+					XDR_QUADLEN(NFS4_VERIFIER_SIZE))
+#define NFS4_enc_cb_offload_sz		(cb_compound_enc_hdr_sz +       \
+					cb_sequence_enc_sz +            \
+					enc_nfs4_fh_sz +		\
+					enc_stateid_sz +		\
+					enc_cb_offload_info_sz)
+#define NFS4_dec_cb_offload_sz		(cb_compound_dec_hdr_sz  +      \
+					cb_sequence_dec_sz +            \
+					op_dec_sz)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 09/18] NFSD OFFLOAD_STATUS xdr
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (7 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 08/18] NFSD CB_OFFLOAD xdr Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 10/18] NFSD OFFLOAD_CANCEL xdr Olga Kornievskaia
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c | 20 ++++++++++++++++++++
 fs/nfsd/nfs4xdr.c  | 30 ++++++++++++++++++++++++++++--
 fs/nfsd/xdr4.h     | 10 ++++++++++
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index feea20b..f564da1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1471,6 +1471,13 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 out:
 	return status;
 }
+static __be32
+nfsd4_offload_status(struct svc_rqst *rqstp,
+		     struct nfsd4_compound_state *cstate,
+		     struct nfsd4_offload_status *os)
+{
+	return nfserr_notsupp;
+}
 
 static __be32
 nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
@@ -2462,6 +2469,14 @@ static inline u32 nfsd4_copy_notify_rsize(struct svc_rqst *rqstp,
 		* sizeof(__be32);
 }
 
+static inline u32 nfsd4_offload_status_rsize(struct svc_rqst *rqstp,
+					     struct nfsd4_op *op)
+{
+	return (op_encode_hdr_size +
+		2 /* osr_count */ +
+		1 /* osr_complete<1> optional 0 for now */) * sizeof(__be32);;
+}
+
 #ifdef CONFIG_NFSD_PNFS
 static inline u32 nfsd4_getdeviceinfo_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 {
@@ -2872,6 +2887,11 @@ static inline u32 nfsd4_seek_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 		.op_name = "OP_COPY_NOTIFY",
 		.op_rsize_bop = (nfsd4op_rsize)nfsd4_copy_notify_rsize,
 	},
+	[OP_OFFLOAD_STATUS] = {
+		.op_func = (nfsd4op_func)nfsd4_offload_status,
+		.op_name = "OP_OFFLOAD_STATUS",
+		.op_rsize_bop = (nfsd4op_rsize)nfsd4_offload_status_rsize,
+	},
 };
 
 /**
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 3235796..6c08c52 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1836,6 +1836,13 @@ static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
 }
 
 static __be32
+nfsd4_decode_offload_status(struct nfsd4_compoundargs *argp,
+			    struct nfsd4_offload_status *os)
+{
+	return nfsd4_decode_stateid(argp, &os->stateid);
+}
+
+static __be32
 nfsd4_decode_seek(struct nfsd4_compoundargs *argp, struct nfsd4_seek *seek)
 {
 	DECODE_HEAD;
@@ -1942,7 +1949,7 @@ static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
 	[OP_LAYOUTERROR]	= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_LAYOUTSTATS]	= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_OFFLOAD_CANCEL]	= (nfsd4_dec)nfsd4_decode_notsupp,
-	[OP_OFFLOAD_STATUS]	= (nfsd4_dec)nfsd4_decode_notsupp,
+	[OP_OFFLOAD_STATUS]	= (nfsd4_dec)nfsd4_decode_offload_status,
 	[OP_READ_PLUS]		= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_SEEK]		= (nfsd4_dec)nfsd4_decode_seek,
 	[OP_WRITE_SAME]		= (nfsd4_dec)nfsd4_decode_notsupp,
@@ -4478,6 +4485,25 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 }
 
 static __be32
+nfsd4_encode_offload_status(struct nfsd4_compoundres *resp, __be32 nfserr,
+			    struct nfsd4_offload_status *os)
+{
+	struct xdr_stream *xdr = &resp->xdr;
+	__be32 *p;
+
+	if (nfserr)
+		return nfserr;
+
+	p = xdr_reserve_space(xdr, 8 + 4);
+	if (!p)
+		return nfserr_resource;
+	p = xdr_encode_hyper(p, os->count);
+	*p++ = cpu_to_be32(0);
+
+	return nfserr;
+}
+
+static __be32
 nfsd4_encode_seek(struct nfsd4_compoundres *resp, __be32 nfserr,
 		  struct nfsd4_seek *seek)
 {
@@ -4583,7 +4609,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	[OP_LAYOUTERROR]	= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_LAYOUTSTATS]	= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_OFFLOAD_CANCEL]	= (nfsd4_enc)nfsd4_encode_noop,
-	[OP_OFFLOAD_STATUS]	= (nfsd4_enc)nfsd4_encode_noop,
+	[OP_OFFLOAD_STATUS]	= (nfsd4_enc)nfsd4_encode_offload_status,
 	[OP_READ_PLUS]		= (nfsd4_enc)nfsd4_encode_noop,
 	[OP_SEEK]		= (nfsd4_enc)nfsd4_encode_seek,
 	[OP_WRITE_SAME]		= (nfsd4_enc)nfsd4_encode_noop,
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index e6fa914..c3e6907 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -561,6 +561,15 @@ struct nfsd4_copy_notify {
 	struct nl4_servers	cpn_src;
 };
 
+struct nfsd4_offload_status {
+	/* request */
+	stateid_t	stateid;
+
+	/* response */
+	u64		count;
+	u32		status;
+};
+
 struct nfsd4_op {
 	int					opnum;
 	__be32					status;
@@ -617,6 +626,7 @@ struct nfsd4_op {
 		struct nfsd4_clone		clone;
 		struct nfsd4_copy		copy;
 		struct nfsd4_copy_notify	copy_notify;
+		struct nfsd4_offload_status	offload_status;
 		struct nfsd4_seek		seek;
 	} u;
 	struct nfs4_replay *			replay;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 10/18] NFSD OFFLOAD_CANCEL xdr
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (8 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 09/18] NFSD OFFLOAD_STATUS xdr Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 11/18] NFSD xdr callback stateid in async COPY reply Olga Kornievskaia
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c | 13 +++++++++++++
 fs/nfsd/nfs4xdr.c  |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f564da1..a3b7954 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1480,6 +1480,14 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 }
 
 static __be32
+nfsd4_offload_cancel(struct svc_rqst *rqstp,
+		     struct nfsd4_compound_state *cstate,
+		     struct nfsd4_offload_status *os)
+{
+	return 0;
+}
+
+static __be32
 nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		struct nfsd4_fallocate *fallocate, int flags)
 {
@@ -2892,6 +2900,11 @@ static inline u32 nfsd4_seek_rsize(struct svc_rqst *rqstp, struct nfsd4_op *op)
 		.op_name = "OP_OFFLOAD_STATUS",
 		.op_rsize_bop = (nfsd4op_rsize)nfsd4_offload_status_rsize,
 	},
+	[OP_OFFLOAD_CANCEL] = {
+		.op_func = (nfsd4op_func)nfsd4_offload_cancel,
+		.op_name = "OP_OFFLOAD_CANCEL",
+		.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
+	},
 };
 
 /**
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 6c08c52..26630ae 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1948,7 +1948,7 @@ static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
 	[OP_IO_ADVISE]		= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_LAYOUTERROR]	= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_LAYOUTSTATS]	= (nfsd4_dec)nfsd4_decode_notsupp,
-	[OP_OFFLOAD_CANCEL]	= (nfsd4_dec)nfsd4_decode_notsupp,
+	[OP_OFFLOAD_CANCEL]	= (nfsd4_dec)nfsd4_decode_offload_status,
 	[OP_OFFLOAD_STATUS]	= (nfsd4_dec)nfsd4_decode_offload_status,
 	[OP_READ_PLUS]		= (nfsd4_dec)nfsd4_decode_notsupp,
 	[OP_SEEK]		= (nfsd4_dec)nfsd4_decode_seek,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 11/18] NFSD xdr callback stateid in async COPY reply
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (9 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 10/18] NFSD OFFLOAD_CANCEL xdr Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 12/18] NFSD first draft of async copy Olga Kornievskaia
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4xdr.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 26630ae..a101940 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4356,15 +4356,27 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 #endif /* CONFIG_NFSD_PNFS */
 
 static __be32
-nfsd42_encode_write_res(struct nfsd4_compoundres *resp, struct nfsd42_write_res *write)
+nfsd42_encode_write_res(struct nfsd4_compoundres *resp,
+		struct nfsd42_write_res *write, bool sync)
 {
 	__be32 *p;
+	p = xdr_reserve_space(&resp->xdr, 4);
+	if (!p)
+		return nfserr_resource;
 
-	p = xdr_reserve_space(&resp->xdr, 4 + 8 + 4 + NFS4_VERIFIER_SIZE);
+	if (sync)
+		*p++ = cpu_to_be32(0);
+	else {
+		__be32 nfserr;
+		*p++ = cpu_to_be32(1);
+		nfserr = nfsd4_encode_stateid(&resp->xdr, &write->cb_stateid);
+		if (nfserr)
+			return nfserr;
+	}
+	p = xdr_reserve_space(&resp->xdr, 8 + 4 + NFS4_VERIFIER_SIZE);
 	if (!p)
 		return nfserr_resource;
 
-	*p++ = cpu_to_be32(0);
 	p = xdr_encode_hyper(p, write->wr_bytes_written);
 	*p++ = cpu_to_be32(write->wr_stable_how);
 	p = xdr_encode_opaque_fixed(p, write->wr_verifier.data,
@@ -4425,7 +4437,8 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	__be32 *p;
 
 	if (!nfserr) {
-		nfserr = nfsd42_encode_write_res(resp, &copy->cp_res);
+		nfserr = nfsd42_encode_write_res(resp, &copy->cp_res,
+				copy->cp_synchronous);
 		if (nfserr)
 			return nfserr;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 12/18] NFSD first draft of async copy
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (10 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 11/18] NFSD xdr callback stateid in async COPY reply Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 13/18] NFSD handle OFFLOAD_CANCEL op Olga Kornievskaia
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Upong receiving copy request for the inter copy, nfsd will
establish the mount to the source server.

Asynchronous copies are handled by a single threaded workqueue.
If we get asynchronous request, make sure to copy the needed
arguments/state from the stack before starting the copy. Then
queue work and reply back to the client indicating copy is
asynchronous.

nfsd_copy_file_range() will copy in 4MBchunk so do a loop over
the total number of bytes need to copy. In case a failure
happens in the middle, we can return an error as well as how
much we copied so far. Once done creating a workitem for the
callback workqueue and send CB_OFFLOAD with the results.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c  | 194 +++++++++++++++++++++++++++++++++++++++++++---------
 fs/nfsd/nfs4state.c |   9 ++-
 fs/nfsd/state.h     |   2 +
 fs/nfsd/xdr4.h      |   8 +++
 4 files changed, 178 insertions(+), 35 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a3b7954..9dfb20b 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -77,6 +77,21 @@
 { }
 #endif
 
+static struct workqueue_struct *copy_wq;
+
+int nfsd4_create_copy_queue(void)
+{
+	copy_wq = create_singlethread_workqueue("nfsd4_copy");
+	if (!copy_wq)
+		return -ENOMEM;
+	return 0;
+}
+
+void nfsd4_destroy_copy_queue(void)
+{
+	destroy_workqueue(copy_wq);
+}
+
 #define NFSDDBG_FACILITY		NFSDDBG_PROC
 
 static u32 nfsd_attrmask[] = {
@@ -1218,8 +1233,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 static struct vfsmount *
 nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
 			struct nfsd4_compound_state *cstate,
-			struct nfsd4_copy *copy, struct file **src,
-			struct file **dst)
+			struct nfsd4_copy *copy)
 {
 	struct svc_fh *s_fh = NULL;
 	stateid_t *s_stid = &copy->cp_src_stateid;
@@ -1232,7 +1246,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	/* Verify the destination stateid and set dst struct file*/
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&copy->cp_dst_stateid,
-					WR_STATE, dst, NULL, NULL);
+					WR_STATE, &copy->fh_dst, NULL, NULL);
 	if (status) {
 		ss_mnt = ERR_PTR(be32_to_cpu(status));
 		goto out;
@@ -1258,7 +1272,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 		nfsd4_interssc_disconnect(ss_mnt);
 		return ERR_CAST(filp);
 	}
-	*src = filp;
+	copy->fh_src = filp;
 out:
 	return ss_mnt;
 }
@@ -1280,8 +1294,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 static struct vfsmount *
 nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
 			struct nfsd4_compound_state *cstate,
-			struct nfsd4_copy *copy, struct file **src,
-			struct file **dst)
+			struct nfsd4_copy *copy)
 {
 	return ERR_PTR(-EINVAL);
 }
@@ -1297,13 +1310,13 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 static __be32
 nfsd4_setup_intra_ssc(struct svc_rqst *rqstp,
 		      struct nfsd4_compound_state *cstate,
-		      struct nfsd4_copy *copy, struct file **src,
-		      struct file **dst)
+		      struct nfsd4_copy *copy)
 {
 	__be32 status;
 
-	status = nfsd4_verify_copy(rqstp, cstate, &copy->cp_src_stateid, src,
-				   &copy->cp_dst_stateid, dst);
+	status = nfsd4_verify_copy(rqstp, cstate, &copy->cp_src_stateid,
+				   &copy->fh_src, &copy->cp_dst_stateid,
+				   &copy->fh_dst);
 	if (status)
 		goto out;
 
@@ -1323,47 +1336,160 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	fput(dst);
 }
 
+static void nfsd4_cb_offload_release(struct nfsd4_callback *cb)
+{
+	struct nfsd4_copy *copy = container_of(cb, struct nfsd4_copy, cp_cb);
+
+	kfree(copy);
+}
+
+static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
+				 struct rpc_task *task)
+{
+	return 1;
+}
+
+static const struct nfsd4_callback_ops nfsd4_cb_offload_ops = {
+	.release = nfsd4_cb_offload_release,
+	.done = nfsd4_cb_offload_done
+};
+
+static int nfsd4_init_copy_res(struct nfsd4_copy *copy, bool sync)
+{
+	memcpy(&copy->cp_res.cb_stateid, &copy->cp_dst_stateid,
+		sizeof(copy->cp_dst_stateid));
+	copy->cp_res.wr_stable_how = NFS_UNSTABLE;
+	copy->cp_consecutive = 1;
+	copy->cp_synchronous = sync;
+	gen_boot_verifier(&copy->cp_res.wr_verifier, copy->net);
+
+	return nfs_ok;
+}
+
+static int _nfsd_copy_file_range(struct nfsd4_copy *copy)
+{
+	ssize_t bytes_copied = 0;
+	size_t bytes_total = copy->cp_count;
+	u64 src_pos = copy->cp_src_pos;
+	u64 dst_pos = copy->cp_dst_pos;
+
+	do {
+		bytes_copied = nfsd_copy_file_range(copy->fh_src, src_pos,
+				copy->fh_dst, dst_pos, bytes_total);
+		if (bytes_copied <= 0)
+			break;
+		bytes_total -= bytes_copied;
+		copy->cp_res.wr_bytes_written += bytes_copied;
+		src_pos += bytes_copied;
+		dst_pos += bytes_copied;
+	} while (bytes_total > 0);
+	return bytes_copied;
+}
+
+static int nfsd4_do_copy(struct nfsd4_copy *copy, bool sync)
+{
+	__be32 status;
+	ssize_t bytes;
+
+	bytes = _nfsd_copy_file_range(copy);
+	if (bytes < 0)
+		status = nfserrno(bytes);
+	else
+		status = nfsd4_init_copy_res(copy, sync);
+
+	if (copy->cp_src.nl_nsvr > 0)   /* Inter server SSC */
+		nfsd4_cleanup_inter_ssc(copy->ss_mnt, copy->fh_src,
+				copy->fh_dst);
+	else
+		nfsd4_cleanup_intra_ssc(copy->fh_src, copy->fh_dst);
+
+	return status;
+}
+
+static void dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
+{
+	memcpy(&dst->cp_src_stateid, &src->cp_src_stateid, sizeof(stateid_t));
+	memcpy(&dst->cp_dst_stateid, &src->cp_dst_stateid, sizeof(stateid_t));
+	dst->cp_src_pos = src->cp_src_pos;
+	dst->cp_dst_pos = src->cp_dst_pos;
+	dst->cp_count = src->cp_count;
+	memcpy(&dst->cp_src, &src->cp_src, sizeof(struct nl4_servers));
+	dst->cp_consecutive = src->cp_consecutive;
+	dst->cp_synchronous = src->cp_synchronous;
+	memcpy(&dst->cp_res, &src->cp_res, sizeof(src->cp_res));
+	/* skipping nfsd4_callback */
+	memcpy(&dst->fh, &src->fh, sizeof(src->fh));
+	dst->fh = src->fh;
+	dst->cp_clp = src->cp_clp;
+	dst->fh_src = src->fh_src;
+	dst->fh_dst = src->fh_dst;
+	dst->ss_mnt = src->ss_mnt;
+	dst->net = src->net;
+}
+
+static void nfsd4_do_async_copy(struct work_struct *work)
+{
+	struct nfsd4_copy *copy =
+		container_of(work, struct nfsd4_copy, cp_work);
+	struct nfsd4_copy *cb_copy;
+
+	copy->nfserr = nfsd4_do_copy(copy, 0);
+	cb_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
+	if (!cb_copy)
+		goto out;
+	memcpy(&cb_copy->cp_res, &copy->cp_res, sizeof(copy->cp_res));
+	cb_copy->cp_clp = copy->cp_clp;
+	cb_copy->nfserr = copy->nfserr;
+	memcpy(&cb_copy->fh, &copy->fh, sizeof(copy->fh));
+	nfsd4_init_cb(&cb_copy->cp_cb, cb_copy->cp_clp,
+			&nfsd4_cb_offload_ops, NFSPROC4_CLNT_CB_OFFLOAD);
+	nfsd4_run_cb(&cb_copy->cp_cb);
+out:
+	kfree(copy);
+}
+
 static __be32
 nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfsd4_copy *copy)
 {
-	struct vfsmount *ss_mnt = NULL;
-	struct file *src, *dst;
 	__be32 status;
-	ssize_t bytes;
 
 	if (copy->cp_src.nl_nsvr > 0) { /* Inter server SSC */
-		ss_mnt = nfsd4_setup_inter_ssc(rqstp, cstate, copy, &src, &dst);
-		if (IS_ERR(ss_mnt)) {
-			status = nfserrno(PTR_ERR(ss_mnt));
+		copy->ss_mnt = nfsd4_setup_inter_ssc(rqstp, cstate, copy);
+		if (IS_ERR(copy->ss_mnt)) {
+			status = nfserrno(PTR_ERR(copy->ss_mnt));
 			goto out;
 		}
 	} else {
-		status = nfsd4_setup_intra_ssc(rqstp, cstate, copy, &src, &dst);
+		status = nfsd4_setup_intra_ssc(rqstp, cstate, copy);
 		if (status)
 			goto out;
 	}
 
-	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
-			dst, copy->cp_dst_pos, copy->cp_count);
-
-	if (bytes < 0)
-		status = nfserrno(bytes);
-	else {
-		copy->cp_res.wr_bytes_written = bytes;
-		copy->cp_res.wr_stable_how = NFS_UNSTABLE;
-		copy->cp_consecutive = 1;
-		copy->cp_synchronous = 1;
-		gen_boot_verifier(&copy->cp_res.wr_verifier, SVC_NET(rqstp));
-		status = nfs_ok;
+	copy->cp_clp = cstate->clp;
+	memcpy(&copy->fh, &cstate->current_fh.fh_handle,
+		sizeof(struct knfsd_fh));
+	copy->net = SVC_NET(rqstp);
+	if (!copy->cp_synchronous) {
+		struct nfsd4_copy *async_copy;
+
+		status = nfsd4_init_copy_res(copy, 0);
+		async_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
+		if (!async_copy)
+			goto out_err;
+		dup_copy_fields(copy, async_copy);
+		memcpy(&copy->cp_res.cb_stateid, &copy->cp_dst_stateid,
+			sizeof(copy->cp_dst_stateid));
+		INIT_WORK(&async_copy->cp_work, nfsd4_do_async_copy);
+		queue_work(copy_wq, &async_copy->cp_work);
+	} else {
+		status = nfsd4_do_copy(copy, 1);
 	}
-
-	if (copy->cp_src.nl_nsvr > 0)   /* Inter server SSC */
-		nfsd4_cleanup_inter_ssc(ss_mnt, src, dst);
-	else
-		nfsd4_cleanup_intra_ssc(src, dst);
 out:
 	return status;
+out_err:
+	status = nfserrno(-ENOMEM);
+	goto out;
 }
 
 static __be32
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 606009a..ca5e9cd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7136,8 +7136,14 @@ static int nfs4_state_create_net(struct net *net)
 		goto out_free_laundry;
 
 	set_max_delegations();
-	return 0;
 
+	ret = nfsd4_create_copy_queue();
+	if (ret)
+		goto out_free_callback;
+
+	return 0;
+out_free_callback:
+	nfsd4_destroy_callback_queue();
 out_free_laundry:
 	destroy_workqueue(laundry_wq);
 out_cleanup_cred:
@@ -7200,6 +7206,7 @@ static int nfs4_state_create_net(struct net *net)
 	destroy_workqueue(laundry_wq);
 	nfsd4_destroy_callback_queue();
 	cleanup_callback_cred();
+	nfsd4_destroy_copy_queue();
 }
 
 static void
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 3b0da32..2acea23 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -649,6 +649,8 @@ extern void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
 							struct nfsd_net *nn);
 extern bool nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn);
+extern int nfsd4_create_copy_queue(void);
+extern void nfsd4_destroy_copy_queue(void);
 
 struct nfs4_file *find_file(struct knfsd_fh *fh);
 void put_nfs4_file(struct nfs4_file *fi);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index c3e6907..d5b6b40 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -536,6 +536,14 @@ struct nfsd4_copy {
 	struct nfsd4_callback	cp_cb;
 	__be32			nfserr;
 	struct knfsd_fh		fh;
+
+	struct work_struct	cp_work;
+	struct nfs4_client	*cp_clp;
+
+	struct file		*fh_src;
+	struct file		*fh_dst;
+	struct vfsmount		*ss_mnt;
+	struct net		*net;
 };
 
 struct nfsd4_seek {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 13/18] NFSD handle OFFLOAD_CANCEL op
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (11 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 12/18] NFSD first draft of async copy Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 14/18] NFSD stop queued async copies on client shutdown Olga Kornievskaia
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Upon receiving OFFLOAD_CANCEL search the list of copy stateids,
if found remove it and it will lead to following READs failing
with ERR_PARTNER_NO_AUTH.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c  | 11 +++++++++++
 fs/nfsd/nfs4state.c | 30 ++++++++++++++++++++----------
 fs/nfsd/state.h     |  2 ++
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 9dfb20b..4b1dcdd 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1610,6 +1610,17 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		     struct nfsd4_compound_state *cstate,
 		     struct nfsd4_offload_status *os)
 {
+
+	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+	__be32 status;
+	struct nfs4_cp_state *state = NULL;
+
+	status = find_cp_state(nn, &os->stateid, &state);
+	if (!status) {
+		list_del(&state->cp_list);
+		nfs4_free_cp_state(state);
+	}
+
 	return 0;
 }
 
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ca5e9cd..b77041d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -743,6 +743,22 @@ static void nfs4_free_cp_statelist(struct nfs4_stid *stid)
 	}
 }
 
+__be32 find_cp_state(struct nfsd_net *nn, stateid_t *st,
+			    struct nfs4_cp_state **cps)
+{
+	struct nfs4_cp_state *state = NULL;
+
+	if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
+		return nfserr_bad_stateid;
+	spin_lock(&nn->s2s_cp_lock);
+	state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
+	spin_unlock(&nn->s2s_cp_lock);
+	if (!state)
+		return nfserr_bad_stateid;
+	*cps = state;
+	return 0;
+}
+
 /*
  * A READ from an inter server to server COPY will have a
  * copy stateid. Return the parent nfs4_stid.
@@ -750,18 +766,12 @@ static void nfs4_free_cp_statelist(struct nfs4_stid *stid)
 static __be32 find_cp_state_parent(struct nfsd_net *nn, stateid_t *st,
 				   struct nfs4_stid **stid)
 {
+	__be32 status;
 	struct nfs4_cp_state *cps = NULL;
 
-	if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
-		return nfserr_bad_stateid;
-	spin_lock(&nn->s2s_cp_lock);
-	cps = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
-	spin_unlock(&nn->s2s_cp_lock);
-	if (!cps) {
-		pr_info("NFSD: find_cp_state cl_id %d so_id %d NOT FOUND\n",
-			st->si_opaque.so_clid.cl_id, st->si_opaque.so_id);
-		return nfserr_bad_stateid;
-	}
+	status = find_cp_state(nn, st, &cps);
+	if (status)
+		return status;
 
 	/* Did the inter server to server copy start in time? */
 	if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies))
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 2acea23..ebf968d 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -651,6 +651,8 @@ extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
 extern bool nfs4_has_reclaimed_state(const char *name, struct nfsd_net *nn);
 extern int nfsd4_create_copy_queue(void);
 extern void nfsd4_destroy_copy_queue(void);
+extern __be32 find_cp_state(struct nfsd_net *nn, stateid_t *st,
+			struct nfs4_cp_state **cps);
 
 struct nfs4_file *find_file(struct knfsd_fh *fh);
 void put_nfs4_file(struct nfs4_file *fi);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 14/18] NFSD stop queued async copies on client shutdown
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (12 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 13/18] NFSD handle OFFLOAD_CANCEL op Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 15/18] NFSD create new stateid for async copy Olga Kornievskaia
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

If client is shutting down and there are still async copies going
on, then stop queued async copies.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c  | 9 +++++++++
 fs/nfsd/nfs4state.c | 1 +
 fs/nfsd/state.h     | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 4b1dcdd..d26d720 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -92,6 +92,12 @@ void nfsd4_destroy_copy_queue(void)
 	destroy_workqueue(copy_wq);
 }
 
+void nfsd4_shutdown_copy(struct nfs4_client *clp)
+{
+	set_bit(NFSD4_CLIENT_COPY_KILL, &clp->cl_flags);
+	flush_workqueue(copy_wq);
+}
+
 #define NFSDDBG_FACILITY		NFSDDBG_PROC
 
 static u32 nfsd_attrmask[] = {
@@ -1433,6 +1439,9 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		container_of(work, struct nfsd4_copy, cp_work);
 	struct nfsd4_copy *cb_copy;
 
+	if (test_bit(NFSD4_CLIENT_COPY_KILL, &copy->cp_clp->cl_flags))
+		return;
+
 	copy->nfserr = nfsd4_do_copy(copy, 0);
 	cb_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
 	if (!cb_copy)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b77041d..16d8509 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1995,6 +1995,7 @@ static __be32 mark_client_expired_locked(struct nfs4_client *clp)
 	}
 	nfsd4_return_all_client_layouts(clp);
 	nfsd4_shutdown_callback(clp);
+	nfsd4_shutdown_copy(clp);
 	if (clp->cl_cb_conn.cb_xprt)
 		svc_xprt_put(clp->cl_cb_conn.cb_xprt);
 	free_client(clp);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index ebf968d..fa749763 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -336,6 +336,7 @@ struct nfs4_client {
 #define NFSD4_CLIENT_RECLAIM_COMPLETE	(3)	/* reclaim_complete done */
 #define NFSD4_CLIENT_CONFIRMED		(4)	/* client is confirmed */
 #define NFSD4_CLIENT_UPCALL_LOCK	(5)	/* upcall serialization */
+#define NFSD4_CLIENT_COPY_KILL		(6)	/* stop copy workqueue */
 #define NFSD4_CLIENT_CB_FLAG_MASK	(1 << NFSD4_CLIENT_CB_UPDATE | \
 					 1 << NFSD4_CLIENT_CB_KILL)
 	unsigned long		cl_flags;
@@ -645,6 +646,7 @@ extern void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 extern int nfsd4_create_callback_queue(void);
 extern void nfsd4_destroy_callback_queue(void);
 extern void nfsd4_shutdown_callback(struct nfs4_client *);
+extern void nfsd4_shutdown_copy(struct nfs4_client *clp);
 extern void nfsd4_prepare_cb_recall(struct nfs4_delegation *dp);
 extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(const char *name,
 							struct nfsd_net *nn);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 15/18] NFSD create new stateid for async copy
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (13 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 14/18] NFSD stop queued async copies on client shutdown Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 16/18] NFSD define EBADF in nfserrno Olga Kornievskaia
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Previously dst copy stateid was used as reply to the asynchronous
copy. Instead, generate a new stateid and the destination server
will keep a list of the stateids. If it receives a cancel, it can
decide to forego sending the CB_OFFLOAD.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c | 71 ++++++++++++++++++++++++++++++++++++++----------------
 fs/nfsd/state.h    |  3 +++
 fs/nfsd/xdr4.h     |  2 ++
 3 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index d26d720..f07eae1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1039,7 +1039,8 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 static __be32
 nfsd4_verify_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		  stateid_t *src_stateid, struct file **src,
-		  stateid_t *dst_stateid, struct file **dst)
+		  stateid_t *dst_stateid, struct file **dst,
+		  struct nfs4_stid **stid)
 {
 	__be32 status;
 
@@ -1053,7 +1054,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					    dst_stateid, WR_STATE, dst, NULL,
-					    NULL);
+					    stid);
 	if (status) {
 		dprintk("NFSD: %s: couldn't process dst stateid!\n", __func__);
 		goto out_put_src;
@@ -1083,7 +1084,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
 	__be32 status;
 
 	status = nfsd4_verify_copy(rqstp, cstate, &clone->cl_src_stateid, &src,
-				   &clone->cl_dst_stateid, &dst);
+				   &clone->cl_dst_stateid, &dst, NULL);
 	if (status)
 		goto out;
 
@@ -1252,7 +1253,8 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 	/* Verify the destination stateid and set dst struct file*/
 	status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
 					&copy->cp_dst_stateid,
-					WR_STATE, &copy->fh_dst, NULL, NULL);
+					WR_STATE, &copy->fh_dst, NULL,
+					&copy->stid);
 	if (status) {
 		ss_mnt = ERR_PTR(be32_to_cpu(status));
 		goto out;
@@ -1322,7 +1324,7 @@ extern struct file *nfs42_ssc_open(struct vfsmount *ss_mnt,
 
 	status = nfsd4_verify_copy(rqstp, cstate, &copy->cp_src_stateid,
 				   &copy->fh_src, &copy->cp_dst_stateid,
-				   &copy->fh_dst);
+				   &copy->fh_dst, &copy->stid);
 	if (status)
 		goto out;
 
@@ -1362,8 +1364,6 @@ static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
 
 static int nfsd4_init_copy_res(struct nfsd4_copy *copy, bool sync)
 {
-	memcpy(&copy->cp_res.cb_stateid, &copy->cp_dst_stateid,
-		sizeof(copy->cp_dst_stateid));
 	copy->cp_res.wr_stable_how = NFS_UNSTABLE;
 	copy->cp_consecutive = 1;
 	copy->cp_synchronous = sync;
@@ -1378,7 +1378,7 @@ static int _nfsd_copy_file_range(struct nfsd4_copy *copy)
 	size_t bytes_total = copy->cp_count;
 	u64 src_pos = copy->cp_src_pos;
 	u64 dst_pos = copy->cp_dst_pos;
-
+	bool cancelled = false;
 	do {
 		bytes_copied = nfsd_copy_file_range(copy->fh_src, src_pos,
 				copy->fh_dst, dst_pos, bytes_total);
@@ -1388,7 +1388,10 @@ static int _nfsd_copy_file_range(struct nfsd4_copy *copy)
 		copy->cp_res.wr_bytes_written += bytes_copied;
 		src_pos += bytes_copied;
 		dst_pos += bytes_copied;
-	} while (bytes_total > 0);
+		spin_lock(&copy->cps->cp_lock);
+		cancelled = copy->cps->cp_cancelled;
+		spin_unlock(&copy->cps->cp_lock);
+	} while (bytes_total > 0 && !cancelled);
 	return bytes_copied;
 }
 
@@ -1431,6 +1434,8 @@ static void dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
 	dst->fh_dst = src->fh_dst;
 	dst->ss_mnt = src->ss_mnt;
 	dst->net = src->net;
+	dst->stid = src->stid;
+	dst->cps = src->cps;
 }
 
 static void nfsd4_do_async_copy(struct work_struct *work)
@@ -1443,17 +1448,20 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		return;
 
 	copy->nfserr = nfsd4_do_copy(copy, 0);
-	cb_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
-	if (!cb_copy)
-		goto out;
-	memcpy(&cb_copy->cp_res, &copy->cp_res, sizeof(copy->cp_res));
-	cb_copy->cp_clp = copy->cp_clp;
-	cb_copy->nfserr = copy->nfserr;
-	memcpy(&cb_copy->fh, &copy->fh, sizeof(copy->fh));
-	nfsd4_init_cb(&cb_copy->cp_cb, cb_copy->cp_clp,
+	if (!copy->cps->cp_cancelled) {
+		cb_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
+		if (!cb_copy)
+			goto out;
+		memcpy(&cb_copy->cp_res, &copy->cp_res, sizeof(copy->cp_res));
+		cb_copy->cp_clp = copy->cp_clp;
+		cb_copy->nfserr = copy->nfserr;
+		memcpy(&cb_copy->fh, &copy->fh, sizeof(copy->fh));
+		nfsd4_init_cb(&cb_copy->cp_cb, cb_copy->cp_clp,
 			&nfsd4_cb_offload_ops, NFSPROC4_CLNT_CB_OFFLOAD);
-	nfsd4_run_cb(&cb_copy->cp_cb);
+		nfsd4_run_cb(&cb_copy->cp_cb);
+	}
 out:
+	nfs4_put_stid(copy->stid);
 	kfree(copy);
 }
 
@@ -1480,15 +1488,26 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		sizeof(struct knfsd_fh));
 	copy->net = SVC_NET(rqstp);
 	if (!copy->cp_synchronous) {
+		struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 		struct nfsd4_copy *async_copy;
 
 		status = nfsd4_init_copy_res(copy, 0);
 		async_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL);
 		if (!async_copy)
 			goto out_err;
+		copy->cps = nfs4_alloc_init_cp_state(nn, nn->nfsd4_lease,
+				copy->stid);
+		if (!copy->cps)
+			goto out_err;
+		/* take a reference on the parent stateid so it's not
+		 * not freed by the copy compound
+		 */
+		atomic_inc(&copy->stid->sc_count);
+		copy->cps->cp_dst_async = true;
+		spin_lock_init(&copy->cps->cp_lock);
+		memcpy(&copy->cp_res.cb_stateid, &copy->cps->cp_stateid,
+			sizeof(copy->cps->cp_stateid));
 		dup_copy_fields(copy, async_copy);
-		memcpy(&copy->cp_res.cb_stateid, &copy->cp_dst_stateid,
-			sizeof(copy->cp_dst_stateid));
 		INIT_WORK(&async_copy->cp_work, nfsd4_do_async_copy);
 		queue_work(copy_wq, &async_copy->cp_work);
 	} else {
@@ -1625,7 +1644,17 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 	struct nfs4_cp_state *state = NULL;
 
 	status = find_cp_state(nn, &os->stateid, &state);
-	if (!status) {
+	/* on the source server, remove stateid from list of acceptable
+	 * stateid to force reads to fail. on the destination server,
+	 * callback offload stateids shouldn't be removed and instead
+	 * mark the offload copy state to be cancelled.
+	 */
+	if (state) {
+		spin_lock(&state->cp_lock);
+		state->cp_cancelled = true;
+		spin_unlock(&state->cp_lock);
+	}
+	if (!status && !state->cp_dst_async) {
 		list_del(&state->cp_list);
 		nfs4_free_cp_state(state);
 	}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index fa749763..70ee3fe 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -114,6 +114,9 @@ struct nfs4_cp_state {
 	struct nfs4_stid	*cp_p_stid;	/* pointer to parent */
 	bool			cp_active;	/* has the copy started */
 	unsigned long		cp_timeout;	/* copy timeout */
+	bool			cp_dst_async;	/* async copy on dst server */
+	bool			cp_cancelled;	/* copy cancelled */
+	spinlock_t		cp_lock;
 };
 
 /*
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index d5b6b40..c6b8e596 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -544,6 +544,8 @@ struct nfsd4_copy {
 	struct file		*fh_dst;
 	struct vfsmount		*ss_mnt;
 	struct net		*net;
+	struct nfs4_stid	*stid;
+	struct nfs4_cp_state	*cps;
 };
 
 struct nfsd4_seek {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 16/18] NFSD define EBADF in nfserrno
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (14 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 15/18] NFSD create new stateid for async copy Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 17/18] NFSD support OFFLOAD_STATUS Olga Kornievskaia
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

vfs_copy_file_range() can return EBADF which currently nfsd does
not recognize.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfsproc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index fa82b77..89d5f6c 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -786,6 +786,7 @@ struct svc_version	nfsd_version2 = {
 		{ nfserr_serverfault, -ESERVERFAULT },
 		{ nfserr_serverfault, -ENFILE },
 		{ nfserr_io, -EUCLEAN },
+		{ nfserr_badhandle, -EBADF },
 	};
 	int	i;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 17/18] NFSD support OFFLOAD_STATUS
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (15 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 16/18] NFSD define EBADF in nfserrno Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-02 16:01 ` [RFC v1 18/18] NFSD remove copy stateid when vfs_copy_file_range completes Olga Kornievskaia
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Update number of bytes copied in the copy state and query that
value under lock if OFFLOAD_STATUS operation received.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c | 14 +++++++++++++-
 fs/nfsd/state.h    |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f07eae1..8a1860e 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1390,6 +1390,7 @@ static int _nfsd_copy_file_range(struct nfsd4_copy *copy)
 		dst_pos += bytes_copied;
 		spin_lock(&copy->cps->cp_lock);
 		cancelled = copy->cps->cp_cancelled;
+		copy->cps->cp_bytes_copied = copy->cp_res.wr_bytes_written;
 		spin_unlock(&copy->cps->cp_lock);
 	} while (bytes_total > 0 && !cancelled);
 	return bytes_copied;
@@ -1630,7 +1631,18 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		     struct nfsd4_compound_state *cstate,
 		     struct nfsd4_offload_status *os)
 {
-	return nfserr_notsupp;
+	struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+	__be32 status;
+	struct nfs4_cp_state *state = NULL;
+
+	status = find_cp_state(nn, &os->stateid, &state);
+
+	if (state) {
+		spin_lock(&state->cp_lock);
+		os->count = state->cp_bytes_copied;
+		spin_unlock(&state->cp_lock);
+	}
+	return status;
 }
 
 static __be32
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 70ee3fe..25c5d82 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -117,6 +117,7 @@ struct nfs4_cp_state {
 	bool			cp_dst_async;	/* async copy on dst server */
 	bool			cp_cancelled;	/* copy cancelled */
 	spinlock_t		cp_lock;
+	ssize_t			cp_bytes_copied;/* copy progress */
 };
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC v1 18/18] NFSD remove copy stateid when vfs_copy_file_range completes
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (16 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 17/18] NFSD support OFFLOAD_STATUS Olga Kornievskaia
@ 2017-03-02 16:01 ` Olga Kornievskaia
  2017-03-17 21:21 ` [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
  2017-09-01 19:41 ` J. Bruce Fields
  19 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-02 16:01 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

Before this patch, the copy state information sticks around until
destination file is closed. However, we can release the resources
earlier when copy is done.

In this patch, I choose to remove the stateid after vfs copy is
done but before CB_OFFLOAD is done. The reason is simple because
I don't need to keep track of the nfsd_net structure then when
doing the callback (alternatively, we also need to copy the
nfsd_net structure for the callback). The drawback is that time
copy state information is available for query by OFFLOAD_STATUS
is slightly less.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4proc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 8a1860e..792cb7a 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1462,6 +1462,8 @@ static void nfsd4_do_async_copy(struct work_struct *work)
 		nfsd4_run_cb(&cb_copy->cp_cb);
 	}
 out:
+	list_del(&copy->cps->cp_list);
+	nfs4_free_cp_state(copy->cps);
 	nfs4_put_stid(copy->stid);
 	kfree(copy);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (17 preceding siblings ...)
  2017-03-02 16:01 ` [RFC v1 18/18] NFSD remove copy stateid when vfs_copy_file_range completes Olga Kornievskaia
@ 2017-03-17 21:21 ` Olga Kornievskaia
  2017-03-20 15:30   ` J. Bruce Fields
  2017-09-01 19:41 ` J. Bruce Fields
  19 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-17 21:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Bruce,

Any comments on this?

Should I separate out the async copy support and send it out like I
did for the client side? I guess the idea is to get the async first
and then add inter when the VFS layer situation is resolved?


On Thu, Mar 2, 2017 at 11:01 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
> This is server-side support for NFSv4.2 inter and async COPY which is
> on top of existing intra sync COPY. It also depends on the NFS client
> piece for NFSv4.2 to do client side of the destination server piece
> in the inter SSC.
>
> NFSD determines if COPY is intra or inter and if sync or async. For
> inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
> (superblock). It will destroy the mount point when copy is done.
>
> To do asynchronous copies, NFSD creates a single threaded workqueue
> and does not tie up an NFSD thread to complete the copy. Upon receiving
> the COPY, it generates a unique copy stateid (stores a global list
> for keeping track of state for OFFLOAD_STATUS to be queried by),
> queues up a workqueue for the copy, and replies back to the client.
> nfsd4_copy arguments that are allocated on the stack are copied for
> the work item.
>
> In the async copy handler, it calls into VFS copy_file_range() with
> 4MB chunks and loops until it completes the requested copy size. If
> error is encountered it's saved but also we save the amount of data
> copied so far. Once done, the results are queued for the callback
> workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
> up the copy state information stored in the global list when cope is
> done and not doing it when callback's release function (it could be
> done there alternatively if needed it?).
>
> On the source server, upon receiving a COPY_NOTIFY, it generate a
> unique stateid that's kept in the global list. Upon receiving a READ
> with a stateid, the code checks the normal list of open stateid and
> now additionally, it'll check the copy state list as well before
> deciding to either fail with BAD_STATEID or find one that matches.
> The stored stateid is only valid to be used for the first time
> with a choosen lease period (90s currently). When the source server
> received an OFFLOAD_CANCEL, it will remove the stateid from the
> global list. Otherwise, the copy stateid is removed upon the removal
> of its "parent" stateid (open/lock/delegation stateid).
>
>
> Andy Adamson (7):
>   NFSD add ca_source_server<> to COPY
>   NFSD generalize nfsd4_compound_state flag names
>   NFSD: allow inter server COPY to have a STALE source server fh
>   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
>   NFSD add COPY_NOTIFY operation
>   NFSD add nfs4 inter ssc to nfsd4_copy
>   NFSD Unique stateid_t for inter server to server COPY authentication
>
> Olga Kornievskaia (10):
>   NFSD CB_OFFLOAD xdr
>   NFSD OFFLOAD_STATUS xdr
>   NFSD OFFLOAD_CANCEL xdr
>   NFSD xdr callback stateid in async COPY reply
>   NFSD first draft of async copy
>   NFSD handle OFFLOAD_CANCEL op
>   NFSD stop queued async copies on client shutdown
>   NFSD create new stateid for async copy
>   NFSD define EBADF in nfserrno
>   NFSD support OFFLOAD_STATUS
>
>  fs/nfsd/Kconfig        |  10 +
>  fs/nfsd/netns.h        |   8 +
>  fs/nfsd/nfs4callback.c |  95 +++++++
>  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
>  fs/nfsd/nfs4state.c    | 142 +++++++++-
>  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
>  fs/nfsd/nfsctl.c       |   2 +
>  fs/nfsd/nfsd.h         |   2 +
>  fs/nfsd/nfsproc.c      |   1 +
>  fs/nfsd/state.h        |  32 ++-
>  fs/nfsd/xdr4.h         |  53 +++-
>  fs/nfsd/xdr4cb.h       |  10 +
>  include/linux/nfs4.h   |   1 +
>  13 files changed, 1273 insertions(+), 53 deletions(-)
>
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-03-17 21:21 ` [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
@ 2017-03-20 15:30   ` J. Bruce Fields
  2017-03-27 21:49     ` Olga Kornievskaia
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-03-20 15:30 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Fri, Mar 17, 2017 at 05:21:52PM -0400, Olga Kornievskaia wrote:
> Any comments on this?

I was hoping to get some more details from Christoph on his objections
to cross-superblock copies.  From the vfs point of view it seems not
that different from splice, so I'm not seeing a fundamental obstacle.

I'll give the patches a look.

> Should I separate out the async copy support and send it out like I
> did for the client side? I guess the idea is to get the async first
> and then add inter when the VFS layer situation is resolved?

If you submit the async stuff first, there needs to be some reason we'd
want it on its own.

The current server behavior is just to truncate and return a short copy
if the client requests a copy of more than 4MB.  That's very easy, but
there might be some situations where it strikes the wrong
balance--either 4MB is too large, and ties up an nfsd thread too long,
or it's too small, and doesn't give the filesystem enough to work with
at a time, or enough to amortize the cost of the round-trip back to the
client which has to submit a new read for the rest of the file.

In which case perhaps there's a smarter way to choose that number, I
don't know.

The asynchronous protocol is significantly more complicated, so before
doing that I'd like evidence that it's necessary--that we can't get the
same performance by breaking up the copy into short copies.

--b.

> 
> 
> On Thu, Mar 2, 2017 at 11:01 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
> > This is server-side support for NFSv4.2 inter and async COPY which is
> > on top of existing intra sync COPY. It also depends on the NFS client
> > piece for NFSv4.2 to do client side of the destination server piece
> > in the inter SSC.
> >
> > NFSD determines if COPY is intra or inter and if sync or async. For
> > inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
> > (superblock). It will destroy the mount point when copy is done.
> >
> > To do asynchronous copies, NFSD creates a single threaded workqueue
> > and does not tie up an NFSD thread to complete the copy. Upon receiving
> > the COPY, it generates a unique copy stateid (stores a global list
> > for keeping track of state for OFFLOAD_STATUS to be queried by),
> > queues up a workqueue for the copy, and replies back to the client.
> > nfsd4_copy arguments that are allocated on the stack are copied for
> > the work item.
> >
> > In the async copy handler, it calls into VFS copy_file_range() with
> > 4MB chunks and loops until it completes the requested copy size. If
> > error is encountered it's saved but also we save the amount of data
> > copied so far. Once done, the results are queued for the callback
> > workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
> > up the copy state information stored in the global list when cope is
> > done and not doing it when callback's release function (it could be
> > done there alternatively if needed it?).
> >
> > On the source server, upon receiving a COPY_NOTIFY, it generate a
> > unique stateid that's kept in the global list. Upon receiving a READ
> > with a stateid, the code checks the normal list of open stateid and
> > now additionally, it'll check the copy state list as well before
> > deciding to either fail with BAD_STATEID or find one that matches.
> > The stored stateid is only valid to be used for the first time
> > with a choosen lease period (90s currently). When the source server
> > received an OFFLOAD_CANCEL, it will remove the stateid from the
> > global list. Otherwise, the copy stateid is removed upon the removal
> > of its "parent" stateid (open/lock/delegation stateid).
> >
> >
> > Andy Adamson (7):
> >   NFSD add ca_source_server<> to COPY
> >   NFSD generalize nfsd4_compound_state flag names
> >   NFSD: allow inter server COPY to have a STALE source server fh
> >   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
> >   NFSD add COPY_NOTIFY operation
> >   NFSD add nfs4 inter ssc to nfsd4_copy
> >   NFSD Unique stateid_t for inter server to server COPY authentication
> >
> > Olga Kornievskaia (10):
> >   NFSD CB_OFFLOAD xdr
> >   NFSD OFFLOAD_STATUS xdr
> >   NFSD OFFLOAD_CANCEL xdr
> >   NFSD xdr callback stateid in async COPY reply
> >   NFSD first draft of async copy
> >   NFSD handle OFFLOAD_CANCEL op
> >   NFSD stop queued async copies on client shutdown
> >   NFSD create new stateid for async copy
> >   NFSD define EBADF in nfserrno
> >   NFSD support OFFLOAD_STATUS
> >
> >  fs/nfsd/Kconfig        |  10 +
> >  fs/nfsd/netns.h        |   8 +
> >  fs/nfsd/nfs4callback.c |  95 +++++++
> >  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
> >  fs/nfsd/nfs4state.c    | 142 +++++++++-
> >  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
> >  fs/nfsd/nfsctl.c       |   2 +
> >  fs/nfsd/nfsd.h         |   2 +
> >  fs/nfsd/nfsproc.c      |   1 +
> >  fs/nfsd/state.h        |  32 ++-
> >  fs/nfsd/xdr4.h         |  53 +++-
> >  fs/nfsd/xdr4cb.h       |  10 +
> >  include/linux/nfs4.h   |   1 +
> >  13 files changed, 1273 insertions(+), 53 deletions(-)
> >
> > --
> > 1.8.3.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-03-20 15:30   ` J. Bruce Fields
@ 2017-03-27 21:49     ` Olga Kornievskaia
  0 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-03-27 21:49 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Mon, Mar 20, 2017 at 11:30 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> On Fri, Mar 17, 2017 at 05:21:52PM -0400, Olga Kornievskaia wrote:
>> Any comments on this?
>
> I was hoping to get some more details from Christoph on his objections
> to cross-superblock copies.  From the vfs point of view it seems not
> that different from splice, so I'm not seeing a fundamental obstacle.
>
> I'll give the patches a look.
>
>> Should I separate out the async copy support and send it out like I
>> did for the client side? I guess the idea is to get the async first
>> and then add inter when the VFS layer situation is resolved?
>
> If you submit the async stuff first, there needs to be some reason we'd
> want it on its own.
>
> The current server behavior is just to truncate and return a short copy
> if the client requests a copy of more than 4MB.  That's very easy, but
> there might be some situations where it strikes the wrong
> balance--either 4MB is too large, and ties up an nfsd thread too long,
> or it's too small, and doesn't give the filesystem enough to work with
> at a time, or enough to amortize the cost of the round-trip back to the
> client which has to submit a new read for the rest of the file.
>
> In which case perhaps there's a smarter way to choose that number, I
> don't know.
>
> The asynchronous protocol is significantly more complicated, so before
> doing that I'd like evidence that it's necessary--that we can't get the
> same performance by breaking up the copy into short copies.


Here's what I have in my testing setup:
On a Mac Pro laptop: 3VMs (1cpu, 2Gmemory each).

size      | cp | 4MBsync | async | inter

256MB 3.233s 0.198s 0.194s 1.809s

512MB 6.460s 0.406s 0.373s 3.615s

1GB 12.357s 1.182s 1.036s 7.445s

2GB 26.470s 2.258s 2.159s 15.901s

4GB 52.108s 4.766s 4.150s 19.421s

In Anna's numbers plain 4.1 copy for 5GB was done in 15s ... in mine
it take 52s to do 4GB.

"async" intra copy is just very slightly better than the sync intra.
"inter" is significantly faster than "cp".

Again my setup is all virtual. Real hardware testing is needed.

>
> --b.
>
>>
>>
>> On Thu, Mar 2, 2017 at 11:01 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>> > This is server-side support for NFSv4.2 inter and async COPY which is
>> > on top of existing intra sync COPY. It also depends on the NFS client
>> > piece for NFSv4.2 to do client side of the destination server piece
>> > in the inter SSC.
>> >
>> > NFSD determines if COPY is intra or inter and if sync or async. For
>> > inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
>> > (superblock). It will destroy the mount point when copy is done.
>> >
>> > To do asynchronous copies, NFSD creates a single threaded workqueue
>> > and does not tie up an NFSD thread to complete the copy. Upon receiving
>> > the COPY, it generates a unique copy stateid (stores a global list
>> > for keeping track of state for OFFLOAD_STATUS to be queried by),
>> > queues up a workqueue for the copy, and replies back to the client.
>> > nfsd4_copy arguments that are allocated on the stack are copied for
>> > the work item.
>> >
>> > In the async copy handler, it calls into VFS copy_file_range() with
>> > 4MB chunks and loops until it completes the requested copy size. If
>> > error is encountered it's saved but also we save the amount of data
>> > copied so far. Once done, the results are queued for the callback
>> > workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
>> > up the copy state information stored in the global list when cope is
>> > done and not doing it when callback's release function (it could be
>> > done there alternatively if needed it?).
>> >
>> > On the source server, upon receiving a COPY_NOTIFY, it generate a
>> > unique stateid that's kept in the global list. Upon receiving a READ
>> > with a stateid, the code checks the normal list of open stateid and
>> > now additionally, it'll check the copy state list as well before
>> > deciding to either fail with BAD_STATEID or find one that matches.
>> > The stored stateid is only valid to be used for the first time
>> > with a choosen lease period (90s currently). When the source server
>> > received an OFFLOAD_CANCEL, it will remove the stateid from the
>> > global list. Otherwise, the copy stateid is removed upon the removal
>> > of its "parent" stateid (open/lock/delegation stateid).
>> >
>> >
>> > Andy Adamson (7):
>> >   NFSD add ca_source_server<> to COPY
>> >   NFSD generalize nfsd4_compound_state flag names
>> >   NFSD: allow inter server COPY to have a STALE source server fh
>> >   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
>> >   NFSD add COPY_NOTIFY operation
>> >   NFSD add nfs4 inter ssc to nfsd4_copy
>> >   NFSD Unique stateid_t for inter server to server COPY authentication
>> >
>> > Olga Kornievskaia (10):
>> >   NFSD CB_OFFLOAD xdr
>> >   NFSD OFFLOAD_STATUS xdr
>> >   NFSD OFFLOAD_CANCEL xdr
>> >   NFSD xdr callback stateid in async COPY reply
>> >   NFSD first draft of async copy
>> >   NFSD handle OFFLOAD_CANCEL op
>> >   NFSD stop queued async copies on client shutdown
>> >   NFSD create new stateid for async copy
>> >   NFSD define EBADF in nfserrno
>> >   NFSD support OFFLOAD_STATUS
>> >
>> >  fs/nfsd/Kconfig        |  10 +
>> >  fs/nfsd/netns.h        |   8 +
>> >  fs/nfsd/nfs4callback.c |  95 +++++++
>> >  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
>> >  fs/nfsd/nfs4state.c    | 142 +++++++++-
>> >  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
>> >  fs/nfsd/nfsctl.c       |   2 +
>> >  fs/nfsd/nfsd.h         |   2 +
>> >  fs/nfsd/nfsproc.c      |   1 +
>> >  fs/nfsd/state.h        |  32 ++-
>> >  fs/nfsd/xdr4.h         |  53 +++-
>> >  fs/nfsd/xdr4cb.h       |  10 +
>> >  include/linux/nfs4.h   |   1 +
>> >  13 files changed, 1273 insertions(+), 53 deletions(-)
>> >
>> > --
>> > 1.8.3.1
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
                   ` (18 preceding siblings ...)
  2017-03-17 21:21 ` [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
@ 2017-09-01 19:41 ` J. Bruce Fields
  2017-09-01 19:42   ` J. Bruce Fields
  2017-09-01 19:48   ` Olga Kornievskaia
  19 siblings, 2 replies; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 19:41 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

Apologies, remind me:

	- does wireshark have support for all of this?  (Or are there
	  patches?)
	- How are you testing?
	- What's the status of the client side?
	- Do we have any user documentation?  Does the client or server
	  administrator need to do any special setup, or is it just a
	  matter of exporting and mounting over 4.1 and calling
	  copy_file_range()?
	- what currently happens if you try to copy across krb5 mounts?

--b.

On Thu, Mar 02, 2017 at 11:01:24AM -0500, Olga Kornievskaia wrote:
> This is server-side support for NFSv4.2 inter and async COPY which is
> on top of existing intra sync COPY. It also depends on the NFS client
> piece for NFSv4.2 to do client side of the destination server piece
> in the inter SSC.
> 
> NFSD determines if COPY is intra or inter and if sync or async. For
> inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
> (superblock). It will destroy the mount point when copy is done.
> 
> To do asynchronous copies, NFSD creates a single threaded workqueue
> and does not tie up an NFSD thread to complete the copy. Upon receiving
> the COPY, it generates a unique copy stateid (stores a global list
> for keeping track of state for OFFLOAD_STATUS to be queried by),
> queues up a workqueue for the copy, and replies back to the client.
> nfsd4_copy arguments that are allocated on the stack are copied for
> the work item.
> 
> In the async copy handler, it calls into VFS copy_file_range() with
> 4MB chunks and loops until it completes the requested copy size. If
> error is encountered it's saved but also we save the amount of data
> copied so far. Once done, the results are queued for the callback
> workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
> up the copy state information stored in the global list when cope is
> done and not doing it when callback's release function (it could be
> done there alternatively if needed it?).
> 
> On the source server, upon receiving a COPY_NOTIFY, it generate a
> unique stateid that's kept in the global list. Upon receiving a READ
> with a stateid, the code checks the normal list of open stateid and
> now additionally, it'll check the copy state list as well before
> deciding to either fail with BAD_STATEID or find one that matches.
> The stored stateid is only valid to be used for the first time
> with a choosen lease period (90s currently). When the source server
> received an OFFLOAD_CANCEL, it will remove the stateid from the
> global list. Otherwise, the copy stateid is removed upon the removal
> of its "parent" stateid (open/lock/delegation stateid).
> 
> 
> Andy Adamson (7):
>   NFSD add ca_source_server<> to COPY
>   NFSD generalize nfsd4_compound_state flag names
>   NFSD: allow inter server COPY to have a STALE source server fh
>   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
>   NFSD add COPY_NOTIFY operation
>   NFSD add nfs4 inter ssc to nfsd4_copy
>   NFSD Unique stateid_t for inter server to server COPY authentication
> 
> Olga Kornievskaia (10):
>   NFSD CB_OFFLOAD xdr
>   NFSD OFFLOAD_STATUS xdr
>   NFSD OFFLOAD_CANCEL xdr
>   NFSD xdr callback stateid in async COPY reply
>   NFSD first draft of async copy
>   NFSD handle OFFLOAD_CANCEL op
>   NFSD stop queued async copies on client shutdown
>   NFSD create new stateid for async copy
>   NFSD define EBADF in nfserrno
>   NFSD support OFFLOAD_STATUS
> 
>  fs/nfsd/Kconfig        |  10 +
>  fs/nfsd/netns.h        |   8 +
>  fs/nfsd/nfs4callback.c |  95 +++++++
>  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
>  fs/nfsd/nfs4state.c    | 142 +++++++++-
>  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
>  fs/nfsd/nfsctl.c       |   2 +
>  fs/nfsd/nfsd.h         |   2 +
>  fs/nfsd/nfsproc.c      |   1 +
>  fs/nfsd/state.h        |  32 ++-
>  fs/nfsd/xdr4.h         |  53 +++-
>  fs/nfsd/xdr4cb.h       |  10 +
>  include/linux/nfs4.h   |   1 +
>  13 files changed, 1273 insertions(+), 53 deletions(-)
> 
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 19:41 ` J. Bruce Fields
@ 2017-09-01 19:42   ` J. Bruce Fields
  2017-09-01 19:48   ` Olga Kornievskaia
  1 sibling, 0 replies; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 19:42 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Fri, Sep 01, 2017 at 03:41:30PM -0400, J. Bruce Fields wrote:
> Apologies, remind me:
> 
> 	- does wireshark have support for all of this?  (Or are there
> 	  patches?)
> 	- How are you testing?
> 	- What's the status of the client side?
> 	- Do we have any user documentation?  Does the client or server
> 	  administrator need to do any special setup, or is it just a
> 	  matter of exporting and mounting over 4.1 and calling

(Sorry, 4.2!)

--b.

> 	  copy_file_range()?
> 	- what currently happens if you try to copy across krb5 mounts?
> 
> --b.
> 
> On Thu, Mar 02, 2017 at 11:01:24AM -0500, Olga Kornievskaia wrote:
> > This is server-side support for NFSv4.2 inter and async COPY which is
> > on top of existing intra sync COPY. It also depends on the NFS client
> > piece for NFSv4.2 to do client side of the destination server piece
> > in the inter SSC.
> > 
> > NFSD determines if COPY is intra or inter and if sync or async. For
> > inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
> > (superblock). It will destroy the mount point when copy is done.
> > 
> > To do asynchronous copies, NFSD creates a single threaded workqueue
> > and does not tie up an NFSD thread to complete the copy. Upon receiving
> > the COPY, it generates a unique copy stateid (stores a global list
> > for keeping track of state for OFFLOAD_STATUS to be queried by),
> > queues up a workqueue for the copy, and replies back to the client.
> > nfsd4_copy arguments that are allocated on the stack are copied for
> > the work item.
> > 
> > In the async copy handler, it calls into VFS copy_file_range() with
> > 4MB chunks and loops until it completes the requested copy size. If
> > error is encountered it's saved but also we save the amount of data
> > copied so far. Once done, the results are queued for the callback
> > workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
> > up the copy state information stored in the global list when cope is
> > done and not doing it when callback's release function (it could be
> > done there alternatively if needed it?).
> > 
> > On the source server, upon receiving a COPY_NOTIFY, it generate a
> > unique stateid that's kept in the global list. Upon receiving a READ
> > with a stateid, the code checks the normal list of open stateid and
> > now additionally, it'll check the copy state list as well before
> > deciding to either fail with BAD_STATEID or find one that matches.
> > The stored stateid is only valid to be used for the first time
> > with a choosen lease period (90s currently). When the source server
> > received an OFFLOAD_CANCEL, it will remove the stateid from the
> > global list. Otherwise, the copy stateid is removed upon the removal
> > of its "parent" stateid (open/lock/delegation stateid).
> > 
> > 
> > Andy Adamson (7):
> >   NFSD add ca_source_server<> to COPY
> >   NFSD generalize nfsd4_compound_state flag names
> >   NFSD: allow inter server COPY to have a STALE source server fh
> >   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
> >   NFSD add COPY_NOTIFY operation
> >   NFSD add nfs4 inter ssc to nfsd4_copy
> >   NFSD Unique stateid_t for inter server to server COPY authentication
> > 
> > Olga Kornievskaia (10):
> >   NFSD CB_OFFLOAD xdr
> >   NFSD OFFLOAD_STATUS xdr
> >   NFSD OFFLOAD_CANCEL xdr
> >   NFSD xdr callback stateid in async COPY reply
> >   NFSD first draft of async copy
> >   NFSD handle OFFLOAD_CANCEL op
> >   NFSD stop queued async copies on client shutdown
> >   NFSD create new stateid for async copy
> >   NFSD define EBADF in nfserrno
> >   NFSD support OFFLOAD_STATUS
> > 
> >  fs/nfsd/Kconfig        |  10 +
> >  fs/nfsd/netns.h        |   8 +
> >  fs/nfsd/nfs4callback.c |  95 +++++++
> >  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
> >  fs/nfsd/nfs4state.c    | 142 +++++++++-
> >  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
> >  fs/nfsd/nfsctl.c       |   2 +
> >  fs/nfsd/nfsd.h         |   2 +
> >  fs/nfsd/nfsproc.c      |   1 +
> >  fs/nfsd/state.h        |  32 ++-
> >  fs/nfsd/xdr4.h         |  53 +++-
> >  fs/nfsd/xdr4cb.h       |  10 +
> >  include/linux/nfs4.h   |   1 +
> >  13 files changed, 1273 insertions(+), 53 deletions(-)
> > 
> > -- 
> > 1.8.3.1
> > 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 19:41 ` J. Bruce Fields
  2017-09-01 19:42   ` J. Bruce Fields
@ 2017-09-01 19:48   ` Olga Kornievskaia
  2017-09-01 19:53     ` J. Bruce Fields
  1 sibling, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-09-01 19:48 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Olga Kornievskaia, linux-nfs

On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> Apologies, remind me:
>
>         - does wireshark have support for all of this?  (Or are there
>           patches?)

Yes I submitted patches for wireshark and latest has it.

>         - How are you testing?

Jorge's nfstest utils via nfstest_ssc

>         - What's the status of the client side?

Anna had minor comments last cycle of review which I have addressed
but haven't reposted waiting on the fate from the server review. Plan
is to also update to 4.13x for after constifying patches for the next
submission. Client side has gone thru a number of review cycles.

>         - Do we have any user documentation?

No additional documentations are needed besides using the
copy_file_range() api and calling between two mount points.

>  Does the client or server
>           administrator need to do any special setup, or is it just a
>           matter of exporting and mounting over 4.1 and calling
>           copy_file_range()?

just calling copy_file_range.

>         - what currently happens if you try to copy across krb5 mounts?

No GSSv3 is included in these patches.  The destination server will
mount the source server using auth_sys.

>
> --b.
>
> On Thu, Mar 02, 2017 at 11:01:24AM -0500, Olga Kornievskaia wrote:
>> This is server-side support for NFSv4.2 inter and async COPY which is
>> on top of existing intra sync COPY. It also depends on the NFS client
>> piece for NFSv4.2 to do client side of the destination server piece
>> in the inter SSC.
>>
>> NFSD determines if COPY is intra or inter and if sync or async. For
>> inter, NSFD uses NFSv4.1 protocol and creates an internal mount point
>> (superblock). It will destroy the mount point when copy is done.
>>
>> To do asynchronous copies, NFSD creates a single threaded workqueue
>> and does not tie up an NFSD thread to complete the copy. Upon receiving
>> the COPY, it generates a unique copy stateid (stores a global list
>> for keeping track of state for OFFLOAD_STATUS to be queried by),
>> queues up a workqueue for the copy, and replies back to the client.
>> nfsd4_copy arguments that are allocated on the stack are copied for
>> the work item.
>>
>> In the async copy handler, it calls into VFS copy_file_range() with
>> 4MB chunks and loops until it completes the requested copy size. If
>> error is encountered it's saved but also we save the amount of data
>> copied so far. Once done, the results are queued for the callback
>> workqueue and sent via CB_OFFLOAD. Also currently, choosing to clean
>> up the copy state information stored in the global list when cope is
>> done and not doing it when callback's release function (it could be
>> done there alternatively if needed it?).
>>
>> On the source server, upon receiving a COPY_NOTIFY, it generate a
>> unique stateid that's kept in the global list. Upon receiving a READ
>> with a stateid, the code checks the normal list of open stateid and
>> now additionally, it'll check the copy state list as well before
>> deciding to either fail with BAD_STATEID or find one that matches.
>> The stored stateid is only valid to be used for the first time
>> with a choosen lease period (90s currently). When the source server
>> received an OFFLOAD_CANCEL, it will remove the stateid from the
>> global list. Otherwise, the copy stateid is removed upon the removal
>> of its "parent" stateid (open/lock/delegation stateid).
>>
>>
>> Andy Adamson (7):
>>   NFSD add ca_source_server<> to COPY
>>   NFSD generalize nfsd4_compound_state flag names
>>   NFSD: allow inter server COPY to have a STALE source server fh
>>   NFSD return nfs4_stid in nfs4_preprocess_stateid_op
>>   NFSD add COPY_NOTIFY operation
>>   NFSD add nfs4 inter ssc to nfsd4_copy
>>   NFSD Unique stateid_t for inter server to server COPY authentication
>>
>> Olga Kornievskaia (10):
>>   NFSD CB_OFFLOAD xdr
>>   NFSD OFFLOAD_STATUS xdr
>>   NFSD OFFLOAD_CANCEL xdr
>>   NFSD xdr callback stateid in async COPY reply
>>   NFSD first draft of async copy
>>   NFSD handle OFFLOAD_CANCEL op
>>   NFSD stop queued async copies on client shutdown
>>   NFSD create new stateid for async copy
>>   NFSD define EBADF in nfserrno
>>   NFSD support OFFLOAD_STATUS
>>
>>  fs/nfsd/Kconfig        |  10 +
>>  fs/nfsd/netns.h        |   8 +
>>  fs/nfsd/nfs4callback.c |  95 +++++++
>>  fs/nfsd/nfs4proc.c     | 704 ++++++++++++++++++++++++++++++++++++++++++++++---
>>  fs/nfsd/nfs4state.c    | 142 +++++++++-
>>  fs/nfsd/nfs4xdr.c      | 266 ++++++++++++++++++-
>>  fs/nfsd/nfsctl.c       |   2 +
>>  fs/nfsd/nfsd.h         |   2 +
>>  fs/nfsd/nfsproc.c      |   1 +
>>  fs/nfsd/state.h        |  32 ++-
>>  fs/nfsd/xdr4.h         |  53 +++-
>>  fs/nfsd/xdr4cb.h       |  10 +
>>  include/linux/nfs4.h   |   1 +
>>  13 files changed, 1273 insertions(+), 53 deletions(-)
>>
>> --
>> 1.8.3.1
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 01/18] NFSD add ca_source_server<> to COPY
  2017-03-02 16:01 ` [RFC v1 01/18] NFSD add ca_source_server<> to COPY Olga Kornievskaia
@ 2017-09-01 19:52   ` J. Bruce Fields
  2017-09-01 20:14     ` Olga Kornievskaia
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 19:52 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Thu, Mar 02, 2017 at 11:01:25AM -0500, Olga Kornievskaia wrote:
> From: Andy Adamson <andros@netapp.com>
> 
> Note: followed conventions and have struct nfsd4_compoundargs pointer as a
> parameter even though it is unused.

I'd understand if nfsd4_decode_nl4_server was an op decoder, but it
looks like it's called by some other decoder?  In which case, there's no
need for the unused argument.

I can't find the definition for struct nl4_server anyway, was this
supposed to apply on top of another set of patches?

So if you send a COPY request with a source server list to the current
(unpatched) server, it looks like you just get back BADXDR?  That sounds
like a bug in the current server.  But I suppose the client may be stuck
with that behavior.  How does the client handle that error from COPY?

--b.

> 
> Signed-off-by: Andy Adamson <andros@netapp.com>
> ---
>  fs/nfsd/nfs4xdr.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  fs/nfsd/xdr4.h    |  4 +++
>  2 files changed, 77 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 382c1fd..f62cbad 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -41,6 +41,7 @@
>  #include <linux/utsname.h>
>  #include <linux/pagemap.h>
>  #include <linux/sunrpc/svcauth_gss.h>
> +#include <linux/sunrpc/addr.h>
>  
>  #include "idmap.h"
>  #include "acl.h"
> @@ -1726,11 +1727,58 @@ static __be32 nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
>  	DECODE_TAIL;
>  }
>  
> +static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs *argp,
> +				      struct nl4_server *ns)
> +{
> +	DECODE_HEAD;
> +	struct nfs42_netaddr *naddr;
> +
> +	READ_BUF(4);
> +	ns->nl4_type = be32_to_cpup(p++);
> +
> +	/* currently support for 1 inter-server source server */
> +	switch (ns->nl4_type) {
> +	case NL4_NAME:
> +	case NL4_URL:
> +		READ_BUF(4);
> +		ns->u.nl4_str_sz = be32_to_cpup(p++);
> +		if (ns->u.nl4_str_sz > NFS4_OPAQUE_LIMIT)
> +			goto xdr_error;
> +
> +		READ_BUF(ns->u.nl4_str_sz);
> +		COPYMEM(ns->u.nl4_str,
> +			ns->u.nl4_str_sz);
> +		break;
> +	case NL4_NETADDR:
> +		naddr = &ns->u.nl4_addr;
> +
> +		READ_BUF(4);
> +		naddr->na_netid_len = be32_to_cpup(p++);
> +		if (naddr->na_netid_len > RPCBIND_MAXNETIDLEN)
> +			goto xdr_error;
> +
> +		READ_BUF(naddr->na_netid_len + 4); /* 4 for uaddr len */
> +		COPYMEM(naddr->na_netid, naddr->na_netid_len);
> +
> +		naddr->na_uaddr_len = be32_to_cpup(p++);
> +		if (naddr->na_uaddr_len > RPCBIND_MAXUADDRLEN)
> +			goto xdr_error;
> +
> +		READ_BUF(naddr->na_uaddr_len);
> +		COPYMEM(naddr->na_uaddr, naddr->na_uaddr_len);
> +		break;
> +	default:
> +		goto xdr_error;
> +	}
> +	DECODE_TAIL;
> +}
> +
>  static __be32
>  nfsd4_decode_copy(struct nfsd4_compoundargs *argp, struct nfsd4_copy *copy)
>  {
>  	DECODE_HEAD;
> -	unsigned int tmp;
> +	struct nl4_server *ns;
> +	int i;
>  
>  	status = nfsd4_decode_stateid(argp, &copy->cp_src_stateid);
>  	if (status)
> @@ -1745,8 +1793,29 @@ static __be32 nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
>  	p = xdr_decode_hyper(p, &copy->cp_count);
>  	copy->cp_consecutive = be32_to_cpup(p++);
>  	copy->cp_synchronous = be32_to_cpup(p++);
> -	tmp = be32_to_cpup(p); /* Source server list not supported */
> +	copy->cp_src.nl_nsvr = be32_to_cpup(p++);
>  
> +	if (copy->cp_src.nl_nsvr == 0) /* intra-server copy */
> +		goto intra;
> +
> +	/** Support NFSD4_MAX_SSC_SRC number of source servers.
> +	 * freed in nfsd4_encode_copy
> +	 */
> +	if (copy->cp_src.nl_nsvr > NFSD4_MAX_SSC_SRC)
> +		copy->cp_src.nl_nsvr = NFSD4_MAX_SSC_SRC;
> +	copy->cp_src.nl_svr = kmalloc(copy->cp_src.nl_nsvr *
> +					sizeof(struct nl4_server), GFP_KERNEL);
> +	if (copy->cp_src.nl_svr == NULL)
> +		return nfserrno(-ENOMEM);
> +
> +	ns = copy->cp_src.nl_svr;
> +	for (i = 0; i < copy->cp_src.nl_nsvr; i++) {
> +		status = nfsd4_decode_nl4_server(argp, ns);
> +		if (status)
> +			return status;
> +		ns++;
> +	}
> +intra:
>  	DECODE_TAIL;
>  }
>  
> @@ -4295,6 +4364,8 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>  		*p++ = cpu_to_be32(copy->cp_consecutive);
>  		*p++ = cpu_to_be32(copy->cp_synchronous);
>  	}
> +	/* allocated in nfsd4_decode_copy */
> +	kfree(copy->cp_src.nl_svr);
>  	return nfserr;
>  }
>  
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 8fda4ab..6b1a61fc 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -509,6 +509,9 @@ struct nfsd42_write_res {
>  	nfs4_verifier		wr_verifier;
>  };
>  
> +/*  support 1 source server for now */
> +#define NFSD4_MAX_SSC_SRC       1
> +
>  struct nfsd4_copy {
>  	/* request */
>  	stateid_t	cp_src_stateid;
> @@ -516,6 +519,7 @@ struct nfsd4_copy {
>  	u64		cp_src_pos;
>  	u64		cp_dst_pos;
>  	u64		cp_count;
> +	struct nl4_servers cp_src;
>  
>  	/* both */
>  	bool		cp_consecutive;
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 19:48   ` Olga Kornievskaia
@ 2017-09-01 19:53     ` J. Bruce Fields
  2017-09-01 20:02       ` Olga Kornievskaia
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 19:53 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: Olga Kornievskaia, linux-nfs

On Fri, Sep 01, 2017 at 03:48:33PM -0400, Olga Kornievskaia wrote:
> On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >         - what currently happens if you try to copy across krb5 mounts?
> 
> No GSSv3 is included in these patches.  The destination server will
> mount the source server using auth_sys.

Assuming that doesn't work--how is the failure handled?

--b.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 19:53     ` J. Bruce Fields
@ 2017-09-01 20:02       ` Olga Kornievskaia
  2017-09-01 20:09         ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-09-01 20:02 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Olga Kornievskaia, linux-nfs


> On Sep 1, 2017, at 3:53 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>=20
> On Fri, Sep 01, 2017 at 03:48:33PM -0400, Olga Kornievskaia wrote:
>> On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>>>        - what currently happens if you try to copy across krb5 =
mounts?
>>=20
>> No GSSv3 is included in these patches.  The destination server will
>> mount the source server using auth_sys.
>=20
> Assuming that doesn't work--how is the failure handled?

If mount fails? Destination server returns an error in COPY (whatever
 vfs_kern_mount can return). Client calls generic =
nfs4_handle_exception()=20
but it=E2=80=99s probably a kind of error it doesn=E2=80=99t handle so =
it=E2=80=99ll be translated to EIO.=20
What kind of error are you thinking about?


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 20:02       ` Olga Kornievskaia
@ 2017-09-01 20:09         ` J. Bruce Fields
  2017-09-01 20:34           ` Olga Kornievskaia
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 20:09 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: Olga Kornievskaia, linux-nfs

On Fri, Sep 01, 2017 at 04:02:48PM -0400, Olga Kornievskaia wrote:
> 
> > On Sep 1, 2017, at 3:53 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > On Fri, Sep 01, 2017 at 03:48:33PM -0400, Olga Kornievskaia wrote:
> >> On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>>        - what currently happens if you try to copy across krb5 mounts?
> >> 
> >> No GSSv3 is included in these patches.  The destination server will
> >> mount the source server using auth_sys.
> > 
> > Assuming that doesn't work--how is the failure handled?
> 
> If mount fails? Destination server returns an error in COPY (whatever
>  vfs_kern_mount can return). Client calls generic nfs4_handle_exception() 
> but it’s probably a kind of error it doesn’t handle so it’ll be translated to EIO. 
> What kind of error are you thinking about?

I just want to make sure that copy_file_range() caller knows what to do
when it encounters this situation.

The typical application probably wants to fall back on a read-write loop
in the case inter-server copy isn't supported between the given mounts?

EIO doesn't sound like the most helpful error to me, but whatever error
it is, it should be documented in the copy_file_range man page so that
callers know how to check for this case.

--b.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 01/18] NFSD add ca_source_server<> to COPY
  2017-09-01 19:52   ` J. Bruce Fields
@ 2017-09-01 20:14     ` Olga Kornievskaia
  0 siblings, 0 replies; 36+ messages in thread
From: Olga Kornievskaia @ 2017-09-01 20:14 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs


> On Sep 1, 2017, at 3:52 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>=20
> On Thu, Mar 02, 2017 at 11:01:25AM -0500, Olga Kornievskaia wrote:
>> From: Andy Adamson <andros@netapp.com>
>>=20
>> Note: followed conventions and have struct nfsd4_compoundargs pointer =
as a
>> parameter even though it is unused.
>=20
> I'd understand if nfsd4_decode_nl4_server was an op decoder, but it
> looks like it's called by some other decoder?  In which case, there's =
no
> need for the unused argument.
>=20
> I can't find the definition for struct nl4_server anyway, was this
> supposed to apply on top of another set of patches?

nl4_server is defined in patch 16 =E2=80=9CNFS NFSD defining nl4_servers =
structure need by both=E2=80=9D

> So if you send a COPY request with a source server list to the current
> (unpatched) server, it looks like you just get back BADXDR?

No it fails with ERR_STALE because unpatched server doesn=E2=80=99t have =
patch 0027=20
"NFSD allow inter server COPY to have a STALE source server fh"

> That sounds
> like a bug in the current server.  But I suppose the client may be =
stuck
> with that behavior.  How does the client handle that error from COPY?

Client fails the copy with EIO.

>=20
> --b.
>=20
>>=20
>> Signed-off-by: Andy Adamson <andros@netapp.com>
>> ---
>> fs/nfsd/nfs4xdr.c | 75 =
+++++++++++++++++++++++++++++++++++++++++++++++++++++--
>> fs/nfsd/xdr4.h    |  4 +++
>> 2 files changed, 77 insertions(+), 2 deletions(-)
>>=20
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 382c1fd..f62cbad 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -41,6 +41,7 @@
>> #include <linux/utsname.h>
>> #include <linux/pagemap.h>
>> #include <linux/sunrpc/svcauth_gss.h>
>> +#include <linux/sunrpc/addr.h>
>>=20
>> #include "idmap.h"
>> #include "acl.h"
>> @@ -1726,11 +1727,58 @@ static __be32 =
nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
>> 	DECODE_TAIL;
>> }
>>=20
>> +static __be32 nfsd4_decode_nl4_server(struct nfsd4_compoundargs =
*argp,
>> +				      struct nl4_server *ns)
>> +{
>> +	DECODE_HEAD;
>> +	struct nfs42_netaddr *naddr;
>> +
>> +	READ_BUF(4);
>> +	ns->nl4_type =3D be32_to_cpup(p++);
>> +
>> +	/* currently support for 1 inter-server source server */
>> +	switch (ns->nl4_type) {
>> +	case NL4_NAME:
>> +	case NL4_URL:
>> +		READ_BUF(4);
>> +		ns->u.nl4_str_sz =3D be32_to_cpup(p++);
>> +		if (ns->u.nl4_str_sz > NFS4_OPAQUE_LIMIT)
>> +			goto xdr_error;
>> +
>> +		READ_BUF(ns->u.nl4_str_sz);
>> +		COPYMEM(ns->u.nl4_str,
>> +			ns->u.nl4_str_sz);
>> +		break;
>> +	case NL4_NETADDR:
>> +		naddr =3D &ns->u.nl4_addr;
>> +
>> +		READ_BUF(4);
>> +		naddr->na_netid_len =3D be32_to_cpup(p++);
>> +		if (naddr->na_netid_len > RPCBIND_MAXNETIDLEN)
>> +			goto xdr_error;
>> +
>> +		READ_BUF(naddr->na_netid_len + 4); /* 4 for uaddr len */
>> +		COPYMEM(naddr->na_netid, naddr->na_netid_len);
>> +
>> +		naddr->na_uaddr_len =3D be32_to_cpup(p++);
>> +		if (naddr->na_uaddr_len > RPCBIND_MAXUADDRLEN)
>> +			goto xdr_error;
>> +
>> +		READ_BUF(naddr->na_uaddr_len);
>> +		COPYMEM(naddr->na_uaddr, naddr->na_uaddr_len);
>> +		break;
>> +	default:
>> +		goto xdr_error;
>> +	}
>> +	DECODE_TAIL;
>> +}
>> +
>> static __be32
>> nfsd4_decode_copy(struct nfsd4_compoundargs *argp, struct nfsd4_copy =
*copy)
>> {
>> 	DECODE_HEAD;
>> -	unsigned int tmp;
>> +	struct nl4_server *ns;
>> +	int i;
>>=20
>> 	status =3D nfsd4_decode_stateid(argp, &copy->cp_src_stateid);
>> 	if (status)
>> @@ -1745,8 +1793,29 @@ static __be32 =
nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, str
>> 	p =3D xdr_decode_hyper(p, &copy->cp_count);
>> 	copy->cp_consecutive =3D be32_to_cpup(p++);
>> 	copy->cp_synchronous =3D be32_to_cpup(p++);
>> -	tmp =3D be32_to_cpup(p); /* Source server list not supported */
>> +	copy->cp_src.nl_nsvr =3D be32_to_cpup(p++);
>>=20
>> +	if (copy->cp_src.nl_nsvr =3D=3D 0) /* intra-server copy */
>> +		goto intra;
>> +
>> +	/** Support NFSD4_MAX_SSC_SRC number of source servers.
>> +	 * freed in nfsd4_encode_copy
>> +	 */
>> +	if (copy->cp_src.nl_nsvr > NFSD4_MAX_SSC_SRC)
>> +		copy->cp_src.nl_nsvr =3D NFSD4_MAX_SSC_SRC;
>> +	copy->cp_src.nl_svr =3D kmalloc(copy->cp_src.nl_nsvr *
>> +					sizeof(struct nl4_server), =
GFP_KERNEL);
>> +	if (copy->cp_src.nl_svr =3D=3D NULL)
>> +		return nfserrno(-ENOMEM);
>> +
>> +	ns =3D copy->cp_src.nl_svr;
>> +	for (i =3D 0; i < copy->cp_src.nl_nsvr; i++) {
>> +		status =3D nfsd4_decode_nl4_server(argp, ns);
>> +		if (status)
>> +			return status;
>> +		ns++;
>> +	}
>> +intra:
>> 	DECODE_TAIL;
>> }
>>=20
>> @@ -4295,6 +4364,8 @@ static __be32 nfsd4_encode_readv(struct =
nfsd4_compoundres *resp,
>> 		*p++ =3D cpu_to_be32(copy->cp_consecutive);
>> 		*p++ =3D cpu_to_be32(copy->cp_synchronous);
>> 	}
>> +	/* allocated in nfsd4_decode_copy */
>> +	kfree(copy->cp_src.nl_svr);
>> 	return nfserr;
>> }
>>=20
>> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
>> index 8fda4ab..6b1a61fc 100644
>> --- a/fs/nfsd/xdr4.h
>> +++ b/fs/nfsd/xdr4.h
>> @@ -509,6 +509,9 @@ struct nfsd42_write_res {
>> 	nfs4_verifier		wr_verifier;
>> };
>>=20
>> +/*  support 1 source server for now */
>> +#define NFSD4_MAX_SSC_SRC       1
>> +
>> struct nfsd4_copy {
>> 	/* request */
>> 	stateid_t	cp_src_stateid;
>> @@ -516,6 +519,7 @@ struct nfsd4_copy {
>> 	u64		cp_src_pos;
>> 	u64		cp_dst_pos;
>> 	u64		cp_count;
>> +	struct nl4_servers cp_src;
>>=20
>> 	/* both */
>> 	bool		cp_consecutive;
>> --=20
>> 1.8.3.1
>>=20


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh
  2017-03-02 16:01 ` [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh Olga Kornievskaia
@ 2017-09-01 20:23   ` J. Bruce Fields
  2017-09-01 20:25     ` Olga Kornievskaia
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 20:23 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Thu, Mar 02, 2017 at 11:01:28AM -0500, Olga Kornievskaia wrote:
> From: Andy Adamson <andros@netapp.com>
> 
> The inter server to server COPY source server filehandle
> is guaranteed to be stale as the COPY is sent to the destination
> server.

This is definitely not true.  That filehandle could very well mean
something to the source server, even if it's just by accident.

In the case the source filehandle refers to a file on a different
server, nfsd knows that, and should not call fh_verify on it at all.

--b.

> 
> Signed-off-by: Andy Adamson <andros@netapp.com>
> ---
>  fs/nfsd/nfs4proc.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>  fs/nfsd/nfs4xdr.c  | 26 +++++++++++++++++++++++++-
>  fs/nfsd/nfsd.h     |  2 ++
>  fs/nfsd/xdr4.h     |  4 ++++
>  4 files changed, 77 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index a680c8c..733a9aa 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -496,11 +496,19 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
>  nfsd4_putfh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	    struct nfsd4_putfh *putfh)
>  {
> +	__be32 ret;
> +
>  	fh_put(&cstate->current_fh);
>  	cstate->current_fh.fh_handle.fh_size = putfh->pf_fhlen;
>  	memcpy(&cstate->current_fh.fh_handle.fh_base, putfh->pf_fhval,
>  	       putfh->pf_fhlen);
> -	return fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> +	ret = fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> +	if (ret == nfserr_stale && HAS_CSTATE_FLAG(cstate, NO_VERIFY_FH)) {
> +		CLEAR_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> +		SET_CSTATE_FLAG(cstate, IS_STALE_FH);
> +		ret = 0;
> +	}
> +	return ret;
>  }
>  
>  static __be32
> @@ -533,6 +541,16 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
>  nfsd4_savefh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	     void *arg)
>  {
> +	/**
> +	* This is either an inter COPY (most likely) or an intra COPY with a
> +	* stale file handle. If the latter, nfsd4_copy will reset the PUTFH to
> +	* return nfserr_stale. No fh_dentry, just copy the file handle
> +	* to use with the inter COPY READ.
> +	*/
> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> +		cstate->save_fh = cstate->current_fh;
> +		return nfs_ok;
> +	}
>  	if (!cstate->current_fh.fh_dentry)
>  		return nfserr_nofilehandle;
>  
> @@ -1067,6 +1085,13 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
>  	if (status)
>  		goto out;
>  
> +	/* Intra copy source fh is stale. PUTFH will fail with ESTALE */
> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> +		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
> +		cstate->status = nfserr_copy_stalefh;
> +		goto out_put;
> +	}
> +
>  	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
>  			dst, copy->cp_dst_pos, copy->cp_count);
>  
> @@ -1081,6 +1106,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
>  		status = nfs_ok;
>  	}
>  
> +out_put:
>  	fput(src);
>  	fput(dst);
>  out:
> @@ -1776,6 +1802,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
>  	struct nfsd4_compound_state *cstate = &resp->cstate;
>  	struct svc_fh *current_fh = &cstate->current_fh;
>  	struct svc_fh *save_fh = &cstate->save_fh;
> +	int		i;
>  	__be32		status;
>  
>  	svcxdr_init_encode(rqstp, resp);
> @@ -1808,6 +1835,12 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
>  		goto encode_op;
>  	}
>  
> +	/* NFSv4.2 COPY source file handle may be from a different server */
> +	for (i = 0; i < args->opcnt; i++) {
> +		op = &args->ops[i];
> +		if (op->opnum == OP_COPY)
> +			SET_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> +	}
>  	while (!status && resp->opcnt < args->opcnt) {
>  		op = &args->ops[resp->opcnt++];
>  
> @@ -1827,6 +1860,9 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
>  
>  		opdesc = OPDESC(op);
>  
> +		if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> +			goto call_op;
> +
>  		if (!current_fh->fh_dentry) {
>  			if (!(opdesc->op_flags & ALLOWED_WITHOUT_FH)) {
>  				op->status = nfserr_nofilehandle;
> @@ -1861,6 +1897,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
>  
>  		if (opdesc->op_get_currentstateid)
>  			opdesc->op_get_currentstateid(cstate, &op->u);
> +call_op:
>  		op->status = opdesc->op_func(rqstp, cstate, &op->u);
>  
>  		if (!op->status) {
> @@ -1881,6 +1918,14 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
>  			status = op->status;
>  			goto out;
>  		}
> +		/* Only from intra COPY */
> +		if (cstate->status == nfserr_copy_stalefh) {
> +			dprintk("%s NFS4.2 intra COPY stale src filehandle\n",
> +				__func__);
> +			status = nfserr_stale;
> +			nfsd4_adjust_encode(resp);
> +			goto out;
> +		}
>  		if (op->status == nfserr_replay_me) {
>  			op->replay = &cstate->replay_owner->so_replay;
>  			nfsd4_encode_replay(&resp->xdr, op);
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index c632156..328ff9c 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -4619,15 +4619,28 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
>  	return nfserr_rep_too_big;
>  }
>  
> +/** Rewind the encoding to return nfserr_stale on the PUTFH
> + * in this failed Intra COPY compound
> + */
> +void
> +nfsd4_adjust_encode(struct nfsd4_compoundres *resp)
> +{
> +	__be32 *p;
> +
> +	p = resp->cstate.putfh_errp;
> +	*p++ = nfserr_stale;
> +}
> +
>  void
>  nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
>  {
>  	struct xdr_stream *xdr = &resp->xdr;
>  	struct nfs4_stateowner *so = resp->cstate.replay_owner;
> +	struct nfsd4_compound_state *cstate = &resp->cstate;
>  	struct svc_rqst *rqstp = resp->rqstp;
>  	int post_err_offset;
>  	nfsd4_enc encoder;
> -	__be32 *p;
> +	__be32 *p, *statp;
>  
>  	p = xdr_reserve_space(xdr, 8);
>  	if (!p) {
> @@ -4636,9 +4649,20 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
>  	}
>  	*p++ = cpu_to_be32(op->opnum);
>  	post_err_offset = xdr->buf->len;
> +	statp = p;
>  
>  	if (op->opnum == OP_ILLEGAL)
>  		goto status;
> +
> +	/** This is a COPY compound with a stale source server file handle.
> +	 * If OP_COPY processing determines that this is an intra server to
> +	 * server COPY, then this PUTFH should return nfserr_ stale so the
> +	 * putfh_errp will be set to nfserr_stale. If this is an inter server
> +	 * to server COPY, ignore the nfserr_stale.
> +	 */
> +	if (op->opnum == OP_PUTFH && HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> +		cstate->putfh_errp = statp;
> +
>  	BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
>  	       !nfsd4_enc_ops[op->opnum]);
>  	encoder = nfsd4_enc_ops[op->opnum];
> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> index d966068..8d6fb0f 100644
> --- a/fs/nfsd/nfsd.h
> +++ b/fs/nfsd/nfsd.h
> @@ -272,6 +272,8 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
>  #define	nfserr_replay_me	cpu_to_be32(11001)
>  /* nfs41 replay detected */
>  #define	nfserr_replay_cache	cpu_to_be32(11002)
> +/* nfs42 intra copy failed with nfserr_stale */
> +#define nfserr_copy_stalefh	cpu_to_be32(1103)
>  
>  /* Check for dir entries '.' and '..' */
>  #define isdotent(n, l)	(l < 3 && n[0] == '.' && (l == 1 || n[1] == '.'))
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 38fcb4f..aa94295 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -45,6 +45,8 @@
>  
>  #define CURRENT_STATE_ID_FLAG (1<<0)
>  #define SAVED_STATE_ID_FLAG (1<<1)
> +#define NO_VERIFY_FH (1<<2)
> +#define IS_STALE_FH  (1<<3)
>  
>  #define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |= (f))
>  #define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
> @@ -63,6 +65,7 @@ struct nfsd4_compound_state {
>  	size_t			iovlen;
>  	u32			minorversion;
>  	__be32			status;
> +	__be32			*putfh_errp;
>  	stateid_t	current_stateid;
>  	stateid_t	save_stateid;
>  	/* to indicate current and saved state id presents */
> @@ -705,6 +708,7 @@ int nfs4svc_decode_compoundargs(struct svc_rqst *, __be32 *,
>  int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
>  		struct nfsd4_compoundres *);
>  __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
> +void nfsd4_adjust_encode(struct nfsd4_compoundres *);
>  void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
>  void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op);
>  __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh
  2017-09-01 20:23   ` J. Bruce Fields
@ 2017-09-01 20:25     ` Olga Kornievskaia
  2017-09-01 21:16       ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-09-01 20:25 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs


> On Sep 1, 2017, at 4:23 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>=20
> On Thu, Mar 02, 2017 at 11:01:28AM -0500, Olga Kornievskaia wrote:
>> From: Andy Adamson <andros@netapp.com>
>>=20
>> The inter server to server COPY source server filehandle
>> is guaranteed to be stale as the COPY is sent to the destination
>> server.
>=20
> This is definitely not true.  That filehandle could very well mean
> something to the source server, even if it's just by accident.
>=20
> In the case the source filehandle refers to a file on a different
> server, nfsd knows that, and should not call fh_verify on it at all.

At the time of processing the FH it doesn=E2=80=99t know that there is a =
COPY
operation coming in the compound. So how does it know it refers to=20
a file on a different server?

>=20
> --b.
>=20
>>=20
>> Signed-off-by: Andy Adamson <andros@netapp.com>
>> ---
>> fs/nfsd/nfs4proc.c | 47 =
++++++++++++++++++++++++++++++++++++++++++++++-
>> fs/nfsd/nfs4xdr.c  | 26 +++++++++++++++++++++++++-
>> fs/nfsd/nfsd.h     |  2 ++
>> fs/nfsd/xdr4.h     |  4 ++++
>> 4 files changed, 77 insertions(+), 2 deletions(-)
>>=20
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index a680c8c..733a9aa 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -496,11 +496,19 @@ static __be32 nfsd4_open_omfg(struct svc_rqst =
*rqstp, struct nfsd4_compound_stat
>> nfsd4_putfh(struct svc_rqst *rqstp, struct nfsd4_compound_state =
*cstate,
>> 	    struct nfsd4_putfh *putfh)
>> {
>> +	__be32 ret;
>> +
>> 	fh_put(&cstate->current_fh);
>> 	cstate->current_fh.fh_handle.fh_size =3D putfh->pf_fhlen;
>> 	memcpy(&cstate->current_fh.fh_handle.fh_base, putfh->pf_fhval,
>> 	       putfh->pf_fhlen);
>> -	return fh_verify(rqstp, &cstate->current_fh, 0, =
NFSD_MAY_BYPASS_GSS);
>> +	ret =3D fh_verify(rqstp, &cstate->current_fh, 0, =
NFSD_MAY_BYPASS_GSS);
>> +	if (ret =3D=3D nfserr_stale && HAS_CSTATE_FLAG(cstate, =
NO_VERIFY_FH)) {
>> +		CLEAR_CSTATE_FLAG(cstate, NO_VERIFY_FH);
>> +		SET_CSTATE_FLAG(cstate, IS_STALE_FH);
>> +		ret =3D 0;
>> +	}
>> +	return ret;
>> }
>>=20
>> static __be32
>> @@ -533,6 +541,16 @@ static __be32 nfsd4_open_omfg(struct svc_rqst =
*rqstp, struct nfsd4_compound_stat
>> nfsd4_savefh(struct svc_rqst *rqstp, struct nfsd4_compound_state =
*cstate,
>> 	     void *arg)
>> {
>> +	/**
>> +	* This is either an inter COPY (most likely) or an intra COPY =
with a
>> +	* stale file handle. If the latter, nfsd4_copy will reset the =
PUTFH to
>> +	* return nfserr_stale. No fh_dentry, just copy the file handle
>> +	* to use with the inter COPY READ.
>> +	*/
>> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
>> +		cstate->save_fh =3D cstate->current_fh;
>> +		return nfs_ok;
>> +	}
>> 	if (!cstate->current_fh.fh_dentry)
>> 		return nfserr_nofilehandle;
>>=20
>> @@ -1067,6 +1085,13 @@ static int fill_in_write_vector(struct kvec =
*vec, struct nfsd4_write *write)
>> 	if (status)
>> 		goto out;
>>=20
>> +	/* Intra copy source fh is stale. PUTFH will fail with ESTALE */
>> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
>> +		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
>> +		cstate->status =3D nfserr_copy_stalefh;
>> +		goto out_put;
>> +	}
>> +
>> 	bytes =3D nfsd_copy_file_range(src, copy->cp_src_pos,
>> 			dst, copy->cp_dst_pos, copy->cp_count);
>>=20
>> @@ -1081,6 +1106,7 @@ static int fill_in_write_vector(struct kvec =
*vec, struct nfsd4_write *write)
>> 		status =3D nfs_ok;
>> 	}
>>=20
>> +out_put:
>> 	fput(src);
>> 	fput(dst);
>> out:
>> @@ -1776,6 +1802,7 @@ static void svcxdr_init_encode(struct svc_rqst =
*rqstp,
>> 	struct nfsd4_compound_state *cstate =3D &resp->cstate;
>> 	struct svc_fh *current_fh =3D &cstate->current_fh;
>> 	struct svc_fh *save_fh =3D &cstate->save_fh;
>> +	int		i;
>> 	__be32		status;
>>=20
>> 	svcxdr_init_encode(rqstp, resp);
>> @@ -1808,6 +1835,12 @@ static void svcxdr_init_encode(struct svc_rqst =
*rqstp,
>> 		goto encode_op;
>> 	}
>>=20
>> +	/* NFSv4.2 COPY source file handle may be from a different =
server */
>> +	for (i =3D 0; i < args->opcnt; i++) {
>> +		op =3D &args->ops[i];
>> +		if (op->opnum =3D=3D OP_COPY)
>> +			SET_CSTATE_FLAG(cstate, NO_VERIFY_FH);
>> +	}
>> 	while (!status && resp->opcnt < args->opcnt) {
>> 		op =3D &args->ops[resp->opcnt++];
>>=20
>> @@ -1827,6 +1860,9 @@ static void svcxdr_init_encode(struct svc_rqst =
*rqstp,
>>=20
>> 		opdesc =3D OPDESC(op);
>>=20
>> +		if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
>> +			goto call_op;
>> +
>> 		if (!current_fh->fh_dentry) {
>> 			if (!(opdesc->op_flags & ALLOWED_WITHOUT_FH)) {
>> 				op->status =3D nfserr_nofilehandle;
>> @@ -1861,6 +1897,7 @@ static void svcxdr_init_encode(struct svc_rqst =
*rqstp,
>>=20
>> 		if (opdesc->op_get_currentstateid)
>> 			opdesc->op_get_currentstateid(cstate, &op->u);
>> +call_op:
>> 		op->status =3D opdesc->op_func(rqstp, cstate, &op->u);
>>=20
>> 		if (!op->status) {
>> @@ -1881,6 +1918,14 @@ static void svcxdr_init_encode(struct svc_rqst =
*rqstp,
>> 			status =3D op->status;
>> 			goto out;
>> 		}
>> +		/* Only from intra COPY */
>> +		if (cstate->status =3D=3D nfserr_copy_stalefh) {
>> +			dprintk("%s NFS4.2 intra COPY stale src =
filehandle\n",
>> +				__func__);
>> +			status =3D nfserr_stale;
>> +			nfsd4_adjust_encode(resp);
>> +			goto out;
>> +		}
>> 		if (op->status =3D=3D nfserr_replay_me) {
>> 			op->replay =3D &cstate->replay_owner->so_replay;
>> 			nfsd4_encode_replay(&resp->xdr, op);
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index c632156..328ff9c 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -4619,15 +4619,28 @@ __be32 nfsd4_check_resp_size(struct =
nfsd4_compoundres *resp, u32 respsize)
>> 	return nfserr_rep_too_big;
>> }
>>=20
>> +/** Rewind the encoding to return nfserr_stale on the PUTFH
>> + * in this failed Intra COPY compound
>> + */
>> +void
>> +nfsd4_adjust_encode(struct nfsd4_compoundres *resp)
>> +{
>> +	__be32 *p;
>> +
>> +	p =3D resp->cstate.putfh_errp;
>> +	*p++ =3D nfserr_stale;
>> +}
>> +
>> void
>> nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct =
nfsd4_op *op)
>> {
>> 	struct xdr_stream *xdr =3D &resp->xdr;
>> 	struct nfs4_stateowner *so =3D resp->cstate.replay_owner;
>> +	struct nfsd4_compound_state *cstate =3D &resp->cstate;
>> 	struct svc_rqst *rqstp =3D resp->rqstp;
>> 	int post_err_offset;
>> 	nfsd4_enc encoder;
>> -	__be32 *p;
>> +	__be32 *p, *statp;
>>=20
>> 	p =3D xdr_reserve_space(xdr, 8);
>> 	if (!p) {
>> @@ -4636,9 +4649,20 @@ __be32 nfsd4_check_resp_size(struct =
nfsd4_compoundres *resp, u32 respsize)
>> 	}
>> 	*p++ =3D cpu_to_be32(op->opnum);
>> 	post_err_offset =3D xdr->buf->len;
>> +	statp =3D p;
>>=20
>> 	if (op->opnum =3D=3D OP_ILLEGAL)
>> 		goto status;
>> +
>> +	/** This is a COPY compound with a stale source server file =
handle.
>> +	 * If OP_COPY processing determines that this is an intra server =
to
>> +	 * server COPY, then this PUTFH should return nfserr_ stale so =
the
>> +	 * putfh_errp will be set to nfserr_stale. If this is an inter =
server
>> +	 * to server COPY, ignore the nfserr_stale.
>> +	 */
>> +	if (op->opnum =3D=3D OP_PUTFH && HAS_CSTATE_FLAG(cstate, =
IS_STALE_FH))
>> +		cstate->putfh_errp =3D statp;
>> +
>> 	BUG_ON(op->opnum < 0 || op->opnum >=3D ARRAY_SIZE(nfsd4_enc_ops) =
||
>> 	       !nfsd4_enc_ops[op->opnum]);
>> 	encoder =3D nfsd4_enc_ops[op->opnum];
>> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
>> index d966068..8d6fb0f 100644
>> --- a/fs/nfsd/nfsd.h
>> +++ b/fs/nfsd/nfsd.h
>> @@ -272,6 +272,8 @@ static inline bool nfsd4_spo_must_allow(struct =
svc_rqst *rqstp)
>> #define	nfserr_replay_me	cpu_to_be32(11001)
>> /* nfs41 replay detected */
>> #define	nfserr_replay_cache	cpu_to_be32(11002)
>> +/* nfs42 intra copy failed with nfserr_stale */
>> +#define nfserr_copy_stalefh	cpu_to_be32(1103)
>>=20
>> /* Check for dir entries '.' and '..' */
>> #define isdotent(n, l)	(l < 3 && n[0] =3D=3D '.' && (l =3D=3D 1 =
|| n[1] =3D=3D '.'))
>> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
>> index 38fcb4f..aa94295 100644
>> --- a/fs/nfsd/xdr4.h
>> +++ b/fs/nfsd/xdr4.h
>> @@ -45,6 +45,8 @@
>>=20
>> #define CURRENT_STATE_ID_FLAG (1<<0)
>> #define SAVED_STATE_ID_FLAG (1<<1)
>> +#define NO_VERIFY_FH (1<<2)
>> +#define IS_STALE_FH  (1<<3)
>>=20
>> #define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |=3D (f))
>> #define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
>> @@ -63,6 +65,7 @@ struct nfsd4_compound_state {
>> 	size_t			iovlen;
>> 	u32			minorversion;
>> 	__be32			status;
>> +	__be32			*putfh_errp;
>> 	stateid_t	current_stateid;
>> 	stateid_t	save_stateid;
>> 	/* to indicate current and saved state id presents */
>> @@ -705,6 +708,7 @@ int nfs4svc_decode_compoundargs(struct svc_rqst =
*, __be32 *,
>> int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
>> 		struct nfsd4_compoundres *);
>> __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
>> +void nfsd4_adjust_encode(struct nfsd4_compoundres *);
>> void nfsd4_encode_operation(struct nfsd4_compoundres *, struct =
nfsd4_op *);
>> void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op =
*op);
>> __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
>> --=20
>> 1.8.3.1
>>=20


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 20:09         ` J. Bruce Fields
@ 2017-09-01 20:34           ` Olga Kornievskaia
  2017-09-01 21:19             ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: Olga Kornievskaia @ 2017-09-01 20:34 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Olga Kornievskaia, linux-nfs


> On Sep 1, 2017, at 4:09 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>=20
> On Fri, Sep 01, 2017 at 04:02:48PM -0400, Olga Kornievskaia wrote:
>>=20
>>> On Sep 1, 2017, at 3:53 PM, J. Bruce Fields <bfields@redhat.com> =
wrote:
>>>=20
>>> On Fri, Sep 01, 2017 at 03:48:33PM -0400, Olga Kornievskaia wrote:
>>>> On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields =
<bfields@redhat.com> wrote:
>>>>>       - what currently happens if you try to copy across krb5 =
mounts?
>>>>=20
>>>> No GSSv3 is included in these patches.  The destination server will
>>>> mount the source server using auth_sys.
>>>=20
>>> Assuming that doesn't work--how is the failure handled?
>>=20
>> If mount fails? Destination server returns an error in COPY (whatever
>> vfs_kern_mount can return). Client calls generic =
nfs4_handle_exception()=20
>> but it=E2=80=99s probably a kind of error it doesn=E2=80=99t handle =
so it=E2=80=99ll be translated to EIO.=20
>> What kind of error are you thinking about?
>=20
> I just want to make sure that copy_file_range() caller knows what to =
do
> when it encounters this situation.
>=20
> The typical application probably wants to fall back on a read-write =
loop
> in the case inter-server copy isn't supported between the given =
mounts?
>=20
> EIO doesn't sound like the most helpful error to me, but whatever =
error
> it is, it should be documented in the copy_file_range man page so that
> callers know how to check for this case.

On the client side, if we were to receive an error code that signified a =
connection=20
problem, then COPY implementation could map that to something that would
trigger the VFS to just fallback to do_splice.=20

However, I couldn=E2=80=99t find any errors in 15.1 from rfc5661 or 11.1 =
from 7862 that
could mean connection errors (that=E2=80=99s typically RPC errors =
right).=20=

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh
  2017-09-01 20:25     ` Olga Kornievskaia
@ 2017-09-01 21:16       ` J. Bruce Fields
  2017-09-01 21:24         ` J. Bruce Fields
  0 siblings, 1 reply; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 21:16 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Fri, Sep 01, 2017 at 04:25:53PM -0400, Olga Kornievskaia wrote:
> 
> > On Sep 1, 2017, at 4:23 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > On Thu, Mar 02, 2017 at 11:01:28AM -0500, Olga Kornievskaia wrote:
> >> From: Andy Adamson <andros@netapp.com>
> >> 
> >> The inter server to server COPY source server filehandle
> >> is guaranteed to be stale as the COPY is sent to the destination
> >> server.
> > 
> > This is definitely not true.  That filehandle could very well mean
> > something to the source server, even if it's just by accident.
> > 
> > In the case the source filehandle refers to a file on a different
> > server, nfsd knows that, and should not call fh_verify on it at all.
> 
> At the time of processing the FH it doesn’t know that there is a COPY
> operation coming in the compound. So how does it know it refers to 
> a file on a different server?

Sorry, of course you're right, I missed that these filehandles are
passed by PUTFH and SAVEFH.  Yuch.

I don't like that we're doing a filehandle lookup at all on this
"foreign" filehandle.  But maybe it doesn't cause big problems in
practice.

--b.

> 
> > 
> > --b.
> > 
> >> 
> >> Signed-off-by: Andy Adamson <andros@netapp.com>
> >> ---
> >> fs/nfsd/nfs4proc.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
> >> fs/nfsd/nfs4xdr.c  | 26 +++++++++++++++++++++++++-
> >> fs/nfsd/nfsd.h     |  2 ++
> >> fs/nfsd/xdr4.h     |  4 ++++
> >> 4 files changed, 77 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >> index a680c8c..733a9aa 100644
> >> --- a/fs/nfsd/nfs4proc.c
> >> +++ b/fs/nfsd/nfs4proc.c
> >> @@ -496,11 +496,19 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
> >> nfsd4_putfh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >> 	    struct nfsd4_putfh *putfh)
> >> {
> >> +	__be32 ret;
> >> +
> >> 	fh_put(&cstate->current_fh);
> >> 	cstate->current_fh.fh_handle.fh_size = putfh->pf_fhlen;
> >> 	memcpy(&cstate->current_fh.fh_handle.fh_base, putfh->pf_fhval,
> >> 	       putfh->pf_fhlen);
> >> -	return fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> >> +	ret = fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> >> +	if (ret == nfserr_stale && HAS_CSTATE_FLAG(cstate, NO_VERIFY_FH)) {
> >> +		CLEAR_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> >> +		SET_CSTATE_FLAG(cstate, IS_STALE_FH);
> >> +		ret = 0;
> >> +	}
> >> +	return ret;
> >> }
> >> 
> >> static __be32
> >> @@ -533,6 +541,16 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
> >> nfsd4_savefh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >> 	     void *arg)
> >> {
> >> +	/**
> >> +	* This is either an inter COPY (most likely) or an intra COPY with a
> >> +	* stale file handle. If the latter, nfsd4_copy will reset the PUTFH to
> >> +	* return nfserr_stale. No fh_dentry, just copy the file handle
> >> +	* to use with the inter COPY READ.
> >> +	*/
> >> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> >> +		cstate->save_fh = cstate->current_fh;
> >> +		return nfs_ok;
> >> +	}
> >> 	if (!cstate->current_fh.fh_dentry)
> >> 		return nfserr_nofilehandle;
> >> 
> >> @@ -1067,6 +1085,13 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
> >> 	if (status)
> >> 		goto out;
> >> 
> >> +	/* Intra copy source fh is stale. PUTFH will fail with ESTALE */
> >> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> >> +		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
> >> +		cstate->status = nfserr_copy_stalefh;
> >> +		goto out_put;
> >> +	}
> >> +
> >> 	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
> >> 			dst, copy->cp_dst_pos, copy->cp_count);
> >> 
> >> @@ -1081,6 +1106,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
> >> 		status = nfs_ok;
> >> 	}
> >> 
> >> +out_put:
> >> 	fput(src);
> >> 	fput(dst);
> >> out:
> >> @@ -1776,6 +1802,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> >> 	struct nfsd4_compound_state *cstate = &resp->cstate;
> >> 	struct svc_fh *current_fh = &cstate->current_fh;
> >> 	struct svc_fh *save_fh = &cstate->save_fh;
> >> +	int		i;
> >> 	__be32		status;
> >> 
> >> 	svcxdr_init_encode(rqstp, resp);
> >> @@ -1808,6 +1835,12 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> >> 		goto encode_op;
> >> 	}
> >> 
> >> +	/* NFSv4.2 COPY source file handle may be from a different server */
> >> +	for (i = 0; i < args->opcnt; i++) {
> >> +		op = &args->ops[i];
> >> +		if (op->opnum == OP_COPY)
> >> +			SET_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> >> +	}
> >> 	while (!status && resp->opcnt < args->opcnt) {
> >> 		op = &args->ops[resp->opcnt++];
> >> 
> >> @@ -1827,6 +1860,9 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> >> 
> >> 		opdesc = OPDESC(op);
> >> 
> >> +		if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> >> +			goto call_op;
> >> +
> >> 		if (!current_fh->fh_dentry) {
> >> 			if (!(opdesc->op_flags & ALLOWED_WITHOUT_FH)) {
> >> 				op->status = nfserr_nofilehandle;
> >> @@ -1861,6 +1897,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> >> 
> >> 		if (opdesc->op_get_currentstateid)
> >> 			opdesc->op_get_currentstateid(cstate, &op->u);
> >> +call_op:
> >> 		op->status = opdesc->op_func(rqstp, cstate, &op->u);
> >> 
> >> 		if (!op->status) {
> >> @@ -1881,6 +1918,14 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> >> 			status = op->status;
> >> 			goto out;
> >> 		}
> >> +		/* Only from intra COPY */
> >> +		if (cstate->status == nfserr_copy_stalefh) {
> >> +			dprintk("%s NFS4.2 intra COPY stale src filehandle\n",
> >> +				__func__);
> >> +			status = nfserr_stale;
> >> +			nfsd4_adjust_encode(resp);
> >> +			goto out;
> >> +		}
> >> 		if (op->status == nfserr_replay_me) {
> >> 			op->replay = &cstate->replay_owner->so_replay;
> >> 			nfsd4_encode_replay(&resp->xdr, op);
> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> >> index c632156..328ff9c 100644
> >> --- a/fs/nfsd/nfs4xdr.c
> >> +++ b/fs/nfsd/nfs4xdr.c
> >> @@ -4619,15 +4619,28 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
> >> 	return nfserr_rep_too_big;
> >> }
> >> 
> >> +/** Rewind the encoding to return nfserr_stale on the PUTFH
> >> + * in this failed Intra COPY compound
> >> + */
> >> +void
> >> +nfsd4_adjust_encode(struct nfsd4_compoundres *resp)
> >> +{
> >> +	__be32 *p;
> >> +
> >> +	p = resp->cstate.putfh_errp;
> >> +	*p++ = nfserr_stale;
> >> +}
> >> +
> >> void
> >> nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
> >> {
> >> 	struct xdr_stream *xdr = &resp->xdr;
> >> 	struct nfs4_stateowner *so = resp->cstate.replay_owner;
> >> +	struct nfsd4_compound_state *cstate = &resp->cstate;
> >> 	struct svc_rqst *rqstp = resp->rqstp;
> >> 	int post_err_offset;
> >> 	nfsd4_enc encoder;
> >> -	__be32 *p;
> >> +	__be32 *p, *statp;
> >> 
> >> 	p = xdr_reserve_space(xdr, 8);
> >> 	if (!p) {
> >> @@ -4636,9 +4649,20 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
> >> 	}
> >> 	*p++ = cpu_to_be32(op->opnum);
> >> 	post_err_offset = xdr->buf->len;
> >> +	statp = p;
> >> 
> >> 	if (op->opnum == OP_ILLEGAL)
> >> 		goto status;
> >> +
> >> +	/** This is a COPY compound with a stale source server file handle.
> >> +	 * If OP_COPY processing determines that this is an intra server to
> >> +	 * server COPY, then this PUTFH should return nfserr_ stale so the
> >> +	 * putfh_errp will be set to nfserr_stale. If this is an inter server
> >> +	 * to server COPY, ignore the nfserr_stale.
> >> +	 */
> >> +	if (op->opnum == OP_PUTFH && HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> >> +		cstate->putfh_errp = statp;
> >> +
> >> 	BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
> >> 	       !nfsd4_enc_ops[op->opnum]);
> >> 	encoder = nfsd4_enc_ops[op->opnum];
> >> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> >> index d966068..8d6fb0f 100644
> >> --- a/fs/nfsd/nfsd.h
> >> +++ b/fs/nfsd/nfsd.h
> >> @@ -272,6 +272,8 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
> >> #define	nfserr_replay_me	cpu_to_be32(11001)
> >> /* nfs41 replay detected */
> >> #define	nfserr_replay_cache	cpu_to_be32(11002)
> >> +/* nfs42 intra copy failed with nfserr_stale */
> >> +#define nfserr_copy_stalefh	cpu_to_be32(1103)
> >> 
> >> /* Check for dir entries '.' and '..' */
> >> #define isdotent(n, l)	(l < 3 && n[0] == '.' && (l == 1 || n[1] == '.'))
> >> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> >> index 38fcb4f..aa94295 100644
> >> --- a/fs/nfsd/xdr4.h
> >> +++ b/fs/nfsd/xdr4.h
> >> @@ -45,6 +45,8 @@
> >> 
> >> #define CURRENT_STATE_ID_FLAG (1<<0)
> >> #define SAVED_STATE_ID_FLAG (1<<1)
> >> +#define NO_VERIFY_FH (1<<2)
> >> +#define IS_STALE_FH  (1<<3)
> >> 
> >> #define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |= (f))
> >> #define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
> >> @@ -63,6 +65,7 @@ struct nfsd4_compound_state {
> >> 	size_t			iovlen;
> >> 	u32			minorversion;
> >> 	__be32			status;
> >> +	__be32			*putfh_errp;
> >> 	stateid_t	current_stateid;
> >> 	stateid_t	save_stateid;
> >> 	/* to indicate current and saved state id presents */
> >> @@ -705,6 +708,7 @@ int nfs4svc_decode_compoundargs(struct svc_rqst *, __be32 *,
> >> int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
> >> 		struct nfsd4_compoundres *);
> >> __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
> >> +void nfsd4_adjust_encode(struct nfsd4_compoundres *);
> >> void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
> >> void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op);
> >> __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
> >> -- 
> >> 1.8.3.1
> >> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 00/17] NFSD support for inter+async COPY
  2017-09-01 20:34           ` Olga Kornievskaia
@ 2017-09-01 21:19             ` J. Bruce Fields
  0 siblings, 0 replies; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 21:19 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: Olga Kornievskaia, linux-nfs

On Fri, Sep 01, 2017 at 04:34:16PM -0400, Olga Kornievskaia wrote:
> 
> > On Sep 1, 2017, at 4:09 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > 
> > On Fri, Sep 01, 2017 at 04:02:48PM -0400, Olga Kornievskaia wrote:
> >> 
> >>> On Sep 1, 2017, at 3:53 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>> 
> >>> On Fri, Sep 01, 2017 at 03:48:33PM -0400, Olga Kornievskaia wrote:
> >>>> On Fri, Sep 1, 2017 at 3:41 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>>>>       - what currently happens if you try to copy across krb5 mounts?
> >>>> 
> >>>> No GSSv3 is included in these patches.  The destination server will
> >>>> mount the source server using auth_sys.
> >>> 
> >>> Assuming that doesn't work--how is the failure handled?
> >> 
> >> If mount fails? Destination server returns an error in COPY (whatever
> >> vfs_kern_mount can return). Client calls generic nfs4_handle_exception() 
> >> but it’s probably a kind of error it doesn’t handle so it’ll be translated to EIO. 
> >> What kind of error are you thinking about?
> > 
> > I just want to make sure that copy_file_range() caller knows what to do
> > when it encounters this situation.
> > 
> > The typical application probably wants to fall back on a read-write loop
> > in the case inter-server copy isn't supported between the given mounts?
> > 
> > EIO doesn't sound like the most helpful error to me, but whatever error
> > it is, it should be documented in the copy_file_range man page so that
> > callers know how to check for this case.
> 
> On the client side, if we were to receive an error code that signified a connection 
> problem, then COPY implementation could map that to something that would
> trigger the VFS to just fallback to do_splice. 

Oh, right.

> However, I couldn’t find any errors in 15.1 from rfc5661 or 11.1 from 7862 that
> could mean connection errors (that’s typically RPC errors right). 

Looking at 7862....  Shouldn't these sorts of cases result in one of the
new errors described in
https://tools.ietf.org/html/rfc7862#section-11.1.2 ?

--b.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh
  2017-09-01 21:16       ` J. Bruce Fields
@ 2017-09-01 21:24         ` J. Bruce Fields
  0 siblings, 0 replies; 36+ messages in thread
From: J. Bruce Fields @ 2017-09-01 21:24 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

On Fri, Sep 01, 2017 at 05:16:14PM -0400, J. Bruce Fields wrote:
> On Fri, Sep 01, 2017 at 04:25:53PM -0400, Olga Kornievskaia wrote:
> > 
> > > On Sep 1, 2017, at 4:23 PM, J. Bruce Fields <bfields@redhat.com> wrote:
> > > 
> > > On Thu, Mar 02, 2017 at 11:01:28AM -0500, Olga Kornievskaia wrote:
> > >> From: Andy Adamson <andros@netapp.com>
> > >> 
> > >> The inter server to server COPY source server filehandle
> > >> is guaranteed to be stale as the COPY is sent to the destination
> > >> server.
> > > 
> > > This is definitely not true.  That filehandle could very well mean
> > > something to the source server, even if it's just by accident.
> > > 
> > > In the case the source filehandle refers to a file on a different
> > > server, nfsd knows that, and should not call fh_verify on it at all.
> > 
> > At the time of processing the FH it doesn’t know that there is a COPY
> > operation coming in the compound. So how does it know it refers to 
> > a file on a different server?
> 
> Sorry, of course you're right, I missed that these filehandles are
> passed by PUTFH and SAVEFH.  Yuch.
> 
> I don't like that we're doing a filehandle lookup at all on this
> "foreign" filehandle.  But maybe it doesn't cause big problems in
> practice.

But could you please check that this does the right thing in practice
if, for example, the source filehandle happens by coincidence to
succesfully verify?

--b.

> 
> --b.
> 
> > 
> > > 
> > > --b.
> > > 
> > >> 
> > >> Signed-off-by: Andy Adamson <andros@netapp.com>
> > >> ---
> > >> fs/nfsd/nfs4proc.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
> > >> fs/nfsd/nfs4xdr.c  | 26 +++++++++++++++++++++++++-
> > >> fs/nfsd/nfsd.h     |  2 ++
> > >> fs/nfsd/xdr4.h     |  4 ++++
> > >> 4 files changed, 77 insertions(+), 2 deletions(-)
> > >> 
> > >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > >> index a680c8c..733a9aa 100644
> > >> --- a/fs/nfsd/nfs4proc.c
> > >> +++ b/fs/nfsd/nfs4proc.c
> > >> @@ -496,11 +496,19 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
> > >> nfsd4_putfh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >> 	    struct nfsd4_putfh *putfh)
> > >> {
> > >> +	__be32 ret;
> > >> +
> > >> 	fh_put(&cstate->current_fh);
> > >> 	cstate->current_fh.fh_handle.fh_size = putfh->pf_fhlen;
> > >> 	memcpy(&cstate->current_fh.fh_handle.fh_base, putfh->pf_fhval,
> > >> 	       putfh->pf_fhlen);
> > >> -	return fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> > >> +	ret = fh_verify(rqstp, &cstate->current_fh, 0, NFSD_MAY_BYPASS_GSS);
> > >> +	if (ret == nfserr_stale && HAS_CSTATE_FLAG(cstate, NO_VERIFY_FH)) {
> > >> +		CLEAR_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> > >> +		SET_CSTATE_FLAG(cstate, IS_STALE_FH);
> > >> +		ret = 0;
> > >> +	}
> > >> +	return ret;
> > >> }
> > >> 
> > >> static __be32
> > >> @@ -533,6 +541,16 @@ static __be32 nfsd4_open_omfg(struct svc_rqst *rqstp, struct nfsd4_compound_stat
> > >> nfsd4_savefh(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >> 	     void *arg)
> > >> {
> > >> +	/**
> > >> +	* This is either an inter COPY (most likely) or an intra COPY with a
> > >> +	* stale file handle. If the latter, nfsd4_copy will reset the PUTFH to
> > >> +	* return nfserr_stale. No fh_dentry, just copy the file handle
> > >> +	* to use with the inter COPY READ.
> > >> +	*/
> > >> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> > >> +		cstate->save_fh = cstate->current_fh;
> > >> +		return nfs_ok;
> > >> +	}
> > >> 	if (!cstate->current_fh.fh_dentry)
> > >> 		return nfserr_nofilehandle;
> > >> 
> > >> @@ -1067,6 +1085,13 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
> > >> 	if (status)
> > >> 		goto out;
> > >> 
> > >> +	/* Intra copy source fh is stale. PUTFH will fail with ESTALE */
> > >> +	if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH)) {
> > >> +		CLEAR_CSTATE_FLAG(cstate, IS_STALE_FH);
> > >> +		cstate->status = nfserr_copy_stalefh;
> > >> +		goto out_put;
> > >> +	}
> > >> +
> > >> 	bytes = nfsd_copy_file_range(src, copy->cp_src_pos,
> > >> 			dst, copy->cp_dst_pos, copy->cp_count);
> > >> 
> > >> @@ -1081,6 +1106,7 @@ static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
> > >> 		status = nfs_ok;
> > >> 	}
> > >> 
> > >> +out_put:
> > >> 	fput(src);
> > >> 	fput(dst);
> > >> out:
> > >> @@ -1776,6 +1802,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> > >> 	struct nfsd4_compound_state *cstate = &resp->cstate;
> > >> 	struct svc_fh *current_fh = &cstate->current_fh;
> > >> 	struct svc_fh *save_fh = &cstate->save_fh;
> > >> +	int		i;
> > >> 	__be32		status;
> > >> 
> > >> 	svcxdr_init_encode(rqstp, resp);
> > >> @@ -1808,6 +1835,12 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> > >> 		goto encode_op;
> > >> 	}
> > >> 
> > >> +	/* NFSv4.2 COPY source file handle may be from a different server */
> > >> +	for (i = 0; i < args->opcnt; i++) {
> > >> +		op = &args->ops[i];
> > >> +		if (op->opnum == OP_COPY)
> > >> +			SET_CSTATE_FLAG(cstate, NO_VERIFY_FH);
> > >> +	}
> > >> 	while (!status && resp->opcnt < args->opcnt) {
> > >> 		op = &args->ops[resp->opcnt++];
> > >> 
> > >> @@ -1827,6 +1860,9 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> > >> 
> > >> 		opdesc = OPDESC(op);
> > >> 
> > >> +		if (HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> > >> +			goto call_op;
> > >> +
> > >> 		if (!current_fh->fh_dentry) {
> > >> 			if (!(opdesc->op_flags & ALLOWED_WITHOUT_FH)) {
> > >> 				op->status = nfserr_nofilehandle;
> > >> @@ -1861,6 +1897,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> > >> 
> > >> 		if (opdesc->op_get_currentstateid)
> > >> 			opdesc->op_get_currentstateid(cstate, &op->u);
> > >> +call_op:
> > >> 		op->status = opdesc->op_func(rqstp, cstate, &op->u);
> > >> 
> > >> 		if (!op->status) {
> > >> @@ -1881,6 +1918,14 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
> > >> 			status = op->status;
> > >> 			goto out;
> > >> 		}
> > >> +		/* Only from intra COPY */
> > >> +		if (cstate->status == nfserr_copy_stalefh) {
> > >> +			dprintk("%s NFS4.2 intra COPY stale src filehandle\n",
> > >> +				__func__);
> > >> +			status = nfserr_stale;
> > >> +			nfsd4_adjust_encode(resp);
> > >> +			goto out;
> > >> +		}
> > >> 		if (op->status == nfserr_replay_me) {
> > >> 			op->replay = &cstate->replay_owner->so_replay;
> > >> 			nfsd4_encode_replay(&resp->xdr, op);
> > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> > >> index c632156..328ff9c 100644
> > >> --- a/fs/nfsd/nfs4xdr.c
> > >> +++ b/fs/nfsd/nfs4xdr.c
> > >> @@ -4619,15 +4619,28 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
> > >> 	return nfserr_rep_too_big;
> > >> }
> > >> 
> > >> +/** Rewind the encoding to return nfserr_stale on the PUTFH
> > >> + * in this failed Intra COPY compound
> > >> + */
> > >> +void
> > >> +nfsd4_adjust_encode(struct nfsd4_compoundres *resp)
> > >> +{
> > >> +	__be32 *p;
> > >> +
> > >> +	p = resp->cstate.putfh_errp;
> > >> +	*p++ = nfserr_stale;
> > >> +}
> > >> +
> > >> void
> > >> nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
> > >> {
> > >> 	struct xdr_stream *xdr = &resp->xdr;
> > >> 	struct nfs4_stateowner *so = resp->cstate.replay_owner;
> > >> +	struct nfsd4_compound_state *cstate = &resp->cstate;
> > >> 	struct svc_rqst *rqstp = resp->rqstp;
> > >> 	int post_err_offset;
> > >> 	nfsd4_enc encoder;
> > >> -	__be32 *p;
> > >> +	__be32 *p, *statp;
> > >> 
> > >> 	p = xdr_reserve_space(xdr, 8);
> > >> 	if (!p) {
> > >> @@ -4636,9 +4649,20 @@ __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *resp, u32 respsize)
> > >> 	}
> > >> 	*p++ = cpu_to_be32(op->opnum);
> > >> 	post_err_offset = xdr->buf->len;
> > >> +	statp = p;
> > >> 
> > >> 	if (op->opnum == OP_ILLEGAL)
> > >> 		goto status;
> > >> +
> > >> +	/** This is a COPY compound with a stale source server file handle.
> > >> +	 * If OP_COPY processing determines that this is an intra server to
> > >> +	 * server COPY, then this PUTFH should return nfserr_ stale so the
> > >> +	 * putfh_errp will be set to nfserr_stale. If this is an inter server
> > >> +	 * to server COPY, ignore the nfserr_stale.
> > >> +	 */
> > >> +	if (op->opnum == OP_PUTFH && HAS_CSTATE_FLAG(cstate, IS_STALE_FH))
> > >> +		cstate->putfh_errp = statp;
> > >> +
> > >> 	BUG_ON(op->opnum < 0 || op->opnum >= ARRAY_SIZE(nfsd4_enc_ops) ||
> > >> 	       !nfsd4_enc_ops[op->opnum]);
> > >> 	encoder = nfsd4_enc_ops[op->opnum];
> > >> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> > >> index d966068..8d6fb0f 100644
> > >> --- a/fs/nfsd/nfsd.h
> > >> +++ b/fs/nfsd/nfsd.h
> > >> @@ -272,6 +272,8 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
> > >> #define	nfserr_replay_me	cpu_to_be32(11001)
> > >> /* nfs41 replay detected */
> > >> #define	nfserr_replay_cache	cpu_to_be32(11002)
> > >> +/* nfs42 intra copy failed with nfserr_stale */
> > >> +#define nfserr_copy_stalefh	cpu_to_be32(1103)
> > >> 
> > >> /* Check for dir entries '.' and '..' */
> > >> #define isdotent(n, l)	(l < 3 && n[0] == '.' && (l == 1 || n[1] == '.'))
> > >> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > >> index 38fcb4f..aa94295 100644
> > >> --- a/fs/nfsd/xdr4.h
> > >> +++ b/fs/nfsd/xdr4.h
> > >> @@ -45,6 +45,8 @@
> > >> 
> > >> #define CURRENT_STATE_ID_FLAG (1<<0)
> > >> #define SAVED_STATE_ID_FLAG (1<<1)
> > >> +#define NO_VERIFY_FH (1<<2)
> > >> +#define IS_STALE_FH  (1<<3)
> > >> 
> > >> #define SET_CSTATE_FLAG(c, f) ((c)->sid_flags |= (f))
> > >> #define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
> > >> @@ -63,6 +65,7 @@ struct nfsd4_compound_state {
> > >> 	size_t			iovlen;
> > >> 	u32			minorversion;
> > >> 	__be32			status;
> > >> +	__be32			*putfh_errp;
> > >> 	stateid_t	current_stateid;
> > >> 	stateid_t	save_stateid;
> > >> 	/* to indicate current and saved state id presents */
> > >> @@ -705,6 +708,7 @@ int nfs4svc_decode_compoundargs(struct svc_rqst *, __be32 *,
> > >> int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *,
> > >> 		struct nfsd4_compoundres *);
> > >> __be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
> > >> +void nfsd4_adjust_encode(struct nfsd4_compoundres *);
> > >> void nfsd4_encode_operation(struct nfsd4_compoundres *, struct nfsd4_op *);
> > >> void nfsd4_encode_replay(struct xdr_stream *xdr, struct nfsd4_op *op);
> > >> __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
> > >> -- 
> > >> 1.8.3.1
> > >> 
> > 

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-09-01 21:24 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02 16:01 [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 01/18] NFSD add ca_source_server<> to COPY Olga Kornievskaia
2017-09-01 19:52   ` J. Bruce Fields
2017-09-01 20:14     ` Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 02/18] NFSD add COPY_NOTIFY operation Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 03/18] NFSD generalize nfsd4_compound_state flag names Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 04/18] NFSD: allow inter server COPY to have a STALE source server fh Olga Kornievskaia
2017-09-01 20:23   ` J. Bruce Fields
2017-09-01 20:25     ` Olga Kornievskaia
2017-09-01 21:16       ` J. Bruce Fields
2017-09-01 21:24         ` J. Bruce Fields
2017-03-02 16:01 ` [RFC v1 05/18] NFSD add nfs4 inter ssc to nfsd4_copy Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 06/18] NFSD return nfs4_stid in nfs4_preprocess_stateid_op Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 07/18] NFSD Unique stateid_t for inter server to server COPY authentication Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 08/18] NFSD CB_OFFLOAD xdr Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 09/18] NFSD OFFLOAD_STATUS xdr Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 10/18] NFSD OFFLOAD_CANCEL xdr Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 11/18] NFSD xdr callback stateid in async COPY reply Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 12/18] NFSD first draft of async copy Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 13/18] NFSD handle OFFLOAD_CANCEL op Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 14/18] NFSD stop queued async copies on client shutdown Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 15/18] NFSD create new stateid for async copy Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 16/18] NFSD define EBADF in nfserrno Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 17/18] NFSD support OFFLOAD_STATUS Olga Kornievskaia
2017-03-02 16:01 ` [RFC v1 18/18] NFSD remove copy stateid when vfs_copy_file_range completes Olga Kornievskaia
2017-03-17 21:21 ` [RFC v1 00/17] NFSD support for inter+async COPY Olga Kornievskaia
2017-03-20 15:30   ` J. Bruce Fields
2017-03-27 21:49     ` Olga Kornievskaia
2017-09-01 19:41 ` J. Bruce Fields
2017-09-01 19:42   ` J. Bruce Fields
2017-09-01 19:48   ` Olga Kornievskaia
2017-09-01 19:53     ` J. Bruce Fields
2017-09-01 20:02       ` Olga Kornievskaia
2017-09-01 20:09         ` J. Bruce Fields
2017-09-01 20:34           ` Olga Kornievskaia
2017-09-01 21:19             ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.