All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] cifs: asynchronous writepages support (try #4)
@ 2011-05-19 20:22 Jeff Layton
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

This is the fourth version of this set. The main changes since the 3rd set are:

- it's been rebased on top of Steve's current master branch

- testing showed that find_get_pages_tag would only return a maximum of
  256 pages, which limited the wsize to 1M with 4k PAGE_CACHE_SIZE. So,
  the code now calls find_get_pages_tag multiple times until it gets all
  that it wants, it hits the end of the file, or no more dirty pages are
  present in the range.

- the default wsize is now 1M, but it will negotiate downward depending
  on what the server supports. It can be set as high as ~16M, but some
  cursory testing didn't show any real improvement in throughput with
  a larger wsize than 1M (YMMV of course).

- I dropped the patch to make the maxmpxcount be respected. I think that
  needs more careful consideration and is probably worthy of a separate
  patchset in its own right.

At this point, I believe this set is ready for merge for 2.6.40. Once it
goes in, I'll plan to spin up a patch to clarify the wsize= option in
the mount.cifs manpage.

Once this has had some soak time, I'll look at async readpages support
as well, and maybe consider changing writepage to use an async call.

Jeff Layton (7):
  cifs: consolidate SendReceive response checks
  cifs: make cifs_send_async take a kvec array
  cifs: don't call mid_q_entry->callback under the Global_MidLock
  cifs: add ignore_pend flag to cifs_call_async
  cifs: add cifs_async_writev
  cifs: convert cifs_writepages to use async writes
  cifs: clean up wsize negotiation and allow for larger wsize

 fs/cifs/cifsglob.h  |    6 +-
 fs/cifs/cifsproto.h |   28 +++++-
 fs/cifs/cifssmb.c   |  242 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/cifs/connect.c   |   99 +++++++++++++++------
 fs/cifs/file.c      |  241 +++++++++++++++++++++------------------------------
 fs/cifs/netmisc.c   |    2 +-
 fs/cifs/transport.c |  196 ++++++++++++-----------------------------
 7 files changed, 498 insertions(+), 316 deletions(-)

-- 
1.7.4.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] cifs: consolidate SendReceive response checks
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-05-19 20:22   ` Jeff Layton
  2011-05-19 20:22   ` [PATCH 2/7] cifs: make cifs_send_async take a kvec array Jeff Layton
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Further consolidate the SendReceive code by moving the checks run over
the packet into a separate function that all the SendReceive variants
can call.

We can also eliminate the check for a receive_len that's too big or too
small. cifs_demultiplex_thread already checks that and disconnects the
socket if that occurs, while setting the midStatus to MALFORMED. It'll
never call this code if that's the case.

Finally do a little cleanup. Use "goto out" on errors so that the flow
of code in the normal case is more evident. Also switch the logErr
variable in map_smb_to_linux_error to a bool.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsproto.h |    4 +-
 fs/cifs/netmisc.c   |    2 +-
 fs/cifs/transport.c |  157 +++++++++++++-------------------------------------
 3 files changed, 45 insertions(+), 118 deletions(-)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index a1c94d3..bf140c2 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -76,6 +76,8 @@ extern int SendReceive(const unsigned int /* xid */ , struct cifsSesInfo *,
 			int * /* bytes returned */ , const int long_op);
 extern int SendReceiveNoRsp(const unsigned int xid, struct cifsSesInfo *ses,
 			struct smb_hdr *in_buf, int flags);
+extern int cifs_check_receive(struct mid_q_entry *mid,
+			struct TCP_Server_Info *server, bool log_error);
 extern int SendReceive2(const unsigned int /* xid */ , struct cifsSesInfo *,
 			struct kvec *, int /* nvec to send */,
 			int * /* type of buf returned */ , const int flags);
@@ -99,7 +101,7 @@ extern int cifs_convert_address(struct sockaddr *dst, const char *src, int len);
 extern int cifs_set_port(struct sockaddr *addr, const unsigned short int port);
 extern int cifs_fill_sockaddr(struct sockaddr *dst, const char *src, int len,
 				const unsigned short int port);
-extern int map_smb_to_linux_error(struct smb_hdr *smb, int logErr);
+extern int map_smb_to_linux_error(struct smb_hdr *smb, bool logErr);
 extern void header_assemble(struct smb_hdr *, char /* command */ ,
 			    const struct cifsTconInfo *, int /* length of
 			    fixed section (word count) in two byte units */);
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c
index 79b71c2..73e47e8 100644
--- a/fs/cifs/netmisc.c
+++ b/fs/cifs/netmisc.c
@@ -836,7 +836,7 @@ ntstatus_to_dos(__u32 ntstatus, __u8 *eclass, __u16 *ecode)
 }
 
 int
-map_smb_to_linux_error(struct smb_hdr *smb, int logErr)
+map_smb_to_linux_error(struct smb_hdr *smb, bool logErr)
 {
 	unsigned int i;
 	int rc = -EIO;	/* if transport error smb error may not be set */
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index f2513fb..43633a7 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -502,13 +502,31 @@ send_nt_cancel(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 }
 
 int
+cifs_check_receive(struct mid_q_entry *mid, struct TCP_Server_Info *server,
+		   bool log_error)
+{
+	dump_smb(mid->resp_buf,
+		 min_t(u32, 92, be32_to_cpu(mid->resp_buf->smb_buf_length)));
+
+	/* convert the length into a more usable form */
+	if (server->secMode & (SECMODE_SIGN_REQUIRED | SECMODE_SIGN_ENABLED)) {
+		/* FIXME: add code to kill session */
+		if (cifs_verify_signature(mid->resp_buf, server,
+					  mid->sequence_number + 1) != 0)
+			cERROR(1, "Unexpected SMB signature");
+	}
+
+	/* BB special case reconnect tid and uid here? */
+	return map_smb_to_linux_error(mid->resp_buf, log_error);
+}
+
+int
 SendReceive2(const unsigned int xid, struct cifsSesInfo *ses,
 	     struct kvec *iov, int n_vec, int *pRespBufType /* ret */,
 	     const int flags)
 {
 	int rc = 0;
 	int long_op;
-	unsigned int receive_len;
 	struct mid_q_entry *midQ;
 	struct smb_hdr *in_buf = iov[0].iov_base;
 
@@ -605,54 +623,24 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses,
 		return rc;
 	}
 
-	receive_len = be32_to_cpu(midQ->resp_buf->smb_buf_length);
-
-	if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) {
-		cERROR(1, "Frame too large received.  Length: %d  Xid: %d",
-			receive_len, xid);
+	if (!midQ->resp_buf || midQ->midState != MID_RESPONSE_RECEIVED) {
 		rc = -EIO;
+		cFYI(1, "Bad MID state?");
 		goto out;
 	}
 
-	/* rcvd frame is ok */
-
-	if (midQ->resp_buf &&
-	    (midQ->midState == MID_RESPONSE_RECEIVED)) {
-
-		iov[0].iov_base = (char *)midQ->resp_buf;
-		if (midQ->largeBuf)
-			*pRespBufType = CIFS_LARGE_BUFFER;
-		else
-			*pRespBufType = CIFS_SMALL_BUFFER;
-		iov[0].iov_len = receive_len + 4;
-
-		dump_smb(midQ->resp_buf, 80);
-		/* convert the length into a more usable form */
-		if ((receive_len > 24) &&
-		    (ses->server->secMode & (SECMODE_SIGN_REQUIRED |
-					     SECMODE_SIGN_ENABLED))) {
-			rc = cifs_verify_signature(midQ->resp_buf,
-						ses->server,
-						midQ->sequence_number+1);
-			if (rc) {
-				cERROR(1, "Unexpected SMB signature");
-				/* BB FIXME add code to kill session */
-			}
-		}
-
-		/* BB special case reconnect tid and uid here? */
-		rc = map_smb_to_linux_error(midQ->resp_buf,
-					    flags & CIFS_LOG_ERROR);
+	iov[0].iov_base = (char *)midQ->resp_buf;
+	iov[0].iov_len = be32_to_cpu(midQ->resp_buf->smb_buf_length) + 4;
+	if (midQ->largeBuf)
+		*pRespBufType = CIFS_LARGE_BUFFER;
+	else
+		*pRespBufType = CIFS_SMALL_BUFFER;
 
-		if ((flags & CIFS_NO_RESP) == 0)
-			midQ->resp_buf = NULL;  /* mark it so buf will
-						   not be freed by
-						   delete_mid */
-	} else {
-		rc = -EIO;
-		cFYI(1, "Bad MID state?");
-	}
+	rc = cifs_check_receive(midQ, ses->server, flags & CIFS_LOG_ERROR);
 
+	/* mark it so buf will not be freed by delete_mid */
+	if ((flags & CIFS_NO_RESP) == 0)
+		midQ->resp_buf = NULL;
 out:
 	delete_mid(midQ);
 	atomic_dec(&ses->server->inFlight);
@@ -667,7 +655,6 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses,
 	    int *pbytes_returned, const int long_op)
 {
 	int rc = 0;
-	unsigned int receive_len;
 	struct mid_q_entry *midQ;
 
 	if (ses == NULL) {
@@ -757,47 +744,16 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses,
 		return rc;
 	}
 
-	receive_len = be32_to_cpu(midQ->resp_buf->smb_buf_length);
-
-	if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) {
-		cERROR(1, "Frame too large received.  Length: %d  Xid: %d",
-			receive_len, xid);
-		rc = -EIO;
-		goto out;
-	}
-
-	/* rcvd frame is ok */
-
-	if (midQ->resp_buf && out_buf
-	    && (midQ->midState == MID_RESPONSE_RECEIVED)) {
-		out_buf->smb_buf_length = cpu_to_be32(receive_len);
-		memcpy((char *)out_buf + 4,
-		       (char *)midQ->resp_buf + 4,
-		       receive_len);
-
-		dump_smb(out_buf, 92);
-		/* convert the length into a more usable form */
-		if ((receive_len > 24) &&
-		    (ses->server->secMode & (SECMODE_SIGN_REQUIRED |
-					     SECMODE_SIGN_ENABLED))) {
-			rc = cifs_verify_signature(out_buf,
-						ses->server,
-						midQ->sequence_number+1);
-			if (rc) {
-				cERROR(1, "Unexpected SMB signature");
-				/* BB FIXME add code to kill session */
-			}
-		}
-
-		*pbytes_returned = be32_to_cpu(out_buf->smb_buf_length);
-
-		/* BB special case reconnect tid and uid here? */
-		rc = map_smb_to_linux_error(out_buf, 0 /* no log */ );
-	} else {
+	if (!midQ->resp_buf || !out_buf ||
+	    midQ->midState != MID_RESPONSE_RECEIVED) {
 		rc = -EIO;
 		cERROR(1, "Bad MID state?");
+		goto out;
 	}
 
+	*pbytes_returned = be32_to_cpu(midQ->resp_buf->smb_buf_length);
+	memcpy(out_buf, midQ->resp_buf, *pbytes_returned + 4);
+	rc = cifs_check_receive(midQ, ses->server, 0);
 out:
 	delete_mid(midQ);
 	atomic_dec(&ses->server->inFlight);
@@ -838,7 +794,6 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon,
 {
 	int rc = 0;
 	int rstart = 0;
-	unsigned int receive_len;
 	struct mid_q_entry *midQ;
 	struct cifsSesInfo *ses;
 
@@ -961,46 +916,16 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon,
 	if (rc != 0)
 		return rc;
 
-	receive_len = be32_to_cpu(midQ->resp_buf->smb_buf_length);
-	if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) {
-		cERROR(1, "Frame too large received.  Length: %d  Xid: %d",
-			receive_len, xid);
-		rc = -EIO;
-		goto out;
-	}
-
 	/* rcvd frame is ok */
-
-	if ((out_buf == NULL) || (midQ->midState != MID_RESPONSE_RECEIVED)) {
+	if (out_buf == NULL || midQ->midState != MID_RESPONSE_RECEIVED) {
 		rc = -EIO;
 		cERROR(1, "Bad MID state?");
 		goto out;
 	}
 
-	out_buf->smb_buf_length = cpu_to_be32(receive_len);
-	memcpy((char *)out_buf + 4,
-	       (char *)midQ->resp_buf + 4,
-	       receive_len);
-
-	dump_smb(out_buf, 92);
-	/* convert the length into a more usable form */
-	if ((receive_len > 24) &&
-	    (ses->server->secMode & (SECMODE_SIGN_REQUIRED |
-				     SECMODE_SIGN_ENABLED))) {
-		rc = cifs_verify_signature(out_buf,
-					   ses->server,
-					   midQ->sequence_number+1);
-		if (rc) {
-			cERROR(1, "Unexpected SMB signature");
-			/* BB FIXME add code to kill session */
-		}
-	}
-
-	*pbytes_returned = be32_to_cpu(out_buf->smb_buf_length);
-
-	/* BB special case reconnect tid and uid here? */
-	rc = map_smb_to_linux_error(out_buf, 0 /* no log */ );
-
+	*pbytes_returned = be32_to_cpu(midQ->resp_buf->smb_buf_length);
+	memcpy(out_buf, midQ->resp_buf, *pbytes_returned + 4);
+	rc = cifs_check_receive(midQ, ses->server, 0);
 out:
 	delete_mid(midQ);
 	if (rstart && rc == -EACCES)
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/7] cifs: make cifs_send_async take a kvec array
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2011-05-19 20:22   ` [PATCH 1/7] cifs: consolidate SendReceive response checks Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
  2011-05-19 20:22   ` [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock Jeff Layton
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

We'll need this for async writes, so convert the call to take a kvec
array. CIFSSMBEcho is changed to put a kvec on the stack and pass
in the SMB buffer using that.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsproto.h |    4 ++--
 fs/cifs/cifssmb.c   |    6 ++++--
 fs/cifs/transport.c |   13 +++++++------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index bf140c2..b5407a6 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -67,8 +67,8 @@ extern char *cifs_compose_mount_options(const char *sb_mountdata,
 extern struct mid_q_entry *AllocMidQEntry(const struct smb_hdr *smb_buffer,
 					struct TCP_Server_Info *server);
 extern void DeleteMidQEntry(struct mid_q_entry *midEntry);
-extern int cifs_call_async(struct TCP_Server_Info *server,
-			   struct smb_hdr *in_buf, mid_callback_t *callback,
+extern int cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
+			   unsigned int nvec, mid_callback_t *callback,
 			   void *cbdata);
 extern int SendReceive(const unsigned int /* xid */ , struct cifsSesInfo *,
 			struct smb_hdr * /* input */ ,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 64c2c91..b998996 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -725,6 +725,7 @@ CIFSSMBEcho(struct TCP_Server_Info *server)
 {
 	ECHO_REQ *smb;
 	int rc = 0;
+	struct kvec iov;
 
 	cFYI(1, "In echo request");
 
@@ -739,9 +740,10 @@ CIFSSMBEcho(struct TCP_Server_Info *server)
 	put_bcc(1, &smb->hdr);
 	smb->Data[0] = 'a';
 	inc_rfc1001_len(smb, 3);
+	iov.iov_base = smb;
+	iov.iov_len = be32_to_cpu(smb->hdr.smb_buf_length) + 4;
 
-	rc = cifs_call_async(server, (struct smb_hdr *)smb,
-				cifs_echo_callback, server);
+	rc = cifs_call_async(server, &iov, 1, cifs_echo_callback, server);
 	if (rc)
 		cFYI(1, "Echo request failed: %d", rc);
 
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 43633a7..67f59c7 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -342,11 +342,12 @@ wait_for_response(struct TCP_Server_Info *server, struct mid_q_entry *midQ)
  * the result. Caller is responsible for dealing with timeouts.
  */
 int
-cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
-		mid_callback_t *callback, void *cbdata)
+cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
+		unsigned int nvec, mid_callback_t *callback, void *cbdata)
 {
 	int rc;
 	struct mid_q_entry *mid;
+	struct smb_hdr *hdr = (struct smb_hdr *)iov[0].iov_base;
 
 	rc = wait_for_free_request(server, CIFS_ASYNC_OP);
 	if (rc)
@@ -354,10 +355,10 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 
 	/* enable signing if server requires it */
 	if (server->secMode & (SECMODE_SIGN_REQUIRED | SECMODE_SIGN_ENABLED))
-		in_buf->Flags2 |= SMBFLG2_SECURITY_SIGNATURE;
+		hdr->Flags2 |= SMBFLG2_SECURITY_SIGNATURE;
 
 	mutex_lock(&server->srv_mutex);
-	mid = AllocMidQEntry(in_buf, server);
+	mid = AllocMidQEntry(hdr, server);
 	if (mid == NULL) {
 		mutex_unlock(&server->srv_mutex);
 		return -ENOMEM;
@@ -368,7 +369,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 	list_add_tail(&mid->qhead, &server->pending_mid_q);
 	spin_unlock(&GlobalMid_Lock);
 
-	rc = cifs_sign_smb(in_buf, server, &mid->sequence_number);
+	rc = cifs_sign_smb2(iov, nvec, server, &mid->sequence_number);
 	if (rc) {
 		mutex_unlock(&server->srv_mutex);
 		goto out_err;
@@ -380,7 +381,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 #ifdef CONFIG_CIFS_STATS2
 	atomic_inc(&server->inSend);
 #endif
-	rc = smb_send(server, in_buf, be32_to_cpu(in_buf->smb_buf_length));
+	rc = smb_sendv(server, iov, nvec);
 #ifdef CONFIG_CIFS_STATS2
 	atomic_dec(&server->inSend);
 	mid->when_sent = jiffies;
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2011-05-19 20:22   ` [PATCH 1/7] cifs: consolidate SendReceive response checks Jeff Layton
  2011-05-19 20:22   ` [PATCH 2/7] cifs: make cifs_send_async take a kvec array Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
       [not found]     ` <1305836578-26333-4-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2011-05-19 20:22   ` [PATCH 4/7] cifs: add ignore_pend flag to cifs_call_async Jeff Layton
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Holding the spinlock while we call this function means that it can't
sleep, which really limits what it can do. Taking it out from under
the spinlock also means less contention for this global lock.

Change the semantics such that the Global_MidLock is not held when
the callback is called. To do this requires that we take extra care
not to have sync_mid_result remove the mid from the list when the
mid is in a state where that has already happened. This prevents
list corruption when the mid is sitting on a private list for
reconnect or when cifsd is coming down.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsglob.h  |    6 +++---
 fs/cifs/connect.c   |   30 ++++++++++++++++++++++++------
 fs/cifs/transport.c |   23 ++++++++---------------
 3 files changed, 35 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 76b4517..fd877c1 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -543,9 +543,8 @@ struct mid_q_entry;
  * This is the prototype for the mid callback function. When creating one,
  * take special care to avoid deadlocks. Things to bear in mind:
  *
- * - it will be called by cifsd
- * - the GlobalMid_Lock will be held
- * - the mid will be removed from the pending_mid_q list
+ * - it will be called by cifsd, with no locks held
+ * - the mid will be removed from any lists
  */
 typedef void (mid_callback_t)(struct mid_q_entry *mid);
 
@@ -656,6 +655,7 @@ static inline void free_dfs_info_array(struct dfs_info3_param *param,
 #define   MID_RESPONSE_RECEIVED 4
 #define   MID_RETRY_NEEDED      8 /* session closed while this request out */
 #define   MID_RESPONSE_MALFORMED 0x10
+#define   MID_SHUTDOWN		 0x20
 
 /* Types of response buffer returned from SendReceive2 */
 #define   CIFS_NO_BUFFER        0    /* Response buffer not returned */
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index da284e3..ccb3ff8 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -138,6 +138,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
 	struct cifsSesInfo *ses;
 	struct cifsTconInfo *tcon;
 	struct mid_q_entry *mid_entry;
+	struct list_head retry_list;
 
 	spin_lock(&GlobalMid_Lock);
 	if (server->tcpStatus == CifsExiting) {
@@ -189,16 +190,23 @@ cifs_reconnect(struct TCP_Server_Info *server)
 	mutex_unlock(&server->srv_mutex);
 
 	/* mark submitted MIDs for retry and issue callback */
-	cFYI(1, "%s: issuing mid callbacks", __func__);
+	INIT_LIST_HEAD(&retry_list);
+	cFYI(1, "%s: moving mids to private list", __func__);
 	spin_lock(&GlobalMid_Lock);
 	list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
 		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
 		if (mid_entry->midState == MID_REQUEST_SUBMITTED)
 			mid_entry->midState = MID_RETRY_NEEDED;
+		list_move(&mid_entry->qhead, &retry_list);
+	}
+	spin_unlock(&GlobalMid_Lock);
+
+	cFYI(1, "%s: moving mids to private list", __func__);
+	list_for_each_safe(tmp, tmp2, &retry_list) {
+		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
 		list_del_init(&mid_entry->qhead);
 		mid_entry->callback(mid_entry);
 	}
-	spin_unlock(&GlobalMid_Lock);
 
 	while (server->tcpStatus == CifsNeedReconnect) {
 		try_to_freeze();
@@ -672,12 +680,12 @@ multi_t2_fnd:
 			mid_entry->when_received = jiffies;
 #endif
 			list_del_init(&mid_entry->qhead);
-			mid_entry->callback(mid_entry);
 			break;
 		}
 		spin_unlock(&GlobalMid_Lock);
 
 		if (mid_entry != NULL) {
+			mid_entry->callback(mid_entry);
 			/* Was previous buf put in mpx struct for multi-rsp? */
 			if (!isMultiRsp) {
 				/* smb buffer will be freed by user thread */
@@ -741,15 +749,25 @@ multi_t2_fnd:
 		cifs_small_buf_release(smallbuf);
 
 	if (!list_empty(&server->pending_mid_q)) {
+		struct list_head dispose_list;
+
+		INIT_LIST_HEAD(&dispose_list);
 		spin_lock(&GlobalMid_Lock);
 		list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
 			mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
-			cFYI(1, "Clearing Mid 0x%x - issuing callback",
-					 mid_entry->mid);
+			cFYI(1, "Clearing mid 0x%x", mid_entry->mid);
+			mid_entry->midState = MID_SHUTDOWN;
+			list_move(&mid_entry->qhead, &dispose_list);
+		}
+		spin_unlock(&GlobalMid_Lock);
+
+		/* now walk dispose list and issue callbacks */
+		list_for_each_safe(tmp, tmp2, &dispose_list) {
+			mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
+			cFYI(1, "Callback mid 0x%x", mid_entry->mid);
 			list_del_init(&mid_entry->qhead);
 			mid_entry->callback(mid_entry);
 		}
-		spin_unlock(&GlobalMid_Lock);
 		/* 1/8th of sec is more than enough time for them to exit */
 		msleep(125);
 	}
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 67f59c7..b3c3c6d 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -425,7 +425,7 @@ SendReceiveNoRsp(const unsigned int xid, struct cifsSesInfo *ses,
 }
 
 static int
-sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
+cifs_sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
 {
 	int rc = 0;
 
@@ -433,28 +433,21 @@ sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
 		mid->mid, mid->midState);
 
 	spin_lock(&GlobalMid_Lock);
-	/* ensure that it's no longer on the pending_mid_q */
-	list_del_init(&mid->qhead);
-
 	switch (mid->midState) {
 	case MID_RESPONSE_RECEIVED:
 		spin_unlock(&GlobalMid_Lock);
 		return rc;
-	case MID_REQUEST_SUBMITTED:
-		/* socket is going down, reject all calls */
-		if (server->tcpStatus == CifsExiting) {
-			cERROR(1, "%s: canceling mid=%d cmd=0x%x state=%d",
-			       __func__, mid->mid, mid->command, mid->midState);
-			rc = -EHOSTDOWN;
-			break;
-		}
 	case MID_RETRY_NEEDED:
 		rc = -EAGAIN;
 		break;
 	case MID_RESPONSE_MALFORMED:
 		rc = -EIO;
 		break;
+	case MID_SHUTDOWN:
+		rc = -EHOSTDOWN;
+		break;
 	default:
+		list_del_init(&mid->qhead);
 		cERROR(1, "%s: invalid mid state mid=%d state=%d", __func__,
 			mid->mid, mid->midState);
 		rc = -EIO;
@@ -617,7 +610,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses,
 
 	cifs_small_buf_release(in_buf);
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0) {
 		atomic_dec(&ses->server->inFlight);
 		wake_up(&ses->server->request_q);
@@ -738,7 +731,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses,
 		spin_unlock(&GlobalMid_Lock);
 	}
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0) {
 		atomic_dec(&ses->server->inFlight);
 		wake_up(&ses->server->request_q);
@@ -913,7 +906,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon,
 		rstart = 1;
 	}
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0)
 		return rc;
 
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] cifs: add ignore_pend flag to cifs_call_async
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2011-05-19 20:22   ` [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
  2011-05-19 20:22   ` [PATCH 5/7] cifs: add cifs_async_writev Jeff Layton
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

The current code always ignores the max_pending limit. Have it instead
only optionally ignore the pending limit. For CIFSSMBEcho, we need to
ignore it to make sure they always can go out. For async reads, writes
and potentially other calls, we need to respect it.

Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsproto.h |    2 +-
 fs/cifs/cifssmb.c   |    2 +-
 fs/cifs/transport.c |    5 +++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index b5407a6..6931631 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -69,7 +69,7 @@ extern struct mid_q_entry *AllocMidQEntry(const struct smb_hdr *smb_buffer,
 extern void DeleteMidQEntry(struct mid_q_entry *midEntry);
 extern int cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
 			   unsigned int nvec, mid_callback_t *callback,
-			   void *cbdata);
+			   void *cbdata, bool ignore_pend);
 extern int SendReceive(const unsigned int /* xid */ , struct cifsSesInfo *,
 			struct smb_hdr * /* input */ ,
 			struct smb_hdr * /* out */ ,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index b998996..0aff051 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -743,7 +743,7 @@ CIFSSMBEcho(struct TCP_Server_Info *server)
 	iov.iov_base = smb;
 	iov.iov_len = be32_to_cpu(smb->hdr.smb_buf_length) + 4;
 
-	rc = cifs_call_async(server, &iov, 1, cifs_echo_callback, server);
+	rc = cifs_call_async(server, &iov, 1, cifs_echo_callback, server, true);
 	if (rc)
 		cFYI(1, "Echo request failed: %d", rc);
 
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index b3c3c6d..d1998b6 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -343,13 +343,14 @@ wait_for_response(struct TCP_Server_Info *server, struct mid_q_entry *midQ)
  */
 int
 cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
-		unsigned int nvec, mid_callback_t *callback, void *cbdata)
+		unsigned int nvec, mid_callback_t *callback, void *cbdata,
+		bool ignore_pend)
 {
 	int rc;
 	struct mid_q_entry *mid;
 	struct smb_hdr *hdr = (struct smb_hdr *)iov[0].iov_base;
 
-	rc = wait_for_free_request(server, CIFS_ASYNC_OP);
+	rc = wait_for_free_request(server, ignore_pend ? CIFS_ASYNC_OP : 0);
 	if (rc)
 		return rc;
 
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] cifs: add cifs_async_writev
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2011-05-19 20:22   ` [PATCH 4/7] cifs: add ignore_pend flag to cifs_call_async Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
  2011-05-19 20:22   ` [PATCH 6/7] cifs: convert cifs_writepages to use async writes Jeff Layton
  2011-05-19 20:22   ` [PATCH 7/7] cifs: clean up wsize negotiation and allow for larger wsize Jeff Layton
  6 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Add the ability for CIFS to do an asynchronous write. The kernel will
set the frame up as it would for a "normal" SMBWrite2 request, and use
cifs_call_async to send it. The mid callback will then be configured to
handle the result.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsproto.h |   18 ++++
 fs/cifs/cifssmb.c   |  236 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 254 insertions(+), 0 deletions(-)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 6931631..41db348 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -438,4 +438,22 @@ extern int mdfour(unsigned char *, unsigned char *, int);
 extern int E_md4hash(const unsigned char *passwd, unsigned char *p16);
 extern int SMBencrypt(unsigned char *passwd, const unsigned char *c8,
 			unsigned char *p24);
+
+/* asynchronous write support */
+struct cifs_writedata {
+	struct kref			refcount;
+	enum writeback_sync_modes	sync_mode;
+	struct work_struct		work;
+	struct cifsFileInfo		*cfile;
+	__u64				offset;
+	unsigned int			bytes;
+	int				result;
+	unsigned int			nr_pages;
+	struct page			*pages[1];
+};
+
+int cifs_async_writev(struct cifs_writedata *wdata);
+struct cifs_writedata *cifs_writedata_alloc(unsigned int nr_pages);
+void cifs_writedata_release(struct kref *refcount);
+
 #endif			/* _CIFSPROTO_H */
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 0aff051..690c8ca 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -32,6 +32,7 @@
 #include <linux/vfs.h>
 #include <linux/slab.h>
 #include <linux/posix_acl_xattr.h>
+#include <linux/pagemap.h>
 #include <asm/uaccess.h>
 #include "cifspdu.h"
 #include "cifsglob.h"
@@ -1604,6 +1605,241 @@ CIFSSMBWrite(const int xid, struct cifsTconInfo *tcon,
 	return rc;
 }
 
+void
+cifs_writedata_release(struct kref *refcount)
+{
+	struct cifs_writedata *wdata = container_of(refcount,
+					struct cifs_writedata, refcount);
+
+	if (wdata->cfile)
+		cifsFileInfo_put(wdata->cfile);
+
+	kfree(wdata);
+}
+
+/*
+ * Write failed with a retryable error. Resend the write request. It's also
+ * possible that the page was redirtied so re-clean the page.
+ */
+static void
+cifs_writev_requeue(struct cifs_writedata *wdata)
+{
+	int i, rc;
+	struct inode *inode = wdata->cfile->dentry->d_inode;
+
+	for (i = 0; i < wdata->nr_pages; i++) {
+		lock_page(wdata->pages[i]);
+		clear_page_dirty_for_io(wdata->pages[i]);
+	}
+
+	do {
+		rc = cifs_async_writev(wdata);
+	} while (rc == -EAGAIN);
+
+	for (i = 0; i < wdata->nr_pages; i++) {
+		if (rc != 0)
+			SetPageError(wdata->pages[i]);
+		unlock_page(wdata->pages[i]);
+	}
+
+	mapping_set_error(inode->i_mapping, rc);
+	kref_put(&wdata->refcount, cifs_writedata_release);
+}
+
+static void
+cifs_writev_complete(struct work_struct *work)
+{
+	struct cifs_writedata *wdata = container_of(work,
+						struct cifs_writedata, work);
+	struct inode *inode = wdata->cfile->dentry->d_inode;
+	int i = 0;
+
+	if (wdata->result == 0) {
+		cifs_update_eof(CIFS_I(inode), wdata->offset, wdata->bytes);
+		cifs_stats_bytes_written(tlink_tcon(wdata->cfile->tlink),
+					 wdata->bytes);
+	} else if (wdata->sync_mode == WB_SYNC_ALL && wdata->result == -EAGAIN)
+		return cifs_writev_requeue(wdata);
+
+	for (i = 0; i < wdata->nr_pages; i++) {
+		struct page *page = wdata->pages[i];
+		if (wdata->result == -EAGAIN)
+			__set_page_dirty_nobuffers(page);
+		else if (wdata->result < 0)
+			SetPageError(page);
+		end_page_writeback(page);
+		page_cache_release(page);
+	}
+	if (wdata->result != -EAGAIN)
+		mapping_set_error(inode->i_mapping, wdata->result);
+	kref_put(&wdata->refcount, cifs_writedata_release);
+}
+
+struct cifs_writedata *
+cifs_writedata_alloc(unsigned int nr_pages)
+{
+	struct cifs_writedata *wdata;
+
+	/* this would overflow */
+	if (nr_pages == 0) {
+		cERROR(1, "%s: called with nr_pages == 0!", __func__);
+		return NULL;
+	}
+
+	/* writedata + number of page pointers */
+	wdata = kzalloc(sizeof(*wdata) +
+			sizeof(struct page *) * (nr_pages - 1), GFP_NOFS);
+	if (wdata != NULL) {
+		INIT_WORK(&wdata->work, cifs_writev_complete);
+		kref_init(&wdata->refcount);
+	}
+	return wdata;
+}
+
+/*
+ * Check the midState and signature on received buffer (if any), and queue the
+ * workqueue completion task.
+ */
+static void
+cifs_writev_callback(struct mid_q_entry *mid)
+{
+	struct cifs_writedata *wdata = mid->callback_data;
+	struct cifsTconInfo *tcon = tlink_tcon(wdata->cfile->tlink);
+	unsigned int written;
+	WRITE_RSP *smb = (WRITE_RSP *)mid->resp_buf;
+
+	switch (mid->midState) {
+	case MID_RESPONSE_RECEIVED:
+		wdata->result = cifs_check_receive(mid, tcon->ses->server, 0);
+		if (wdata->result != 0)
+			break;
+
+		written = le16_to_cpu(smb->CountHigh);
+		written <<= 16;
+		written += le16_to_cpu(smb->Count);
+		/*
+		 * Mask off high 16 bits when bytes written as returned
+		 * by the server is greater than bytes requested by the
+		 * client. OS/2 servers are known to set incorrect
+		 * CountHigh values.
+		 */
+		if (written > wdata->bytes)
+			written &= 0xFFFF;
+
+		if (written < wdata->bytes)
+			wdata->result = -ENOSPC;
+		else
+			wdata->bytes = written;
+		break;
+	case MID_REQUEST_SUBMITTED:
+	case MID_RETRY_NEEDED:
+		wdata->result = -EAGAIN;
+		break;
+	default:
+		wdata->result = -EIO;
+		break;
+	}
+
+	queue_work(system_nrt_wq, &wdata->work);
+	DeleteMidQEntry(mid);
+	atomic_dec(&tcon->ses->server->inFlight);
+	wake_up(&tcon->ses->server->request_q);
+}
+
+/* cifs_async_writev - send an async write, and set up mid to handle result */
+int
+cifs_async_writev(struct cifs_writedata *wdata)
+{
+	int i, rc = -EACCES;
+	WRITE_REQ *smb = NULL;
+	int wct;
+	struct cifsTconInfo *tcon = tlink_tcon(wdata->cfile->tlink);
+	struct inode *inode = wdata->cfile->dentry->d_inode;
+	struct kvec *iov = NULL;
+
+	if (tcon->ses->capabilities & CAP_LARGE_FILES) {
+		wct = 14;
+	} else {
+		wct = 12;
+		if (wdata->offset >> 32 > 0) {
+			/* can not handle big offset for old srv */
+			return -EIO;
+		}
+	}
+
+	rc = small_smb_init(SMB_COM_WRITE_ANDX, wct, tcon, (void **)&smb);
+	if (rc)
+		goto async_writev_out;
+
+	/* 1 iov per page + 1 for header */
+	iov = kzalloc((wdata->nr_pages + 1) * sizeof(*iov), GFP_NOFS);
+	if (iov == NULL) {
+		rc = -ENOMEM;
+		goto async_writev_out;
+	}
+
+	smb->AndXCommand = 0xFF;	/* none */
+	smb->Fid = wdata->cfile->netfid;
+	smb->OffsetLow = cpu_to_le32(wdata->offset & 0xFFFFFFFF);
+	if (wct == 14)
+		smb->OffsetHigh = cpu_to_le32(wdata->offset >> 32);
+	smb->Reserved = 0xFFFFFFFF;
+	smb->WriteMode = 0;
+	smb->Remaining = 0;
+
+	smb->DataOffset =
+	    cpu_to_le16(offsetof(struct smb_com_write_req, Data) - 4);
+
+	/* 4 for RFC1001 length + 1 for BCC */
+	iov[0].iov_len = be32_to_cpu(smb->hdr.smb_buf_length) + 4 + 1;
+	iov[0].iov_base = smb;
+
+	/* marshal up the pages into iov array */
+	wdata->bytes = 0;
+	for (i = 0; i < wdata->nr_pages; i++) {
+		iov[i + 1].iov_len = min(inode->i_size -
+				      page_offset(wdata->pages[i]),
+					(loff_t)PAGE_CACHE_SIZE);
+		iov[i + 1].iov_base = kmap(wdata->pages[i]);
+		wdata->bytes += iov[i + 1].iov_len;
+	}
+
+	cFYI(1, "async write at %llu %u bytes", wdata->offset, wdata->bytes);
+
+	smb->DataLengthLow = cpu_to_le16(wdata->bytes & 0xFFFF);
+	smb->DataLengthHigh = cpu_to_le16(wdata->bytes >> 16);
+
+	if (wct == 14) {
+		inc_rfc1001_len(&smb->hdr, wdata->bytes + 1);
+		put_bcc(wdata->bytes + 1, &smb->hdr);
+	} else {
+		/* wct == 12 */
+		struct smb_com_writex_req *smbw =
+				(struct smb_com_writex_req *)smb;
+		inc_rfc1001_len(&smbw->hdr, wdata->bytes + 5);
+		put_bcc(wdata->bytes + 5, &smbw->hdr);
+		iov[0].iov_len += 4; /* pad bigger by four bytes */
+	}
+
+	kref_get(&wdata->refcount);
+	rc = cifs_call_async(tcon->ses->server, iov, wdata->nr_pages + 1,
+			     cifs_writev_callback, wdata, false);
+
+	if (rc == 0)
+		cifs_stats_inc(&tcon->num_writes);
+	else
+		kref_put(&wdata->refcount, cifs_writedata_release);
+
+	/* send is done, unmap pages */
+	for (i = 0; i < wdata->nr_pages; i++)
+		kunmap(wdata->pages[i]);
+
+async_writev_out:
+	cifs_small_buf_release(smb);
+	kfree(iov);
+	return rc;
+}
+
 int
 CIFSSMBWrite2(const int xid, struct cifsTconInfo *tcon,
 	     const int netfid, const unsigned int count,
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] cifs: convert cifs_writepages to use async writes
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2011-05-19 20:22   ` [PATCH 5/7] cifs: add cifs_async_writev Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
       [not found]     ` <BANLkTik=mmwW1Hssfi4K4mzfhfKZj-9rEg@mail.gmail.com>
  2011-05-19 20:22   ` [PATCH 7/7] cifs: clean up wsize negotiation and allow for larger wsize Jeff Layton
  6 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Have cifs_writepages issue asynchronous writes instead of waiting on
each write call to complete before issuing another. This also allows us
to return more quickly from writepages. It can just send out all of the
I/Os and not wait around for the replies.

In the WB_SYNC_ALL case, if the write completes with a retryable error,
then the completion workqueue job will resend the write.

This also changes the page locking semantics a little bit. Instead of
holding the page lock until the response is received, release it after
doing the send. This will reduce contention for the page lock and should
prevent processes that have the file mmap'ed from being blocked
unnecessarily.

Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/file.c |  241 +++++++++++++++++++++++---------------------------------
 1 files changed, 99 insertions(+), 142 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 0aeaaf7..b81bbd8 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1092,58 +1092,20 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
 static int cifs_writepages(struct address_space *mapping,
 			   struct writeback_control *wbc)
 {
-	unsigned int bytes_to_write;
-	unsigned int bytes_written;
-	struct cifs_sb_info *cifs_sb;
-	int done = 0;
-	pgoff_t end;
-	pgoff_t index;
-	int range_whole = 0;
-	struct kvec *iov;
-	int len;
-	int n_iov = 0;
-	pgoff_t next;
-	int nr_pages;
-	__u64 offset = 0;
-	struct cifsFileInfo *open_file;
-	struct cifsTconInfo *tcon;
-	struct cifsInodeInfo *cifsi = CIFS_I(mapping->host);
+	struct cifs_sb_info *cifs_sb = CIFS_SB(mapping->host->i_sb);
+	bool done = false, scanned = false, range_whole = false;
+	pgoff_t end, index;
+	struct cifs_writedata *wdata;
 	struct page *page;
-	struct pagevec pvec;
 	int rc = 0;
-	int scanned = 0;
-	int xid;
-
-	cifs_sb = CIFS_SB(mapping->host->i_sb);
 
 	/*
-	 * If wsize is smaller that the page cache size, default to writing
+	 * If wsize is smaller than the page cache size, default to writing
 	 * one page at a time via cifs_writepage
 	 */
 	if (cifs_sb->wsize < PAGE_CACHE_SIZE)
 		return generic_writepages(mapping, wbc);
 
-	iov = kmalloc(32 * sizeof(struct kvec), GFP_KERNEL);
-	if (iov == NULL)
-		return generic_writepages(mapping, wbc);
-
-	/*
-	 * if there's no open file, then this is likely to fail too,
-	 * but it'll at least handle the return. Maybe it should be
-	 * a BUG() instead?
-	 */
-	open_file = find_writable_file(CIFS_I(mapping->host), false);
-	if (!open_file) {
-		kfree(iov);
-		return generic_writepages(mapping, wbc);
-	}
-
-	tcon = tlink_tcon(open_file->tlink);
-	cifsFileInfo_put(open_file);
-
-	xid = GetXid();
-
-	pagevec_init(&pvec, 0);
 	if (wbc->range_cyclic) {
 		index = mapping->writeback_index; /* Start from prev offset */
 		end = -1;
@@ -1151,24 +1113,49 @@ static int cifs_writepages(struct address_space *mapping,
 		index = wbc->range_start >> PAGE_CACHE_SHIFT;
 		end = wbc->range_end >> PAGE_CACHE_SHIFT;
 		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
-			range_whole = 1;
-		scanned = 1;
+			range_whole = true;
+		scanned = true;
 	}
 retry:
-	while (!done && (index <= end) &&
-	       (nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
-			PAGECACHE_TAG_DIRTY,
-			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1))) {
-		int first;
-		unsigned int i;
-
-		first = -1;
-		next = 0;
-		n_iov = 0;
-		bytes_to_write = 0;
-
-		for (i = 0; i < nr_pages; i++) {
-			page = pvec.pages[i];
+	while (!done && index <= end) {
+		unsigned int i, nr_pages, found_pages;
+		pgoff_t next = 0, tofind;
+		struct page **pages;
+
+		tofind = min((cifs_sb->wsize / PAGE_CACHE_SIZE) - 1,
+				end - index) + 1;
+
+		wdata = cifs_writedata_alloc((unsigned int)tofind);
+		if (!wdata) {
+			rc = -ENOMEM;
+			break;
+		}
+
+		/*
+		 * find_get_pages_tag seems to return a max of 256 on each
+		 * iteration, so we must call it several times in order to
+		 * fill the array or the wsize is effectively limited to
+		 * 256 * PAGE_CACHE_SIZE.
+		 */
+		found_pages = 0;
+		pages = wdata->pages;
+		do {
+			nr_pages = find_get_pages_tag(mapping, &index,
+							PAGECACHE_TAG_DIRTY,
+							tofind, pages);
+			found_pages += nr_pages;
+			tofind -= nr_pages;
+			pages += nr_pages;
+		} while (nr_pages && tofind && index <= end);
+
+		if (found_pages == 0) {
+			kref_put(&wdata->refcount, cifs_writedata_release);
+			break;
+		}
+
+		nr_pages = 0;
+		for (i = 0; i < found_pages; i++) {
+			page = wdata->pages[i];
 			/*
 			 * At this point we hold neither mapping->tree_lock nor
 			 * lock on the page itself: the page may be truncated or
@@ -1177,7 +1164,7 @@ retry:
 			 * mapping
 			 */
 
-			if (first < 0)
+			if (nr_pages == 0)
 				lock_page(page);
 			else if (!trylock_page(page))
 				break;
@@ -1188,7 +1175,7 @@ retry:
 			}
 
 			if (!wbc->range_cyclic && page->index > end) {
-				done = 1;
+				done = true;
 				unlock_page(page);
 				break;
 			}
@@ -1215,119 +1202,89 @@ retry:
 			set_page_writeback(page);
 
 			if (page_offset(page) >= mapping->host->i_size) {
-				done = 1;
+				done = true;
 				unlock_page(page);
 				end_page_writeback(page);
 				break;
 			}
 
-			/*
-			 * BB can we get rid of this?  pages are held by pvec
-			 */
-			page_cache_get(page);
+			wdata->pages[i] = page;
+			next = page->index + 1;
+			++nr_pages;
+		}
 
-			len = min(mapping->host->i_size - page_offset(page),
-				  (loff_t)PAGE_CACHE_SIZE);
+		/* reset index to refind any pages skipped */
+		if (nr_pages == 0)
+			index = wdata->pages[0]->index + 1;
 
-			/* reserve iov[0] for the smb header */
-			n_iov++;
-			iov[n_iov].iov_base = kmap(page);
-			iov[n_iov].iov_len = len;
-			bytes_to_write += len;
+		/* put any pages we aren't going to use */
+		for (i = nr_pages; i < found_pages; i++) {
+			page_cache_release(wdata->pages[i]);
+			wdata->pages[i] = NULL;
+		}
 
-			if (first < 0) {
-				first = i;
-				offset = page_offset(page);
-			}
-			next = page->index + 1;
-			if (bytes_to_write + PAGE_CACHE_SIZE > cifs_sb->wsize)
-				break;
+		/* nothing to write? */
+		if (nr_pages == 0) {
+			kref_put(&wdata->refcount, cifs_writedata_release);
+			continue;
 		}
-		if (n_iov) {
-retry_write:
-			open_file = find_writable_file(CIFS_I(mapping->host),
-							false);
-			if (!open_file) {
-				cERROR(1, "No writable handles for inode");
-				rc = -EBADF;
-			} else {
-				rc = CIFSSMBWrite2(xid, tcon, open_file->netfid,
-						   bytes_to_write, offset,
-						   &bytes_written, iov, n_iov,
-						   0);
-				cifsFileInfo_put(open_file);
-			}
 
-			cFYI(1, "Write2 rc=%d, wrote=%u", rc, bytes_written);
+		wdata->sync_mode = wbc->sync_mode;
+		wdata->nr_pages = nr_pages;
+		wdata->offset = page_offset(wdata->pages[0]);
 
-			/*
-			 * For now, treat a short write as if nothing got
-			 * written. A zero length write however indicates
-			 * ENOSPC or EFBIG. We have no way to know which
-			 * though, so call it ENOSPC for now. EFBIG would
-			 * get translated to AS_EIO anyway.
-			 *
-			 * FIXME: make it take into account the data that did
-			 *	  get written
-			 */
-			if (rc == 0) {
-				if (bytes_written == 0)
-					rc = -ENOSPC;
-				else if (bytes_written < bytes_to_write)
-					rc = -EAGAIN;
+		do {
+			if (wdata->cfile != NULL)
+				cifsFileInfo_put(wdata->cfile);
+			wdata->cfile = find_writable_file(CIFS_I(mapping->host),
+							  false);
+			if (!wdata->cfile) {
+				cERROR(1, "No writable handles for inode");
+				rc = -EBADF;
+				break;
 			}
+			rc = cifs_async_writev(wdata);
+		} while (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN);
 
-			/* retry on data-integrity flush */
-			if (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN)
-				goto retry_write;
+		for (i = 0; i < nr_pages; ++i)
+			unlock_page(wdata->pages[i]);
 
-			/* fix the stats and EOF */
-			if (bytes_written > 0) {
-				cifs_stats_bytes_written(tcon, bytes_written);
-				cifs_update_eof(cifsi, offset, bytes_written);
-			}
-
-			for (i = 0; i < n_iov; i++) {
-				page = pvec.pages[first + i];
-				/* on retryable write error, redirty page */
+		/* send failure -- clean up the mess */
+		if (rc != 0) {
+			for (i = 0; i < nr_pages; ++i) {
 				if (rc == -EAGAIN)
-					redirty_page_for_writepage(wbc, page);
-				else if (rc != 0)
-					SetPageError(page);
-				kunmap(page);
-				unlock_page(page);
-				end_page_writeback(page);
-				page_cache_release(page);
+					redirty_page_for_writepage(wbc,
+							   wdata->pages[i]);
+				else
+					SetPageError(wdata->pages[i]);
+				end_page_writeback(wdata->pages[i]);
+				page_cache_release(wdata->pages[i]);
 			}
-
 			if (rc != -EAGAIN)
 				mapping_set_error(mapping, rc);
-			else
-				rc = 0;
+		}
+		kref_put(&wdata->refcount, cifs_writedata_release);
 
-			if ((wbc->nr_to_write -= n_iov) <= 0)
-				done = 1;
-			index = next;
-		} else
-			/* Need to re-find the pages we skipped */
-			index = pvec.pages[0]->index + 1;
+		wbc->nr_to_write -= nr_pages;
+		if (wbc->nr_to_write <= 0)
+			done = true;
 
-		pagevec_release(&pvec);
+		index = next;
 	}
+
 	if (!scanned && !done) {
 		/*
 		 * We hit the last page and there is more work to be done: wrap
 		 * back to the start of the file
 		 */
-		scanned = 1;
+		scanned = true;
 		index = 0;
 		goto retry;
 	}
+
 	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
 		mapping->writeback_index = index;
 
-	FreeXid(xid);
-	kfree(iov);
 	return rc;
 }
 
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] cifs: clean up wsize negotiation and allow for larger wsize
       [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (5 preceding siblings ...)
  2011-05-19 20:22   ` [PATCH 6/7] cifs: convert cifs_writepages to use async writes Jeff Layton
@ 2011-05-19 20:22   ` Jeff Layton
       [not found]     ` <1305836578-26333-8-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  6 siblings, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2011-05-19 20:22 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Now that we can handle larger wsizes in writepages, fix up the
negotiation of the wsize to allow for that. find_get_pages only seems to
give out a max of 256 pages at a time, so that gives us a reasonable
default of 1M for the wsize.

If the server however does not support large writes via POSIX
extensions, then we cap the wsize to (128k - PAGE_CACHE_SIZE). That
gives us a size that goes up to the max frame size specified in RFC1001.

Finally, if CAP_LARGE_WRITE_AND_X isn't set, then further cap it to the
largest size allowed by the protocol (USHRT_MAX).

Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/connect.c |   69 +++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index ccb3ff8..490fb68 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -2633,23 +2633,6 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info,
 	else /* default */
 		cifs_sb->rsize = CIFSMaxBufSize;
 
-	if (pvolume_info->wsize > PAGEVEC_SIZE * PAGE_CACHE_SIZE) {
-		cERROR(1, "wsize %d too large, using 4096 instead",
-			  pvolume_info->wsize);
-		cifs_sb->wsize = 4096;
-	} else if (pvolume_info->wsize)
-		cifs_sb->wsize = pvolume_info->wsize;
-	else
-		cifs_sb->wsize = min_t(const int,
-					PAGEVEC_SIZE * PAGE_CACHE_SIZE,
-					127*1024);
-		/* old default of CIFSMaxBufSize was too small now
-		   that SMB Write2 can send multiple pages in kvec.
-		   RFC1001 does not describe what happens when frame
-		   bigger than 128K is sent so use that as max in
-		   conjunction with 52K kvec constraint on arch with 4K
-		   page size  */
-
 	if (cifs_sb->rsize < 2048) {
 		cifs_sb->rsize = 2048;
 		/* Windows ME may prefer this */
@@ -2727,6 +2710,53 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info,
 			   "mount option supported");
 }
 
+/*
+ * When the server supports very large writes via POSIX extensions, we can
+ * allow up to 2^24 - PAGE_CACHE_SIZE.
+ *
+ * Note that this might make for "interesting" allocation problems during
+ * writeback however (as we have to allocate an array of pointers for the
+ * pages). A 16M write means ~32kb page array with PAGE_CACHE_SIZE == 4096.
+ */
+#define CIFS_MAX_WSIZE ((1<<24) - PAGE_CACHE_SIZE)
+
+/*
+ * When the server doesn't allow large posix writes, default to a wsize of
+ * 128k - PAGE_CACHE_SIZE -- one page less than the largest frame size
+ * described in RFC1001. This allows space for the header without going over
+ * that by default.
+ */
+#define CIFS_MAX_RFC1001_WSIZE (128 * 1024 - PAGE_CACHE_SIZE)
+
+/*
+ * The default wsize is 1M. find_get_pages seems to return a maximum of 256
+ * pages in a single call. With PAGE_CACHE_SIZE == 4k, this means we can fill
+ * a single wsize request with a single call.
+ */
+#define CIFS_DEFAULT_WSIZE (1024 * 1024)
+
+static unsigned int
+cifs_negotiate_wsize(struct cifsTconInfo *tcon, struct smb_vol *pvolume_info)
+{
+	__u64 unix_cap = le64_to_cpu(tcon->fsUnixInfo.Capability);
+	struct TCP_Server_Info *server = tcon->ses->server;
+	unsigned int wsize = pvolume_info->wsize ? pvolume_info->wsize :
+				CIFS_DEFAULT_WSIZE;
+
+	/* can server support 24-bit write sizes? (via UNIX extensions) */
+	if (!tcon->unix_ext || !(unix_cap & CIFS_UNIX_LARGE_WRITE_CAP))
+		wsize = min_t(unsigned int, wsize, CIFS_MAX_RFC1001_WSIZE);
+
+	/* no CAP_LARGE_WRITE_X? Limit it to 16 bits */
+	if (!(server->capabilities & CAP_LARGE_WRITE_X))
+		wsize = min_t(unsigned int, wsize, USHRT_MAX);
+
+	/* hard limit of CIFS_MAX_WSIZE */
+	wsize = min_t(unsigned int, wsize, CIFS_MAX_WSIZE);
+
+	return wsize;
+}
+
 static int
 is_path_accessible(int xid, struct cifsTconInfo *tcon,
 		   struct cifs_sb_info *cifs_sb, const char *full_path)
@@ -2988,13 +3018,12 @@ try_mount_again:
 		cifs_sb->rsize = 1024 * 127;
 		cFYI(DBG2, "no very large read support, rsize now 127K");
 	}
-	if (!(tcon->ses->capabilities & CAP_LARGE_WRITE_X))
-		cifs_sb->wsize = min(cifs_sb->wsize,
-			       (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE));
 	if (!(tcon->ses->capabilities & CAP_LARGE_READ_X))
 		cifs_sb->rsize = min(cifs_sb->rsize,
 			       (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE));
 
+	cifs_sb->wsize = cifs_negotiate_wsize(tcon, volume_info);
+
 remote_path_check:
 #ifdef CONFIG_CIFS_DFS_UPCALL
 	/*
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock
       [not found]     ` <1305836578-26333-4-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-05-20 18:32       ` Shirish Pargaonkar
       [not found]         ` <BANLkTimzEy+GxOjiC3yhjUD2ynoiRDRkag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Shirish Pargaonkar @ 2011-05-20 18:32 UTC (permalink / raw)
  To: Jeff Layton
  Cc: smfrench-Re5JQEeQqe8AvxtiuMwx3w, linux-cifs-u79uwXL29TY76Z2rM5mHXA

On Thu, May 19, 2011 at 3:22 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Holding the spinlock while we call this function means that it can't
> sleep, which really limits what it can do. Taking it out from under
> the spinlock also means less contention for this global lock.
>
> Change the semantics such that the Global_MidLock is not held when
> the callback is called. To do this requires that we take extra care
> not to have sync_mid_result remove the mid from the list when the
> mid is in a state where that has already happened. This prevents
> list corruption when the mid is sitting on a private list for
> reconnect or when cifsd is coming down.
>
> Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
> Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/cifs/cifsglob.h  |    6 +++---
>  fs/cifs/connect.c   |   30 ++++++++++++++++++++++++------
>  fs/cifs/transport.c |   23 ++++++++---------------
>  3 files changed, 35 insertions(+), 24 deletions(-)
>
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> index 76b4517..fd877c1 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -543,9 +543,8 @@ struct mid_q_entry;
>  * This is the prototype for the mid callback function. When creating one,
>  * take special care to avoid deadlocks. Things to bear in mind:
>  *
> - * - it will be called by cifsd
> - * - the GlobalMid_Lock will be held
> - * - the mid will be removed from the pending_mid_q list
> + * - it will be called by cifsd, with no locks held
> + * - the mid will be removed from any lists
>  */
>  typedef void (mid_callback_t)(struct mid_q_entry *mid);
>
> @@ -656,6 +655,7 @@ static inline void free_dfs_info_array(struct dfs_info3_param *param,
>  #define   MID_RESPONSE_RECEIVED 4
>  #define   MID_RETRY_NEEDED      8 /* session closed while this request out */
>  #define   MID_RESPONSE_MALFORMED 0x10
> +#define   MID_SHUTDOWN          0x20
>
>  /* Types of response buffer returned from SendReceive2 */
>  #define   CIFS_NO_BUFFER        0    /* Response buffer not returned */
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index da284e3..ccb3ff8 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -138,6 +138,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
>        struct cifsSesInfo *ses;
>        struct cifsTconInfo *tcon;
>        struct mid_q_entry *mid_entry;
> +       struct list_head retry_list;
>
>        spin_lock(&GlobalMid_Lock);
>        if (server->tcpStatus == CifsExiting) {
> @@ -189,16 +190,23 @@ cifs_reconnect(struct TCP_Server_Info *server)
>        mutex_unlock(&server->srv_mutex);
>
>        /* mark submitted MIDs for retry and issue callback */
> -       cFYI(1, "%s: issuing mid callbacks", __func__);
> +       INIT_LIST_HEAD(&retry_list);
> +       cFYI(1, "%s: moving mids to private list", __func__);
>        spin_lock(&GlobalMid_Lock);
>        list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
>                mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
>                if (mid_entry->midState == MID_REQUEST_SUBMITTED)
>                        mid_entry->midState = MID_RETRY_NEEDED;
> +               list_move(&mid_entry->qhead, &retry_list);
> +       }
> +       spin_unlock(&GlobalMid_Lock);
> +
> +       cFYI(1, "%s: moving mids to private list", __func__);

Does this comment repeat here?

> +       list_for_each_safe(tmp, tmp2, &retry_list) {
> +               mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
>                list_del_init(&mid_entry->qhead);

And if we called list_move on this entry earlier, does this entry exist
on the qhead list anymore i.e. did not list_move that care of list_del part?

>                mid_entry->callback(mid_entry);
>        }
> -       spin_unlock(&GlobalMid_Lock);
>
>        while (server->tcpStatus == CifsNeedReconnect) {
>                try_to_freeze();
> @@ -672,12 +680,12 @@ multi_t2_fnd:
>                        mid_entry->when_received = jiffies;
>  #endif
>                        list_del_init(&mid_entry->qhead);
> -                       mid_entry->callback(mid_entry);
>                        break;
>                }
>                spin_unlock(&GlobalMid_Lock);
>
>                if (mid_entry != NULL) {
> +                       mid_entry->callback(mid_entry);
>                        /* Was previous buf put in mpx struct for multi-rsp? */
>                        if (!isMultiRsp) {
>                                /* smb buffer will be freed by user thread */
> @@ -741,15 +749,25 @@ multi_t2_fnd:
>                cifs_small_buf_release(smallbuf);
>
>        if (!list_empty(&server->pending_mid_q)) {
> +               struct list_head dispose_list;
> +
> +               INIT_LIST_HEAD(&dispose_list);
>                spin_lock(&GlobalMid_Lock);
>                list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
>                        mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> -                       cFYI(1, "Clearing Mid 0x%x - issuing callback",
> -                                        mid_entry->mid);
> +                       cFYI(1, "Clearing mid 0x%x", mid_entry->mid);
> +                       mid_entry->midState = MID_SHUTDOWN;
> +                       list_move(&mid_entry->qhead, &dispose_list);
> +               }
> +               spin_unlock(&GlobalMid_Lock);
> +
> +               /* now walk dispose list and issue callbacks */
> +               list_for_each_safe(tmp, tmp2, &dispose_list) {
> +                       mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> +                       cFYI(1, "Callback mid 0x%x", mid_entry->mid);
>                        list_del_init(&mid_entry->qhead);
>                        mid_entry->callback(mid_entry);
>                }
> -               spin_unlock(&GlobalMid_Lock);
>                /* 1/8th of sec is more than enough time for them to exit */
>                msleep(125);
>        }
> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> index 67f59c7..b3c3c6d 100644
> --- a/fs/cifs/transport.c
> +++ b/fs/cifs/transport.c
> @@ -425,7 +425,7 @@ SendReceiveNoRsp(const unsigned int xid, struct cifsSesInfo *ses,
>  }
>
>  static int
> -sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
> +cifs_sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
>  {
>        int rc = 0;
>
> @@ -433,28 +433,21 @@ sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
>                mid->mid, mid->midState);
>
>        spin_lock(&GlobalMid_Lock);
> -       /* ensure that it's no longer on the pending_mid_q */
> -       list_del_init(&mid->qhead);
> -
>        switch (mid->midState) {
>        case MID_RESPONSE_RECEIVED:
>                spin_unlock(&GlobalMid_Lock);
>                return rc;
> -       case MID_REQUEST_SUBMITTED:
> -               /* socket is going down, reject all calls */
> -               if (server->tcpStatus == CifsExiting) {
> -                       cERROR(1, "%s: canceling mid=%d cmd=0x%x state=%d",
> -                              __func__, mid->mid, mid->command, mid->midState);
> -                       rc = -EHOSTDOWN;
> -                       break;
> -               }
>        case MID_RETRY_NEEDED:
>                rc = -EAGAIN;
>                break;
>        case MID_RESPONSE_MALFORMED:
>                rc = -EIO;
>                break;
> +       case MID_SHUTDOWN:
> +               rc = -EHOSTDOWN;
> +               break;
>        default:
> +               list_del_init(&mid->qhead);
>                cERROR(1, "%s: invalid mid state mid=%d state=%d", __func__,
>                        mid->mid, mid->midState);
>                rc = -EIO;
> @@ -617,7 +610,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses,
>
>        cifs_small_buf_release(in_buf);
>
> -       rc = sync_mid_result(midQ, ses->server);
> +       rc = cifs_sync_mid_result(midQ, ses->server);
>        if (rc != 0) {
>                atomic_dec(&ses->server->inFlight);
>                wake_up(&ses->server->request_q);
> @@ -738,7 +731,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses,
>                spin_unlock(&GlobalMid_Lock);
>        }
>
> -       rc = sync_mid_result(midQ, ses->server);
> +       rc = cifs_sync_mid_result(midQ, ses->server);
>        if (rc != 0) {
>                atomic_dec(&ses->server->inFlight);
>                wake_up(&ses->server->request_q);
> @@ -913,7 +906,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon,
>                rstart = 1;
>        }
>
> -       rc = sync_mid_result(midQ, ses->server);
> +       rc = cifs_sync_mid_result(midQ, ses->server);
>        if (rc != 0)
>                return rc;
>
> --
> 1.7.4.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock
       [not found]         ` <BANLkTimzEy+GxOjiC3yhjUD2ynoiRDRkag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-05-20 18:57           ` Jeff Layton
  2011-05-22 11:09           ` [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock (try #5) Jeff Layton
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-20 18:57 UTC (permalink / raw)
  To: Shirish Pargaonkar
  Cc: smfrench-Re5JQEeQqe8AvxtiuMwx3w, linux-cifs-u79uwXL29TY76Z2rM5mHXA

On Fri, 20 May 2011 13:32:10 -0500
Shirish Pargaonkar <shirishpargaonkar-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Thu, May 19, 2011 at 3:22 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > Holding the spinlock while we call this function means that it can't
> > sleep, which really limits what it can do. Taking it out from under
> > the spinlock also means less contention for this global lock.
> >
> > Change the semantics such that the Global_MidLock is not held when
> > the callback is called. To do this requires that we take extra care
> > not to have sync_mid_result remove the mid from the list when the
> > mid is in a state where that has already happened. This prevents
> > list corruption when the mid is sitting on a private list for
> > reconnect or when cifsd is coming down.
> >
> > Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
> > Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  fs/cifs/cifsglob.h  |    6 +++---
> >  fs/cifs/connect.c   |   30 ++++++++++++++++++++++++------
> >  fs/cifs/transport.c |   23 ++++++++---------------
> >  3 files changed, 35 insertions(+), 24 deletions(-)
> >

[...]

> >
> >        /* mark submitted MIDs for retry and issue callback */
> > -       cFYI(1, "%s: issuing mid callbacks", __func__);
> > +       INIT_LIST_HEAD(&retry_list);
> > +       cFYI(1, "%s: moving mids to private list", __func__);
> >        spin_lock(&GlobalMid_Lock);
> >        list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
> >                mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> >                if (mid_entry->midState == MID_REQUEST_SUBMITTED)
> >                        mid_entry->midState = MID_RETRY_NEEDED;
> > +               list_move(&mid_entry->qhead, &retry_list);
> > +       }
> > +       spin_unlock(&GlobalMid_Lock);
> > +
> > +       cFYI(1, "%s: moving mids to private list", __func__);
> 
> Does this comment repeat here?
> 

Oops, yes. I'll plan to fix it. If there are no other problems with the
patch, it might be easiest to commit this as-is and I'll do another
patch to fix it. Otherwise, I can respin if Steve prefers.

> > +       list_for_each_safe(tmp, tmp2, &retry_list) {
> > +               mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
> >                list_del_init(&mid_entry->qhead);
> 
> And if we called list_move on this entry earlier, does this entry exist
> on the qhead list anymore i.e. did not list_move that care of list_del part?
> 

list_move moves it from the pending_mid_q to the retry_list.
list_del_init removes it from the list and then makes the list_head
point to itself (i.e. turns it into an empty list).

We have to do the initial list_move with the spinlock held since another
thread could be modifying the pending_mid_q. After they all get moved
to the retry list, we can walk that list without holding the spinlock
since it's private.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock (try #5)
       [not found]         ` <BANLkTimzEy+GxOjiC3yhjUD2ynoiRDRkag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-05-20 18:57           ` Jeff Layton
@ 2011-05-22 11:09           ` Jeff Layton
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-05-22 11:09 UTC (permalink / raw)
  To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Minor revision to the last version of this patch -- the only difference
is the fix to the cFYI statement in cifs_reconnect.

Holding the spinlock while we call this function means that it can't
sleep, which really limits what it can do. Taking it out from under
the spinlock also means less contention for this global lock.

Change the semantics such that the Global_MidLock is not held when
the callback is called. To do this requires that we take extra care
not to have sync_mid_result remove the mid from the list when the
mid is in a state where that has already happened. This prevents
list corruption when the mid is sitting on a private list for
reconnect or when cifsd is coming down.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsglob.h  |    6 +++---
 fs/cifs/connect.c   |   30 ++++++++++++++++++++++++------
 fs/cifs/transport.c |   23 ++++++++---------------
 3 files changed, 35 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 76b4517..fd877c1 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -543,9 +543,8 @@ struct mid_q_entry;
  * This is the prototype for the mid callback function. When creating one,
  * take special care to avoid deadlocks. Things to bear in mind:
  *
- * - it will be called by cifsd
- * - the GlobalMid_Lock will be held
- * - the mid will be removed from the pending_mid_q list
+ * - it will be called by cifsd, with no locks held
+ * - the mid will be removed from any lists
  */
 typedef void (mid_callback_t)(struct mid_q_entry *mid);
 
@@ -656,6 +655,7 @@ static inline void free_dfs_info_array(struct dfs_info3_param *param,
 #define   MID_RESPONSE_RECEIVED 4
 #define   MID_RETRY_NEEDED      8 /* session closed while this request out */
 #define   MID_RESPONSE_MALFORMED 0x10
+#define   MID_SHUTDOWN		 0x20
 
 /* Types of response buffer returned from SendReceive2 */
 #define   CIFS_NO_BUFFER        0    /* Response buffer not returned */
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index da284e3..d4094de 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -138,6 +138,7 @@ cifs_reconnect(struct TCP_Server_Info *server)
 	struct cifsSesInfo *ses;
 	struct cifsTconInfo *tcon;
 	struct mid_q_entry *mid_entry;
+	struct list_head retry_list;
 
 	spin_lock(&GlobalMid_Lock);
 	if (server->tcpStatus == CifsExiting) {
@@ -189,16 +190,23 @@ cifs_reconnect(struct TCP_Server_Info *server)
 	mutex_unlock(&server->srv_mutex);
 
 	/* mark submitted MIDs for retry and issue callback */
-	cFYI(1, "%s: issuing mid callbacks", __func__);
+	INIT_LIST_HEAD(&retry_list);
+	cFYI(1, "%s: moving mids to private list", __func__);
 	spin_lock(&GlobalMid_Lock);
 	list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
 		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
 		if (mid_entry->midState == MID_REQUEST_SUBMITTED)
 			mid_entry->midState = MID_RETRY_NEEDED;
+		list_move(&mid_entry->qhead, &retry_list);
+	}
+	spin_unlock(&GlobalMid_Lock);
+
+	cFYI(1, "%s: issuing mid callbacks", __func__);
+	list_for_each_safe(tmp, tmp2, &retry_list) {
+		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
 		list_del_init(&mid_entry->qhead);
 		mid_entry->callback(mid_entry);
 	}
-	spin_unlock(&GlobalMid_Lock);
 
 	while (server->tcpStatus == CifsNeedReconnect) {
 		try_to_freeze();
@@ -672,12 +680,12 @@ multi_t2_fnd:
 			mid_entry->when_received = jiffies;
 #endif
 			list_del_init(&mid_entry->qhead);
-			mid_entry->callback(mid_entry);
 			break;
 		}
 		spin_unlock(&GlobalMid_Lock);
 
 		if (mid_entry != NULL) {
+			mid_entry->callback(mid_entry);
 			/* Was previous buf put in mpx struct for multi-rsp? */
 			if (!isMultiRsp) {
 				/* smb buffer will be freed by user thread */
@@ -741,15 +749,25 @@ multi_t2_fnd:
 		cifs_small_buf_release(smallbuf);
 
 	if (!list_empty(&server->pending_mid_q)) {
+		struct list_head dispose_list;
+
+		INIT_LIST_HEAD(&dispose_list);
 		spin_lock(&GlobalMid_Lock);
 		list_for_each_safe(tmp, tmp2, &server->pending_mid_q) {
 			mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
-			cFYI(1, "Clearing Mid 0x%x - issuing callback",
-					 mid_entry->mid);
+			cFYI(1, "Clearing mid 0x%x", mid_entry->mid);
+			mid_entry->midState = MID_SHUTDOWN;
+			list_move(&mid_entry->qhead, &dispose_list);
+		}
+		spin_unlock(&GlobalMid_Lock);
+
+		/* now walk dispose list and issue callbacks */
+		list_for_each_safe(tmp, tmp2, &dispose_list) {
+			mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
+			cFYI(1, "Callback mid 0x%x", mid_entry->mid);
 			list_del_init(&mid_entry->qhead);
 			mid_entry->callback(mid_entry);
 		}
-		spin_unlock(&GlobalMid_Lock);
 		/* 1/8th of sec is more than enough time for them to exit */
 		msleep(125);
 	}
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 67f59c7..b3c3c6d 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -425,7 +425,7 @@ SendReceiveNoRsp(const unsigned int xid, struct cifsSesInfo *ses,
 }
 
 static int
-sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
+cifs_sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
 {
 	int rc = 0;
 
@@ -433,28 +433,21 @@ sync_mid_result(struct mid_q_entry *mid, struct TCP_Server_Info *server)
 		mid->mid, mid->midState);
 
 	spin_lock(&GlobalMid_Lock);
-	/* ensure that it's no longer on the pending_mid_q */
-	list_del_init(&mid->qhead);
-
 	switch (mid->midState) {
 	case MID_RESPONSE_RECEIVED:
 		spin_unlock(&GlobalMid_Lock);
 		return rc;
-	case MID_REQUEST_SUBMITTED:
-		/* socket is going down, reject all calls */
-		if (server->tcpStatus == CifsExiting) {
-			cERROR(1, "%s: canceling mid=%d cmd=0x%x state=%d",
-			       __func__, mid->mid, mid->command, mid->midState);
-			rc = -EHOSTDOWN;
-			break;
-		}
 	case MID_RETRY_NEEDED:
 		rc = -EAGAIN;
 		break;
 	case MID_RESPONSE_MALFORMED:
 		rc = -EIO;
 		break;
+	case MID_SHUTDOWN:
+		rc = -EHOSTDOWN;
+		break;
 	default:
+		list_del_init(&mid->qhead);
 		cERROR(1, "%s: invalid mid state mid=%d state=%d", __func__,
 			mid->mid, mid->midState);
 		rc = -EIO;
@@ -617,7 +610,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses,
 
 	cifs_small_buf_release(in_buf);
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0) {
 		atomic_dec(&ses->server->inFlight);
 		wake_up(&ses->server->request_q);
@@ -738,7 +731,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses,
 		spin_unlock(&GlobalMid_Lock);
 	}
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0) {
 		atomic_dec(&ses->server->inFlight);
 		wake_up(&ses->server->request_q);
@@ -913,7 +906,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon,
 		rstart = 1;
 	}
 
-	rc = sync_mid_result(midQ, ses->server);
+	rc = cifs_sync_mid_result(midQ, ses->server);
 	if (rc != 0)
 		return rc;
 
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/7] cifs: clean up wsize negotiation and allow for larger wsize
       [not found]     ` <1305836578-26333-8-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-05-25 10:46       ` Pavel Shilovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Pavel Shilovsky @ 2011-05-25 10:46 UTC (permalink / raw)
  To: Jeff Layton
  Cc: smfrench-Re5JQEeQqe8AvxtiuMwx3w, linux-cifs-u79uwXL29TY76Z2rM5mHXA

2011/5/20 Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>:
> Now that we can handle larger wsizes in writepages, fix up the
> negotiation of the wsize to allow for that. find_get_pages only seems to
> give out a max of 256 pages at a time, so that gives us a reasonable
> default of 1M for the wsize.
>
> If the server however does not support large writes via POSIX
> extensions, then we cap the wsize to (128k - PAGE_CACHE_SIZE). That
> gives us a size that goes up to the max frame size specified in RFC1001.
>
> Finally, if CAP_LARGE_WRITE_AND_X isn't set, then further cap it to the
> largest size allowed by the protocol (USHRT_MAX).
>
> Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/cifs/connect.c |   69 +++++++++++++++++++++++++++++++++++++---------------
>  1 files changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index ccb3ff8..490fb68 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -2633,23 +2633,6 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info,
>        else /* default */
>                cifs_sb->rsize = CIFSMaxBufSize;
>
> -       if (pvolume_info->wsize > PAGEVEC_SIZE * PAGE_CACHE_SIZE) {
> -               cERROR(1, "wsize %d too large, using 4096 instead",
> -                         pvolume_info->wsize);
> -               cifs_sb->wsize = 4096;
> -       } else if (pvolume_info->wsize)
> -               cifs_sb->wsize = pvolume_info->wsize;
> -       else
> -               cifs_sb->wsize = min_t(const int,
> -                                       PAGEVEC_SIZE * PAGE_CACHE_SIZE,
> -                                       127*1024);
> -               /* old default of CIFSMaxBufSize was too small now
> -                  that SMB Write2 can send multiple pages in kvec.
> -                  RFC1001 does not describe what happens when frame
> -                  bigger than 128K is sent so use that as max in
> -                  conjunction with 52K kvec constraint on arch with 4K
> -                  page size  */
> -
>        if (cifs_sb->rsize < 2048) {
>                cifs_sb->rsize = 2048;
>                /* Windows ME may prefer this */
> @@ -2727,6 +2710,53 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info,
>                           "mount option supported");
>  }
>
> +/*
> + * When the server supports very large writes via POSIX extensions, we can
> + * allow up to 2^24 - PAGE_CACHE_SIZE.
> + *
> + * Note that this might make for "interesting" allocation problems during
> + * writeback however (as we have to allocate an array of pointers for the
> + * pages). A 16M write means ~32kb page array with PAGE_CACHE_SIZE == 4096.
> + */
> +#define CIFS_MAX_WSIZE ((1<<24) - PAGE_CACHE_SIZE)
> +
> +/*
> + * When the server doesn't allow large posix writes, default to a wsize of
> + * 128k - PAGE_CACHE_SIZE -- one page less than the largest frame size
> + * described in RFC1001. This allows space for the header without going over
> + * that by default.
> + */
> +#define CIFS_MAX_RFC1001_WSIZE (128 * 1024 - PAGE_CACHE_SIZE)
> +
> +/*
> + * The default wsize is 1M. find_get_pages seems to return a maximum of 256
> + * pages in a single call. With PAGE_CACHE_SIZE == 4k, this means we can fill
> + * a single wsize request with a single call.
> + */
> +#define CIFS_DEFAULT_WSIZE (1024 * 1024)
> +
> +static unsigned int
> +cifs_negotiate_wsize(struct cifsTconInfo *tcon, struct smb_vol *pvolume_info)
> +{
> +       __u64 unix_cap = le64_to_cpu(tcon->fsUnixInfo.Capability);
> +       struct TCP_Server_Info *server = tcon->ses->server;
> +       unsigned int wsize = pvolume_info->wsize ? pvolume_info->wsize :
> +                               CIFS_DEFAULT_WSIZE;
> +
> +       /* can server support 24-bit write sizes? (via UNIX extensions) */
> +       if (!tcon->unix_ext || !(unix_cap & CIFS_UNIX_LARGE_WRITE_CAP))
> +               wsize = min_t(unsigned int, wsize, CIFS_MAX_RFC1001_WSIZE);
> +
> +       /* no CAP_LARGE_WRITE_X? Limit it to 16 bits */
> +       if (!(server->capabilities & CAP_LARGE_WRITE_X))
> +               wsize = min_t(unsigned int, wsize, USHRT_MAX);
> +
> +       /* hard limit of CIFS_MAX_WSIZE */
> +       wsize = min_t(unsigned int, wsize, CIFS_MAX_WSIZE);
> +
> +       return wsize;
> +}
> +
>  static int
>  is_path_accessible(int xid, struct cifsTconInfo *tcon,
>                   struct cifs_sb_info *cifs_sb, const char *full_path)
> @@ -2988,13 +3018,12 @@ try_mount_again:
>                cifs_sb->rsize = 1024 * 127;
>                cFYI(DBG2, "no very large read support, rsize now 127K");
>        }
> -       if (!(tcon->ses->capabilities & CAP_LARGE_WRITE_X))
> -               cifs_sb->wsize = min(cifs_sb->wsize,
> -                              (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE));
>        if (!(tcon->ses->capabilities & CAP_LARGE_READ_X))
>                cifs_sb->rsize = min(cifs_sb->rsize,
>                               (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE));
>
> +       cifs_sb->wsize = cifs_negotiate_wsize(tcon, volume_info);
> +
>  remote_path_check:
>  #ifdef CONFIG_CIFS_DFS_UPCALL
>        /*
> --
> 1.7.4.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Reviewed-and-Tested-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>

-- 
Best regards,
Pavel Shilovsky.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Fwd: [PATCH 6/7] cifs: convert cifs_writepages to use async writes
       [not found]       ` <BANLkTik=mmwW1Hssfi4K4mzfhfKZj-9rEg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-05-25 10:47         ` Pavel Shilovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Pavel Shilovsky @ 2011-05-25 10:47 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Sorry - missed linux-cifs in Cc.


---------- Forwarded message ----------
From: Pavel Shilovsky <piastryyy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: 2011/5/25
Subject: Re: [PATCH 6/7] cifs: convert cifs_writepages to use async writes
To: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>


2011/5/20 Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>:
> Have cifs_writepages issue asynchronous writes instead of waiting on
> each write call to complete before issuing another. This also allows us
> to return more quickly from writepages. It can just send out all of the
> I/Os and not wait around for the replies.
>
> In the WB_SYNC_ALL case, if the write completes with a retryable error,
> then the completion workqueue job will resend the write.
>
> This also changes the page locking semantics a little bit. Instead of
> holding the page lock until the response is received, release it after
> doing the send. This will reduce contention for the page lock and should
> prevent processes that have the file mmap'ed from being blocked
> unnecessarily.
>
> Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/cifs/file.c |  241 +++++++++++++++++++++++---------------------------------
>  1 files changed, 99 insertions(+), 142 deletions(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 0aeaaf7..b81bbd8 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -1092,58 +1092,20 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
>  static int cifs_writepages(struct address_space *mapping,
>                           struct writeback_control *wbc)
>  {
> -       unsigned int bytes_to_write;
> -       unsigned int bytes_written;
> -       struct cifs_sb_info *cifs_sb;
> -       int done = 0;
> -       pgoff_t end;
> -       pgoff_t index;
> -       int range_whole = 0;
> -       struct kvec *iov;
> -       int len;
> -       int n_iov = 0;
> -       pgoff_t next;
> -       int nr_pages;
> -       __u64 offset = 0;
> -       struct cifsFileInfo *open_file;
> -       struct cifsTconInfo *tcon;
> -       struct cifsInodeInfo *cifsi = CIFS_I(mapping->host);
> +       struct cifs_sb_info *cifs_sb = CIFS_SB(mapping->host->i_sb);
> +       bool done = false, scanned = false, range_whole = false;
> +       pgoff_t end, index;
> +       struct cifs_writedata *wdata;
>        struct page *page;
> -       struct pagevec pvec;
>        int rc = 0;
> -       int scanned = 0;
> -       int xid;
> -
> -       cifs_sb = CIFS_SB(mapping->host->i_sb);
>
>        /*
> -        * If wsize is smaller that the page cache size, default to writing
> +        * If wsize is smaller than the page cache size, default to writing
>         * one page at a time via cifs_writepage
>         */
>        if (cifs_sb->wsize < PAGE_CACHE_SIZE)
>                return generic_writepages(mapping, wbc);
>
> -       iov = kmalloc(32 * sizeof(struct kvec), GFP_KERNEL);
> -       if (iov == NULL)
> -               return generic_writepages(mapping, wbc);
> -
> -       /*
> -        * if there's no open file, then this is likely to fail too,
> -        * but it'll at least handle the return. Maybe it should be
> -        * a BUG() instead?
> -        */
> -       open_file = find_writable_file(CIFS_I(mapping->host), false);
> -       if (!open_file) {
> -               kfree(iov);
> -               return generic_writepages(mapping, wbc);
> -       }
> -
> -       tcon = tlink_tcon(open_file->tlink);
> -       cifsFileInfo_put(open_file);
> -
> -       xid = GetXid();
> -
> -       pagevec_init(&pvec, 0);
>        if (wbc->range_cyclic) {
>                index = mapping->writeback_index; /* Start from prev offset */
>                end = -1;
> @@ -1151,24 +1113,49 @@ static int cifs_writepages(struct address_space *mapping,
>                index = wbc->range_start >> PAGE_CACHE_SHIFT;
>                end = wbc->range_end >> PAGE_CACHE_SHIFT;
>                if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
> -                       range_whole = 1;
> -               scanned = 1;
> +                       range_whole = true;
> +               scanned = true;
>        }
>  retry:
> -       while (!done && (index <= end) &&
> -              (nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
> -                       PAGECACHE_TAG_DIRTY,
> -                       min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1))) {
> -               int first;
> -               unsigned int i;
> -
> -               first = -1;
> -               next = 0;
> -               n_iov = 0;
> -               bytes_to_write = 0;
> -
> -               for (i = 0; i < nr_pages; i++) {
> -                       page = pvec.pages[i];
> +       while (!done && index <= end) {
> +               unsigned int i, nr_pages, found_pages;
> +               pgoff_t next = 0, tofind;
> +               struct page **pages;
> +
> +               tofind = min((cifs_sb->wsize / PAGE_CACHE_SIZE) - 1,
> +                               end - index) + 1;
> +
> +               wdata = cifs_writedata_alloc((unsigned int)tofind);
> +               if (!wdata) {
> +                       rc = -ENOMEM;
> +                       break;
> +               }
> +
> +               /*
> +                * find_get_pages_tag seems to return a max of 256 on each
> +                * iteration, so we must call it several times in order to
> +                * fill the array or the wsize is effectively limited to
> +                * 256 * PAGE_CACHE_SIZE.
> +                */
> +               found_pages = 0;
> +               pages = wdata->pages;
> +               do {
> +                       nr_pages = find_get_pages_tag(mapping, &index,
> +                                                       PAGECACHE_TAG_DIRTY,
> +                                                       tofind, pages);
> +                       found_pages += nr_pages;
> +                       tofind -= nr_pages;
> +                       pages += nr_pages;
> +               } while (nr_pages && tofind && index <= end);
> +
> +               if (found_pages == 0) {
> +                       kref_put(&wdata->refcount, cifs_writedata_release);
> +                       break;
> +               }
> +
> +               nr_pages = 0;
> +               for (i = 0; i < found_pages; i++) {
> +                       page = wdata->pages[i];
>                        /*
>                         * At this point we hold neither mapping->tree_lock nor
>                         * lock on the page itself: the page may be truncated or
> @@ -1177,7 +1164,7 @@ retry:
>                         * mapping
>                         */
>
> -                       if (first < 0)
> +                       if (nr_pages == 0)
>                                lock_page(page);
>                        else if (!trylock_page(page))
>                                break;
> @@ -1188,7 +1175,7 @@ retry:
>                        }
>
>                        if (!wbc->range_cyclic && page->index > end) {
> -                               done = 1;
> +                               done = true;
>                                unlock_page(page);
>                                break;
>                        }
> @@ -1215,119 +1202,89 @@ retry:
>                        set_page_writeback(page);
>
>                        if (page_offset(page) >= mapping->host->i_size) {
> -                               done = 1;
> +                               done = true;
>                                unlock_page(page);
>                                end_page_writeback(page);
>                                break;
>                        }
>
> -                       /*
> -                        * BB can we get rid of this?  pages are held by pvec
> -                        */
> -                       page_cache_get(page);
> +                       wdata->pages[i] = page;
> +                       next = page->index + 1;
> +                       ++nr_pages;
> +               }
>
> -                       len = min(mapping->host->i_size - page_offset(page),
> -                                 (loff_t)PAGE_CACHE_SIZE);
> +               /* reset index to refind any pages skipped */
> +               if (nr_pages == 0)
> +                       index = wdata->pages[0]->index + 1;
>
> -                       /* reserve iov[0] for the smb header */
> -                       n_iov++;
> -                       iov[n_iov].iov_base = kmap(page);
> -                       iov[n_iov].iov_len = len;
> -                       bytes_to_write += len;
> +               /* put any pages we aren't going to use */
> +               for (i = nr_pages; i < found_pages; i++) {
> +                       page_cache_release(wdata->pages[i]);
> +                       wdata->pages[i] = NULL;
> +               }
>
> -                       if (first < 0) {
> -                               first = i;
> -                               offset = page_offset(page);
> -                       }
> -                       next = page->index + 1;
> -                       if (bytes_to_write + PAGE_CACHE_SIZE > cifs_sb->wsize)
> -                               break;
> +               /* nothing to write? */
> +               if (nr_pages == 0) {
> +                       kref_put(&wdata->refcount, cifs_writedata_release);
> +                       continue;
>                }
> -               if (n_iov) {
> -retry_write:
> -                       open_file = find_writable_file(CIFS_I(mapping->host),
> -                                                       false);
> -                       if (!open_file) {
> -                               cERROR(1, "No writable handles for inode");
> -                               rc = -EBADF;
> -                       } else {
> -                               rc = CIFSSMBWrite2(xid, tcon, open_file->netfid,
> -                                                  bytes_to_write, offset,
> -                                                  &bytes_written, iov, n_iov,
> -                                                  0);
> -                               cifsFileInfo_put(open_file);
> -                       }
>
> -                       cFYI(1, "Write2 rc=%d, wrote=%u", rc, bytes_written);
> +               wdata->sync_mode = wbc->sync_mode;
> +               wdata->nr_pages = nr_pages;
> +               wdata->offset = page_offset(wdata->pages[0]);
>
> -                       /*
> -                        * For now, treat a short write as if nothing got
> -                        * written. A zero length write however indicates
> -                        * ENOSPC or EFBIG. We have no way to know which
> -                        * though, so call it ENOSPC for now. EFBIG would
> -                        * get translated to AS_EIO anyway.
> -                        *
> -                        * FIXME: make it take into account the data that did
> -                        *        get written
> -                        */
> -                       if (rc == 0) {
> -                               if (bytes_written == 0)
> -                                       rc = -ENOSPC;
> -                               else if (bytes_written < bytes_to_write)
> -                                       rc = -EAGAIN;
> +               do {
> +                       if (wdata->cfile != NULL)
> +                               cifsFileInfo_put(wdata->cfile);
> +                       wdata->cfile = find_writable_file(CIFS_I(mapping->host),
> +                                                         false);
> +                       if (!wdata->cfile) {
> +                               cERROR(1, "No writable handles for inode");
> +                               rc = -EBADF;
> +                               break;
>                        }
> +                       rc = cifs_async_writev(wdata);
> +               } while (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN);
>
> -                       /* retry on data-integrity flush */
> -                       if (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN)
> -                               goto retry_write;
> +               for (i = 0; i < nr_pages; ++i)
> +                       unlock_page(wdata->pages[i]);
>
> -                       /* fix the stats and EOF */
> -                       if (bytes_written > 0) {
> -                               cifs_stats_bytes_written(tcon, bytes_written);
> -                               cifs_update_eof(cifsi, offset, bytes_written);
> -                       }
> -
> -                       for (i = 0; i < n_iov; i++) {
> -                               page = pvec.pages[first + i];
> -                               /* on retryable write error, redirty page */
> +               /* send failure -- clean up the mess */
> +               if (rc != 0) {
> +                       for (i = 0; i < nr_pages; ++i) {
>                                if (rc == -EAGAIN)
> -                                       redirty_page_for_writepage(wbc, page);
> -                               else if (rc != 0)
> -                                       SetPageError(page);
> -                               kunmap(page);
> -                               unlock_page(page);
> -                               end_page_writeback(page);
> -                               page_cache_release(page);
> +                                       redirty_page_for_writepage(wbc,
> +                                                          wdata->pages[i]);
> +                               else
> +                                       SetPageError(wdata->pages[i]);
> +                               end_page_writeback(wdata->pages[i]);
> +                               page_cache_release(wdata->pages[i]);
>                        }
> -
>                        if (rc != -EAGAIN)
>                                mapping_set_error(mapping, rc);
> -                       else
> -                               rc = 0;
> +               }
> +               kref_put(&wdata->refcount, cifs_writedata_release);
>
> -                       if ((wbc->nr_to_write -= n_iov) <= 0)
> -                               done = 1;
> -                       index = next;
> -               } else
> -                       /* Need to re-find the pages we skipped */
> -                       index = pvec.pages[0]->index + 1;
> +               wbc->nr_to_write -= nr_pages;
> +               if (wbc->nr_to_write <= 0)
> +                       done = true;
>
> -               pagevec_release(&pvec);
> +               index = next;
>        }
> +
>        if (!scanned && !done) {
>                /*
>                 * We hit the last page and there is more work to be done: wrap
>                 * back to the start of the file
>                 */
> -               scanned = 1;
> +               scanned = true;
>                index = 0;
>                goto retry;
>        }
> +
>        if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
>                mapping->writeback_index = index;
>
> -       FreeXid(xid);
> -       kfree(iov);
>        return rc;
>  }
>
> --
> 1.7.4.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Looks good. I tested it (writing ~gigabyte file) - It's ~15% faster
that sequential variant.

Reviewed-and-Tested-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>

--
Best regards,
Pavel Shilovsky.



-- 
Best regards,
Pavel Shilovsky.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/7] cifs: make cifs_send_async take a kvec array
       [not found] ` <1302694994-8303-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-04-13 11:43   ` Jeff Layton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2011-04-13 11:43 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

We'll need this for async writes, so convert the call to take a kvec
array. CIFSSMBEcho is changed to put a kvec on the stack and pass
in the SMB buffer using that.

Reviewed-by: Pavel Shilovsky <piastry-7qunaywFIewox3rIn2DAYQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/cifsproto.h |    4 ++--
 fs/cifs/cifssmb.c   |    6 ++++--
 fs/cifs/transport.c |   13 +++++++------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 4af7db8..e255a2b 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -65,8 +65,8 @@ extern struct mid_q_entry *AllocMidQEntry(const struct smb_hdr *smb_buffer,
 					struct TCP_Server_Info *server);
 extern void DeleteMidQEntry(struct mid_q_entry *midEntry);
 extern int wait_for_free_request(struct TCP_Server_Info *sv, const int long_op);
-extern int cifs_call_async(struct TCP_Server_Info *server,
-			   struct smb_hdr *in_buf, mid_callback_t *callback,
+extern int cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
+			   unsigned int nvec, mid_callback_t *callback,
 			   void *cbdata);
 extern int SendReceive(const unsigned int /* xid */ , struct cifs_ses *,
 			struct smb_hdr * /* input */ ,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index ca58aea..79e4881 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -725,6 +725,7 @@ CIFSSMBEcho(struct TCP_Server_Info *server)
 {
 	ECHO_REQ *smb;
 	int rc = 0;
+	struct kvec iov;
 
 	cFYI(1, "In echo request");
 
@@ -739,9 +740,10 @@ CIFSSMBEcho(struct TCP_Server_Info *server)
 	put_bcc(1, &smb->hdr);
 	smb->Data[0] = 'a';
 	inc_rfc1001_len(smb, 3);
+	iov.iov_base = smb;
+	iov.iov_len = be32_to_cpu(smb->hdr.smb_buf_length) + 4;
 
-	rc = cifs_call_async(server, (struct smb_hdr *)smb,
-				cifs_echo_callback, server);
+	rc = cifs_call_async(server, &iov, 1, cifs_echo_callback, server);
 	if (rc)
 		cFYI(1, "Echo request failed: %d", rc);
 
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index d9833fc..7ccf292 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -349,11 +349,12 @@ wait_for_response(struct TCP_Server_Info *server, struct mid_q_entry *midQ)
  * the result. Caller is responsible for dealing with timeouts.
  */
 int
-cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
-		mid_callback_t *callback, void *cbdata)
+cifs_call_async(struct TCP_Server_Info *server, struct kvec *iov,
+		unsigned int nvec, mid_callback_t *callback, void *cbdata)
 {
 	int rc;
 	struct mid_q_entry *mid;
+	struct smb_hdr *hdr = (struct smb_hdr *)iov[0].iov_base;
 
 	rc = wait_for_free_request(server, CIFS_ASYNC_OP);
 	if (rc)
@@ -361,10 +362,10 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 
 	/* enable signing if server requires it */
 	if (server->sec_mode & (SECMODE_SIGN_REQUIRED | SECMODE_SIGN_ENABLED))
-		in_buf->Flags2 |= SMBFLG2_SECURITY_SIGNATURE;
+		hdr->Flags2 |= SMBFLG2_SECURITY_SIGNATURE;
 
 	mutex_lock(&server->srv_mutex);
-	mid = AllocMidQEntry(in_buf, server);
+	mid = AllocMidQEntry(hdr, server);
 	if (mid == NULL) {
 		mutex_unlock(&server->srv_mutex);
 		return -ENOMEM;
@@ -375,7 +376,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 	list_add_tail(&mid->qhead, &server->pending_mid_q);
 	spin_unlock(&GlobalMid_Lock);
 
-	rc = cifs_sign_smb(in_buf, server, &mid->sequence_number);
+	rc = cifs_sign_smb2(iov, nvec, server, &mid->sequence_number);
 	if (rc) {
 		mutex_unlock(&server->srv_mutex);
 		goto out_err;
@@ -385,7 +386,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_hdr *in_buf,
 	mid->callback_data = cbdata;
 	mid->midState = MID_REQUEST_SUBMITTED;
 	cifs_in_send_inc(server);
-	rc = smb_send(server, in_buf, be32_to_cpu(in_buf->smb_buf_length));
+	rc = smb_sendv(server, iov, nvec);
 	cifs_in_send_dec(server);
 	cifs_save_when_sent(mid);
 	mutex_unlock(&server->srv_mutex);
-- 
1.7.4.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-05-25 10:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-19 20:22 [PATCH 0/7] cifs: asynchronous writepages support (try #4) Jeff Layton
     [not found] ` <1305836578-26333-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-05-19 20:22   ` [PATCH 1/7] cifs: consolidate SendReceive response checks Jeff Layton
2011-05-19 20:22   ` [PATCH 2/7] cifs: make cifs_send_async take a kvec array Jeff Layton
2011-05-19 20:22   ` [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock Jeff Layton
     [not found]     ` <1305836578-26333-4-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-05-20 18:32       ` Shirish Pargaonkar
     [not found]         ` <BANLkTimzEy+GxOjiC3yhjUD2ynoiRDRkag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-20 18:57           ` Jeff Layton
2011-05-22 11:09           ` [PATCH 3/7] cifs: don't call mid_q_entry->callback under the Global_MidLock (try #5) Jeff Layton
2011-05-19 20:22   ` [PATCH 4/7] cifs: add ignore_pend flag to cifs_call_async Jeff Layton
2011-05-19 20:22   ` [PATCH 5/7] cifs: add cifs_async_writev Jeff Layton
2011-05-19 20:22   ` [PATCH 6/7] cifs: convert cifs_writepages to use async writes Jeff Layton
     [not found]     ` <BANLkTik=mmwW1Hssfi4K4mzfhfKZj-9rEg@mail.gmail.com>
     [not found]       ` <BANLkTik=mmwW1Hssfi4K4mzfhfKZj-9rEg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-25 10:47         ` Fwd: " Pavel Shilovsky
2011-05-19 20:22   ` [PATCH 7/7] cifs: clean up wsize negotiation and allow for larger wsize Jeff Layton
     [not found]     ` <1305836578-26333-8-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-05-25 10:46       ` Pavel Shilovsky
  -- strict thread matches above, loose matches on Subject: below --
2011-04-13 11:43 [PATCH 0/7] cifs: asynchronous writepages support (try #2) Jeff Layton
     [not found] ` <1302694994-8303-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-04-13 11:43   ` [PATCH 2/7] cifs: make cifs_send_async take a kvec array Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.