All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/4] move layout get/return synchronization to rpc prepare
@ 2010-09-14 10:52 Benny Halevy
  2010-09-14 10:53 ` [PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout Benny Halevy
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 10:52 UTC (permalink / raw)
  To: NFS list

The following patchset introduces a per-inode rpc wait queue
to be used to synchronize layout gets and returns during
their respective rpc prepare phase.

Essentially, layout gets wait on layout segments marked as invalid,
currently meaning there is a layout return in progress.
It should be ok to send the layout get anyway but waiting avoids
the race rather than having to possibly resolve it on the done path.

In the future layout gets could insert the layout segment they ask
for as "invalid" so that parallel layout gets would wait too.
However, since the server may return a different segment than
requested (bigger or smaller), there's no guarantee that the
parallel layoutget will overlap the one in progress.

Layout returns wait on layout segments while they are being used
(e.g. for I/O operations).

[PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout
[PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
[PATCH 3/4] pnfs: allow nfs4_proc_layoutreturn to sleep on barrier
[PATCH 4/4] SQUASHME: pnfs: get rid of lo_waitq

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout
  2010-09-14 10:52 [RFC 0/4] move layout get/return synchronization to rpc prepare Benny Halevy
@ 2010-09-14 10:53 ` Benny Halevy
  2010-09-14 10:53 ` [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs Benny Halevy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 10:53 UTC (permalink / raw)
  To: linux-nfs

They are always evaluated to treu and false, respectively as the lseg ptr is
always needed.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
---
 fs/nfs/pnfs.c |   23 ++++++++---------------
 1 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index b576470..bc4f0c1 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1001,9 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
  */
 static struct pnfs_layout_segment *
 pnfs_has_layout(struct pnfs_layout_hdr *lo,
-		struct pnfs_layout_range *range,
-		bool take_ref,
-		bool only_valid)
+		struct pnfs_layout_range *range)
 {
 	struct pnfs_layout_segment *lseg, *ret = NULL;
 
@@ -1011,19 +1009,17 @@ pnfs_has_layout(struct pnfs_layout_hdr *lo,
 
 	BUG_ON_UNLOCKED_LO(lo);
 	list_for_each_entry (lseg, &lo->segs, fi_list) {
-		if (has_matching_lseg(lseg, range) &&
-		    (lseg->valid || !only_valid)) {
+		if (has_matching_lseg(lseg, range)) {
 			ret = lseg;
-			if (take_ref)
-				get_lseg(ret);
+			get_lseg(ret);
 			break;
 		}
 		if (cmp_layout(range, &lseg->range) > 0)
 			break;
 	}
 
-	dprintk("%s:Return lseg %p take_ref %d ref %d valid %d\n",
-		__func__, ret, take_ref,
+	dprintk("%s:Return lseg %p ref %d valid %d\n",
+		__func__, ret,
 		ret ? atomic_read(&ret->kref.refcount) : 0,
 		ret ? ret->valid : 0);
 	return ret;
@@ -1050,10 +1046,8 @@ _pnfs_update_layout(struct inode *ino,
 	struct nfs_inode *nfsi = NFS_I(ino);
 	struct pnfs_layout_hdr *lo;
 	struct pnfs_layout_segment *lseg = NULL;
-	bool take_ref = (lsegpp != NULL);
 
-	if (take_ref)
-		*lsegpp = NULL;
+	*lsegpp = NULL;
 	spin_lock(&ino->i_lock);
 	lo = pnfs_alloc_layout(ino);
 	if (lo == NULL) {
@@ -1062,10 +1056,9 @@ _pnfs_update_layout(struct inode *ino,
 	}
 
 	/* Check to see if the layout for the given range already exists */
-	lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref);
+	lseg = pnfs_has_layout(lo, &arg);
 	if (lseg && !lseg->valid) {
-		if (take_ref)
-			put_lseg_locked(lseg);
+		put_lseg_locked(lseg);
 		/* someone is cleaning the layout */
 		lseg = NULL;
 		goto out_unlock;
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
  2010-09-14 10:52 [RFC 0/4] move layout get/return synchronization to rpc prepare Benny Halevy
  2010-09-14 10:53 ` [PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout Benny Halevy
@ 2010-09-14 10:53 ` Benny Halevy
  2010-09-14 12:00   ` Fred Isaman
  2010-09-14 10:53 ` [PATCH 3/4] pnfs: allow nfs4_proc_layoutreturn to sleep on barrier Benny Halevy
  2010-09-14 10:53 ` [PATCH 4/4] SQUASHME: pnfs: get rid of lo_waitq Benny Halevy
  3 siblings, 1 reply; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 10:53 UTC (permalink / raw)
  To: linux-nfs

Introduce lo_rpcwaitq and sleep on it while there's an lseg marked invalid,
i.e. it is being returned (in the future, it could also be layoutget in progress).

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
---
 fs/nfs/inode.c         |    1 +
 fs/nfs/nfs4proc.c      |   25 ++++++++++++++++++++++---
 fs/nfs/pnfs.c          |   37 ++++++++++++++++++-------------------
 fs/nfs/pnfs.h          |    3 +++
 include/linux/nfs_fs.h |    1 +
 5 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 7621cfc..059bdf7 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1462,6 +1462,7 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
 	init_rwsem(&nfsi->rwsem);
 #ifdef CONFIG_NFS_V4_1
 	init_waitqueue_head(&nfsi->lo_waitq);
+	rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
 	nfsi->pnfs_layout_suspend = 0;
 	nfsi->layout = NULL;
 #endif /* CONFIG_NFS_V4_1 */
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 48a763e..f12568d 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5442,13 +5442,32 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
 {
 	struct nfs4_layoutget *lgp = calldata;
 	struct inode *ino = lgp->args.inode;
+	struct nfs_inode *nfsi = NFS_I(ino);
 	struct nfs_server *server = NFS_SERVER(ino);
+	struct pnfs_layout_segment *lseg;
 
 	dprintk("--> %s\n", __func__);
-	if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
-				&lgp->res.seq_res, 0, task))
+	spin_lock(&ino->i_lock);
+	lseg = pnfs_has_layout(nfsi->layout, &lgp->args.range);
+	if (likely(!lseg)) {
+		spin_unlock(&ino->i_lock);
+		dprintk("%s: no lseg found, proceeding\n", __func__);
+		if (!nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
+					 &lgp->res.seq_res, 0, task))
+			rpc_call_start(task);
 		return;
-	rpc_call_start(task);
+	}
+	if (!lseg->valid) {
+		put_lseg_locked(lseg);
+		spin_unlock(&ino->i_lock);
+		dprintk("%s: invalid lseg found, waiting\n", __func__);
+		rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
+		return;
+	}
+	*lgp->lsegpp = lseg;
+	spin_unlock(&ino->i_lock);
+	dprintk("%s: valid lseg found, no rpc required\n", __func__);
+	rpc_exit(task, NFS4_OK);
 }
 
 static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index bc4f0c1..2450383 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -430,7 +430,7 @@ destroy_lseg(struct kref *kref)
 	PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
 }
 
-static void
+void
 put_lseg_locked(struct pnfs_layout_segment *lseg)
 {
 	bool do_wake_up;
@@ -444,8 +444,10 @@ put_lseg_locked(struct pnfs_layout_segment *lseg)
 	do_wake_up = !lseg->valid;
 	nfsi = PNFS_NFS_INODE(lseg->layout);
 	kref_put(&lseg->kref, destroy_lseg);
-	if (do_wake_up)
+	if (do_wake_up) {
 		wake_up(&nfsi->lo_waitq);
+		rpc_wake_up(&nfsi->lo_rpcwaitq);
+	}
 }
 
 void
@@ -999,7 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
 /*
  * lookup range in layout
  */
-static struct pnfs_layout_segment *
+struct pnfs_layout_segment *
 pnfs_has_layout(struct pnfs_layout_hdr *lo,
 		struct pnfs_layout_range *range)
 {
@@ -1057,22 +1059,20 @@ _pnfs_update_layout(struct inode *ino,
 
 	/* Check to see if the layout for the given range already exists */
 	lseg = pnfs_has_layout(lo, &arg);
-	if (lseg && !lseg->valid) {
-		put_lseg_locked(lseg);
-		/* someone is cleaning the layout */
-		lseg = NULL;
-		goto out_unlock;
-	}
-
 	if (lseg) {
-		dprintk("%s: Using cached lseg %p for %llu@%llu iomode %d)\n",
-			__func__,
-			lseg,
-			arg.length,
-			arg.offset,
-			arg.iomode);
+		if (lseg->valid) {
+			dprintk("%s: Using cached lseg %p for %llu@%llu "
+				"iomode %d)\n",
+				__func__,
+				lseg,
+				arg.length,
+				arg.offset,
+				arg.iomode);
 
-		goto out_unlock;
+			goto out_unlock;
+		}
+		/* someone is cleaning the layout */
+		put_lseg_locked(lseg);
 	}
 
 	/* if get layout already failed once goto out */
@@ -1097,8 +1097,7 @@ out:
 		nfsi->layout->state, lseg);
 	return;
 out_unlock:
-	if (lsegpp)
-		*lsegpp = lseg;
+	*lsegpp = lseg;
 	spin_unlock(&ino->i_lock);
 	goto out;
 }
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index ea70385..f7f567e 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -34,6 +34,9 @@ extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
 /* pnfs.c */
 extern const nfs4_stateid zero_stateid;
 
+void put_lseg_locked(struct pnfs_layout_segment *);
+struct pnfs_layout_segment *pnfs_has_layout(struct pnfs_layout_hdr *,
+					    struct pnfs_layout_range *);
 void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
 	loff_t pos, u64 count, enum pnfs_iomode access_type,
 	struct pnfs_layout_segment **lsegpp);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 5bafa95..196c0c6 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -213,6 +213,7 @@ struct nfs_inode {
 	/* pNFS layout information */
 #if defined(CONFIG_NFS_V4_1)
 	wait_queue_head_t lo_waitq;
+	struct rpc_wait_queue lo_rpcwaitq;
 	struct pnfs_layout_hdr *layout;
 	time_t pnfs_layout_suspend;
 #endif /* CONFIG_NFS_V4_1 */
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] pnfs: allow nfs4_proc_layoutreturn to sleep on barrier
  2010-09-14 10:52 [RFC 0/4] move layout get/return synchronization to rpc prepare Benny Halevy
  2010-09-14 10:53 ` [PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout Benny Halevy
  2010-09-14 10:53 ` [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs Benny Halevy
@ 2010-09-14 10:53 ` Benny Halevy
  2010-09-14 10:53 ` [PATCH 4/4] SQUASHME: pnfs: get rid of lo_waitq Benny Halevy
  3 siblings, 0 replies; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 10:53 UTC (permalink / raw)
  To: linux-nfs

use lo_rpcwaitq and sleep on it in the rpc prepare phase while
there's need to wait on the layoutreturn barrier.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
---
 fs/nfs/nfs4proc.c |    8 ++++++++
 fs/nfs/pnfs.c     |   12 +-----------
 fs/nfs/pnfs.h     |    1 +
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f12568d..b21a711 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5683,9 +5683,17 @@ nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
 {
 	struct nfs4_layoutreturn *lrp = calldata;
 	struct inode *ino = lrp->args.inode;
+	struct nfs_inode *nfsi = NFS_I(ino);
 	struct nfs_server *server = NFS_SERVER(ino);
 
 	dprintk("--> %s\n", __func__);
+	if ((lrp->args.return_type == RETURN_FILE) &&
+	    pnfs_return_layout_barrier(nfsi, &lrp->args.range)) {
+		dprintk("%s: waiting on barrier\n", __func__);
+		rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
+		return;
+	}
+
 	if (nfs4_setup_sequence(server, NULL, &lrp->args.seq_args,
 				&lrp->res.seq_res, 0, task))
 		return;
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 2450383..0e8bb6f 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -714,7 +714,7 @@ pnfs_free_layout(struct pnfs_layout_hdr *lo,
 	dprintk("%s:Return\n", __func__);
 }
 
-static bool
+bool
 pnfs_return_layout_barrier(struct nfs_inode *nfsi,
 			   struct pnfs_layout_range *range)
 {
@@ -804,16 +804,6 @@ _pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *range,
 
 		spin_unlock(&ino->i_lock);
 
-		if (pnfs_return_layout_barrier(nfsi, &arg)) {
-			if (stateid) { /* callback */
-				status = -EAGAIN;
-				goto out_put;
-			}
-			dprintk("%s: waiting\n", __func__);
-			wait_event(nfsi->lo_waitq,
-				   !pnfs_return_layout_barrier(nfsi, &arg));
-		}
-
 		if (layoutcommit_needed(nfsi)) {
 			if (stateid && !wait) { /* callback */
 				dprintk("%s: layoutcommit pending\n", __func__);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index f7f567e..81534aa 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -41,6 +41,7 @@ void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
 	loff_t pos, u64 count, enum pnfs_iomode access_type,
 	struct pnfs_layout_segment **lsegpp);
 
+bool pnfs_return_layout_barrier(struct nfs_inode *, struct pnfs_layout_range *);
 int _pnfs_return_layout(struct inode *, struct pnfs_layout_range *,
 			const nfs4_stateid *stateid, /* optional */
 			enum pnfs_layoutreturn_type, bool wait);
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] SQUASHME: pnfs: get rid of lo_waitq
  2010-09-14 10:52 [RFC 0/4] move layout get/return synchronization to rpc prepare Benny Halevy
                   ` (2 preceding siblings ...)
  2010-09-14 10:53 ` [PATCH 3/4] pnfs: allow nfs4_proc_layoutreturn to sleep on barrier Benny Halevy
@ 2010-09-14 10:53 ` Benny Halevy
  3 siblings, 0 replies; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 10:53 UTC (permalink / raw)
  To: linux-nfs

it has no more users at this point.

SPLITME: refactor put_lseg{,_locked}

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
---
 fs/nfs/inode.c         |    1 -
 fs/nfs/pnfs.c          |   32 ++++++++++++++------------------
 include/linux/nfs_fs.h |    1 -
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 059bdf7..8e5fbb5 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1461,7 +1461,6 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
 	nfsi->delegation_state = 0;
 	init_rwsem(&nfsi->rwsem);
 #ifdef CONFIG_NFS_V4_1
-	init_waitqueue_head(&nfsi->lo_waitq);
 	rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
 	nfsi->pnfs_layout_suspend = 0;
 	nfsi->layout = NULL;
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 0e8bb6f..e6261a3 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -360,7 +360,6 @@ pnfs_layout_release(struct pnfs_layout_hdr *lo,
 	 */
 	put_layout_locked(lo);
 	spin_unlock(&nfsi->vfs_inode.i_lock);
-	wake_up_all(&nfsi->lo_waitq);
 }
 
 void
@@ -430,44 +429,41 @@ destroy_lseg(struct kref *kref)
 	PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
 }
 
-void
-put_lseg_locked(struct pnfs_layout_segment *lseg)
+static void
+put_lseg_common(struct nfs_inode *nfsi, struct pnfs_layout_segment *lseg)
 {
 	bool do_wake_up;
-	struct nfs_inode *nfsi;
-
-	if (!lseg)
-		return;
 
 	dprintk("%s: lseg %p ref %d valid %d\n", __func__, lseg,
 		atomic_read(&lseg->kref.refcount), lseg->valid);
 	do_wake_up = !lseg->valid;
-	nfsi = PNFS_NFS_INODE(lseg->layout);
 	kref_put(&lseg->kref, destroy_lseg);
-	if (do_wake_up) {
-		wake_up(&nfsi->lo_waitq);
+	if (do_wake_up)
 		rpc_wake_up(&nfsi->lo_rpcwaitq);
-	}
+}
+
+void
+put_lseg_locked(struct pnfs_layout_segment *lseg)
+{
+	if (!lseg)
+		return;
+
+	BUG_ON_UNLOCKED_LO(lseg->layout);
+	put_lseg_common(PNFS_NFS_INODE(lseg->layout), lseg);
 }
 
 void
 put_lseg(struct pnfs_layout_segment *lseg)
 {
-	bool do_wake_up;
 	struct nfs_inode *nfsi;
 
 	if (!lseg)
 		return;
 
-	dprintk("%s: lseg %p ref %d valid %d\n", __func__, lseg,
-		atomic_read(&lseg->kref.refcount), lseg->valid);
-	do_wake_up = !lseg->valid;
 	nfsi = PNFS_NFS_INODE(lseg->layout);
 	spin_lock(&nfsi->vfs_inode.i_lock);
-	kref_put(&lseg->kref, destroy_lseg);
+	put_lseg_common(nfsi, lseg);
 	spin_unlock(&nfsi->vfs_inode.i_lock);
-	if (do_wake_up)
-		wake_up(&nfsi->lo_waitq);
 }
 EXPORT_SYMBOL(put_lseg);
 
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 196c0c6..745d14e 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -212,7 +212,6 @@ struct nfs_inode {
 
 	/* pNFS layout information */
 #if defined(CONFIG_NFS_V4_1)
-	wait_queue_head_t lo_waitq;
 	struct rpc_wait_queue lo_rpcwaitq;
 	struct pnfs_layout_hdr *layout;
 	time_t pnfs_layout_suspend;
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
  2010-09-14 10:53 ` [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs Benny Halevy
@ 2010-09-14 12:00   ` Fred Isaman
  2010-09-14 12:40     ` Benny Halevy
  0 siblings, 1 reply; 9+ messages in thread
From: Fred Isaman @ 2010-09-14 12:00 UTC (permalink / raw)
  To: Benny Halevy; +Cc: linux-nfs

Note that this waits on any range that overlaps both an outstanding
LAYOUTRETURN and a current layout.  Whereas what you really want is to
wait on any range that overlaps an outstanding LAYOUTRETURN.  One way
to get the desired behavior would be to have LAYOUTRETURN insert an
"invalid" lseg covering its range.

Fred


On Tue, Sep 14, 2010 at 3:53 AM, Benny Halevy <bhalevy@panasas.com> wrote:
> Introduce lo_rpcwaitq and sleep on it while there's an lseg marked invalid,
> i.e. it is being returned (in the future, it could also be layoutget in progress).
>
> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
> ---
>  fs/nfs/inode.c         |    1 +
>  fs/nfs/nfs4proc.c      |   25 ++++++++++++++++++++++---
>  fs/nfs/pnfs.c          |   37 ++++++++++++++++++-------------------
>  fs/nfs/pnfs.h          |    3 +++
>  include/linux/nfs_fs.h |    1 +
>  5 files changed, 45 insertions(+), 22 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 7621cfc..059bdf7 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1462,6 +1462,7 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
>        init_rwsem(&nfsi->rwsem);
>  #ifdef CONFIG_NFS_V4_1
>        init_waitqueue_head(&nfsi->lo_waitq);
> +       rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
>        nfsi->pnfs_layout_suspend = 0;
>        nfsi->layout = NULL;
>  #endif /* CONFIG_NFS_V4_1 */
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 48a763e..f12568d 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5442,13 +5442,32 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
>  {
>        struct nfs4_layoutget *lgp = calldata;
>        struct inode *ino = lgp->args.inode;
> +       struct nfs_inode *nfsi = NFS_I(ino);
>        struct nfs_server *server = NFS_SERVER(ino);
> +       struct pnfs_layout_segment *lseg;
>
>        dprintk("--> %s\n", __func__);
> -       if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
> -                               &lgp->res.seq_res, 0, task))
> +       spin_lock(&ino->i_lock);
> +       lseg = pnfs_has_layout(nfsi->layout, &lgp->args.range);
> +       if (likely(!lseg)) {
> +               spin_unlock(&ino->i_lock);
> +               dprintk("%s: no lseg found, proceeding\n", __func__);
> +               if (!nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
> +                                        &lgp->res.seq_res, 0, task))
> +                       rpc_call_start(task);
>                return;
> -       rpc_call_start(task);
> +       }
> +       if (!lseg->valid) {
> +               put_lseg_locked(lseg);
> +               spin_unlock(&ino->i_lock);
> +               dprintk("%s: invalid lseg found, waiting\n", __func__);
> +               rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
> +               return;
> +       }
> +       *lgp->lsegpp = lseg;
> +       spin_unlock(&ino->i_lock);
> +       dprintk("%s: valid lseg found, no rpc required\n", __func__);
> +       rpc_exit(task, NFS4_OK);
>  }
>
>  static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index bc4f0c1..2450383 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -430,7 +430,7 @@ destroy_lseg(struct kref *kref)
>        PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
>  }
>
> -static void
> +void
>  put_lseg_locked(struct pnfs_layout_segment *lseg)
>  {
>        bool do_wake_up;
> @@ -444,8 +444,10 @@ put_lseg_locked(struct pnfs_layout_segment *lseg)
>        do_wake_up = !lseg->valid;
>        nfsi = PNFS_NFS_INODE(lseg->layout);
>        kref_put(&lseg->kref, destroy_lseg);
> -       if (do_wake_up)
> +       if (do_wake_up) {
>                wake_up(&nfsi->lo_waitq);
> +               rpc_wake_up(&nfsi->lo_rpcwaitq);
> +       }
>  }
>
>  void
> @@ -999,7 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
>  /*
>  * lookup range in layout
>  */
> -static struct pnfs_layout_segment *
> +struct pnfs_layout_segment *
>  pnfs_has_layout(struct pnfs_layout_hdr *lo,
>                struct pnfs_layout_range *range)
>  {
> @@ -1057,22 +1059,20 @@ _pnfs_update_layout(struct inode *ino,
>
>        /* Check to see if the layout for the given range already exists */
>        lseg = pnfs_has_layout(lo, &arg);
> -       if (lseg && !lseg->valid) {
> -               put_lseg_locked(lseg);
> -               /* someone is cleaning the layout */
> -               lseg = NULL;
> -               goto out_unlock;
> -       }
> -
>        if (lseg) {
> -               dprintk("%s: Using cached lseg %p for %llu@%llu iomode %d)\n",
> -                       __func__,
> -                       lseg,
> -                       arg.length,
> -                       arg.offset,
> -                       arg.iomode);
> +               if (lseg->valid) {
> +                       dprintk("%s: Using cached lseg %p for %llu@%llu "
> +                               "iomode %d)\n",
> +                               __func__,
> +                               lseg,
> +                               arg.length,
> +                               arg.offset,
> +                               arg.iomode);
>
> -               goto out_unlock;
> +                       goto out_unlock;
> +               }
> +               /* someone is cleaning the layout */
> +               put_lseg_locked(lseg);
>        }
>
>        /* if get layout already failed once goto out */
> @@ -1097,8 +1097,7 @@ out:
>                nfsi->layout->state, lseg);
>        return;
>  out_unlock:
> -       if (lsegpp)
> -               *lsegpp = lseg;
> +       *lsegpp = lseg;
>        spin_unlock(&ino->i_lock);
>        goto out;
>  }
> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
> index ea70385..f7f567e 100644
> --- a/fs/nfs/pnfs.h
> +++ b/fs/nfs/pnfs.h
> @@ -34,6 +34,9 @@ extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
>  /* pnfs.c */
>  extern const nfs4_stateid zero_stateid;
>
> +void put_lseg_locked(struct pnfs_layout_segment *);
> +struct pnfs_layout_segment *pnfs_has_layout(struct pnfs_layout_hdr *,
> +                                           struct pnfs_layout_range *);
>  void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>        loff_t pos, u64 count, enum pnfs_iomode access_type,
>        struct pnfs_layout_segment **lsegpp);
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index 5bafa95..196c0c6 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -213,6 +213,7 @@ struct nfs_inode {
>        /* pNFS layout information */
>  #if defined(CONFIG_NFS_V4_1)
>        wait_queue_head_t lo_waitq;
> +       struct rpc_wait_queue lo_rpcwaitq;
>        struct pnfs_layout_hdr *layout;
>        time_t pnfs_layout_suspend;
>  #endif /* CONFIG_NFS_V4_1 */
> --
> 1.7.2.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
  2010-09-14 12:00   ` Fred Isaman
@ 2010-09-14 12:40     ` Benny Halevy
  2010-09-14 12:47       ` Fred Isaman
  0 siblings, 1 reply; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 12:40 UTC (permalink / raw)
  To: Fred Isaman; +Cc: linux-nfs

On 2010-09-14 14:00, Fred Isaman wrote:
> Note that this waits on any range that overlaps both an outstanding
> LAYOUTRETURN and a current layout.  Whereas what you really want is to
> wait on any range that overlaps an outstanding LAYOUTRETURN.  One way
> to get the desired behavior would be to have LAYOUTRETURN insert an
> "invalid" lseg covering its range.

What do you mean by "current layout"?
The way it works now is that layout return does mark the layout segments
it's about to return as invalid (in pnfs_return_layout_barrier).
layout get inserts a valid layout segment only upon completion
(in pnfs_layout_process) and it does not yet insert an invalid one
before sending the RPC, something we may wan to consider in the future
to prevent superfluous parallel gets.

Benny

> 
> Fred
> 
> 
> On Tue, Sep 14, 2010 at 3:53 AM, Benny Halevy <bhalevy@panasas.com> wrote:
>> Introduce lo_rpcwaitq and sleep on it while there's an lseg marked invalid,
>> i.e. it is being returned (in the future, it could also be layoutget in progress).
>>
>> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
>> ---
>>  fs/nfs/inode.c         |    1 +
>>  fs/nfs/nfs4proc.c      |   25 ++++++++++++++++++++++---
>>  fs/nfs/pnfs.c          |   37 ++++++++++++++++++-------------------
>>  fs/nfs/pnfs.h          |    3 +++
>>  include/linux/nfs_fs.h |    1 +
>>  5 files changed, 45 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>> index 7621cfc..059bdf7 100644
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -1462,6 +1462,7 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
>>        init_rwsem(&nfsi->rwsem);
>>  #ifdef CONFIG_NFS_V4_1
>>        init_waitqueue_head(&nfsi->lo_waitq);
>> +       rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
>>        nfsi->pnfs_layout_suspend = 0;
>>        nfsi->layout = NULL;
>>  #endif /* CONFIG_NFS_V4_1 */
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 48a763e..f12568d 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -5442,13 +5442,32 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
>>  {
>>        struct nfs4_layoutget *lgp = calldata;
>>        struct inode *ino = lgp->args.inode;
>> +       struct nfs_inode *nfsi = NFS_I(ino);
>>        struct nfs_server *server = NFS_SERVER(ino);
>> +       struct pnfs_layout_segment *lseg;
>>
>>        dprintk("--> %s\n", __func__);
>> -       if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>> -                               &lgp->res.seq_res, 0, task))
>> +       spin_lock(&ino->i_lock);
>> +       lseg = pnfs_has_layout(nfsi->layout, &lgp->args.range);
>> +       if (likely(!lseg)) {
>> +               spin_unlock(&ino->i_lock);
>> +               dprintk("%s: no lseg found, proceeding\n", __func__);
>> +               if (!nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>> +                                        &lgp->res.seq_res, 0, task))
>> +                       rpc_call_start(task);
>>                return;
>> -       rpc_call_start(task);
>> +       }
>> +       if (!lseg->valid) {
>> +               put_lseg_locked(lseg);
>> +               spin_unlock(&ino->i_lock);
>> +               dprintk("%s: invalid lseg found, waiting\n", __func__);
>> +               rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
>> +               return;
>> +       }
>> +       *lgp->lsegpp = lseg;
>> +       spin_unlock(&ino->i_lock);
>> +       dprintk("%s: valid lseg found, no rpc required\n", __func__);
>> +       rpc_exit(task, NFS4_OK);
>>  }
>>
>>  static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>> index bc4f0c1..2450383 100644
>> --- a/fs/nfs/pnfs.c
>> +++ b/fs/nfs/pnfs.c
>> @@ -430,7 +430,7 @@ destroy_lseg(struct kref *kref)
>>        PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
>>  }
>>
>> -static void
>> +void
>>  put_lseg_locked(struct pnfs_layout_segment *lseg)
>>  {
>>        bool do_wake_up;
>> @@ -444,8 +444,10 @@ put_lseg_locked(struct pnfs_layout_segment *lseg)
>>        do_wake_up = !lseg->valid;
>>        nfsi = PNFS_NFS_INODE(lseg->layout);
>>        kref_put(&lseg->kref, destroy_lseg);
>> -       if (do_wake_up)
>> +       if (do_wake_up) {
>>                wake_up(&nfsi->lo_waitq);
>> +               rpc_wake_up(&nfsi->lo_rpcwaitq);
>> +       }
>>  }
>>
>>  void
>> @@ -999,7 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
>>  /*
>>  * lookup range in layout
>>  */
>> -static struct pnfs_layout_segment *
>> +struct pnfs_layout_segment *
>>  pnfs_has_layout(struct pnfs_layout_hdr *lo,
>>                struct pnfs_layout_range *range)
>>  {
>> @@ -1057,22 +1059,20 @@ _pnfs_update_layout(struct inode *ino,
>>
>>        /* Check to see if the layout for the given range already exists */
>>        lseg = pnfs_has_layout(lo, &arg);
>> -       if (lseg && !lseg->valid) {
>> -               put_lseg_locked(lseg);
>> -               /* someone is cleaning the layout */
>> -               lseg = NULL;
>> -               goto out_unlock;
>> -       }
>> -
>>        if (lseg) {
>> -               dprintk("%s: Using cached lseg %p for %llu@%llu iomode %d)\n",
>> -                       __func__,
>> -                       lseg,
>> -                       arg.length,
>> -                       arg.offset,
>> -                       arg.iomode);
>> +               if (lseg->valid) {
>> +                       dprintk("%s: Using cached lseg %p for %llu@%llu "
>> +                               "iomode %d)\n",
>> +                               __func__,
>> +                               lseg,
>> +                               arg.length,
>> +                               arg.offset,
>> +                               arg.iomode);
>>
>> -               goto out_unlock;
>> +                       goto out_unlock;
>> +               }
>> +               /* someone is cleaning the layout */
>> +               put_lseg_locked(lseg);
>>        }
>>
>>        /* if get layout already failed once goto out */
>> @@ -1097,8 +1097,7 @@ out:
>>                nfsi->layout->state, lseg);
>>        return;
>>  out_unlock:
>> -       if (lsegpp)
>> -               *lsegpp = lseg;
>> +       *lsegpp = lseg;
>>        spin_unlock(&ino->i_lock);
>>        goto out;
>>  }
>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>> index ea70385..f7f567e 100644
>> --- a/fs/nfs/pnfs.h
>> +++ b/fs/nfs/pnfs.h
>> @@ -34,6 +34,9 @@ extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
>>  /* pnfs.c */
>>  extern const nfs4_stateid zero_stateid;
>>
>> +void put_lseg_locked(struct pnfs_layout_segment *);
>> +struct pnfs_layout_segment *pnfs_has_layout(struct pnfs_layout_hdr *,
>> +                                           struct pnfs_layout_range *);
>>  void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>>        loff_t pos, u64 count, enum pnfs_iomode access_type,
>>        struct pnfs_layout_segment **lsegpp);
>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>> index 5bafa95..196c0c6 100644
>> --- a/include/linux/nfs_fs.h
>> +++ b/include/linux/nfs_fs.h
>> @@ -213,6 +213,7 @@ struct nfs_inode {
>>        /* pNFS layout information */
>>  #if defined(CONFIG_NFS_V4_1)
>>        wait_queue_head_t lo_waitq;
>> +       struct rpc_wait_queue lo_rpcwaitq;
>>        struct pnfs_layout_hdr *layout;
>>        time_t pnfs_layout_suspend;
>>  #endif /* CONFIG_NFS_V4_1 */
>> --
>> 1.7.2.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
  2010-09-14 12:40     ` Benny Halevy
@ 2010-09-14 12:47       ` Fred Isaman
  2010-09-14 13:13         ` Benny Halevy
  0 siblings, 1 reply; 9+ messages in thread
From: Fred Isaman @ 2010-09-14 12:47 UTC (permalink / raw)
  To: Benny Halevy; +Cc: linux-nfs

On Tue, Sep 14, 2010 at 5:40 AM, Benny Halevy <bhalevy@panasas.com> wrote:
> On 2010-09-14 14:00, Fred Isaman wrote:
>> Note that this waits on any range that overlaps both an outstanding
>> LAYOUTRETURN and a current layout.  Whereas what you really want is to
>> wait on any range that overlaps an outstanding LAYOUTRETURN.  One way
>> to get the desired behavior would be to have LAYOUTRETURN insert an
>> "invalid" lseg covering its range.
>
> What do you mean by "current layout"?

Consider the case where the server asks for a LAYOUTRETURN of the
range 0-1000, but the client currently only has lsegs covering 0-500.
The client should wait for the LAYOUTRETURN to complete before sending
any LAYOUTGET for the range 500-1000, but there is currently no
mechanism to do this.

Fred

> The way it works now is that layout return does mark the layout segments
> it's about to return as invalid (in pnfs_return_layout_barrier).
> layout get inserts a valid layout segment only upon completion
> (in pnfs_layout_process) and it does not yet insert an invalid one
> before sending the RPC, something we may wan to consider in the future
> to prevent superfluous parallel gets.
>
> Benny
>
>>
>> Fred
>>
>>
>> On Tue, Sep 14, 2010 at 3:53 AM, Benny Halevy <bhalevy@panasas.com> wrote:
>>> Introduce lo_rpcwaitq and sleep on it while there's an lseg marked invalid,
>>> i.e. it is being returned (in the future, it could also be layoutget in progress).
>>>
>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
>>> ---
>>>  fs/nfs/inode.c         |    1 +
>>>  fs/nfs/nfs4proc.c      |   25 ++++++++++++++++++++++---
>>>  fs/nfs/pnfs.c          |   37 ++++++++++++++++++-------------------
>>>  fs/nfs/pnfs.h          |    3 +++
>>>  include/linux/nfs_fs.h |    1 +
>>>  5 files changed, 45 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>>> index 7621cfc..059bdf7 100644
>>> --- a/fs/nfs/inode.c
>>> +++ b/fs/nfs/inode.c
>>> @@ -1462,6 +1462,7 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
>>>        init_rwsem(&nfsi->rwsem);
>>>  #ifdef CONFIG_NFS_V4_1
>>>        init_waitqueue_head(&nfsi->lo_waitq);
>>> +       rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
>>>        nfsi->pnfs_layout_suspend = 0;
>>>        nfsi->layout = NULL;
>>>  #endif /* CONFIG_NFS_V4_1 */
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 48a763e..f12568d 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5442,13 +5442,32 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
>>>  {
>>>        struct nfs4_layoutget *lgp = calldata;
>>>        struct inode *ino = lgp->args.inode;
>>> +       struct nfs_inode *nfsi = NFS_I(ino);
>>>        struct nfs_server *server = NFS_SERVER(ino);
>>> +       struct pnfs_layout_segment *lseg;
>>>
>>>        dprintk("--> %s\n", __func__);
>>> -       if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>>> -                               &lgp->res.seq_res, 0, task))
>>> +       spin_lock(&ino->i_lock);
>>> +       lseg = pnfs_has_layout(nfsi->layout, &lgp->args.range);
>>> +       if (likely(!lseg)) {
>>> +               spin_unlock(&ino->i_lock);
>>> +               dprintk("%s: no lseg found, proceeding\n", __func__);
>>> +               if (!nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>>> +                                        &lgp->res.seq_res, 0, task))
>>> +                       rpc_call_start(task);
>>>                return;
>>> -       rpc_call_start(task);
>>> +       }
>>> +       if (!lseg->valid) {
>>> +               put_lseg_locked(lseg);
>>> +               spin_unlock(&ino->i_lock);
>>> +               dprintk("%s: invalid lseg found, waiting\n", __func__);
>>> +               rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
>>> +               return;
>>> +       }
>>> +       *lgp->lsegpp = lseg;
>>> +       spin_unlock(&ino->i_lock);
>>> +       dprintk("%s: valid lseg found, no rpc required\n", __func__);
>>> +       rpc_exit(task, NFS4_OK);
>>>  }
>>>
>>>  static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index bc4f0c1..2450383 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -430,7 +430,7 @@ destroy_lseg(struct kref *kref)
>>>        PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
>>>  }
>>>
>>> -static void
>>> +void
>>>  put_lseg_locked(struct pnfs_layout_segment *lseg)
>>>  {
>>>        bool do_wake_up;
>>> @@ -444,8 +444,10 @@ put_lseg_locked(struct pnfs_layout_segment *lseg)
>>>        do_wake_up = !lseg->valid;
>>>        nfsi = PNFS_NFS_INODE(lseg->layout);
>>>        kref_put(&lseg->kref, destroy_lseg);
>>> -       if (do_wake_up)
>>> +       if (do_wake_up) {
>>>                wake_up(&nfsi->lo_waitq);
>>> +               rpc_wake_up(&nfsi->lo_rpcwaitq);
>>> +       }
>>>  }
>>>
>>>  void
>>> @@ -999,7 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
>>>  /*
>>>  * lookup range in layout
>>>  */
>>> -static struct pnfs_layout_segment *
>>> +struct pnfs_layout_segment *
>>>  pnfs_has_layout(struct pnfs_layout_hdr *lo,
>>>                struct pnfs_layout_range *range)
>>>  {
>>> @@ -1057,22 +1059,20 @@ _pnfs_update_layout(struct inode *ino,
>>>
>>>        /* Check to see if the layout for the given range already exists */
>>>        lseg = pnfs_has_layout(lo, &arg);
>>> -       if (lseg && !lseg->valid) {
>>> -               put_lseg_locked(lseg);
>>> -               /* someone is cleaning the layout */
>>> -               lseg = NULL;
>>> -               goto out_unlock;
>>> -       }
>>> -
>>>        if (lseg) {
>>> -               dprintk("%s: Using cached lseg %p for %llu@%llu iomode %d)\n",
>>> -                       __func__,
>>> -                       lseg,
>>> -                       arg.length,
>>> -                       arg.offset,
>>> -                       arg.iomode);
>>> +               if (lseg->valid) {
>>> +                       dprintk("%s: Using cached lseg %p for %llu@%llu "
>>> +                               "iomode %d)\n",
>>> +                               __func__,
>>> +                               lseg,
>>> +                               arg.length,
>>> +                               arg.offset,
>>> +                               arg.iomode);
>>>
>>> -               goto out_unlock;
>>> +                       goto out_unlock;
>>> +               }
>>> +               /* someone is cleaning the layout */
>>> +               put_lseg_locked(lseg);
>>>        }
>>>
>>>        /* if get layout already failed once goto out */
>>> @@ -1097,8 +1097,7 @@ out:
>>>                nfsi->layout->state, lseg);
>>>        return;
>>>  out_unlock:
>>> -       if (lsegpp)
>>> -               *lsegpp = lseg;
>>> +       *lsegpp = lseg;
>>>        spin_unlock(&ino->i_lock);
>>>        goto out;
>>>  }
>>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>>> index ea70385..f7f567e 100644
>>> --- a/fs/nfs/pnfs.h
>>> +++ b/fs/nfs/pnfs.h
>>> @@ -34,6 +34,9 @@ extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
>>>  /* pnfs.c */
>>>  extern const nfs4_stateid zero_stateid;
>>>
>>> +void put_lseg_locked(struct pnfs_layout_segment *);
>>> +struct pnfs_layout_segment *pnfs_has_layout(struct pnfs_layout_hdr *,
>>> +                                           struct pnfs_layout_range *);
>>>  void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>>>        loff_t pos, u64 count, enum pnfs_iomode access_type,
>>>        struct pnfs_layout_segment **lsegpp);
>>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>>> index 5bafa95..196c0c6 100644
>>> --- a/include/linux/nfs_fs.h
>>> +++ b/include/linux/nfs_fs.h
>>> @@ -213,6 +213,7 @@ struct nfs_inode {
>>>        /* pNFS layout information */
>>>  #if defined(CONFIG_NFS_V4_1)
>>>        wait_queue_head_t lo_waitq;
>>> +       struct rpc_wait_queue lo_rpcwaitq;
>>>        struct pnfs_layout_hdr *layout;
>>>        time_t pnfs_layout_suspend;
>>>  #endif /* CONFIG_NFS_V4_1 */
>>> --
>>> 1.7.2.2
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs
  2010-09-14 12:47       ` Fred Isaman
@ 2010-09-14 13:13         ` Benny Halevy
  0 siblings, 0 replies; 9+ messages in thread
From: Benny Halevy @ 2010-09-14 13:13 UTC (permalink / raw)
  To: Fred Isaman; +Cc: linux-nfs

On 2010-09-14 14:47, Fred Isaman wrote:
> On Tue, Sep 14, 2010 at 5:40 AM, Benny Halevy <bhalevy@panasas.com> wrote:
>> On 2010-09-14 14:00, Fred Isaman wrote:
>>> Note that this waits on any range that overlaps both an outstanding
>>> LAYOUTRETURN and a current layout.  Whereas what you really want is to
>>> wait on any range that overlaps an outstanding LAYOUTRETURN.  One way
>>> to get the desired behavior would be to have LAYOUTRETURN insert an
>>> "invalid" lseg covering its range.
>>
>> What do you mean by "current layout"?
> 
> Consider the case where the server asks for a LAYOUTRETURN of the
> range 0-1000, but the client currently only has lsegs covering 0-500.
> The client should wait for the LAYOUTRETURN to complete before sending
> any LAYOUTGET for the range 500-1000, but there is currently no
> mechanism to do this.
> 

Yes, this could be yet another optimization to avoid
getting NFS4ERR_RECALLCONFLICT.

Benny

> Fred
> 
>> The way it works now is that layout return does mark the layout segments
>> it's about to return as invalid (in pnfs_return_layout_barrier).
>> layout get inserts a valid layout segment only upon completion
>> (in pnfs_layout_process) and it does not yet insert an invalid one
>> before sending the RPC, something we may wan to consider in the future
>> to prevent superfluous parallel gets.
>>
>> Benny
>>
>>>
>>> Fred
>>>
>>>
>>> On Tue, Sep 14, 2010 at 3:53 AM, Benny Halevy <bhalevy@panasas.com> wrote:
>>>> Introduce lo_rpcwaitq and sleep on it while there's an lseg marked invalid,
>>>> i.e. it is being returned (in the future, it could also be layoutget in progress).
>>>>
>>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
>>>> ---
>>>>  fs/nfs/inode.c         |    1 +
>>>>  fs/nfs/nfs4proc.c      |   25 ++++++++++++++++++++++---
>>>>  fs/nfs/pnfs.c          |   37 ++++++++++++++++++-------------------
>>>>  fs/nfs/pnfs.h          |    3 +++
>>>>  include/linux/nfs_fs.h |    1 +
>>>>  5 files changed, 45 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>>>> index 7621cfc..059bdf7 100644
>>>> --- a/fs/nfs/inode.c
>>>> +++ b/fs/nfs/inode.c
>>>> @@ -1462,6 +1462,7 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
>>>>        init_rwsem(&nfsi->rwsem);
>>>>  #ifdef CONFIG_NFS_V4_1
>>>>        init_waitqueue_head(&nfsi->lo_waitq);
>>>> +       rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layout");
>>>>        nfsi->pnfs_layout_suspend = 0;
>>>>        nfsi->layout = NULL;
>>>>  #endif /* CONFIG_NFS_V4_1 */
>>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>>> index 48a763e..f12568d 100644
>>>> --- a/fs/nfs/nfs4proc.c
>>>> +++ b/fs/nfs/nfs4proc.c
>>>> @@ -5442,13 +5442,32 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
>>>>  {
>>>>        struct nfs4_layoutget *lgp = calldata;
>>>>        struct inode *ino = lgp->args.inode;
>>>> +       struct nfs_inode *nfsi = NFS_I(ino);
>>>>        struct nfs_server *server = NFS_SERVER(ino);
>>>> +       struct pnfs_layout_segment *lseg;
>>>>
>>>>        dprintk("--> %s\n", __func__);
>>>> -       if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>>>> -                               &lgp->res.seq_res, 0, task))
>>>> +       spin_lock(&ino->i_lock);
>>>> +       lseg = pnfs_has_layout(nfsi->layout, &lgp->args.range);
>>>> +       if (likely(!lseg)) {
>>>> +               spin_unlock(&ino->i_lock);
>>>> +               dprintk("%s: no lseg found, proceeding\n", __func__);
>>>> +               if (!nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
>>>> +                                        &lgp->res.seq_res, 0, task))
>>>> +                       rpc_call_start(task);
>>>>                return;
>>>> -       rpc_call_start(task);
>>>> +       }
>>>> +       if (!lseg->valid) {
>>>> +               put_lseg_locked(lseg);
>>>> +               spin_unlock(&ino->i_lock);
>>>> +               dprintk("%s: invalid lseg found, waiting\n", __func__);
>>>> +               rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
>>>> +               return;
>>>> +       }
>>>> +       *lgp->lsegpp = lseg;
>>>> +       spin_unlock(&ino->i_lock);
>>>> +       dprintk("%s: valid lseg found, no rpc required\n", __func__);
>>>> +       rpc_exit(task, NFS4_OK);
>>>>  }
>>>>
>>>>  static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
>>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>>> index bc4f0c1..2450383 100644
>>>> --- a/fs/nfs/pnfs.c
>>>> +++ b/fs/nfs/pnfs.c
>>>> @@ -430,7 +430,7 @@ destroy_lseg(struct kref *kref)
>>>>        PNFS_LD_IO_OPS(lseg->layout)->free_lseg(lseg);
>>>>  }
>>>>
>>>> -static void
>>>> +void
>>>>  put_lseg_locked(struct pnfs_layout_segment *lseg)
>>>>  {
>>>>        bool do_wake_up;
>>>> @@ -444,8 +444,10 @@ put_lseg_locked(struct pnfs_layout_segment *lseg)
>>>>        do_wake_up = !lseg->valid;
>>>>        nfsi = PNFS_NFS_INODE(lseg->layout);
>>>>        kref_put(&lseg->kref, destroy_lseg);
>>>> -       if (do_wake_up)
>>>> +       if (do_wake_up) {
>>>>                wake_up(&nfsi->lo_waitq);
>>>> +               rpc_wake_up(&nfsi->lo_rpcwaitq);
>>>> +       }
>>>>  }
>>>>
>>>>  void
>>>> @@ -999,7 +1001,7 @@ has_matching_lseg(struct pnfs_layout_segment *lseg,
>>>>  /*
>>>>  * lookup range in layout
>>>>  */
>>>> -static struct pnfs_layout_segment *
>>>> +struct pnfs_layout_segment *
>>>>  pnfs_has_layout(struct pnfs_layout_hdr *lo,
>>>>                struct pnfs_layout_range *range)
>>>>  {
>>>> @@ -1057,22 +1059,20 @@ _pnfs_update_layout(struct inode *ino,
>>>>
>>>>        /* Check to see if the layout for the given range already exists */
>>>>        lseg = pnfs_has_layout(lo, &arg);
>>>> -       if (lseg && !lseg->valid) {
>>>> -               put_lseg_locked(lseg);
>>>> -               /* someone is cleaning the layout */
>>>> -               lseg = NULL;
>>>> -               goto out_unlock;
>>>> -       }
>>>> -
>>>>        if (lseg) {
>>>> -               dprintk("%s: Using cached lseg %p for %llu@%llu iomode %d)\n",
>>>> -                       __func__,
>>>> -                       lseg,
>>>> -                       arg.length,
>>>> -                       arg.offset,
>>>> -                       arg.iomode);
>>>> +               if (lseg->valid) {
>>>> +                       dprintk("%s: Using cached lseg %p for %llu@%llu "
>>>> +                               "iomode %d)\n",
>>>> +                               __func__,
>>>> +                               lseg,
>>>> +                               arg.length,
>>>> +                               arg.offset,
>>>> +                               arg.iomode);
>>>>
>>>> -               goto out_unlock;
>>>> +                       goto out_unlock;
>>>> +               }
>>>> +               /* someone is cleaning the layout */
>>>> +               put_lseg_locked(lseg);
>>>>        }
>>>>
>>>>        /* if get layout already failed once goto out */
>>>> @@ -1097,8 +1097,7 @@ out:
>>>>                nfsi->layout->state, lseg);
>>>>        return;
>>>>  out_unlock:
>>>> -       if (lsegpp)
>>>> -               *lsegpp = lseg;
>>>> +       *lsegpp = lseg;
>>>>        spin_unlock(&ino->i_lock);
>>>>        goto out;
>>>>  }
>>>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>>>> index ea70385..f7f567e 100644
>>>> --- a/fs/nfs/pnfs.h
>>>> +++ b/fs/nfs/pnfs.h
>>>> @@ -34,6 +34,9 @@ extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
>>>>  /* pnfs.c */
>>>>  extern const nfs4_stateid zero_stateid;
>>>>
>>>> +void put_lseg_locked(struct pnfs_layout_segment *);
>>>> +struct pnfs_layout_segment *pnfs_has_layout(struct pnfs_layout_hdr *,
>>>> +                                           struct pnfs_layout_range *);
>>>>  void _pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>>>>        loff_t pos, u64 count, enum pnfs_iomode access_type,
>>>>        struct pnfs_layout_segment **lsegpp);
>>>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>>>> index 5bafa95..196c0c6 100644
>>>> --- a/include/linux/nfs_fs.h
>>>> +++ b/include/linux/nfs_fs.h
>>>> @@ -213,6 +213,7 @@ struct nfs_inode {
>>>>        /* pNFS layout information */
>>>>  #if defined(CONFIG_NFS_V4_1)
>>>>        wait_queue_head_t lo_waitq;
>>>> +       struct rpc_wait_queue lo_rpcwaitq;
>>>>        struct pnfs_layout_hdr *layout;
>>>>        time_t pnfs_layout_suspend;
>>>>  #endif /* CONFIG_NFS_V4_1 */
>>>> --
>>>> 1.7.2.2
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-09-14 13:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-14 10:52 [RFC 0/4] move layout get/return synchronization to rpc prepare Benny Halevy
2010-09-14 10:53 ` [PATCH 1/4] SQUASHME: pnfs-post-submit: remove take_ref and only_valid params from pnfs_has_layout Benny Halevy
2010-09-14 10:53 ` [PATCH 2/4] pnfs: allow nfs4_proc_layoutget to sleep on invalid lsegs Benny Halevy
2010-09-14 12:00   ` Fred Isaman
2010-09-14 12:40     ` Benny Halevy
2010-09-14 12:47       ` Fred Isaman
2010-09-14 13:13         ` Benny Halevy
2010-09-14 10:53 ` [PATCH 3/4] pnfs: allow nfs4_proc_layoutreturn to sleep on barrier Benny Halevy
2010-09-14 10:53 ` [PATCH 4/4] SQUASHME: pnfs: get rid of lo_waitq Benny Halevy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.