All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers
@ 2012-06-20 19:03 andros
  2012-06-20 19:03 ` [PATCH Version 2 1/4] NFSv4.1 return the LAYOUT for each file with failed DS connection I/O andros
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: andros @ 2012-06-20 19:03 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, Andy Adamson

From: Andy Adamson <andros@netapp.com>

When I/O sent to a file layout data server has a connection error,the I/O
is resent to the MDS. As with all mulipath I/O, we want to fence one path,
the DS I/O path, ASAP to prevent data corruption.  PNFS provides the
LAYOUTREURN call to fence file layout data servers.

The current implementation of _pnfs_return_layout does not take a layout range
but returns all layouts. It also does not wait for all layout segments to
be marked invalid prior to sending a LAYOUTRETURN. This works well for the
file layout driver. Other drives can address these issues if needed.

The first patch calls LAYOUTRETURN for each failed data server I/O.
The second patch blocks sending a LAYOUTCOMMIT on the returned layouts.

Because LAYOUTRETURN is called on each failed data server request, it can be
called multiple times. The third patch addresses this by setting a flag. The
flag is cleared on subsequent LAYOUTGETs.

The DS connection error also causes the deviceid using the DS to be marked
invalid. This prevents any new layouts from subsequent LAYOUTGETs that use the
invalid deviceid from being cached - but the layout hdr is not reaped,
resulting in a pnfs_layout_hdr with an empty plh_segs list.
New LAYOUTGETs are allowed because they can use a new valid deviceid, and in
fact, this is currently the only way an MDS can use pNFS again for files with
layouts referring to an invalid deviceid.

pnfs_layout_hdrs with empty plh_segs lists are prevented from sending a
LAYOUTRETURN (on evict inode) by the last patch.

Testing: Against the pynfs file layout server and network partioning the
client from the DS.


Andy Adamson (4):
  NFSv4.1 return the LAYOUT for each file with failed DS connection I/O
  NFSv4.1 don't send LAYOUTCOMMIT if data resent through MDS
  NFSv4.1 mark layout when already returned
  NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list

 fs/nfs/nfs4filelayout.c |    4 ++--
 fs/nfs/pnfs.c           |   33 +++++++++++++++++++++++++++------
 fs/nfs/pnfs.h           |   19 +++++++++++++++++++
 3 files changed, 48 insertions(+), 8 deletions(-)

-- 
1.7.6.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH Version 2 1/4] NFSv4.1 return the LAYOUT for each file with failed DS connection I/O
  2012-06-20 19:03 [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers andros
@ 2012-06-20 19:03 ` andros
  2012-06-20 19:03 ` [PATCH Version 2 2/4] NFSv4.1 don't send LAYOUTCOMMIT if data resent through MDS andros
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: andros @ 2012-06-20 19:03 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, Andy Adamson

From: Andy Adamson <andros@netapp.com>

First mark the deviceid invalid to prevent any future use. Then fence all
files involved in I/O to a DS with a connection error by sending a
LAYOUTRETURN.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfs/nfs4filelayout.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 85b7063..26b96de 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -205,9 +205,8 @@ static int filelayout_async_handle_error(struct rpc_task *task,
 	case -EPIPE:
 		dprintk("%s DS connection error %d\n", __func__,
 			task->tk_status);
-		if (!filelayout_test_devid_invalid(devid))
-			_pnfs_return_layout(inode);
 		filelayout_mark_devid_invalid(devid);
+		_pnfs_return_layout(inode);
 		rpc_wake_up(&tbl->slot_tbl_waitq);
 		nfs4_ds_disconnect(clp);
 		/* fall through */
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH Version 2 2/4] NFSv4.1 don't send LAYOUTCOMMIT if data resent through MDS
  2012-06-20 19:03 [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers andros
  2012-06-20 19:03 ` [PATCH Version 2 1/4] NFSv4.1 return the LAYOUT for each file with failed DS connection I/O andros
@ 2012-06-20 19:03 ` andros
  2012-06-20 19:03 ` [PATCH Version 2 3/4] NFSv4.1 mark layout when already returned andros
  2012-06-20 19:03 ` [PATCH Version 2 4/4] NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list andros
  3 siblings, 0 replies; 5+ messages in thread
From: andros @ 2012-06-20 19:03 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, Andy Adamson

From: Andy Adamson <andros@netapp.com>

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfs/nfs4filelayout.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 26b96de..53f94d9 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -206,6 +206,7 @@ static int filelayout_async_handle_error(struct rpc_task *task,
 		dprintk("%s DS connection error %d\n", __func__,
 			task->tk_status);
 		filelayout_mark_devid_invalid(devid);
+		clear_bit(NFS_INO_LAYOUTCOMMIT, &NFS_I(inode)->flags);
 		_pnfs_return_layout(inode);
 		rpc_wake_up(&tbl->slot_tbl_waitq);
 		nfs4_ds_disconnect(clp);
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH Version 2 3/4] NFSv4.1 mark layout when already returned
  2012-06-20 19:03 [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers andros
  2012-06-20 19:03 ` [PATCH Version 2 1/4] NFSv4.1 return the LAYOUT for each file with failed DS connection I/O andros
  2012-06-20 19:03 ` [PATCH Version 2 2/4] NFSv4.1 don't send LAYOUTCOMMIT if data resent through MDS andros
@ 2012-06-20 19:03 ` andros
  2012-06-20 19:03 ` [PATCH Version 2 4/4] NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list andros
  3 siblings, 0 replies; 5+ messages in thread
From: andros @ 2012-06-20 19:03 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, Andy Adamson

From: Andy Adamson <andros@netapp.com>

When the file layout driver is fencing a DS, _pnfs_return_layout can be
called mulitple times per inode due to in-flight i/o referencing lsegs on it's
plh_segs list.

Remember that LAYOUTRETURN has been called, and do not call it again.
Allow LAYOUTRETURNs after a subsequent LAYOUTGET.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfs/pnfs.c |   10 ++++++++--
 fs/nfs/pnfs.h |   19 +++++++++++++++++++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index bbc49ca..3dfd0a6 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -662,11 +662,11 @@ _pnfs_return_layout(struct inode *ino)
 	nfs4_stateid stateid;
 	int status = 0;
 
-	dprintk("--> %s\n", __func__);
+	dprintk("NFS: %s for inode %lu\n", __func__, ino->i_ino);
 
 	spin_lock(&ino->i_lock);
 	lo = nfsi->layout;
-	if (!lo) {
+	if (!lo || pnfs_test_layout_returned(lo)) {
 		spin_unlock(&ino->i_lock);
 		dprintk("%s: no layout to return\n", __func__);
 		return status;
@@ -676,6 +676,7 @@ _pnfs_return_layout(struct inode *ino)
 	get_layout_hdr(lo);
 	mark_matching_lsegs_invalid(lo, &tmp_list, NULL);
 	lo->plh_block_lgets++;
+	pnfs_mark_layout_returned(lo);
 	spin_unlock(&ino->i_lock);
 	pnfs_free_lseg_list(&tmp_list);
 
@@ -686,6 +687,7 @@ _pnfs_return_layout(struct inode *ino)
 		status = -ENOMEM;
 		set_bit(NFS_LAYOUT_RW_FAILED, &lo->plh_flags);
 		set_bit(NFS_LAYOUT_RO_FAILED, &lo->plh_flags);
+		pnfs_clear_layout_returned(lo);
 		put_layout_hdr(lo);
 		goto out;
 	}
@@ -1075,6 +1077,10 @@ pnfs_update_layout(struct inode *ino,
 	get_layout_hdr(lo);
 	if (list_empty(&lo->plh_segs))
 		first = true;
+
+	/* Enable LAYOUTRETURNs */
+	pnfs_clear_layout_returned(lo);
+
 	spin_unlock(&ino->i_lock);
 	if (first) {
 		/* The lo must be on the clp list if there is any
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 64f90d8..f9bc98c 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -64,6 +64,7 @@ enum {
 	NFS_LAYOUT_ROC,			/* some lseg had roc bit set */
 	NFS_LAYOUT_DESTROYED,		/* no new use of layout allowed */
 	NFS_LAYOUT_INVALID,		/* layout is being destroyed */
+	NFS_LAYOUT_RETURNED,		/* layout has already been returned */
 };
 
 enum layoutdriver_policy_flags {
@@ -255,6 +256,24 @@ struct nfs4_deviceid_node *nfs4_insert_deviceid_node(struct nfs4_deviceid_node *
 bool nfs4_put_deviceid_node(struct nfs4_deviceid_node *);
 void nfs4_deviceid_purge_client(const struct nfs_client *);
 
+static inline void
+pnfs_mark_layout_returned(struct pnfs_layout_hdr *lo)
+{
+	set_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
+}
+
+static inline void
+pnfs_clear_layout_returned(struct pnfs_layout_hdr *lo)
+{
+	clear_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
+}
+
+static inline bool
+pnfs_test_layout_returned(struct pnfs_layout_hdr *lo)
+{
+	return test_bit(NFS_LAYOUT_RETURNED, &lo->plh_flags);
+}
+
 static inline int lo_fail_bit(u32 iomode)
 {
 	return iomode == IOMODE_RW ?
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH Version 2 4/4] NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list
  2012-06-20 19:03 [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers andros
                   ` (2 preceding siblings ...)
  2012-06-20 19:03 ` [PATCH Version 2 3/4] NFSv4.1 mark layout when already returned andros
@ 2012-06-20 19:03 ` andros
  3 siblings, 0 replies; 5+ messages in thread
From: andros @ 2012-06-20 19:03 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs, Andy Adamson

From: Andy Adamson <andros@netapp.com>

mark_matching_lsegs_invalid() resets the mds_threshold counters and can
dereference the layout hdr on an initial empty plh_segs list. It returns 0 both
in the case of an initial empty list and in a non-emtpy list that was cleared
by calls to mark_lseg_invalid.

Don't send a LAYOUTRETURN if the list was initially empty.

Signed-off-by: Andy Adamson <andros@netapp.com>
---
 fs/nfs/pnfs.c |   23 +++++++++++++++++++----
 1 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 3dfd0a6..2669710 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -651,7 +651,14 @@ out_err_free:
 	return NULL;
 }
 
-/* Initiates a LAYOUTRETURN(FILE) */
+/*
+ * Initiates a LAYOUTRETURN(FILE), and removes the pnfs_layout_hdr
+ * when the layout segment list is empty.
+ *
+ * Note that a pnfs_layout_hdr can exist with an empty layout segment
+ * list when LAYOUTGET has failed, or when LAYOUTGET succeeded, but the
+ * deviceid is marked invalid.
+ */
 int
 _pnfs_return_layout(struct inode *ino)
 {
@@ -660,7 +667,7 @@ _pnfs_return_layout(struct inode *ino)
 	LIST_HEAD(tmp_list);
 	struct nfs4_layoutreturn *lrp;
 	nfs4_stateid stateid;
-	int status = 0;
+	int status = 0, empty;
 
 	dprintk("NFS: %s for inode %lu\n", __func__, ino->i_ino);
 
@@ -668,13 +675,21 @@ _pnfs_return_layout(struct inode *ino)
 	lo = nfsi->layout;
 	if (!lo || pnfs_test_layout_returned(lo)) {
 		spin_unlock(&ino->i_lock);
-		dprintk("%s: no layout to return\n", __func__);
-		return status;
+		dprintk("NFS: %s no layout to return\n", __func__);
+		goto out;
 	}
 	stateid = nfsi->layout->plh_stateid;
 	/* Reference matched in nfs4_layoutreturn_release */
 	get_layout_hdr(lo);
+	empty = list_empty(&lo->plh_segs);
 	mark_matching_lsegs_invalid(lo, &tmp_list, NULL);
+	/* Don't send a LAYOUTRETURN if list was initially empty */
+	if (empty) {
+		spin_unlock(&ino->i_lock);
+		put_layout_hdr(lo);
+		dprintk("NFS: %s no layout segments to return\n", __func__);
+		goto out;
+	}
 	lo->plh_block_lgets++;
 	pnfs_mark_layout_returned(lo);
 	spin_unlock(&ino->i_lock);
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-06-20 19:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-20 19:03 [PATCH Version 2 0/4] NFSv4.1 enable fencing of file layout data servers andros
2012-06-20 19:03 ` [PATCH Version 2 1/4] NFSv4.1 return the LAYOUT for each file with failed DS connection I/O andros
2012-06-20 19:03 ` [PATCH Version 2 2/4] NFSv4.1 don't send LAYOUTCOMMIT if data resent through MDS andros
2012-06-20 19:03 ` [PATCH Version 2 3/4] NFSv4.1 mark layout when already returned andros
2012-06-20 19:03 ` [PATCH Version 2 4/4] NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list andros

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.