All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Lustre fixes
@ 2015-03-26  1:53 green
  2015-03-26  1:53 ` [PATCH 01/10] staging/lustre/osc: shorten IO calling path green
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Oleg Drokin

From: Oleg Drokin <green@linuxhacker.ru>

A number of lustre fixes.

Andriy Skulysh (1):
  staging/lustre/ptlrpc: fix import state during replay

Bobi Jam (2):
  staging/lustre/osc: shorten IO calling path
  staging/lustre/mgc: detach MGC dev on error

Christopher J. Morrone (1):
  staging/lustre/mdc: Handle empty but non-zero acl xattr

Hongchao Zhang (1):
  staging/lustre/mgc: check the import stat for lprocfs

Lai Siyao (1):
  staging/lustre/xattr: xattr data may be gone with lock held

Li Dongyang (1):
  staging/lustre/llite: glimpse the inode before doing fiemap

Liang Zhen (1):
  staging/lustre/ptlrpc: false alarm in AT network latency measuring

Niu Yawei (1):
  staging/lustre: update timestamps after buiding rpc

Yang Sheng (1):
  staging/lustre/lov: don't crash accessing LOV object with FID{0,0}

 .../staging/lustre/lustre/include/lustre_import.h  |  2 +
 drivers/staging/lustre/lustre/llite/file.c         |  6 +++
 drivers/staging/lustre/lustre/llite/xattr_cache.c  | 19 +++++----
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  6 +++
 drivers/staging/lustre/lustre/lov/lov_internal.h   | 12 ++++++
 drivers/staging/lustre/lustre/lov/lov_io.c         | 12 ++++++
 drivers/staging/lustre/lustre/lov/lov_lock.c       | 26 ++++++++++---
 drivers/staging/lustre/lustre/lov/lov_obd.c        | 32 ++++++++++++---
 drivers/staging/lustre/lustre/lov/lov_object.c     |  7 ++++
 drivers/staging/lustre/lustre/lov/lov_request.c    |  9 +++++
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  3 ++
 drivers/staging/lustre/lustre/mgc/mgc_request.c    | 45 +++++++++++++++-------
 drivers/staging/lustre/lustre/osc/osc_cache.c      |  2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |  3 ++
 drivers/staging/lustre/lustre/ptlrpc/client.c      | 25 +++++++++---
 drivers/staging/lustre/lustre/ptlrpc/import.c      | 15 +++++++-
 16 files changed, 184 insertions(+), 40 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/10] staging/lustre/osc: shorten IO calling path
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
@ 2015-03-26  1:53 ` green
  2015-03-26  2:04   ` [PATCH v2 " green
  2015-03-26  1:53 ` [PATCH 02/10] staging/lustre/mdc: Handle empty but non-zero acl xattr green
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Bobi Jam, Bob Glossman, Oleg Drokin

From: Bobi Jam <bobijam.xu@intel.com>

By using osc_io_unplug_aync() for osc_queue_sync_pages() to shorten
the IO calling path, to reduce the chance of stack overflow.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com
Reviewed-on: http://review.whamcloud.com/11612
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3188
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/osc/osc_cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 7022ed4..d44b3d4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -2613,7 +2613,7 @@ int osc_queue_sync_pages(const struct lu_env *env, struct osc_object *obj,
 	}
 	osc_object_unlock(obj);
 
-	osc_io_unplug(env, cli, obj, PDL_POLICY_ROUND);
+	osc_io_unplug_async(env, cli, obj);
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/10] staging/lustre/mdc: Handle empty but non-zero acl xattr
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
  2015-03-26  1:53 ` [PATCH 01/10] staging/lustre/osc: shorten IO calling path green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 03/10] staging/lustre/ptlrpc: false alarm in AT network latency measuring green
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Christopher J. Morrone, Oleg Drokin

From: "Christopher J. Morrone" <morrone2@llnl.gov>

We have found that posix_acl_access can have a value
of \002\000\000\000.  In that case body->aclsize is
non-zero, but the there are no actuall acls stored
in the xattr.

In mdc_unpack_acl(), it only checks IS_ERR() on the
pointer returned by posix_acl_from_xattr(), it does not
check for NULL.  Because of the above situation, the
xattr aclsize can be non-zero, but posic_acl_from_xattr()
still returns NULL.  Passing NULL to posix_acl_valid()
crashes the kernel.

We add a check to properly handle the NULL return value.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/11989
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5150
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index c04eec5..f8ef5fe 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -481,6 +481,9 @@ static int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md)
 		return -EPROTO;
 
 	acl = posix_acl_from_xattr(&init_user_ns, buf, body->aclsize);
+	if (acl == NULL)
+		return 0;
+
 	if (IS_ERR(acl)) {
 		rc = PTR_ERR(acl);
 		CERROR("convert xattr to acl: %d\n", rc);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/10] staging/lustre/ptlrpc: false alarm in AT network latency measuring
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
  2015-03-26  1:53 ` [PATCH 01/10] staging/lustre/osc: shorten IO calling path green
  2015-03-26  1:53 ` [PATCH 02/10] staging/lustre/mdc: Handle empty but non-zero acl xattr green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 04/10] staging/lustre/mgc: check the import stat for lprocfs green
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Liang Zhen, Oleg Drokin

From: Liang Zhen <liang.zhen@intel.com>

If early reply of client RPC is lost and client RPC is expired and
resent, server will drop the resent RPC because it's already in
processing, server may also send reply or early reply to client,
which can still match reply buffer of the original request.
In this case, client is measuring time from resent time, but server
is reporting service time of original RPC, which is longer than
the time measured by client.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-on: http://review.whamcloud.com/12855
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5545
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 0fe6a64..0357f1d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -285,14 +285,27 @@ static void ptlrpc_at_adj_net_latency(struct ptlrpc_request *req,
 	time_t now = get_seconds();
 
 	LASSERT(req->rq_import);
-	at = &req->rq_import->imp_at;
+
+	if (service_time > now - req->rq_sent + 3) {
+		/* bz16408, however, this can also happen if early reply
+		 * is lost and client RPC is expired and resent, early reply
+		 * or reply of original RPC can still be fit in reply buffer
+		 * of resent RPC, now client is measuring time from the
+		 * resent time, but server sent back service time of original
+		 * RPC.
+		 */
+		CDEBUG((lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT) ?
+		       D_ADAPTTO : D_WARNING,
+		       "Reported service time %u > total measured time "
+		       CFS_DURATION_T"\n", service_time,
+		       cfs_time_sub(now, req->rq_sent));
+		return;
+	}
 
 	/* Network latency is total time less server processing time */
-	nl = max_t(int, now - req->rq_sent - service_time, 0) + 1/*st rounding*/;
-	if (service_time > now - req->rq_sent + 3 /* bz16408 */)
-		CWARN("Reported service time %u > total measured time "
-		      CFS_DURATION_T"\n", service_time,
-		      cfs_time_sub(now, req->rq_sent));
+	nl = max_t(int, now - req->rq_sent -
+			service_time, 0) + 1; /* st rounding */
+	at = &req->rq_import->imp_at;
 
 	oldnl = at_measured(&at->iat_net_latency, nl);
 	if (oldnl != 0)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/10] staging/lustre/mgc: check the import stat for lprocfs
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (2 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 03/10] staging/lustre/ptlrpc: false alarm in AT network latency measuring green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 05/10] staging/lustre/mgc: detach MGC dev on error green
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Hongchao Zhang, Oleg Drokin

From: Hongchao Zhang <hongchao.zhang@intel.com>

in lprocfs_mgc_rd_ir_state, the import state should be checked
the validity before doing further work.

Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/12896
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5650
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/mgc/mgc_request.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 0375bfe..3f00e77 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -452,10 +452,14 @@ static int config_log_end(char *logname, struct config_llog_instance *cfg)
 int lprocfs_mgc_rd_ir_state(struct seq_file *m, void *data)
 {
 	struct obd_device       *obd = data;
-	struct obd_import       *imp = obd->u.cli.cl_import;
-	struct obd_connect_data *ocd = &imp->imp_connect_data;
+	struct obd_import       *imp;
+	struct obd_connect_data *ocd;
 	struct config_llog_data *cld;
 
+	LPROCFS_CLIMP_CHECK(obd);
+	imp = obd->u.cli.cl_import;
+	ocd = &imp->imp_connect_data;
+
 	seq_printf(m, "imperative_recovery: %s\n",
 		      OCD_HAS_FLAG(ocd, IMP_RECOV) ? "ENABLED" : "DISABLED");
 	seq_printf(m, "client_state:\n");
@@ -470,6 +474,7 @@ int lprocfs_mgc_rd_ir_state(struct seq_file *m, void *data)
 	}
 	spin_unlock(&config_list_lock);
 
+	LPROCFS_CLIMP_EXIT(obd);
 	return 0;
 }
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/10] staging/lustre/mgc: detach MGC dev on error
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (3 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 04/10] staging/lustre/mgc: check the import stat for lprocfs green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 06/10] staging/lustre/lov: don't crash accessing LOV object with FID{0,0} green
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Bobi Jam, Oleg Drokin

From: Bobi Jam <bobijam.xu@intel.com>

lustre_start_mgc() creates MGC device, if error happens later on
ll_fill_super(), this device is still attached, and later mount
fails by keep complaining that the MGC device's already in the
client node.

It turns out that the device was referenced by mgc config llog data
which is arranged in the mgc lock requeue thread re-trying to get its
mgc lock, and in normal case, this llog reference only released in
mgc_blocking_ast() when the system is umount.

This patch make mgc_precleanup() to wake up requeue thread to handle
the config llog data.

This patch also makes mgc_setup() wait for mgc_requeue_thread() start
before moving on.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Reviewed-on: http://review.whamcloud.com/11765
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4943
Reviewed-by: Ryan Haasken <haasken@cray.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/mgc/mgc_request.c | 36 +++++++++++++++++--------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 3f00e77..8496d25 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -484,9 +484,11 @@ int lprocfs_mgc_rd_ir_state(struct seq_file *m, void *data)
 #define RQ_NOW     0x2
 #define RQ_LATER   0x4
 #define RQ_STOP    0x8
+#define RQ_PRECLEANUP  0x10
 static int rq_state;
 static wait_queue_head_t	    rq_waitq;
 static DECLARE_COMPLETION(rq_exit);
+static DECLARE_COMPLETION(rq_start);
 
 static void do_requeue(struct config_llog_data *cld)
 {
@@ -515,6 +517,8 @@ static void do_requeue(struct config_llog_data *cld)
 
 static int mgc_requeue_thread(void *data)
 {
+	bool first = true;
+
 	CDEBUG(D_MGC, "Starting requeue thread\n");
 
 	/* Keep trying failed locks periodically */
@@ -531,13 +535,19 @@ static int mgc_requeue_thread(void *data)
 		rq_state &= ~(RQ_NOW | RQ_LATER);
 		spin_unlock(&config_list_lock);
 
+		if (first) {
+			first = false;
+			complete(&rq_start);
+		}
+
 		/* Always wait a few seconds to allow the server who
 		   caused the lock revocation to finish its setup, plus some
 		   random so everyone doesn't try to reconnect at once. */
 		to = MGC_TIMEOUT_MIN_SECONDS * HZ;
 		to += rand * HZ / 100; /* rand is centi-seconds */
 		lwi = LWI_TIMEOUT(to, NULL, NULL);
-		l_wait_event(rq_waitq, rq_state & RQ_STOP, &lwi);
+		l_wait_event(rq_waitq, rq_state & (RQ_STOP | RQ_PRECLEANUP),
+			     &lwi);
 
 		/*
 		 * iterate & processing through the list. for each cld, process
@@ -550,6 +560,7 @@ static int mgc_requeue_thread(void *data)
 		cld_prev = NULL;
 
 		spin_lock(&config_list_lock);
+		rq_state &= ~RQ_PRECLEANUP;
 		list_for_each_entry(cld, &config_llog_list,
 					cld_list_chain) {
 			if (!cld->cld_lostlock)
@@ -666,24 +677,26 @@ static atomic_t mgc_count = ATOMIC_INIT(0);
 static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 {
 	int rc = 0;
+	int temp;
 
 	switch (stage) {
 	case OBD_CLEANUP_EARLY:
 		break;
 	case OBD_CLEANUP_EXPORTS:
 		if (atomic_dec_and_test(&mgc_count)) {
-			int running;
+			LASSERT(rq_state & RQ_RUNNING);
 			/* stop requeue thread */
-			spin_lock(&config_list_lock);
-			running = rq_state & RQ_RUNNING;
-			if (running)
-				rq_state |= RQ_STOP;
-			spin_unlock(&config_list_lock);
-			if (running) {
-				wake_up(&rq_waitq);
-				wait_for_completion(&rq_exit);
-			}
+			temp = RQ_STOP;
+		} else {
+			/* wakeup requeue thread to clean our cld */
+			temp = RQ_NOW | RQ_PRECLEANUP;
 		}
+		spin_lock(&config_list_lock);
+		rq_state |= temp;
+		spin_unlock(&config_list_lock);
+		wake_up(&rq_waitq);
+		if (temp & RQ_STOP)
+			wait_for_completion(&rq_exit);
 		obd_cleanup_client_import(obd);
 		rc = mgc_llog_fini(NULL, obd);
 		if (rc != 0)
@@ -742,6 +755,7 @@ static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 		}
 		/* rc is the task_struct pointer of mgc_requeue_thread. */
 		rc = 0;
+		wait_for_completion(&rq_start);
 	}
 
 	return rc;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/10] staging/lustre/lov: don't crash accessing LOV object with FID{0,0}
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (4 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 05/10] staging/lustre/mgc: detach MGC dev on error green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 07/10] staging/lustre/ptlrpc: fix import state during replay green
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Yang Sheng, Fan Yong, Oleg Drokin

From: Yang Sheng <yang.sheng@intel.com>

Some object maybe has a corrupted LOV EA or a hole in
LOV EA. We should not crash client in such case.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-on: http://review.whamcloud.com/12740
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4958
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c       |  6 +++++
 drivers/staging/lustre/lustre/lov/lov_internal.h | 12 +++++++++
 drivers/staging/lustre/lustre/lov/lov_io.c       | 12 +++++++++
 drivers/staging/lustre/lustre/lov/lov_lock.c     | 26 ++++++++++++++-----
 drivers/staging/lustre/lustre/lov/lov_obd.c      | 32 +++++++++++++++++++-----
 drivers/staging/lustre/lustre/lov/lov_object.c   |  7 ++++++
 drivers/staging/lustre/lustre/lov/lov_request.c  |  9 +++++++
 7 files changed, 92 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index d1b0417..2bcfaea 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -227,6 +227,9 @@ static int lsm_unpackmd_v1(struct lov_obd *lov, struct lov_stripe_md *lsm,
 		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
 		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
 		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
 			CERROR("OST index %d more than OST count %d\n",
 			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
@@ -314,6 +317,9 @@ static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
 		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
 		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
 		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
 			CERROR("OST index %d more than OST count %d\n",
 			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 8c8508b..b644acc 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -304,4 +304,16 @@ static inline struct lov_stripe_md *lsm_addref(struct lov_stripe_md *lsm)
 	return lsm;
 }
 
+static inline bool lov_oinfo_is_dummy(const struct lov_oinfo *loi)
+{
+	if (unlikely(loi->loi_oi.oi.oi_id == 0 &&
+		     loi->loi_oi.oi.oi_seq == 0 &&
+		     loi->loi_ost_idx == 0 &&
+		     loi->loi_ost_gen == 0))
+		return true;
+
+	return false;
+}
+
+
 #endif
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index f1f6db3..80c2ef6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -148,6 +148,9 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	LASSERT(sub->sub_env == NULL);
 	LASSERT(sub->sub_stripe < lio->lis_stripe_count);
 
+	if (unlikely(lov_r0(lov)->lo_sub[stripe] == NULL))
+		return -EIO;
+
 	result = 0;
 	sub->sub_io_initialized = 0;
 	sub->sub_borrowed = 0;
@@ -391,6 +394,15 @@ static int lov_io_iter_init(const struct lu_env *env,
 					   endpos, &start, &end))
 			continue;
 
+		if (unlikely(lov_r0(lio->lis_object)->lo_sub[stripe] == NULL)) {
+			if (ios->cis_io->ci_type == CIT_READ ||
+			    ios->cis_io->ci_type == CIT_WRITE ||
+			    ios->cis_io->ci_type == CIT_FAULT)
+				return -EIO;
+
+			continue;
+		}
+
 		end = lov_offset_mod(end, +1);
 		sub = lov_sub_get(env, lio, stripe);
 		if (!IS_ERR(sub)) {
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index 49e6942..f2eca56 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -308,7 +308,8 @@ static int lov_lock_sub_init(const struct lu_env *env,
 		 * XXX for wide striping smarter algorithm is desirable,
 		 * breaking out of the loop, early.
 		 */
-		if (lov_stripe_intersects(loo->lo_lsm, i,
+		if (likely(r0->lo_sub[i] != NULL) &&
+		    lov_stripe_intersects(loo->lo_lsm, i,
 					  file_start, file_end, &start, &end))
 			nr++;
 	}
@@ -326,7 +327,8 @@ static int lov_lock_sub_init(const struct lu_env *env,
 	 * top-lock.
 	 */
 	for (i = 0, nr = 0; i < r0->lo_nr; ++i) {
-		if (lov_stripe_intersects(loo->lo_lsm, i,
+		if (likely(r0->lo_sub[i] != NULL) &&
+		    lov_stripe_intersects(loo->lo_lsm, i,
 					  file_start, file_end, &start, &end)) {
 			struct cl_lock_descr *descr;
 
@@ -914,10 +916,22 @@ static int lov_lock_stripe_is_matching(const struct lu_env *env,
 	 */
 	start = cl_offset(&lov->lo_cl, descr->cld_start);
 	end   = cl_offset(&lov->lo_cl, descr->cld_end + 1) - 1;
-	result = end - start <= lsm->lsm_stripe_size &&
-		 stripe == lov_stripe_number(lsm, start) &&
-		 stripe == lov_stripe_number(lsm, end);
-	if (result) {
+	result = 0;
+	/* glimpse should work on the object with LOV EA hole. */
+	if (end - start <= lsm->lsm_stripe_size) {
+		int idx;
+
+		idx = lov_stripe_number(lsm, start);
+		if (idx == stripe ||
+		    unlikely(lov_r0(lov)->lo_sub[idx] == NULL)) {
+			idx = lov_stripe_number(lsm, end);
+			if (idx == stripe ||
+			    unlikely(lov_r0(lov)->lo_sub[idx] == NULL))
+				result = 1;
+		}
+	}
+
+	if (result != 0) {
 		struct cl_lock_descr *subd = &lov_env_info(env)->lti_ldescr;
 		u64 sub_start;
 		u64 sub_end;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b2ed52f..0278157 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1011,9 +1011,13 @@ static int lov_recreate(struct obd_export *exp, struct obdo *src_oa,
 	}
 
 	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		if (lsm->lsm_oinfo[i]->loi_ost_idx == ost_idx) {
-			if (ostid_id(&lsm->lsm_oinfo[i]->loi_oi) !=
-					ostid_id(&src_oa->o_oi)) {
+		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
+		if (loi->loi_ost_idx == ost_idx) {
+			if (ostid_id(&loi->loi_oi) != ostid_id(&src_oa->o_oi)) {
 				rc = -EINVAL;
 				goto out;
 			}
@@ -1305,10 +1309,14 @@ static int lov_find_cbdata(struct obd_export *exp,
 		struct lov_stripe_md submd;
 		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
 
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (!lov->lov_tgts[loi->loi_ost_idx]) {
-			CDEBUG(D_HA, "lov idx %d NULL \n", loi->loi_ost_idx);
+			CDEBUG(D_HA, "lov idx %d NULL\n", loi->loi_ost_idx);
 			continue;
 		}
+
 		submd.lsm_oi = loi->loi_oi;
 		submd.lsm_stripe_count = 0;
 		rc = obd_find_cbdata(lov->lov_tgts[loi->loi_ost_idx]->ltd_exp,
@@ -1616,8 +1624,12 @@ static u64 fiemap_calc_fm_end_offset(struct ll_user_fiemap *fiemap,
 
 	/* Find out stripe_no from ost_index saved in the fe_device */
 	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		if (lsm->lsm_oinfo[i]->loi_ost_idx ==
-					fiemap->fm_extents[0].fe_device) {
+		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
+
+		if (lov_oinfo_is_dummy(oinfo))
+			continue;
+
+		if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) {
 			stripe_no = i;
 			break;
 		}
@@ -1795,6 +1807,11 @@ static int lov_fiemap(struct lov_obd *lov, __u32 keylen, void *key,
 					   &lun_start, &obd_object_end)) == 0)
 			continue;
 
+		if (lov_oinfo_is_dummy(lsm->lsm_oinfo[cur_stripe])) {
+			rc = -EIO;
+			goto out;
+		}
+
 		/* If this is a continuation FIEMAP call and we are on
 		 * starting stripe then lun_start needs to be set to
 		 * fm_end_offset */
@@ -1985,6 +2002,9 @@ static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 		 * be NULL and won't match the lock's export. */
 		for (i = 0; i < lsm->lsm_stripe_count; i++) {
 			loi = lsm->lsm_oinfo[i];
+			if (lov_oinfo_is_dummy(loi))
+				continue;
+
 			if (!lov->lov_tgts[loi->loi_ost_idx])
 				continue;
 			if (lov->lov_tgts[loi->loi_ost_idx]->ltd_exp ==
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index d69a035..1fb1dfa 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -230,6 +230,9 @@ static int lov_init_raid0(const struct lu_env *env,
 			struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
 			int ost_idx = oinfo->loi_ost_idx;
 
+			if (lov_oinfo_is_dummy(oinfo))
+				continue;
+
 			result = ostid_to_fid(ofid, &oinfo->loi_oi,
 					      oinfo->loi_ost_idx);
 			if (result != 0)
@@ -973,6 +976,10 @@ int lov_read_and_clear_async_rc(struct cl_object *clob)
 			LASSERT(lsm != NULL);
 			for (i = 0; i < lsm->lsm_stripe_count; i++) {
 				struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+
+				if (lov_oinfo_is_dummy(loi))
+					continue;
+
 				if (loi->loi_ar.ar_rc && !rc)
 					rc = loi->loi_ar.ar_rc;
 				loi->loi_ar.ar_rc = 0;
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 7358b9d..933e2d1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -299,6 +299,9 @@ int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 		struct lov_request *req;
 
 		loi = oinfo->oi_md->lsm_oinfo[i];
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
 			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
 			if (oinfo->oi_oa->o_valid & OBD_MD_FLEPOCH) {
@@ -384,6 +387,9 @@ int lov_prep_destroy_set(struct obd_export *exp, struct obd_info *oinfo,
 		struct lov_request *req;
 
 		loi = lsm->lsm_oinfo[i];
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
 			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
 			continue;
@@ -497,6 +503,9 @@ int lov_prep_setattr_set(struct obd_export *exp, struct obd_info *oinfo,
 		struct lov_oinfo *loi = oinfo->oi_md->lsm_oinfo[i];
 		struct lov_request *req;
 
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
 		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
 			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
 			continue;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/10] staging/lustre/ptlrpc: fix import state during replay
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (5 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 06/10] staging/lustre/lov: don't crash accessing LOV object with FID{0,0} green
@ 2015-03-26  1:53 ` green
  2015-03-26  2:07   ` [PATCH v2 " green
  2015-03-26  1:53 ` [PATCH 08/10] staging/lustre/llite: glimpse the inode before doing fiemap green
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Andriy Skulysh, Bob Glossman, Oleg Drokin

From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>

Client doesn't restore import state correctly
on reconnect during replay. It resends lock replay
when final ping was queued by server.
Server fails with "target_queue_recovery_request())
ASSERTION( req->rq_export->exp_lock_replay_needed ) failed"

Add imp_replay_state to store last replay state.
imp_state is restored from imp_replay_state
during reconnect.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Xyratex-bug-id: MRP-2022
Reviewed-on: http://review.whamcloud.com/12163
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5651
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/include/lustre_import.h |  2 ++
 drivers/staging/lustre/lustre/ptlrpc/import.c         | 15 ++++++++++++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_import.h b/drivers/staging/lustre/lustre/include/lustre_import.h
index 51f3e98..dcc8076 100644
--- a/drivers/staging/lustre/lustre/include/lustre_import.h
+++ b/drivers/staging/lustre/lustre/include/lustre_import.h
@@ -218,6 +218,8 @@ struct obd_import {
 	atomic_t	      imp_timeouts;
 	/** Current import state */
 	enum lustre_imp_state     imp_state;
+	/** Last replay state */
+	enum lustre_imp_state	  imp_replay_state;
 	/** History of import states */
 	struct import_state_hist  imp_state_hist[IMP_STATE_HIST_LEN];
 	int		       imp_state_hist_idx;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 4ceb90d..d637c65 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -63,6 +63,19 @@ struct ptlrpc_connect_async_args {
 static void __import_set_state(struct obd_import *imp,
 			       enum lustre_imp_state state)
 {
+	switch (state) {
+	case LUSTRE_IMP_CLOSED:
+	case LUSTRE_IMP_NEW:
+	case LUSTRE_IMP_DISCON:
+	case LUSTRE_IMP_CONNECTING:
+		break;
+	case LUSTRE_IMP_REPLAY_WAIT:
+		imp->imp_replay_state = LUSTRE_IMP_REPLAY_LOCKS;
+		break;
+	default:
+		imp->imp_replay_state = LUSTRE_IMP_REPLAY;
+       }
+
 	imp->imp_state = state;
 	imp->imp_state_hist[imp->imp_state_hist_idx].ish_state = state;
 	imp->imp_state_hist[imp->imp_state_hist_idx].ish_time =
@@ -966,7 +979,7 @@ static int ptlrpc_connect_interpret(const struct lu_env *env,
 			imp->imp_resend_replay = 1;
 			spin_unlock(&imp->imp_lock);
 
-			IMPORT_SET_STATE(imp, LUSTRE_IMP_REPLAY);
+			IMPORT_SET_STATE(imp, imp->imp_replay_state);
 		} else {
 			IMPORT_SET_STATE(imp, LUSTRE_IMP_RECOVER);
 		}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/10] staging/lustre/llite: glimpse the inode before doing fiemap
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (6 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 07/10] staging/lustre/ptlrpc: fix import state during replay green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 09/10] staging/lustre: update timestamps after buiding rpc green
  2015-03-26  1:53 ` [PATCH 10/10] staging/lustre/xattr: xattr data may be gone with lock held green
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Li Dongyang, Bob Glossman, Oleg Drokin

From: Li Dongyang <dongyang.li@anu.edu.au>

For a new inode, the i_size is 0 until a stat, which will yield
an empty fiemap result.
Fix the issue by glimpsing the size before doing fiemap.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/13439
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6091
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 6dab3ba..85e74d1 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1712,6 +1712,12 @@ static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
 	fm_key.oa.o_oi = lsm->lsm_oi;
 	fm_key.oa.o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
 
+	if (i_size_read(inode) == 0) {
+		rc = ll_glimpse_size(inode);
+		if (rc)
+			goto out;
+	}
+
 	obdo_from_inode(&fm_key.oa, inode, OBD_MD_FLSIZE);
 	obdo_set_parent_fid(&fm_key.oa, &ll_i2info(inode)->lli_fid);
 	/* If filesize is 0, then there would be no objects for mapping */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/10] staging/lustre: update timestamps after buiding rpc
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (7 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 08/10] staging/lustre/llite: glimpse the inode before doing fiemap green
@ 2015-03-26  1:53 ` green
  2015-03-26  1:53 ` [PATCH 10/10] staging/lustre/xattr: xattr data may be gone with lock held green
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Niu Yawei, Oleg Drokin

From: Niu Yawei <yawei.niu@intel.com>

The mtime/atime/ctime in the write RPC has to be updated after
the RPC is built (where xid is generated), otherwise, it could
race with the setattr and updating wrong timestamps on OST side.

Seems this regression was introduced when landing clio code.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-on: http://review.whamcloud.com/13261
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5951
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/osc/osc_request.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 5e27e0a..d7a9b65 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -1880,6 +1880,7 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
 	int				page_count = 0;
 	int				i;
 	int				rc;
+	struct ost_body			*body;
 	LIST_HEAD(rpc_list);
 
 	LASSERT(!list_empty(ext_list));
@@ -1981,6 +1982,8 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
 	 * later setattr before earlier BRW (as determined by the request xid),
 	 * the OST will not use BRW timestamps.  Sadly, there is no obvious
 	 * way to do this in a single call.  bug 10150 */
+	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
+	crattr->cra_oa = &body->oa;
 	cl_req_attr_set(env, clerq, crattr,
 			OBD_MD_FLMTIME|OBD_MD_FLCTIME|OBD_MD_FLATIME);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/10] staging/lustre/xattr: xattr data may be gone with lock held
  2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
                   ` (8 preceding siblings ...)
  2015-03-26  1:53 ` [PATCH 09/10] staging/lustre: update timestamps after buiding rpc green
@ 2015-03-26  1:53 ` green
  9 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  1:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Lai Siyao, Oleg Drokin

From: Lai Siyao <lai.siyao@intel.com>

Xattr cached data may be gone, but lock still held, in this case,
refetch xattr from server, otherwise client will return error.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-on: http://review.whamcloud.com/12952
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/llite/xattr_cache.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/xattr_cache.c b/drivers/staging/lustre/lustre/llite/xattr_cache.c
index da190f9..69ea92a 100644
--- a/drivers/staging/lustre/lustre/llite/xattr_cache.c
+++ b/drivers/staging/lustre/lustre/llite/xattr_cache.c
@@ -295,13 +295,18 @@ static int ll_xattr_find_get_lock(struct inode *inode,
 
 
 	mutex_lock(&lli->lli_xattrs_enq_lock);
-	/* Try matching first. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0, LCK_PR);
-	if (mode != 0) {
-		/* fake oit in mdc_revalidate_lock() manner */
-		oit->d.lustre.it_lock_handle = lockh.cookie;
-		oit->d.lustre.it_lock_mode = mode;
-		goto out;
+	/* inode may have been shrunk and recreated, so data is gone, match lock
+	 * only when data exists. */
+	if (ll_xattr_cache_valid(lli)) {
+		/* Try matching first. */
+		mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0,
+				       LCK_PR);
+		if (mode != 0) {
+			/* fake oit in mdc_revalidate_lock() manner */
+			oit->d.lustre.it_lock_handle = lockh.cookie;
+			oit->d.lustre.it_lock_mode = mode;
+			goto out;
+		}
 	}
 
 	/* Enqueue if the lock isn't cached locally. */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 01/10] staging/lustre/osc: shorten IO calling path
  2015-03-26  1:53 ` [PATCH 01/10] staging/lustre/osc: shorten IO calling path green
@ 2015-03-26  2:04   ` green
  2015-03-26 10:09     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 14+ messages in thread
From: green @ 2015-03-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Bobi Jam, Bob Glossman, Oleg Drokin

From: Bobi Jam <bobijam.xu@intel.com>

By using osc_io_unplug_aync() for osc_queue_sync_pages() to shorten
the IO calling path, to reduce the chance of stack overflow.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: http://review.whamcloud.com/11612
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3188
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
Had one signed-off-by borrked in the original one somehow
 drivers/staging/lustre/lustre/osc/osc_cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 7022ed4..d44b3d4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -2613,7 +2613,7 @@ int osc_queue_sync_pages(const struct lu_env *env, struct osc_object *obj,
 	}
 	osc_object_unlock(obj);
 
-	osc_io_unplug(env, cli, obj, PDL_POLICY_ROUND);
+	osc_io_unplug_async(env, cli, obj);
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 07/10] staging/lustre/ptlrpc: fix import state during replay
  2015-03-26  1:53 ` [PATCH 07/10] staging/lustre/ptlrpc: fix import state during replay green
@ 2015-03-26  2:07   ` green
  0 siblings, 0 replies; 14+ messages in thread
From: green @ 2015-03-26  2:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger
  Cc: Linux Kernel Mailing List, Andriy Skulysh, Bob Glossman, Oleg Drokin

From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>

Client doesn't restore import state correctly
on reconnect during replay. It resends lock replay
when final ping was queued by server.
Server fails with "target_queue_recovery_request())
ASSERTION( req->rq_export->exp_lock_replay_needed ) failed"

Add imp_replay_state to store last replay state.
imp_state is restored from imp_replay_state
during reconnect.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Xyratex-bug-id: MRP-2022
Reviewed-on: http://review.whamcloud.com/12163
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5651
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
Preious version had a borked whitespace on one line that I noticed too late

 drivers/staging/lustre/lustre/include/lustre_import.h |  2 ++
 drivers/staging/lustre/lustre/ptlrpc/import.c         | 15 ++++++++++++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_import.h b/drivers/staging/lustre/lustre/include/lustre_import.h
index 51f3e98..dcc8076 100644
--- a/drivers/staging/lustre/lustre/include/lustre_import.h
+++ b/drivers/staging/lustre/lustre/include/lustre_import.h
@@ -218,6 +218,8 @@ struct obd_import {
 	atomic_t	      imp_timeouts;
 	/** Current import state */
 	enum lustre_imp_state     imp_state;
+	/** Last replay state */
+	enum lustre_imp_state	  imp_replay_state;
 	/** History of import states */
 	struct import_state_hist  imp_state_hist[IMP_STATE_HIST_LEN];
 	int		       imp_state_hist_idx;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 4ceb90d..d5fc689 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -63,6 +63,19 @@ struct ptlrpc_connect_async_args {
 static void __import_set_state(struct obd_import *imp,
 			       enum lustre_imp_state state)
 {
+	switch (state) {
+	case LUSTRE_IMP_CLOSED:
+	case LUSTRE_IMP_NEW:
+	case LUSTRE_IMP_DISCON:
+	case LUSTRE_IMP_CONNECTING:
+		break;
+	case LUSTRE_IMP_REPLAY_WAIT:
+		imp->imp_replay_state = LUSTRE_IMP_REPLAY_LOCKS;
+		break;
+	default:
+		imp->imp_replay_state = LUSTRE_IMP_REPLAY;
+	}
+
 	imp->imp_state = state;
 	imp->imp_state_hist[imp->imp_state_hist_idx].ish_state = state;
 	imp->imp_state_hist[imp->imp_state_hist_idx].ish_time =
@@ -966,7 +979,7 @@ static int ptlrpc_connect_interpret(const struct lu_env *env,
 			imp->imp_resend_replay = 1;
 			spin_unlock(&imp->imp_lock);
 
-			IMPORT_SET_STATE(imp, LUSTRE_IMP_REPLAY);
+			IMPORT_SET_STATE(imp, imp->imp_replay_state);
 		} else {
 			IMPORT_SET_STATE(imp, LUSTRE_IMP_RECOVER);
 		}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 01/10] staging/lustre/osc: shorten IO calling path
  2015-03-26  2:04   ` [PATCH v2 " green
@ 2015-03-26 10:09     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2015-03-26 10:09 UTC (permalink / raw)
  To: green
  Cc: devel, Andreas Dilger, Bob Glossman, Oleg Drokin, Bobi Jam,
	Linux Kernel Mailing List

On Wed, Mar 25, 2015 at 10:04:53PM -0400, green@linuxhacker.ru wrote:
> From: Bobi Jam <bobijam.xu@intel.com>

Meta-comment:  In the future, when doing a v2 patch, put the "v2" after
the patch number, so I can properly sort things in my email client.

In other words, the subject here should be:
	[PATCH 01/10 v2]

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-03-26 10:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-26  1:53 [PATCH 00/10] Lustre fixes green
2015-03-26  1:53 ` [PATCH 01/10] staging/lustre/osc: shorten IO calling path green
2015-03-26  2:04   ` [PATCH v2 " green
2015-03-26 10:09     ` Greg Kroah-Hartman
2015-03-26  1:53 ` [PATCH 02/10] staging/lustre/mdc: Handle empty but non-zero acl xattr green
2015-03-26  1:53 ` [PATCH 03/10] staging/lustre/ptlrpc: false alarm in AT network latency measuring green
2015-03-26  1:53 ` [PATCH 04/10] staging/lustre/mgc: check the import stat for lprocfs green
2015-03-26  1:53 ` [PATCH 05/10] staging/lustre/mgc: detach MGC dev on error green
2015-03-26  1:53 ` [PATCH 06/10] staging/lustre/lov: don't crash accessing LOV object with FID{0,0} green
2015-03-26  1:53 ` [PATCH 07/10] staging/lustre/ptlrpc: fix import state during replay green
2015-03-26  2:07   ` [PATCH v2 " green
2015-03-26  1:53 ` [PATCH 08/10] staging/lustre/llite: glimpse the inode before doing fiemap green
2015-03-26  1:53 ` [PATCH 09/10] staging/lustre: update timestamps after buiding rpc green
2015-03-26  1:53 ` [PATCH 10/10] staging/lustre/xattr: xattr data may be gone with lock held green

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.