linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/16] staging/lustre: sync with external tree, set 2
@ 2013-11-26  2:04 Peng Tao
  2013-11-26  2:04 ` [PATCH 01/16] staging/lustre/server: use unified request handler for MGS Peng Tao
                   ` (15 more replies)
  0 siblings, 16 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Peng Tao, Andreas Dilger

Hi Greg,

Following 16 patches were ported from Lustre external tree. Please see if
it is ok to apply them.

Thanks,
Tao

Cc: Andreas Dilger <andreas.dilger@intel.com>

Amir Shehata (1):
  staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer

Andrew Perepechko (1):
  staging/lustre/llite: extended attribute cache

Andriy Skulysh (1):
  staging/lustre/ptlrpc: Fix race during exp_flock_hash creation

Dmitry Eremin (2):
  staging/lustre/build: clean up unused variables and dead code
  staging/lustre/build: fix compilation issue with is_compat_task

Fan Yong (2):
  staging/lustre/scrub: OI scrub on OST
  staging/lustre/scrub: control OI scrub on OST from user space

JC Lafoucriere (2):
  staging/lustre/mdt: HSM coordinator client interface
  staging/lustre/mdt: HSM coordinator agent interface

Jinshan Xiong (1):
  staging/lustre/hsm: Add hsm_release feature.

John L. Hammond (3):
  staging/lustre/mdc: prevent fall through in mdc_iocontrol()
  staging/lustre/lu: shrink lu_object by 8 bytes on x86_64
  staging/lustre/llite: don't check for O_CREAT in it_create_mode

Mikhail Pershin (2):
  staging/lustre/server: use unified request handler for MGS
  staging/lustre/llog: MGC to use OSD API for backup logs

Patrick Farrell (1):
  staging/lustre/nfs: writing to new files will return ENOENT

 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 -
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
 drivers/staging/lustre/lnet/selftest/timer.c       |    6 +-
 drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
 .../lustre/lustre/include/linux/lustre_compat25.h  |    4 +-
 .../lustre/lustre/include/linux/lustre_lite.h      |    1 +
 drivers/staging/lustre/lustre/include/lu_object.h  |   19 -
 .../lustre/lustre/include/lustre/lustre_idl.h      |   44 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |   24 +-
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    3 -
 drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |    2 -
 .../lustre/lustre/include/lustre_req_layout.h      |    7 +
 drivers/staging/lustre/lustre/include/md_object.h  |    4 +-
 drivers/staging/lustre/lustre/include/obd.h        |   15 +-
 .../staging/lustre/lustre/include/obd_support.h    |   10 +
 .../staging/lustre/lustre/lclient/lcommon_misc.c   |    4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   28 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    2 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    8 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    3 -
 drivers/staging/lustre/lustre/libcfs/workitem.c    |   56 +-
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/dcache.c       |   36 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   24 +-
 drivers/staging/lustre/lustre/llite/file.c         |  105 +++-
 .../staging/lustre/lustre/llite/llite_internal.h   |   46 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   22 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  |   36 ++
 drivers/staging/lustre/lustre/llite/namei.c        |   15 +-
 drivers/staging/lustre/lustre/llite/super25.c      |    4 +
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    2 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  102 ++--
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  640 ++++++++++++++++++++
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |   16 +
 drivers/staging/lustre/lustre/lov/lov_object.c     |   35 +-
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    1 +
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   26 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   63 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   54 +-
 drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  401 +++++++-----
 drivers/staging/lustre/lustre/obdclass/llog.c      |  212 ++++---
 .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
 .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
 .../lustre/lustre/obdclass/lprocfs_status.c        |    4 -
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   26 +-
 drivers/staging/lustre/lustre/obdclass/md_attrs.c  |    4 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   10 +-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |    2 -
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   60 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   16 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    7 +
 drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   65 --
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |   44 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |    4 +
 59 files changed, 1750 insertions(+), 618 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_cache.c

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/16] staging/lustre/server: use unified request handler for MGS
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
@ 2013-11-26  2:04 ` Peng Tao
  2013-11-26  2:04 ` [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <tappro@whamcloud.com>

- Unify request handler. It finds target for particular request and
  calls appropriate handler for it. Generic handlers are moved to
  the unified target code. The tgt_session_info is introduced to
  store all request-related data and passed to all handlers.
- Pack reply in llog server functions early and use err_serious()
- remove obsoleted llog_origin_handle_cancel(), it is not used
  anymore
- remove push_ctxt/pop_ctxt from llog server function, it is based
  on OSD now.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2145
Lustre-change: http://review.whamcloud.com/4826
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre_req_layout.h      |    2 ++
 .../staging/lustre/lustre/include/obd_support.h    |    7 +++++++
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    7 ++++++-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c    |    4 ++--
 4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index f4d3820..2832569 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -245,6 +245,8 @@ extern struct req_format RQF_LLOG_ORIGIN_HANDLE_PREV_BLOCK;
 extern struct req_format RQF_LLOG_ORIGIN_HANDLE_READ_HEADER;
 extern struct req_format RQF_LLOG_ORIGIN_CONNECT;
 
+extern struct req_format RQF_CONNECT;
+
 extern struct req_msg_field RMF_GENERIC_DATA;
 extern struct req_msg_field RMF_PTLRPC_BODY;
 extern struct req_msg_field RMF_MDT_BODY;
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 9697e7f..825a0c9 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -416,6 +416,13 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_MGC_PAUSE_PROCESS_LOG   0x903
 #define OBD_FAIL_MGS_PAUSE_REQ	   0x904
 #define OBD_FAIL_MGS_PAUSE_TARGET_REG    0x905
+#define OBD_FAIL_MGS_CONNECT_NET	 0x906
+#define OBD_FAIL_MGS_DISCONNECT_NET	 0x907
+#define OBD_FAIL_MGS_SET_INFO_NET	 0x908
+#define OBD_FAIL_MGS_EXCEPTION_NET	 0x909
+#define OBD_FAIL_MGS_TARGET_REG_NET	 0x90a
+#define OBD_FAIL_MGS_TARGET_DEL_NET	 0x90b
+#define OBD_FAIL_MGS_CONFIG_READ_NET	 0x90c
 
 #define OBD_FAIL_QUOTA_DQACQ_NET			0xA01
 #define OBD_FAIL_QUOTA_EDQUOT	    0xA02
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 504682d..4d6ac686 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -738,7 +738,8 @@ static struct req_format *req_formats[] = {
 	&RQF_LLOG_ORIGIN_HANDLE_NEXT_BLOCK,
 	&RQF_LLOG_ORIGIN_HANDLE_PREV_BLOCK,
 	&RQF_LLOG_ORIGIN_HANDLE_READ_HEADER,
-	&RQF_LLOG_ORIGIN_CONNECT
+	&RQF_LLOG_ORIGIN_CONNECT,
+	&RQF_CONNECT,
 };
 
 struct req_msg_field {
@@ -1504,6 +1505,10 @@ struct req_format RQF_LLOG_ORIGIN_CONNECT =
 	DEFINE_REQ_FMT0("LLOG_ORIGIN_CONNECT", llogd_conn_body_only, empty);
 EXPORT_SYMBOL(RQF_LLOG_ORIGIN_CONNECT);
 
+struct req_format RQF_CONNECT =
+	DEFINE_REQ_FMT0("CONNECT", obd_connect_client, obd_connect_server);
+EXPORT_SYMBOL(RQF_CONNECT);
+
 struct req_format RQF_OST_CONNECT =
 	DEFINE_REQ_FMT0("OST_CONNECT",
 			obd_connect_client, obd_connect_server);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c b/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
index 65451f7..1be9786 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
@@ -114,10 +114,10 @@ struct ll_rpc_opcode {
 	{ MGS_SET_INFO,     "mgs_set_info" },
 	{ MGS_CONFIG_READ,  "mgs_config_read" },
 	{ OBD_PING,	 "obd_ping" },
-	{ OBD_LOG_CANCEL,   "llog_origin_handle_cancel" },
+	{ OBD_LOG_CANCEL,	"llog_cancel" },
 	{ OBD_QC_CALLBACK,  "obd_quota_callback" },
 	{ OBD_IDX_READ,	    "dt_index_read" },
-	{ LLOG_ORIGIN_HANDLE_CREATE,     "llog_origin_handle_create" },
+	{ LLOG_ORIGIN_HANDLE_CREATE,	 "llog_origin_handle_open" },
 	{ LLOG_ORIGIN_HANDLE_NEXT_BLOCK, "llog_origin_handle_next_block" },
 	{ LLOG_ORIGIN_HANDLE_READ_HEADER,"llog_origin_handle_read_header" },
 	{ LLOG_ORIGIN_HANDLE_WRITE_REC,  "llog_origin_handle_write_rec" },
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
  2013-11-26  2:04 ` [PATCH 01/16] staging/lustre/server: use unified request handler for MGS Peng Tao
@ 2013-11-26  2:04 ` Peng Tao
  2013-11-26  3:14   ` Greg Kroah-Hartman
  2013-11-26  2:04 ` [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <mike.pershin@intel.com>

MGC uses lvfs API to access local llogs blocking removal of old code

- MGS is converted to use OSD API for local llogs
- llog_is_empty() and llog_backup() are introduced
- initial OSD start initialize run ctxt so llog can use it too
- shrink dcache after initial scrub in osd-ldiskfs to avoid holding
  data of unlinked objects until umount.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2059
Lustre-change: http://review.whamcloud.com/5049
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
 drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
 drivers/staging/lustre/lustre/include/obd.h        |    4 +-
 drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  401 ++++++++++++--------
 drivers/staging/lustre/lustre/obdclass/llog.c      |  212 +++++++----
 .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
 .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
 10 files changed, 408 insertions(+), 244 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/dt_object.h b/drivers/staging/lustre/lustre/include/dt_object.h
index e116bb2..9304c26 100644
--- a/drivers/staging/lustre/lustre/include/dt_object.h
+++ b/drivers/staging/lustre/lustre/include/dt_object.h
@@ -692,7 +692,7 @@ struct local_oid_storage {
 	struct dt_object *los_obj;
 
 	/* data used to generate new fids */
-	struct mutex	 los_id_lock;
+	struct mutex	  los_id_lock;
 	__u64		  los_seq;
 	__u32		  los_last_oid;
 };
diff --git a/drivers/staging/lustre/lustre/include/lustre_log.h b/drivers/staging/lustre/lustre/include/lustre_log.h
index 721aa05..896c757 100644
--- a/drivers/staging/lustre/lustre/include/lustre_log.h
+++ b/drivers/staging/lustre/lustre/include/lustre_log.h
@@ -136,7 +136,11 @@ int llog_open(const struct lu_env *env, struct llog_ctxt *ctxt,
 	      struct llog_handle **lgh, struct llog_logid *logid,
 	      char *name, enum llog_open_param open_param);
 int llog_close(const struct lu_env *env, struct llog_handle *cathandle);
-int llog_get_size(struct llog_handle *loghandle);
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name);
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bak_ctxt,
+		char *name, char *backup);
 
 /* llog_process flags */
 #define LLOG_FLAG_NODEAMON 0x0001
@@ -382,6 +386,13 @@ static inline int llog_data_len(int len)
 	return cfs_size_round(len);
 }
 
+static inline int llog_get_size(struct llog_handle *loghandle)
+{
+	if (loghandle && loghandle->lgh_hdr)
+		return loghandle->lgh_hdr->llh_count;
+	return 0;
+}
+
 static inline struct llog_ctxt *llog_ctxt_get(struct llog_ctxt *ctxt)
 {
 	atomic_inc(&ctxt->loc_refcount);
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index d0aea15..f0c1773 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -399,8 +399,8 @@ struct client_obd {
 
 	/* mgc datastruct */
 	struct semaphore	 cl_mgc_sem;
-	struct vfsmount	 *cl_mgc_vfsmnt;
-	struct dentry	   *cl_mgc_configs_dir;
+	struct local_oid_storage *cl_mgc_los;
+	struct dt_object	*cl_mgc_configs_dir;
 	atomic_t	     cl_mgc_refcount;
 	struct obd_export       *cl_mgc_mgsexp;
 
diff --git a/drivers/staging/lustre/lustre/mgc/libmgc.c b/drivers/staging/lustre/lustre/mgc/libmgc.c
index 7b4947c..9b40c57 100644
--- a/drivers/staging/lustre/lustre/mgc/libmgc.c
+++ b/drivers/staging/lustre/lustre/mgc/libmgc.c
@@ -99,11 +99,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	ptlrpcd_decref();
 
 	rc = client_obd_cleanup(obd);
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 93b601d..bf4d8e9 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -41,17 +41,14 @@
 #define DEBUG_SUBSYSTEM S_MGC
 #define D_MGC D_CONFIG /*|D_WARNING*/
 
-# include <linux/module.h>
-# include <linux/pagemap.h>
-# include <linux/miscdevice.h>
-# include <linux/init.h>
-
+#include <linux/module.h>
 #include <obd_class.h>
 #include <lustre_dlm.h>
 #include <lprocfs_status.h>
 #include <lustre_log.h>
-#include <lustre_fsfilt.h>
 #include <lustre_disk.h>
+#include <dt_object.h>
+
 #include "mgc_internal.h"
 
 static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
@@ -578,97 +575,175 @@ static void mgc_requeue_add(struct config_llog_data *cld)
 }
 
 /********************** class fns **********************/
+static int mgc_local_llog_init(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct obd_device *disk)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_ORIG_CTXT, disk,
+			&llog_osd_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	LASSERT(ctxt);
+	ctxt->loc_dir = obd->u.cli.cl_mgc_configs_dir;
+	llog_ctxt_put(ctxt);
 
-static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb,
-			struct vfsmount *mnt)
+	return 0;
+}
+
+static int mgc_local_llog_fini(const struct lu_env *env,
+			       struct obd_device *obd)
 {
-	struct lvfs_run_ctxt saved;
-	struct lustre_sb_info *lsi = s2lsi(sb);
-	struct client_obd *cli = &obd->u.cli;
-	struct dentry *dentry;
-	char *label;
-	int err = 0;
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	llog_cleanup(env, ctxt);
+
+	return 0;
+}
+
+static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb)
+{
+	struct lustre_sb_info	*lsi = s2lsi(sb);
+	struct client_obd	*cli = &obd->u.cli;
+	struct lu_fid		 rfid, fid;
+	struct dt_object	*root, *dto;
+	struct lu_env		*env;
+	int			 rc = 0;
 
 	LASSERT(lsi);
-	LASSERT(lsi->lsi_srv_mnt == mnt);
+	LASSERT(lsi->lsi_dt_dev);
+
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
+		return -ENOMEM;
 
 	/* The mgc fs exclusion sem. Only one fs can be setup at a time. */
 	down(&cli->cl_mgc_sem);
 
 	cfs_cleanup_group_info();
 
-	obd->obd_fsops = fsfilt_get_ops(lsi->lsi_fstype);
-	if (IS_ERR(obd->obd_fsops)) {
-		up(&cli->cl_mgc_sem);
-		CERROR("%s: No fstype %s: rc = %ld\n", lsi->lsi_fstype,
-		       obd->obd_name, PTR_ERR(obd->obd_fsops));
-		return PTR_ERR(obd->obd_fsops);
-	}
+	/* Setup the configs dir */
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_err, rc);
 
-	cli->cl_mgc_vfsmnt = mnt;
-	err = fsfilt_setup(obd, mnt->mnt_sb);
-	if (err)
-		GOTO(err_ops, err);
-
-	OBD_SET_CTXT_MAGIC(&obd->obd_lvfs_ctxt);
-	obd->obd_lvfs_ctxt.pwdmnt = mnt;
-	obd->obd_lvfs_ctxt.pwd = mnt->mnt_root;
-	obd->obd_lvfs_ctxt.fs = get_ds();
-
-	push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	dentry = ll_lookup_one_len(MOUNT_CONFIGS_DIR, cfs_fs_pwd(current->fs),
-				   strlen(MOUNT_CONFIGS_DIR));
-	pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	if (IS_ERR(dentry)) {
-		err = PTR_ERR(dentry);
-		CERROR("cannot lookup %s directory: rc = %d\n",
-		       MOUNT_CONFIGS_DIR, err);
-		GOTO(err_ops, err);
-	}
-	cli->cl_mgc_configs_dir = dentry;
+	fid.f_seq = FID_SEQ_LOCAL_NAME;
+	fid.f_oid = 1;
+	fid.f_ver = 0;
+	rc = local_oid_storage_init(env, lsi->lsi_dt_dev, &fid,
+				    &cli->cl_mgc_los);
+	if (rc)
+		GOTO(out_env, rc);
+
+	rc = dt_root_get(env, lsi->lsi_dt_dev, &rfid);
+	if (rc)
+		GOTO(out_env, rc);
+
+	root = dt_locate_at(env, lsi->lsi_dt_dev, &rfid,
+			    &cli->cl_mgc_los->los_dev->dd_lu_dev);
+	if (unlikely(IS_ERR(root)))
+		GOTO(out_los, rc = PTR_ERR(root));
+
+	dto = local_file_find_or_create(env, cli->cl_mgc_los, root,
+					MOUNT_CONFIGS_DIR,
+					S_IFDIR | S_IRUGO | S_IWUSR | S_IXUGO);
+	lu_object_put_nocache(env, &root->do_lu);
+	if (IS_ERR(dto))
+		GOTO(out_los, rc = PTR_ERR(dto));
+
+	cli->cl_mgc_configs_dir = dto;
+
+	LASSERT(lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt);
+	rc = mgc_local_llog_init(env, obd, lsi->lsi_osd_exp->exp_obd);
+	if (rc)
+		GOTO(out_llog, rc);
 
 	/* We take an obd ref to insure that we can't get to mgc_cleanup
-	   without calling mgc_fs_cleanup first. */
+	 * without calling mgc_fs_cleanup first. */
 	class_incref(obd, "mgc_fs", obd);
 
-	label = fsfilt_get_label(obd, mnt->mnt_sb);
-	if (label)
-		CDEBUG(D_MGC, "MGC using disk labelled=%s\n", label);
-
 	/* We keep the cl_mgc_sem until mgc_fs_cleanup */
-	return 0;
-
-err_ops:
-	fsfilt_put_ops(obd->obd_fsops);
-	obd->obd_fsops = NULL;
-	cli->cl_mgc_vfsmnt = NULL;
-	up(&cli->cl_mgc_sem);
-	return err;
+out_llog:
+	if (rc) {
+		lu_object_put(env, &cli->cl_mgc_configs_dir->do_lu);
+		cli->cl_mgc_configs_dir = NULL;
+	}
+out_los:
+	if (rc < 0) {
+		local_oid_storage_fini(env, cli->cl_mgc_los);
+		cli->cl_mgc_los = NULL;
+		up(&cli->cl_mgc_sem);
+	}
+out_env:
+	lu_env_fini(env);
+out_err:
+	OBD_FREE_PTR(env);
+	return rc;
 }
 
 static int mgc_fs_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
-	int rc = 0;
+	struct lu_env		 env;
+	struct client_obd	*cli = &obd->u.cli;
+	int			 rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt != NULL);
+	LASSERT(cli->cl_mgc_los != NULL);
 
-	if (cli->cl_mgc_configs_dir != NULL) {
-		struct lvfs_run_ctxt saved;
-		push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		l_dput(cli->cl_mgc_configs_dir);
-		cli->cl_mgc_configs_dir = NULL;
-		pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		class_decref(obd, "mgc_fs", obd);
-	}
+	rc = lu_env_init(&env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(unlock, rc);
+
+	mgc_local_llog_fini(&env, obd);
 
-	cli->cl_mgc_vfsmnt = NULL;
-	if (obd->obd_fsops)
-		fsfilt_put_ops(obd->obd_fsops);
+	lu_object_put_nocache(&env, &cli->cl_mgc_configs_dir->do_lu);
+	cli->cl_mgc_configs_dir = NULL;
 
+	local_oid_storage_fini(&env, cli->cl_mgc_los);
+	cli->cl_mgc_los = NULL;
+	lu_env_fini(&env);
+
+unlock:
+	class_decref(obd, "mgc_fs", obd);
 	up(&cli->cl_mgc_sem);
 
-	return rc;
+	return 0;
+}
+
+static int mgc_llog_init(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	/* setup only remote ctxt, the local disk context is switched per each
+	 * filesystem during mgc_fs_setup() */
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_REPL_CTXT, obd,
+			&llog_client_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
+	llog_initiator_connect(ctxt);
+	llog_ctxt_put(ctxt);
+
+	return 0;
+}
+
+static int mgc_llog_fini(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	if (ctxt)
+		llog_cleanup(env, ctxt);
+
+	return 0;
 }
 
 static atomic_t mgc_count = ATOMIC_INIT(0);
@@ -694,7 +769,7 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 			}
 		}
 		obd_cleanup_client_import(obd);
-		rc = obd_llog_finish(obd, 0);
+		rc = mgc_llog_fini(NULL, obd);
 		if (rc != 0)
 			CERROR("failed to cleanup llogging subsystems\n");
 		break;
@@ -704,11 +779,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	/* COMPAT_146 - old config logs may have added profiles we don't
 	   know about */
 	if (obd->obd_type->typ_refcnt <= 1)
@@ -733,7 +805,7 @@ static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	if (rc)
 		GOTO(err_decref, rc);
 
-	rc = obd_llog_init(obd, &obd->obd_olg, obd, NULL);
+	rc = mgc_llog_init(NULL, obd);
 	if (rc) {
 		CERROR("failed to setup llogging subsystems\n");
 		GOTO(err_cleanup, rc);
@@ -1011,23 +1083,17 @@ int mgc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 	}
 	if (KEY_IS(KEY_SET_FS)) {
 		struct super_block *sb = (struct super_block *)val;
-		struct lustre_sb_info *lsi;
+
 		if (vallen != sizeof(struct super_block))
 			return -EINVAL;
-		lsi = s2lsi(sb);
-		rc = mgc_fs_setup(exp->exp_obd, sb, lsi->lsi_srv_mnt);
-		if (rc) {
-			CERROR("set_fs got %d\n", rc);
-		}
+
+		rc = mgc_fs_setup(exp->exp_obd, sb);
 		return rc;
 	}
 	if (KEY_IS(KEY_CLEAR_FS)) {
 		if (vallen != 0)
 			return -EINVAL;
 		rc = mgc_fs_cleanup(exp->exp_obd);
-		if (rc) {
-			CERROR("clear_fs got %d\n", rc);
-		}
 		return rc;
 	}
 	if (KEY_IS(KEY_SET_INFO)) {
@@ -1145,49 +1211,6 @@ static int mgc_import_event(struct obd_device *obd,
 	return rc;
 }
 
-static int mgc_llog_init(struct obd_device *obd, struct obd_llog_group *olg,
-			 struct obd_device *tgt, int *index)
-{
-	struct llog_ctxt *ctxt;
-	int rc;
-
-	LASSERT(olg == &obd->obd_olg);
-
-
-	rc = llog_setup(NULL, obd, olg, LLOG_CONFIG_REPL_CTXT, tgt,
-			&llog_client_ops);
-	if (rc)
-		GOTO(out, rc);
-
-	ctxt = llog_group_get_ctxt(olg, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt)
-		GOTO(out, rc = -ENODEV);
-
-	llog_initiator_connect(ctxt);
-	llog_ctxt_put(ctxt);
-
-	return 0;
-out:
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return rc;
-}
-
-static int mgc_llog_finish(struct obd_device *obd, int count)
-{
-	struct llog_ctxt *ctxt;
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return 0;
-}
-
 enum {
 	CONFIG_READ_NRPAGES_INIT = 1 << (20 - PAGE_CACHE_SHIFT),
 	CONFIG_READ_NRPAGES      = 4
@@ -1540,17 +1563,58 @@ out:
 	return rc;
 }
 
+static int mgc_llog_local_copy(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct llog_ctxt *rctxt,
+			       struct llog_ctxt *lctxt, char *logname)
+{
+	char	*temp_log;
+	int	 rc;
+
+
+
+	/*
+	 * - copy it to backup using llog_backup()
+	 * - copy remote llog to logname using llog_backup()
+	 * - if failed then move bakup to logname again
+	 */
+
+	OBD_ALLOC(temp_log, strlen(logname) + 1);
+	if (!temp_log)
+		return -ENOMEM;
+	sprintf(temp_log, "%sT", logname);
+
+	/* make a copy of local llog at first */
+	rc = llog_backup(env, obd, lctxt, lctxt, logname, temp_log);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out, rc);
+	/* copy remote llog to the local copy */
+	rc = llog_backup(env, obd, rctxt, lctxt, logname, logname);
+	if (rc == -ENOENT) {
+		/* no remote llog, delete local one too */
+		llog_erase(env, lctxt, NULL, logname);
+	} else if (rc < 0) {
+		/* error during backup, get local one back from the copy */
+		llog_backup(env, obd, lctxt, lctxt, temp_log, logname);
+out:
+		CERROR("%s: failed to copy remote log %s: rc = %d\n",
+		       obd->obd_name, logname, rc);
+	}
+	llog_erase(env, lctxt, NULL, temp_log);
+	OBD_FREE(temp_log, strlen(logname) + 1);
+	return rc;
+}
 
 /* local_only means it cannot get remote llogs */
 static int mgc_process_cfg_log(struct obd_device *mgc,
-			       struct config_llog_data *cld,
-			       int local_only)
+			       struct config_llog_data *cld, int local_only)
 {
-	struct llog_ctxt *ctxt, *lctxt = NULL;
-	struct lvfs_run_ctxt *saved_ctxt;
-	struct lustre_sb_info *lsi = NULL;
-	int rc = 0, must_pop = 0;
-	bool sptlrpc_started = false;
+	struct llog_ctxt	*ctxt, *lctxt = NULL;
+	struct client_obd	*cli = &mgc->u.cli;
+	struct lustre_sb_info	*lsi = NULL;
+	int			 rc = 0;
+	bool			 sptlrpc_started = false;
+	struct lu_env		*env;
 
 	LASSERT(cld);
 	LASSERT(mutex_is_locked(&cld->cld_lock));
@@ -1565,20 +1629,49 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	if (cld->cld_cfg.cfg_sb)
 		lsi = s2lsi(cld->cld_cfg.cfg_sb);
 
-	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt) {
-		CERROR("missing llog context\n");
-		return -EINVAL;
-	}
-
-	OBD_ALLOC_PTR(saved_ctxt);
-	if (saved_ctxt == NULL)
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
 		return -ENOMEM;
 
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_free, rc);
+
+	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
 	lctxt = llog_get_context(mgc, LLOG_CONFIG_ORIG_CTXT);
 
-		if (local_only) { /* no local log at client side */
-		GOTO(out_pop, rc = -EIO);
+	/* Copy the setup log locally if we can. Don't mess around if we're
+	 * running an MGS though (logs are already local). */
+	if (lctxt && lsi && IS_SERVER(lsi) && !IS_MGS(lsi) &&
+	    cli->cl_mgc_configs_dir != NULL &&
+	    lu2dt_dev(cli->cl_mgc_configs_dir->do_lu.lo_dev) ==
+	    lsi->lsi_dt_dev) {
+		if (!local_only)
+			/* Only try to copy log if we have the lock. */
+			rc = mgc_llog_local_copy(env, mgc, ctxt, lctxt,
+						 cld->cld_logname);
+		if (local_only || rc) {
+			if (llog_is_empty(env, lctxt, cld->cld_logname)) {
+				LCONSOLE_ERROR_MSG(0x13a,
+						   "Failed to get MGS log %s and no local copy.\n",
+						   cld->cld_logname);
+				GOTO(out_pop, rc = -ENOTCONN);
+			}
+			CDEBUG(D_MGC,
+			       "Failed to get MGS log %s, using local copy for now, will try to update later.\n",
+			       cld->cld_logname);
+		}
+		/* Now, whether we copied or not, start using the local llog.
+		 * If we failed to copy, we'll start using whatever the old
+		 * log has. */
+		llog_ctxt_put(ctxt);
+		ctxt = lctxt;
+		lctxt = NULL;
+	} else {
+		if (local_only) /* no local log at client side */
+			GOTO(out_pop, rc = -EIO);
 	}
 
 	if (cld_is_sptlrpc(cld)) {
@@ -1587,19 +1680,16 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	}
 
 	/* logname and instance info should be the same, so use our
-	   copy of the instance for the update.  The cfg_last_idx will
-	   be updated here. */
-	rc = class_config_parse_llog(NULL, ctxt, cld->cld_logname,
+	 * copy of the instance for the update.  The cfg_last_idx will
+	 * be updated here. */
+	rc = class_config_parse_llog(env, ctxt, cld->cld_logname,
 				     &cld->cld_cfg);
 
 out_pop:
-	llog_ctxt_put(ctxt);
+	__llog_ctxt_put(env, ctxt);
 	if (lctxt)
-		llog_ctxt_put(lctxt);
-	if (must_pop)
-		pop_ctxt(saved_ctxt, &mgc->obd_lvfs_ctxt, NULL);
+		__llog_ctxt_put(env, lctxt);
 
-	OBD_FREE_PTR(saved_ctxt);
 	/*
 	 * update settings on existing OBDs. doing it inside
 	 * of llog_process_lock so no device is attaching/detaching
@@ -1614,6 +1704,9 @@ out_pop:
 					  strlen("-sptlrpc"));
 	}
 
+	lu_env_fini(env);
+out_free:
+	OBD_FREE_PTR(env);
 	return rc;
 }
 
@@ -1801,8 +1894,6 @@ struct obd_ops mgc_obd_ops = {
 	.o_set_info_async = mgc_set_info_async,
 	.o_get_info       = mgc_get_info,
 	.o_import_event = mgc_import_event,
-	.o_llog_init    = mgc_llog_init,
-	.o_llog_finish  = mgc_llog_finish,
 	.o_process_config = mgc_process_config,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index b4dad34..67513be 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -265,31 +265,6 @@ out:
 }
 EXPORT_SYMBOL(llog_init_handle);
 
-int llog_copy_handler(const struct lu_env *env,
-		      struct llog_handle *llh,
-		      struct llog_rec_hdr *rec,
-		      void *data)
-{
-	struct llog_rec_hdr local_rec = *rec;
-	struct llog_handle *local_llh = (struct llog_handle *)data;
-	char *cfg_buf = (char*) (rec + 1);
-	struct lustre_cfg *lcfg;
-	int rc = 0;
-
-	/* Append all records */
-	local_rec.lrh_len -= sizeof(*rec) + sizeof(struct llog_rec_tail);
-	rc = llog_write(env, local_llh, &local_rec, NULL, 0,
-			(void *)cfg_buf, -1);
-
-	lcfg = (struct lustre_cfg *)cfg_buf;
-	CDEBUG(D_INFO, "idx=%d, rc=%d, len=%d, cmd %x %s %s\n",
-	       rec->lrh_index, rc, rec->lrh_len, lcfg->lcfg_command,
-	       lustre_cfg_string(lcfg, 0), lustre_cfg_string(lcfg, 1));
-
-	return rc;
-}
-EXPORT_SYMBOL(llog_copy_handler);
-
 static int llog_process_thread(void *arg)
 {
 	struct llog_process_info	*lpi = arg;
@@ -493,14 +468,6 @@ int llog_process(const struct lu_env *env, struct llog_handle *loghandle,
 }
 EXPORT_SYMBOL(llog_process);
 
-inline int llog_get_size(struct llog_handle *loghandle)
-{
-	if (loghandle && loghandle->lgh_hdr)
-		return loghandle->lgh_hdr->llh_count;
-	return 0;
-}
-EXPORT_SYMBOL(llog_get_size);
-
 int llog_reverse_process(const struct lu_env *env,
 			 struct llog_handle *loghandle, llog_cb_t cb,
 			 void *data, void *catdata)
@@ -767,8 +734,9 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 		     struct llog_handle **res, struct llog_logid *logid,
 		     char *name)
 {
-	struct thandle	*th;
-	int		 rc;
+	struct dt_device	*d;
+	struct thandle		*th;
+	int			 rc;
 
 	rc = llog_open(env, ctxt, res, logid, name, LLOG_OPEN_NEW);
 	if (rc)
@@ -777,27 +745,21 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 	if (llog_exist(*res))
 		return 0;
 
-	if ((*res)->lgh_obj != NULL) {
-		struct dt_device *d;
+	LASSERT((*res)->lgh_obj != NULL);
 
-		d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
+	d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, d);
-		if (IS_ERR(th))
-			GOTO(out, rc = PTR_ERR(th));
+	th = dt_trans_create(env, d);
+	if (IS_ERR(th))
+		GOTO(out, rc = PTR_ERR(th));
 
-		rc = llog_declare_create(env, *res, th);
-		if (rc == 0) {
-			rc = dt_trans_start_local(env, d, th);
-			if (rc == 0)
-				rc = llog_create(env, *res, th);
-		}
-		dt_trans_stop(env, d, th);
-	} else {
-		/* lvfs compat code */
-		LASSERT((*res)->lgh_file == NULL);
-		rc = llog_create(env, *res, NULL);
+	rc = llog_declare_create(env, *res, th);
+	if (rc == 0) {
+		rc = dt_trans_start_local(env, d, th);
+		if (rc == 0)
+			rc = llog_create(env, *res, th);
 	}
+	dt_trans_stop(env, d, th);
 out:
 	if (rc)
 		llog_close(env, *res);
@@ -842,41 +804,34 @@ int llog_write(const struct lu_env *env, struct llog_handle *loghandle,
 	       struct llog_rec_hdr *rec, struct llog_cookie *reccookie,
 	       int cookiecount, void *buf, int idx)
 {
-	int rc;
+	struct dt_device	*dt;
+	struct thandle		*th;
+	int			 rc;
 
 	LASSERT(loghandle);
 	LASSERT(loghandle->lgh_ctxt);
+	LASSERT(loghandle->lgh_obj != NULL);
 
-	if (loghandle->lgh_obj != NULL) {
-		struct dt_device	*dt;
-		struct thandle		*th;
-
-		dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
+	dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, dt);
-		if (IS_ERR(th))
-			return PTR_ERR(th);
+	th = dt_trans_create(env, dt);
+	if (IS_ERR(th))
+		return PTR_ERR(th);
 
-		rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		rc = dt_trans_start_local(env, dt, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = dt_trans_start_local(env, dt, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, th);
-		up_write(&loghandle->lgh_lock);
+	down_write(&loghandle->lgh_lock);
+	rc = llog_write_rec(env, loghandle, rec, reccookie,
+			    cookiecount, buf, idx, th);
+	up_write(&loghandle->lgh_lock);
 out_trans:
-		dt_trans_stop(env, dt, th);
-	} else { /* lvfs compatibility */
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, NULL);
-		up_write(&loghandle->lgh_lock);
-	}
+	dt_trans_stop(env, dt, th);
 	return rc;
 }
 EXPORT_SYMBOL(llog_write);
@@ -932,3 +887,104 @@ out:
 	return rc;
 }
 EXPORT_SYMBOL(llog_close);
+
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name)
+{
+	struct llog_handle	*llh;
+	int			 rc = 0;
+
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		if (likely(rc == -ENOENT))
+			rc = 0;
+		GOTO(out, rc);
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+	rc = llog_get_size(llh);
+
+out_close:
+	llog_close(env, llh);
+out:
+	/* header is record 1 */
+	return (rc <= 1);
+}
+EXPORT_SYMBOL(llog_is_empty);
+
+int llog_copy_handler(const struct lu_env *env, struct llog_handle *llh,
+		      struct llog_rec_hdr *rec, void *data)
+{
+	struct llog_handle	*copy_llh = data;
+
+	/* Append all records */
+	return llog_write(env, copy_llh, rec, NULL, 0, NULL, -1);
+}
+EXPORT_SYMBOL(llog_copy_handler);
+
+/* backup plain llog */
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bctxt,
+		char *name, char *backup)
+{
+	struct llog_handle	*llh, *bllh;
+	int			 rc;
+
+
+
+	/* open original log */
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		/* the -ENOENT case is also reported to the caller
+		 * but silently so it should handle that if needed.
+		 */
+		if (rc != -ENOENT)
+			CERROR("%s: failed to open log %s: rc = %d\n",
+			       obd->obd_name, name, rc);
+		return rc;
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+
+	/* Make sure there's no old backup log */
+	rc = llog_erase(env, bctxt, NULL, backup);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out_close, rc);
+
+	/* open backup log */
+	rc = llog_open_create(env, bctxt, &bllh, NULL, backup);
+	if (rc) {
+		CERROR("%s: failed to open backup logfile %s: rc = %d\n",
+		       obd->obd_name, backup, rc);
+		GOTO(out_close, rc);
+	}
+
+	/* check that backup llog is not the same object as original one */
+	if (llh->lgh_obj == bllh->lgh_obj) {
+		CERROR("%s: backup llog %s to itself (%s), objects %p/%p\n",
+		       obd->obd_name, name, backup, llh->lgh_obj,
+		       bllh->lgh_obj);
+		GOTO(out_backup, rc = -EEXIST);
+	}
+
+	rc = llog_init_handle(env, bllh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_backup, rc);
+
+	/* Copy log record by record */
+	rc = llog_process_or_fork(env, llh, llog_copy_handler, (void *)bllh,
+				  NULL, false);
+	if (rc)
+		CERROR("%s: failed to backup log %s: rc = %d\n",
+		       obd->obd_name, name, rc);
+out_backup:
+	llog_close(env, bllh);
+out_close:
+	llog_close(env, llh);
+	return rc;
+}
+EXPORT_SYMBOL(llog_backup);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.c b/drivers/staging/lustre/lustre/obdclass/local_storage.c
index 51ab7f4..e79e4be 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.c
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.c
@@ -855,9 +855,12 @@ out_los:
 		(*los)->los_seq = fid_seq(first_fid);
 		(*los)->los_last_oid = le64_to_cpu(lastid);
 		(*los)->los_obj = o;
-		/* read value should not be less than initial one */
-		LASSERTF((*los)->los_last_oid >= first_oid, "%u < %u\n",
-			 (*los)->los_last_oid, first_oid);
+		/* Read value should not be less than initial one
+		 * but possible after upgrade from older fs.
+		 * In this case just switch to the first_oid in memory and
+		 * it will be updated on disk with first object generated */
+		if ((*los)->los_last_oid < first_oid)
+			(*los)->los_last_oid = first_oid;
 	}
 out:
 	mutex_unlock(&ls->ls_los_mutex);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.h b/drivers/staging/lustre/lustre/obdclass/local_storage.h
index d553c37..0f63b8c 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.h
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.h
@@ -29,6 +29,8 @@
  *
  * Author: Mikhail Pershin <mike.pershin@intel.com>
  */
+#ifndef __LOCAL_STORAGE_H
+#define __LOCAL_STORAGE_H
 
 #include <dt_object.h>
 #include <obd.h>
@@ -86,3 +88,4 @@ struct los_ondisk {
 };
 
 #define LOS_MAGIC	0xdecafbee
+#endif
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index d61b0db..00c7e9c 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -423,7 +423,7 @@ LU_KEY_INIT_FINI(lu_global, struct lu_cdebug_data);
  */
 struct lu_context_key lu_global_key = {
 	.lct_tags = LCT_MD_THREAD | LCT_DT_THREAD |
-		    LCT_MG_THREAD | LCT_CL_THREAD,
+		    LCT_MG_THREAD | LCT_CL_THREAD | LCT_LOCAL,
 	.lct_init = lu_global_key_init,
 	.lct_fini = lu_global_key_fini
 };
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 68a4d6a..a69a630 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -631,6 +631,9 @@ int lustre_put_lsi(struct super_block *sb)
 	CDEBUG(D_MOUNT, "put %p %d\n", sb, atomic_read(&lsi->lsi_mounts));
 	if (atomic_dec_and_test(&lsi->lsi_mounts)) {
 		if (IS_SERVER(lsi) && lsi->lsi_osd_exp) {
+			lu_device_put(&lsi->lsi_dt_dev->dd_lu_dev);
+			lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt = NULL;
+			lsi->lsi_dt_dev = NULL;
 			obd_disconnect(lsi->lsi_osd_exp);
 			/* wait till OSD is gone */
 			obd_zombie_barrier();
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
  2013-11-26  2:04 ` [PATCH 01/16] staging/lustre/server: use unified request handler for MGS Peng Tao
  2013-11-26  2:04 ` [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
@ 2013-11-26  2:04 ` Peng Tao
  2013-11-26  6:45   ` Patrick Farrell
  2013-11-26  2:04 ` [PATCH 04/16] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Patrick Farrell, Cheng Shao, Peng Tao, Andreas Dilger

From: Patrick Farrell <paf@cray.com>

This happend with SLES11SP2 Lustre client, which in turn acts as an
NFS server, exporting a subtree of an Lustre fs through NFS.

We detected that whenever we are writing to a new file using, fx,
'echo blah > newfile', it will return ENOENT error. We found
out that this was caused by the anonymous dentry. In SLESS11SP2,
anonymous dentries are assigned '/' as the name, instead of an
empty string. When MDT handles the intent_open call, it will look
up the obj by the name if it is not an empty string, and thus
couldn't find it.

As MDS_OPEN_BY_FID is always set on this request, we never need
to send the name in this request.  The fid is already available
and should be used in case the file has been renamed.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
Lustre-change: http://review.whamcloud.com/6920
Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 82248e9..f36c5d8 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
 {
 	struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
 	struct dentry *parent = file->f_dentry->d_parent;
-	const char *name = file->f_dentry->d_name.name;
-	const int len = file->f_dentry->d_name.len;
 	struct md_op_data *op_data;
 	struct ptlrpc_request *req;
 	__u32 opc = LUSTRE_OPC_ANY;
@@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
 	}
 
 	op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
-				      file->f_dentry->d_inode, name, len,
+				      file->f_dentry->d_inode, NULL, 0,
 				      O_RDWR, opc, NULL);
+
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/16] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (2 preceding siblings ...)
  2013-11-26  2:04 ` [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
@ 2013-11-26  2:04 ` Peng Tao
  2013-11-26  2:04 ` [PATCH 05/16] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Andriy Skulysh, Peng Tao, Andreas Dilger

From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>

During race exp_flock_hash can be created 2 times.
It is created & assigned without any lock.

Move hash initialization from ldlm_flock_blocking_link()
to ldlm_init_export()

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2835
Lustre-change: http://review.whamcloud.com/5471
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <Alexander_Boyko@xyratex.com>
Reviewed-by: Vitaly Fertman <Vitaly_Fertman@xyratex.com>
Tested-by: Kyrylo Shatskyy <kyrylo_shatskyy@xyratex.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c |   28 +++++++----------------
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c |    8 +++++++
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 456d5aa..c9aae13 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -95,20 +95,12 @@ ldlm_flocks_overlap(struct ldlm_lock *lock, struct ldlm_lock *new)
 		lock->l_policy_data.l_flock.start));
 }
 
-static inline int ldlm_flock_blocking_link(struct ldlm_lock *req,
-					   struct ldlm_lock *lock)
+static inline void ldlm_flock_blocking_link(struct ldlm_lock *req,
+					    struct ldlm_lock *lock)
 {
-	int rc = 0;
-
 	/* For server only */
 	if (req->l_export == NULL)
-		return 0;
-
-	if (unlikely(req->l_export->exp_flock_hash == NULL)) {
-		rc = ldlm_init_flock_export(req->l_export);
-		if (rc)
-			goto error;
-	}
+		return;
 
 	LASSERT(hlist_unhashed(&req->l_exp_flock_hash));
 
@@ -121,8 +113,6 @@ static inline int ldlm_flock_blocking_link(struct ldlm_lock *req,
 	cfs_hash_add(req->l_export->exp_flock_hash,
 		     &req->l_policy_data.l_flock.owner,
 		     &req->l_exp_flock_hash);
-error:
-	return rc;
 }
 
 static inline void ldlm_flock_blocking_unlink(struct ldlm_lock *req)
@@ -250,7 +240,6 @@ ldlm_process_flock_lock(struct ldlm_lock *req, __u64 *flags, int first_enq,
 	int overlaps = 0;
 	int splitted = 0;
 	const struct ldlm_callback_suite null_cbs = { NULL };
-	int rc;
 
 	CDEBUG(D_DLMTRACE, "flags %#llx owner "LPU64" pid %u mode %u start "
 	       LPU64" end "LPU64"\n", *flags,
@@ -328,12 +317,8 @@ reprocess:
 
 			/* add lock to blocking list before deadlock
 			 * check to prevent race */
-			rc = ldlm_flock_blocking_link(req, lock);
-			if (rc) {
-				ldlm_flock_destroy(req, mode, *flags);
-				*err = rc;
-				return LDLM_ITER_STOP;
-			}
+			ldlm_flock_blocking_link(req, lock);
+
 			if (ldlm_flock_deadlock(req, lock)) {
 				ldlm_flock_blocking_unlink(req);
 				ldlm_flock_destroy(req, mode, *flags);
@@ -813,6 +798,9 @@ static cfs_hash_ops_t ldlm_export_flock_ops = {
 
 int ldlm_init_flock_export(struct obd_export *exp)
 {
+	if (strcmp(exp->exp_obd->obd_type->typ_name, LUSTRE_MDT_NAME) != 0)
+		return 0;
+
 	exp->exp_flock_hash =
 		cfs_hash_create(obd_uuid2str(&exp->exp_client_uuid),
 				HASH_EXP_LOCK_CUR_BITS,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index bbf4291..85f5e7e 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -964,6 +964,7 @@ static cfs_hash_ops_t ldlm_export_lock_ops = {
 
 int ldlm_init_export(struct obd_export *exp)
 {
+	int rc;
 	exp->exp_lock_hash =
 		cfs_hash_create(obd_uuid2str(&exp->exp_client_uuid),
 				HASH_EXP_LOCK_CUR_BITS,
@@ -977,7 +978,14 @@ int ldlm_init_export(struct obd_export *exp)
 	if (!exp->exp_lock_hash)
 		return -ENOMEM;
 
+	rc = ldlm_init_flock_export(exp);
+	if (rc)
+		GOTO(err, rc);
+
 	return 0;
+err:
+	ldlm_destroy_export(exp);
+	return rc;
 }
 EXPORT_SYMBOL(ldlm_init_export);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/16] staging/lustre/mdc: prevent fall through in mdc_iocontrol()
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (3 preceding siblings ...)
  2013-11-26  2:04 ` [PATCH 04/16] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
@ 2013-11-26  2:04 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 06/16] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

In mdc_iocontrol() add a goto to the end of the LL_IOC_HSM_STATE_SET
case, preventing fall through into the next case. In the same
function, replace the return statement in OBD_IOC_QUOTACTL with a
goto, so that a reference to the module is not leaked.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3576
Lustre-change: http://review.whamcloud.com/6962
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   27 +++++++++++------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index ed3a7a0..a09a5c3 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1743,6 +1743,7 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		GOTO(out, rc);
 	case LL_IOC_HSM_STATE_SET:
 		rc = mdc_ioc_hsm_state_set(exp, karg);
+		GOTO(out, rc);
 	case LL_IOC_HSM_ACTION:
 		rc = mdc_ioc_hsm_current_action(exp, karg);
 		GOTO(out, rc);
@@ -1814,8 +1815,8 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		struct obd_quotactl *oqctl;
 
 		OBD_ALLOC_PTR(oqctl);
-		if (!oqctl)
-			return -ENOMEM;
+		if (oqctl == NULL)
+			GOTO(out, rc = -ENOMEM);
 
 		QCTL_COPY(oqctl, qctl);
 		rc = obd_quotactl(exp, oqctl);
@@ -1824,23 +1825,21 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			qctl->qc_valid = QC_MDTIDX;
 			qctl->obd_uuid = obd->u.cli.cl_target_uuid;
 		}
+
 		OBD_FREE_PTR(oqctl);
-		break;
+		GOTO(out, rc);
 	}
-	case LL_IOC_GET_CONNECT_FLAGS: {
-		if (copy_to_user(uarg,
-				     exp_connect_flags_ptr(exp),
-				     sizeof(__u64)))
+	case LL_IOC_GET_CONNECT_FLAGS:
+		if (copy_to_user(uarg, exp_connect_flags_ptr(exp),
+				 sizeof(*exp_connect_flags_ptr(exp))))
 			GOTO(out, rc = -EFAULT);
-		else
-			GOTO(out, rc = 0);
-	}
-	case LL_IOC_LOV_SWAP_LAYOUTS: {
+
+		GOTO(out, rc = 0);
+	case LL_IOC_LOV_SWAP_LAYOUTS:
 		rc = mdc_ioc_swap_layouts(exp, karg);
-		break;
-	}
+		GOTO(out, rc);
 	default:
-		CERROR("mdc_ioctl(): unrecognised ioctl %#x\n", cmd);
+		CERROR("unrecognised ioctl: cmd = %#x\n", cmd);
 		GOTO(out, rc = -ENOTTY);
 	}
 out:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/16] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (4 preceding siblings ...)
  2013-11-26  2:04 ` [PATCH 05/16] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 07/16] staging/lustre/mdt: HSM coordinator client interface Peng Tao
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

Remove the lo_depth member from struct lu_object.  This field is never
set and only read in lu_object_print().  Remove the lo_flags member.
This field was only used in lu_object_alloc() and can be replaced with
an on-stack mask to keep trace of which layers have been allocated.

Lustre-change: http://review.whamcloud.com/5890
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3059
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lu_object.h  |   19 ----------------
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 +++++++++++++-------
 2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index d5b8225..6773bca 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -398,17 +398,6 @@ static inline int lu_device_is_md(const struct lu_device *d)
 }
 
 /**
- * Flags for the object layers.
- */
-enum lu_object_flags {
-	/**
-	 * this flags is set if lu_object_operations::loo_object_init() has
-	 * been called for this layer. Used by lu_object_alloc().
-	 */
-	LU_OBJECT_ALLOCATED = (1 << 0)
-};
-
-/**
  * Common object attributes.
  */
 struct lu_attr {
@@ -486,14 +475,6 @@ struct lu_object {
 	 */
 	struct list_head			 lo_linkage;
 	/**
-	 * Depth. Top level layer depth is 0.
-	 */
-	int				lo_depth;
-	/**
-	 * Flags from enum lu_object_flags.
-	 */
-	__u32					lo_flags;
-	/**
 	 * Link to the device, for debugging.
 	 */
 	struct lu_ref_link                 lo_dev_ref;
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 00c7e9c..9887d8f 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -200,6 +200,8 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 	struct lu_object *scan;
 	struct lu_object *top;
 	struct list_head *layers;
+	unsigned int init_mask = 0;
+	unsigned int init_flag;
 	int clean;
 	int result;
 
@@ -218,15 +220,17 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 	 */
 	top->lo_header->loh_fid = *f;
 	layers = &top->lo_header->loh_layers;
+
 	do {
 		/*
 		 * Call ->loo_object_init() repeatedly, until no more new
 		 * object slices are created.
 		 */
 		clean = 1;
+		init_flag = 1;
 		list_for_each_entry(scan, layers, lo_linkage) {
-			if (scan->lo_flags & LU_OBJECT_ALLOCATED)
-				continue;
+			if (init_mask & init_flag)
+				goto next;
 			clean = 0;
 			scan->lo_header = top->lo_header;
 			result = scan->lo_ops->loo_object_init(env, scan, conf);
@@ -234,7 +238,9 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 				lu_object_free(env, top);
 				return ERR_PTR(result);
 			}
-			scan->lo_flags |= LU_OBJECT_ALLOCATED;
+			init_mask |= init_flag;
+next:
+			init_flag <<= 1;
 		}
 	} while (!clean);
 
@@ -487,23 +493,25 @@ void lu_object_print(const struct lu_env *env, void *cookie,
 {
 	static const char ruler[] = "........................................";
 	struct lu_object_header *top;
-	int depth;
+	int depth = 4;
 
 	top = o->lo_header;
 	lu_object_header_print(env, cookie, printer, top);
-	(*printer)(env, cookie, "{ \n");
-	list_for_each_entry(o, &top->loh_layers, lo_linkage) {
-		depth = o->lo_depth + 4;
+	(*printer)(env, cookie, "{\n");
 
+	list_for_each_entry(o, &top->loh_layers, lo_linkage) {
 		/*
 		 * print `.' \a depth times followed by type name and address
 		 */
 		(*printer)(env, cookie, "%*.*s%s@%p", depth, depth, ruler,
 			   o->lo_dev->ld_type->ldt_name, o);
+
 		if (o->lo_ops->loo_object_print != NULL)
-			o->lo_ops->loo_object_print(env, cookie, printer, o);
+			(*o->lo_ops->loo_object_print)(env, cookie, printer, o);
+
 		(*printer)(env, cookie, "\n");
 	}
+
 	(*printer)(env, cookie, "} header@%p\n", top);
 }
 EXPORT_SYMBOL(lu_object_print);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/16] staging/lustre/mdt: HSM coordinator client interface
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (5 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 06/16] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

This patch implements the HSM coordinator methods
used by client to add/remove/list HSM actions on
FID.

Lustre-change: http://review.whamcloud.com/6532
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3341
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |    7 +++----
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    4 +---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 9fd1d3b..9436166 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -1121,11 +1121,10 @@ static inline int hal_size(struct hsm_action_list *hal)
 
 	sz = sizeof(*hal) + cfs_size_round(strlen(hal->hal_fsname));
 	hai = hai_zero(hal);
-	for (i = 0 ; i < hal->hal_count ; i++) {
+	for (i = 0; i < hal->hal_count; i++, hai = hai_next(hai))
 		sz += cfs_size_round(hai->hai_len);
-		hai = hai_next(hai);
-	}
-	return(sz);
+
+	return sz;
 }
 
 /* Copytool progress reporting */
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index a09a5c3..8373778 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1919,10 +1919,8 @@ static void lustre_swab_hal(struct hsm_action_list *h)
 	__swab32s(&h->hal_archive_id);
 	__swab64s(&h->hal_flags);
 	hai = hai_zero(h);
-	for (i = 0; i < h->hal_count; i++) {
+	for (i = 0; i < h->hal_count; i++, hai = hai_next(hai))
 		lustre_swab_hai(hai);
-		hai = hai_next(hai);
-	}
 }
 
 static void lustre_swab_kuch(struct kuc_hdr *l)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (6 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 07/16] staging/lustre/mdt: HSM coordinator client interface Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  3:30   ` Greg Kroah-Hartman
  2013-11-26  2:05 ` [PATCH 09/16] staging/lustre/scrub: OI scrub on OST Peng Tao
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

To move data with external storage, HSM coordinator
uses a Copy Tool running on a client named agent.
This patch implements the interface for these agents.

Lustre-change: http://review.whamcloud.com/6534
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3342
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 9436166..631f026 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -428,8 +428,8 @@ struct obd_uuid {
 	char uuid[UUID_MAX];
 };
 
-static inline int obd_uuid_equals(const struct obd_uuid *u1,
-				  const struct obd_uuid *u2)
+static inline bool obd_uuid_equals(const struct obd_uuid *u1,
+				   const struct obd_uuid *u2)
 {
 	return strcmp((char *)u1->uuid, (char *)u2->uuid) == 0;
 }
@@ -446,7 +446,7 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
 }
 
 /* For printf's only, make sure uuid is terminated */
-static inline char *obd_uuid2str(struct obd_uuid *uuid)
+static inline char *obd_uuid2str(const struct obd_uuid *uuid)
 {
 	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
 		/* Obviously not safe, but for printfs, no real harm done...
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/16] staging/lustre/scrub: OI scrub on OST
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (7 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 10/16] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Fan Yong, Peng Tao, Andreas Dilger

From: Fan Yong <fan.yong@intel.com>

OI scrub should has the ability to handle kinds of OI, including
both the OI files on MDT and the /O directory on OST.

We trust the FID in LMA for both MDT objects and OST objects. So
if some /O sub-item does not match related LMA, then the /O will
be updated, instead of the LMA.

To guarantee that the OI scrub can run without MDT0 involved for
FLDB, the OST object needs to store some flag in its LMA to tell
OI scrub that it is for an OST object, no need to query the MDT0.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3335
Lustre-change: http://review.whamcloud.com/6669
Signed-off-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   18 +++++++++++-------
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 drivers/staging/lustre/lustre/obdclass/md_attrs.c  |    4 ++--
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |    4 ++++
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 7dfb925..e4cb1ad 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -321,8 +321,11 @@ static inline int range_compare_loc(const struct lu_seq_range *r1,
  * xattr.
  */
 enum lma_compat {
-	LMAC_HSM = 0x00000001,
-	LMAC_SOM = 0x00000002,
+	LMAC_HSM	= 0x00000001,
+	LMAC_SOM	= 0x00000002,
+	LMAC_NOT_IN_OI	= 0x00000004, /* the object does NOT need OI mapping */
+	LMAC_FID_ON_OST = 0x00000008, /* For OST-object, its OI mapping is
+				       * under /O/<seq>/d<x>. */
 };
 
 /**
@@ -331,16 +334,17 @@ enum lma_compat {
  * This information is stored in lustre_mdt_attrs::lma_incompat.
  */
 enum lma_incompat {
-	LMAI_RELEASED = 0x0000001, /* file is released */
-	LMAI_AGENT = 0x00000002, /* agent inode */
-	LMAI_REMOTE_PARENT = 0x00000004, /* the parent of the object
-					    is on the remote MDT */
+	LMAI_RELEASED		= 0x00000001, /* file is released */
+	LMAI_AGENT		= 0x00000002, /* agent inode */
+	LMAI_REMOTE_PARENT	= 0x00000004, /* the parent of the object
+						 is on the remote MDT */
 };
 #define LMA_INCOMPAT_SUPP	(LMAI_AGENT | LMAI_REMOTE_PARENT)
 
 extern void lustre_lma_swab(struct lustre_mdt_attrs *lma);
 extern void lustre_lma_init(struct lustre_mdt_attrs *lma,
-			    const struct lu_fid *fid, __u32 incompat);
+			    const struct lu_fid *fid,
+			    __u32 compat, __u32 incompat);
 /**
  * SOM on-disk attributes stored in a separate xattr.
  */
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 825a0c9..1748985 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -256,6 +256,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_OSD_SCRUB_FATAL			0x192
 #define OBD_FAIL_OSD_FID_MAPPING			0x193
 #define OBD_FAIL_OSD_LMA_INCOMPAT			0x194
+#define OBD_FAIL_OSD_COMPAT_INVALID_ENTRY		0x195
 
 #define OBD_FAIL_OST		     0x200
 #define OBD_FAIL_OST_CONNECT_NET	 0x201
diff --git a/drivers/staging/lustre/lustre/obdclass/md_attrs.c b/drivers/staging/lustre/lustre/obdclass/md_attrs.c
index f718782..91eb30b 100644
--- a/drivers/staging/lustre/lustre/obdclass/md_attrs.c
+++ b/drivers/staging/lustre/lustre/obdclass/md_attrs.c
@@ -39,9 +39,9 @@
  * \param incompat - features that MDS must understand to access object
  */
 void lustre_lma_init(struct lustre_mdt_attrs *lma, const struct lu_fid *fid,
-		     __u32 incompat)
+		     __u32 compat, __u32 incompat)
 {
-	lma->lma_compat   = 0;
+	lma->lma_compat   = compat;
 	lma->lma_incompat = incompat;
 	lma->lma_self_fid = *fid;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index e3f02c7..7160a23 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -433,6 +433,10 @@ void lustre_assert_wire_constants(void)
 		(unsigned)LMAC_HSM);
 	LASSERTF(LMAC_SOM == 0x00000002UL, "found 0x%.8xUL\n",
 		(unsigned)LMAC_SOM);
+	LASSERTF(LMAC_NOT_IN_OI == 0x00000004UL, "found 0x%.8xUL\n",
+		(unsigned)LMAC_NOT_IN_OI);
+	LASSERTF(LMAC_FID_ON_OST == 0x00000008UL, "found 0x%.8xUL\n",
+		(unsigned)LMAC_FID_ON_OST);
 	LASSERTF(OBJ_CREATE == 1, "found %lld\n",
 		 (long long)OBJ_CREATE);
 	LASSERTF(OBJ_DESTROY == 2, "found %lld\n",
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/16] staging/lustre/scrub: control OI scrub on OST from user space
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (8 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 09/16] staging/lustre/scrub: OI scrub on OST Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 11/16] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Fan Yong, Peng Tao, Andreas Dilger

From: Fan Yong <fan.yong@intel.com>

Not all the OI inconsistency can be detected automatically, such as /O
entry lost case. So the OI scrub on OST should can be triggered by the
administrator manually.

Lustre-change: http://review.whamcloud.com/6698
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3335
Signed-off-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 1748985..4d1f62c 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -257,6 +257,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_OSD_FID_MAPPING			0x193
 #define OBD_FAIL_OSD_LMA_INCOMPAT			0x194
 #define OBD_FAIL_OSD_COMPAT_INVALID_ENTRY		0x195
+#define OBD_FAIL_OSD_COMPAT_NO_ENTRY			0x196
 
 #define OBD_FAIL_OST		     0x200
 #define OBD_FAIL_OST_CONNECT_NET	 0x201
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/16] staging/lustre/llite: don't check for O_CREAT in it_create_mode
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (9 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 10/16] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 12/16] staging/lustre/build: clean up unused variables and dead code Peng Tao
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

ll_lookup_it() checks for O_CREAT in struct lookup_intent's
it_create_mode member which is nonsensical, as it_create_mode is used
for file mode bits (S_IFREG, S_IRUSR, ...). Fix this by just checking
for IT_CREATE in it_op and do the same in llu_lookup_it(). This will
not affect the behavior of either function, since if O_CREATE (0100)
is actually set in o_create_mode then IT_CREATE must have been set in
it_op. In ll_atomic_open() check for O_CREAT in the open_flags
parameter rather than testing mode.

Lustre-change: http://review.whamcloud.com/6786
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3517
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/namei.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 8377468..e47e2b1 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -523,8 +523,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	icbd.icbd_childp = &dentry;
 	icbd.icbd_parent = parent;
 
-	if (it->it_op & IT_CREAT ||
-	    (it->it_op & IT_OPEN && it->it_create_mode & O_CREAT))
+	if (it->it_op & IT_CREAT)
 		opc = LUSTRE_OPC_CREATE;
 	else
 		opc = LUSTRE_OPC_ANY;
@@ -623,7 +622,7 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 		return -ENOMEM;
 
 	it->it_op = IT_OPEN;
-	if (mode) {
+	if (open_flags & O_CREAT) {
 		it->it_op |= IT_CREAT;
 		lookup_flags |= LOOKUP_CREATE;
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/16] staging/lustre/build: clean up unused variables and dead code
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (10 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 11/16] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 13/16] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Dmitry Eremin, Ned Bass, Peng Tao, Andreas Dilger

From: Dmitry Eremin <dmitry.eremin@intel.com>

Clean up the code from really unused variables and rearrange other
to remove SET_BUT_UNUSED and UNUSED macro usage if possible.

Clean up or comment on some dead code.

Removed never implemented suspend timeouts functionality from pinger.c
which was commented out since 2007 and going to be replaced by adaptive
timeouts. Also removed all references to this functionality from
ldlm_lockd.c, ldlm_request.c and import.c which actually nevers executes
or do nothing.

 pinger.c commit d2d56f38da01001c92a09afc6b52b5acbd9bc13c
 Author: tappro <tappro>
 Date:   Mon Jul 30 21:08:59 2007 +0000
    - make HEAD from b_post_cmd3

Lustre-change: http://review.whamcloud.com/6139
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3204
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 -
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
 drivers/staging/lustre/lnet/selftest/timer.c       |    6 +-
 .../lustre/lustre/include/linux/lustre_compat25.h  |    4 +-
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    3 -
 drivers/staging/lustre/lustre/include/lustre_net.h |    2 -
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    3 -
 drivers/staging/lustre/lustre/libcfs/workitem.c    |   56 ++++++++---------
 drivers/staging/lustre/lustre/llite/dcache.c       |   36 ++++-------
 drivers/staging/lustre/lustre/llite/namei.c        |    6 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |    4 --
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   10 ++-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |    2 -
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   10 ++-
 drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   65 --------------------
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |   44 ++++++-------
 17 files changed, 76 insertions(+), 182 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 687dbab..4a6c7da 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -181,8 +181,6 @@ static inline void *__container_of(void *ptr, unsigned long shift)
 #define container_of0(ptr, type, member) \
 	((type *)__container_of((void *)(ptr), offsetof(type, member)))
 
-#define SET_BUT_UNUSED(a) do { } while(sizeof(a) - sizeof(a))
-
 #define _LIBCFS_H
 
 #endif /* _LIBCFS_H */
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 7659a26..5ae59d2 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -124,7 +124,6 @@ srpc_bulk_t *
 srpc_alloc_bulk(int cpt, unsigned bulk_npg, unsigned bulk_len, int sink)
 {
 	srpc_bulk_t  *bk;
-	struct page  **pages;
 	int	      i;
 
 	LASSERT(bulk_npg > 0 && bulk_npg <= LNET_MAX_IOV);
@@ -140,7 +139,6 @@ srpc_alloc_bulk(int cpt, unsigned bulk_npg, unsigned bulk_len, int sink)
 	bk->bk_sink   = sink;
 	bk->bk_len    = bulk_len;
 	bk->bk_niov   = bulk_npg;
-	UNUSED(pages);
 
 	for (i = 0; i < bulk_npg; i++) {
 		struct page *pg;
diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h
index 8053b05..cd53926 100644
--- a/drivers/staging/lustre/lnet/selftest/selftest.h
+++ b/drivers/staging/lustre/lnet/selftest/selftest.h
@@ -572,9 +572,6 @@ swi_state2str (int state)
 #undef STATE2STR
 }
 
-#define UNUSED(x)       ( (void)(x) )
-
-
 #define selftest_wait_events()	cfs_pause(cfs_time_seconds(1) / 10)
 
 
diff --git a/drivers/staging/lustre/lnet/selftest/timer.c b/drivers/staging/lustre/lnet/selftest/timer.c
index 82fd363..5dc8e3d 100644
--- a/drivers/staging/lustre/lnet/selftest/timer.c
+++ b/drivers/staging/lustre/lnet/selftest/timer.c
@@ -172,9 +172,6 @@ int
 stt_timer_main(void *arg)
 {
 	int rc = 0;
-	UNUSED(arg);
-
-	SET_BUT_UNUSED(rc);
 
 	cfs_block_allsigs();
 
@@ -184,12 +181,13 @@ stt_timer_main(void *arg)
 		rc = wait_event_timeout(stt_data.stt_waitq,
 					stt_data.stt_shuttingdown,
 					cfs_time_seconds(STTIMER_SLOTTIME));
+		rc = 0; /* Discard jiffies remaining before timeout. */
 	}
 
 	spin_lock(&stt_data.stt_lock);
 	stt_data.stt_nthreads--;
 	spin_unlock(&stt_data.stt_lock);
-	return 0;
+	return rc;
 }
 
 int
diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 359c6c1..2e1b7f1 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -171,8 +171,8 @@ static inline int ll_quota_off(struct super_block *sb, int off, int remount)
 #define ll_d_hlist_empty(list) hlist_empty(list)
 #define ll_d_hlist_entry(ptr, type, name) hlist_entry(ptr.first, type, name)
 #define ll_d_hlist_for_each(tmp, i_dentry) hlist_for_each(tmp, i_dentry)
-#define ll_d_hlist_for_each_entry(dentry, p, i_dentry, alias) \
-	p = NULL; hlist_for_each_entry(dentry, i_dentry, alias)
+#define ll_d_hlist_for_each_entry(dentry, i_dentry, alias) \
+	hlist_for_each_entry(dentry, i_dentry, alias)
 
 
 #define bio_hw_segments(q, bio) 0
diff --git a/drivers/staging/lustre/lustre/include/lustre_ha.h b/drivers/staging/lustre/lustre/include/lustre_ha.h
index 105f6d6..f3ae02b 100644
--- a/drivers/staging/lustre/lustre/include/lustre_ha.h
+++ b/drivers/staging/lustre/lustre/include/lustre_ha.h
@@ -58,9 +58,6 @@ void ptlrpc_activate_import(struct obd_import *imp);
 void ptlrpc_deactivate_import(struct obd_import *imp);
 void ptlrpc_invalidate_import(struct obd_import *imp);
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt);
-int ptlrpc_check_suspend(void);
-void ptlrpc_activate_timeouts(struct obd_import *imp);
-void ptlrpc_deactivate_timeouts(struct obd_import *imp);
 
 /** @} ha */
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 91f28e3..6c5479b 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -3403,10 +3403,8 @@ int ptlrpc_del_timeout_client(struct list_head *obd_list,
 			      enum timeout_event event);
 struct ptlrpc_request * ptlrpc_prep_ping(struct obd_import *imp);
 int ptlrpc_obd_ping(struct obd_device *obd);
-cfs_time_t ptlrpc_suspend_wakeup_time(void);
 void ping_evictor_start(void);
 void ping_evictor_stop(void);
-int ptlrpc_check_and_wait_suspend(struct ptlrpc_request *req);
 void ptlrpc_pinger_ir_up(void);
 void ptlrpc_pinger_ir_down(void);
 /** @} */
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 4974bec..a8f8c1c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -97,9 +97,6 @@ int ldlm_expired_completion_wait(void *data)
 	if (lock->l_conn_export == NULL) {
 		static cfs_time_t next_dump = 0, last_dump = 0;
 
-		if (ptlrpc_check_suspend())
-			return 0;
-
 		LCONSOLE_WARN("lock timed out (enqueued at "CFS_TIME_T", "
 			      CFS_DURATION_T"s ago)\n",
 			      lock->l_last_activity,
diff --git a/drivers/staging/lustre/lustre/libcfs/workitem.c b/drivers/staging/lustre/lustre/libcfs/workitem.c
index 1a55c81..f216deb 100644
--- a/drivers/staging/lustre/lustre/libcfs/workitem.c
+++ b/drivers/staging/lustre/lustre/libcfs/workitem.c
@@ -306,8 +306,6 @@ cfs_wi_scheduler (void *arg)
 void
 cfs_wi_sched_destroy(struct cfs_wi_sched *sched)
 {
-	int	i;
-
 	LASSERT(cfs_wi_data.wi_init);
 	LASSERT(!cfs_wi_data.wi_stopping);
 
@@ -324,18 +322,22 @@ cfs_wi_sched_destroy(struct cfs_wi_sched *sched)
 
 	spin_unlock(&cfs_wi_data.wi_glock);
 
-	i = 2;
 	wake_up_all(&sched->ws_waitq);
 
 	spin_lock(&cfs_wi_data.wi_glock);
-	while (sched->ws_nthreads > 0) {
-		CDEBUG(IS_PO2(++i) ? D_WARNING : D_NET,
-		       "waiting for %d threads of WI sched[%s] to terminate\n",
-		       sched->ws_nthreads, sched->ws_name);
+	{
+		int i = 2;
 
-		spin_unlock(&cfs_wi_data.wi_glock);
-		cfs_pause(cfs_time_seconds(1) / 20);
-		spin_lock(&cfs_wi_data.wi_glock);
+		while (sched->ws_nthreads > 0) {
+			CDEBUG(IS_PO2(++i) ? D_WARNING : D_NET,
+			       "waiting for %d threads of WI sched[%s] to "
+			       "terminate\n", sched->ws_nthreads,
+			       sched->ws_name);
+
+			spin_unlock(&cfs_wi_data.wi_glock);
+			cfs_pause(cfs_time_seconds(1) / 20);
+			spin_lock(&cfs_wi_data.wi_glock);
+		}
 	}
 
 	list_del(&sched->ws_list);
@@ -352,7 +354,6 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 		    int cpt, int nthrs, struct cfs_wi_sched **sched_pp)
 {
 	struct cfs_wi_sched	*sched;
-	int			rc;
 
 	LASSERT(cfs_wi_data.wi_init);
 	LASSERT(!cfs_wi_data.wi_stopping);
@@ -373,9 +374,8 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 	INIT_LIST_HEAD(&sched->ws_rerunq);
 	INIT_LIST_HEAD(&sched->ws_list);
 
-	rc = 0;
-	while (nthrs > 0)  {
-		char	name[16];
+	for (; nthrs > 0; nthrs--)  {
+		char		name[16];
 		struct task_struct *task;
 
 		spin_lock(&cfs_wi_data.wi_glock);
@@ -402,21 +402,19 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 			nthrs--;
 			continue;
 		}
-		rc = PTR_ERR(task);
-
-		CERROR("Failed to create thread for WI scheduler %s: %d\n",
-		       name, rc);
-
-		spin_lock(&cfs_wi_data.wi_glock);
-
-		/* make up for cfs_wi_sched_destroy */
-		list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
-		sched->ws_starting--;
-
-		spin_unlock(&cfs_wi_data.wi_glock);
-
-		cfs_wi_sched_destroy(sched);
-		return rc;
+		if (IS_ERR(task)) {
+			int rc = PTR_ERR(task);
+			CERROR(
+			       "Failed to create thread for WI scheduler %s: %d\n",
+			       name, rc);
+			spin_lock(&cfs_wi_data.wi_glock);
+			/* make up for cfs_wi_sched_destroy */
+			list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
+			sched->ws_starting--;
+			spin_unlock(&cfs_wi_data.wi_glock);
+			cfs_wi_sched_destroy(sched);
+			return rc;
+		}
 	}
 	spin_lock(&cfs_wi_data.wi_glock);
 	list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
diff --git a/drivers/staging/lustre/lustre/llite/dcache.c b/drivers/staging/lustre/lustre/llite/dcache.c
index e7629be..cc7befe 100644
--- a/drivers/staging/lustre/lustre/llite/dcache.c
+++ b/drivers/staging/lustre/lustre/llite/dcache.c
@@ -270,7 +270,6 @@ void ll_intent_release(struct lookup_intent *it)
 void ll_invalidate_aliases(struct inode *inode)
 {
 	struct dentry *dentry;
-	struct ll_d_hlist_node *p;
 
 	LASSERT(inode != NULL);
 
@@ -278,7 +277,7 @@ void ll_invalidate_aliases(struct inode *inode)
 	       inode->i_ino, inode->i_generation, inode);
 
 	ll_lock_dcache(inode);
-	ll_d_hlist_for_each_entry(dentry, p, &inode->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
 		CDEBUG(D_DENTRY, "dentry in drop %.*s (%p) parent %p "
 		       "inode %p flags %d\n", dentry->d_name.len,
 		       dentry->d_name.name, dentry, dentry->d_parent,
@@ -404,7 +403,6 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 		struct inode *inode = de->d_inode;
 		struct ll_inode_info *lli = ll_i2info(inode);
 		struct obd_client_handle **och_p;
-		__u64 *och_usecount;
 		__u64 ibits;
 
 		/*
@@ -417,38 +415,30 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 		 * change, LOOKUP lock is revoked.
 		 */
 
-
-		if (it->it_flags & FMODE_WRITE) {
+		if (it->it_flags & FMODE_WRITE)
 			och_p = &lli->lli_mds_write_och;
-			och_usecount = &lli->lli_open_fd_write_count;
-		} else if (it->it_flags & FMODE_EXEC) {
+		else if (it->it_flags & FMODE_EXEC)
 			och_p = &lli->lli_mds_exec_och;
-			och_usecount = &lli->lli_open_fd_exec_count;
-		} else {
+		else
 			och_p = &lli->lli_mds_read_och;
-			och_usecount = &lli->lli_open_fd_read_count;
-		}
 		/* Check for the proper lock. */
 		ibits = MDS_INODELOCK_LOOKUP;
 		if (!ll_have_md_lock(inode, &ibits, LCK_MINMODE))
 			goto do_lock;
 		mutex_lock(&lli->lli_och_mutex);
 		if (*och_p) { /* Everything is open already, do nothing */
-			/*(*och_usecount)++;  Do not let them steal our open
-			  handle from under us */
-			SET_BUT_UNUSED(och_usecount);
-			/* XXX The code above was my original idea, but in case
-			   we have the handle, but we cannot use it due to later
-			   checks (e.g. O_CREAT|O_EXCL flags set), nobody
-			   would decrement counter increased here. So we just
-			   hope the lock won't be invalidated in between. But
-			   if it would be, we'll reopen the open request to
-			   MDS later during file open path */
+			/* Originally it was idea to do not let them steal our
+			 * open handle from under us by (*och_usecount)++ here.
+			 * But in case we have the handle, but we cannot use it
+			 * due to later checks (e.g. O_CREAT|O_EXCL flags set),
+			 * nobody would decrement counter increased here. So we
+			 * just hope the lock won't be invalidated in between.
+			 * But if it would be, we'll reopen the open request to
+			 * MDS later during file open path. */
 			mutex_unlock(&lli->lli_och_mutex);
 			return 1;
-		} else {
-			mutex_unlock(&lli->lli_och_mutex);
 		}
+		mutex_unlock(&lli->lli_och_mutex);
 	}
 
 	if (it->it_op == IT_GETATTR) {
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index e47e2b1..0000530 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -172,10 +172,9 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash,
 static void ll_invalidate_negative_children(struct inode *dir)
 {
 	struct dentry *dentry, *tmp_subdir;
-	struct ll_d_hlist_node *p;
 
 	ll_lock_dcache(dir);
-	ll_d_hlist_for_each_entry(dentry, p, &dir->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(dentry, &dir->i_dentry, d_alias) {
 		spin_lock(&dentry->d_lock);
 		if (!list_empty(&dentry->d_subdirs)) {
 			struct dentry *child;
@@ -352,7 +351,6 @@ void ll_i2gids(__u32 *suppgids, struct inode *i1, struct inode *i2)
 static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 {
 	struct dentry *alias, *discon_alias, *invalid_alias;
-	struct ll_d_hlist_node *p;
 
 	if (ll_d_hlist_empty(&inode->i_dentry))
 		return NULL;
@@ -360,7 +358,7 @@ static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 	discon_alias = invalid_alias = NULL;
 
 	ll_lock_dcache(inode);
-	ll_d_hlist_for_each_entry(alias, p, &inode->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(alias, &inode->i_dentry, d_alias) {
 		LASSERT(alias != dentry);
 
 		spin_lock(&alias->d_lock);
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 0f0750b..8828419 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -420,7 +420,6 @@ void lprocfs_stats_collect(struct lprocfs_stats *stats, int idx,
 {
 	unsigned int			num_entry;
 	struct lprocfs_counter		*percpu_cntr;
-	struct lprocfs_counter_header	*cntr_header;
 	int				i;
 	unsigned long			flags = 0;
 
@@ -439,7 +438,6 @@ void lprocfs_stats_collect(struct lprocfs_stats *stats, int idx,
 	for (i = 0; i < num_entry; i++) {
 		if (stats->ls_percpu[i] == NULL)
 			continue;
-		cntr_header = &stats->ls_cnt_header[idx];
 		percpu_cntr = lprocfs_stats_counter_get(stats, i, idx);
 
 		cnt->lc_count += percpu_cntr->lc_count;
@@ -999,7 +997,6 @@ EXPORT_SYMBOL(lprocfs_free_stats);
 void lprocfs_clear_stats(struct lprocfs_stats *stats)
 {
 	struct lprocfs_counter		*percpu_cntr;
-	struct lprocfs_counter_header	*header;
 	int				i;
 	int				j;
 	unsigned int			num_entry;
@@ -1011,7 +1008,6 @@ void lprocfs_clear_stats(struct lprocfs_stats *stats)
 		if (stats->ls_percpu[i] == NULL)
 			continue;
 		for (j = 0; j < stats->ls_num; j++) {
-			header = &stats->ls_cnt_header[j];
 			percpu_cntr = lprocfs_stats_counter_get(stats, i, j);
 			percpu_cntr->lc_count		= 0;
 			percpu_cntr->lc_min		= LC_MIN_INIT;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index d90efe4..684000e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2289,7 +2289,6 @@ EXPORT_SYMBOL(ptlrpc_req_xid);
 int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async)
 {
 	int		rc;
-	wait_queue_head_t       *wq;
 	struct l_wait_info lwi;
 
 	/*
@@ -2334,12 +2333,11 @@ int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async)
 	 * a chance to run reply_in_callback(), and to make sure we've
 	 * unlinked before returning a req to the pool.
 	 */
-	if (request->rq_set != NULL)
-		wq = &request->rq_set->set_waitq;
-	else
-		wq = &request->rq_reply_waitq;
-
 	for (;;) {
+		/* The wq argument is ignored by user-space wait_event macros */
+		wait_queue_head_t *wq = (request->rq_set != NULL) ?
+				  &request->rq_set->set_waitq :
+				  &request->rq_reply_waitq;
 		/* Network access will complete in finite time but the HUGE
 		 * timeout lets us CWARN for visibility of sluggish NALs */
 		lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(LONG_UNLINK),
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 283173a..f465547 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -170,7 +170,6 @@ int ptlrpc_set_import_discon(struct obd_import *imp, __u32 conn_cnt)
 			       target_len, target_start,
 			       libcfs_nid2str(imp->imp_connection->c_peer.nid));
 		}
-		ptlrpc_deactivate_timeouts(imp);
 		IMPORT_SET_STATE_NOLOCK(imp, LUSTRE_IMP_DISCON);
 		spin_unlock(&imp->imp_lock);
 
@@ -383,7 +382,6 @@ void ptlrpc_activate_import(struct obd_import *imp)
 
 	spin_lock(&imp->imp_lock);
 	imp->imp_invalid = 0;
-	ptlrpc_activate_timeouts(imp);
 	spin_unlock(&imp->imp_lock);
 	obd_import_event(obd, imp, IMP_EVENT_ACTIVE);
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index 5f2aa7a..499e4be 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -243,7 +243,6 @@ EXPORT_SYMBOL(ptlrpc_register_bulk);
 int ptlrpc_unregister_bulk(struct ptlrpc_request *req, int async)
 {
 	struct ptlrpc_bulk_desc *desc = req->rq_bulk;
-	wait_queue_head_t	     *wq;
 	struct l_wait_info       lwi;
 	int		      rc;
 
@@ -275,12 +274,11 @@ int ptlrpc_unregister_bulk(struct ptlrpc_request *req, int async)
 	if (async)
 		return 0;
 
-	if (req->rq_set != NULL)
-		wq = &req->rq_set->set_waitq;
-	else
-		wq = &req->rq_reply_waitq;
-
 	for (;;) {
+		/* The wq argument is ignored by user-space wait_event macros */
+		wait_queue_head_t *wq = (req->rq_set != NULL) ?
+				  &req->rq_set->set_waitq :
+				  &req->rq_reply_waitq;
 		/* Network access will complete in finite time but the HUGE
 		 * timeout lets us CWARN for visibility of sluggish NALs */
 		lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(LONG_UNLINK),
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pinger.c b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
index 7c05e82..aaceafa 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pinger.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
@@ -141,9 +141,6 @@ static inline int ptlrpc_next_reconnect(struct obd_import *imp)
 		return cfs_time_shift(obd_timeout);
 }
 
-static atomic_t suspend_timeouts = ATOMIC_INIT(0);
-static cfs_time_t suspend_wakeup_time = 0;
-
 cfs_duration_t pinger_check_timeout(cfs_time_t time)
 {
 	struct timeout_item *item;
@@ -163,67 +160,6 @@ cfs_duration_t pinger_check_timeout(cfs_time_t time)
 					 cfs_time_current());
 }
 
-static wait_queue_head_t suspend_timeouts_waitq;
-
-cfs_time_t ptlrpc_suspend_wakeup_time(void)
-{
-	return suspend_wakeup_time;
-}
-
-void ptlrpc_deactivate_timeouts(struct obd_import *imp)
-{
-	/*XXX: disabled for now, will be replaced by adaptive timeouts */
-#if 0
-	if (imp->imp_no_timeout)
-		return;
-	imp->imp_no_timeout = 1;
-	atomic_inc(&suspend_timeouts);
-	CDEBUG(D_HA|D_WARNING, "deactivate timeouts %u\n",
-	       atomic_read(&suspend_timeouts));
-#endif
-}
-
-void ptlrpc_activate_timeouts(struct obd_import *imp)
-{
-	/*XXX: disabled for now, will be replaced by adaptive timeouts */
-#if 0
-	if (!imp->imp_no_timeout)
-		return;
-	imp->imp_no_timeout = 0;
-	LASSERT(atomic_read(&suspend_timeouts) > 0);
-	if (atomic_dec_and_test(&suspend_timeouts)) {
-		suspend_wakeup_time = cfs_time_current();
-		wake_up(&suspend_timeouts_waitq);
-	}
-	CDEBUG(D_HA|D_WARNING, "activate timeouts %u\n",
-	       atomic_read(&suspend_timeouts));
-#endif
-}
-
-int ptlrpc_check_suspend(void)
-{
-	if (atomic_read(&suspend_timeouts))
-		return 1;
-	return 0;
-}
-
-int ptlrpc_check_and_wait_suspend(struct ptlrpc_request *req)
-{
-	struct l_wait_info lwi;
-
-	if (atomic_read(&suspend_timeouts)) {
-		DEBUG_REQ(D_NET, req, "-- suspend %d regular timeout",
-			  atomic_read(&suspend_timeouts));
-		lwi = LWI_INTR(NULL, NULL);
-		l_wait_event(suspend_timeouts_waitq,
-			     atomic_read(&suspend_timeouts) == 0, &lwi);
-		DEBUG_REQ(D_NET, req, "-- recharge regular timeout");
-		return 1;
-	}
-	return 0;
-}
-
-
 static bool ir_up;
 
 void ptlrpc_pinger_ir_up(void)
@@ -378,7 +314,6 @@ int ptlrpc_start_pinger(void)
 		return -EALREADY;
 
 	init_waitqueue_head(&pinger_thread.t_ctl_waitq);
-	init_waitqueue_head(&suspend_timeouts_waitq);
 
 	strcpy(pinger_thread.t_name, "ll_ping");
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 2ebdb1b..8a95fa5 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -600,7 +600,6 @@ static int ptlrpcd_bind(int index, int max)
 int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 {
 	int rc;
-	int env = 0;
 
 	/*
 	 * Do not allow start second thread for one pc.
@@ -619,6 +618,7 @@ int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 	pc->pc_set = ptlrpc_prep_set();
 	if (pc->pc_set == NULL)
 		GOTO(out, rc = -ENOMEM);
+
 	/*
 	 * So far only "client" ptlrpcd uses an environment. In the future,
 	 * ptlrpcd thread (or a thread-set) has to be given an argument,
@@ -626,40 +626,40 @@ int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 	 */
 	rc = lu_context_init(&pc->pc_env.le_ctx, LCT_CL_THREAD|LCT_REMEMBER);
 	if (rc != 0)
-		GOTO(out, rc);
+		GOTO(out_set, rc);
 
-	env = 1;
 	{
 		struct task_struct *task;
-
 		if (index >= 0) {
 			rc = ptlrpcd_bind(index, max);
 			if (rc < 0)
-				GOTO(out, rc);
+				GOTO(out_env, rc);
 		}
 
-		task = kthread_run(ptlrpcd, pc, "%s", pc->pc_name);
+		task = kthread_run(ptlrpcd, pc, pc->pc_name);
 		if (IS_ERR(task))
-			GOTO(out, rc = PTR_ERR(task));
+			GOTO(out_env, rc = PTR_ERR(task));
 
-		rc = 0;
 		wait_for_completion(&pc->pc_starting);
 	}
-out:
-	if (rc) {
-		if (pc->pc_set != NULL) {
-			struct ptlrpc_request_set *set = pc->pc_set;
-
-			spin_lock(&pc->pc_lock);
-			pc->pc_set = NULL;
-			spin_unlock(&pc->pc_lock);
-			ptlrpc_set_destroy(set);
-		}
-		if (env != 0)
-			lu_context_fini(&pc->pc_env.le_ctx);
-		clear_bit(LIOD_BIND, &pc->pc_flags);
-		clear_bit(LIOD_START, &pc->pc_flags);
+	return 0;
+
+out_env:
+	lu_context_fini(&pc->pc_env.le_ctx);
+
+out_set:
+	if (pc->pc_set != NULL) {
+		struct ptlrpc_request_set *set = pc->pc_set;
+
+		spin_lock(&pc->pc_lock);
+		pc->pc_set = NULL;
+		spin_unlock(&pc->pc_lock);
+		ptlrpc_set_destroy(set);
 	}
+	clear_bit(LIOD_BIND, &pc->pc_flags);
+
+out:
+	clear_bit(LIOD_START, &pc->pc_flags);
 	return rc;
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/16] staging/lustre/build: fix compilation issue with is_compat_task
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (11 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 12/16] staging/lustre/build: clean up unused variables and dead code Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 14/16] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Dmitry Eremin, Peng Tao, Andreas Dilger

From: Dmitry Eremin <dmitry.eremin@intel.com>

After removing LIBCFS_HAVE_IS_COMPAT_TASK test we have a
compilation issue with kernels configured without CONFIG_COMPAT.

Lustre-change: http://review.whamcloud.com/7118
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2800
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/lustre/llite/llite_internal.h   |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index c326ff2..b6a22b6 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -650,7 +650,12 @@ static inline int ll_need_32bit_api(struct ll_sb_info *sbi)
 #if BITS_PER_LONG == 32
 	return 1;
 #else
-	return unlikely(is_compat_task() || (sbi->ll_flags & LL_SBI_32BIT_API));
+	return unlikely(
+#ifdef CONFIG_COMPAT
+		is_compat_task() ||
+#endif
+		(sbi->ll_flags & LL_SBI_32BIT_API)
+	);
 #endif
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 14/16] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (12 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 13/16] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH 15/16] staging/lustre/hsm: Add hsm_release feature Peng Tao
  2013-11-26  2:05 ` [PATCH] staging/lustre/llite: extended attribute cache Peng Tao
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Amir Shehata, Peng Tao, Andreas Dilger

From: Amir Shehata <amir.shehata@intel.com>

When a system runs out of memory and the function
ptlrpc_register_bulk() is called from ptl_send_rpc() the call to
LNetMEAttach() fails due to failure to allocate memory.  This forces
the code into an error path, which most probably previously went
untested.  The error path:
if (rc != 0) {
        CERROR("%s: LNetMEAttach failed x"LPU64"/%d: rc = %dn",
                desc->bd_export->exp_obd->obd_name, xid,
                posted_md, rc);
        break;
}
This print assumes that desc->bd_export is not NULL.  However, it is.
In fact it is expected to be NULL.  desc->bd_import is the correct
structure to access in this case.

Lustre-change: http://review.whamcloud.com/7121
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3585
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index 499e4be..23b259d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -180,7 +180,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 				  LNET_UNLINK, LNET_INS_AFTER, &me_h);
 		if (rc != 0) {
 			CERROR("%s: LNetMEAttach failed x"LPU64"/%d: rc = %d\n",
-			       desc->bd_export->exp_obd->obd_name, xid,
+			       desc->bd_import->imp_obd->obd_name, xid,
 			       posted_md, rc);
 			break;
 		}
@@ -190,7 +190,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 				  &desc->bd_mds[posted_md]);
 		if (rc != 0) {
 			CERROR("%s: LNetMDAttach failed x"LPU64"/%d: rc = %d\n",
-			       desc->bd_export->exp_obd->obd_name, xid,
+			       desc->bd_import->imp_obd->obd_name, xid,
 			       posted_md, rc);
 			rc2 = LNetMEUnlink(me_h);
 			LASSERT(rc2 == 0);
@@ -220,7 +220,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 	/* Holler if peer manages to touch buffers before he knows the xid */
 	if (desc->bd_md_count != total_md)
 		CWARN("%s: Peer %s touched %d buffers while I registered\n",
-		      desc->bd_export->exp_obd->obd_name, libcfs_id2str(peer),
+		      desc->bd_import->imp_obd->obd_name, libcfs_id2str(peer),
 		      total_md - desc->bd_md_count);
 	spin_unlock(&desc->bd_lock);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 15/16] staging/lustre/hsm: Add hsm_release feature.
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (13 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 14/16] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  2013-11-26  2:05 ` [PATCH] staging/lustre/llite: extended attribute cache Peng Tao
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jinshan Xiong, Aurelien Degremont, Peng Tao,
	Andreas Dilger

From: Jinshan Xiong <jinshan.xiong@intel.com>

HSM Release is one of the key feature of HSM. To perform HSM
release, clients need to acquire the file lease exclusivelt and
flush dirty cache from clients. A special close REQ will be sent
to the MDT to release the lease and get rid of OST objects.

Lustre-change: http://review.whamcloud.com/7028
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1333
Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   14 +++-
 .../lustre/lustre/include/lustre/lustre_user.h     |   11 ++-
 .../lustre/lustre/include/lustre_req_layout.h      |    2 +
 drivers/staging/lustre/lustre/include/obd.h        |    6 +-
 .../staging/lustre/lustre/lclient/lcommon_misc.c   |    4 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   24 +++++-
 drivers/staging/lustre/lustre/llite/file.c         |   87 +++++++++++++++++---
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    2 +-
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |   16 ++++
 drivers/staging/lustre/lustre/lov/lov_object.c     |   35 +++++---
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   26 +++++-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   23 +++++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   19 +++++
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    7 ++
 15 files changed, 242 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index e4cb1ad..c173bba 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1762,6 +1762,7 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 #define OBD_MD_FLRMTRGETFACL (0x0008000000000000ULL) /* lfs rgetfacl case */
 
 #define OBD_MD_FLDATAVERSION (0x0010000000000000ULL) /* iversion sum */
+#define OBD_MD_FLRELEASED    (0x0020000000000000ULL) /* file released */
 
 #define OBD_MD_FLGETATTR (OBD_MD_FLID    | OBD_MD_FLATIME | OBD_MD_FLMTIME | \
 			  OBD_MD_FLCTIME | OBD_MD_FLSIZE  | OBD_MD_FLBLKSZ | \
@@ -2398,6 +2399,7 @@ extern void lustre_swab_mdt_rec_setattr (struct mdt_rec_setattr *sa);
 					      * delegation, succeed if it's not
 					      * being opened with conflict mode.
 					      */
+#define MDS_OPEN_RELEASE   02000000000000ULL /* Open the file for HSM release */
 
 /* permission for create non-directory file */
 #define MAY_CREATE      (1 << 7)
@@ -2416,7 +2418,7 @@ extern void lustre_swab_mdt_rec_setattr (struct mdt_rec_setattr *sa);
 /* lfs rgetfacl permission check */
 #define MAY_RGETFACL    (1 << 14)
 
-enum {
+enum mds_op_bias {
 	MDS_CHECK_SPLIT		= 1 << 0,
 	MDS_CROSS_REF		= 1 << 1,
 	MDS_VTX_BYPASS		= 1 << 2,
@@ -2429,6 +2431,7 @@ enum {
 	MDS_DATA_MODIFIED	= 1 << 9,
 	MDS_CREATE_VOLATILE	= 1 << 10,
 	MDS_OWNEROVERRIDE	= 1 << 11,
+	MDS_HSM_RELEASE		= 1 << 12,
 };
 
 /* instance of mdt_reint_rec */
@@ -3752,5 +3755,14 @@ struct mdc_swap_layouts {
 
 void lustre_swab_swap_layouts(struct mdc_swap_layouts *msl);
 
+struct close_data {
+	struct lustre_handle	cd_handle;
+	struct lu_fid		cd_fid;
+	__u64			cd_data_version;
+	__u64			cd_reserved[8];
+};
+
+void lustre_swab_close_data(struct close_data *data);
+
 #endif
 /** @} lustreidl */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 631f026..c7abb80 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -623,10 +623,13 @@ struct if_quotactl {
 };
 
 /* swap layout flags */
-#define	SWAP_LAYOUTS_CHECK_DV1		(1 << 0)
-#define	SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
-#define	SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
-#define	SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+#define SWAP_LAYOUTS_CHECK_DV1		(1 << 0)
+#define SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
+#define SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
+#define SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+
+/* Swap XATTR_NAME_HSM as well, only on the MDT so far */
+#define SWAP_LAYOUTS_MDS_HSM		(1 << 31)
 struct lustre_swap_layouts {
 	__u64	sl_flags;
 	__u32	sl_fd;
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index 2832569..a75f4c6 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -164,6 +164,7 @@ extern struct req_format RQF_UPDATE_OBJ;
  */
 extern struct req_format RQF_MDS_GETATTR_NAME;
 extern struct req_format RQF_MDS_CLOSE;
+extern struct req_format RQF_MDS_RELEASE_CLOSE;
 extern struct req_format RQF_MDS_PIN;
 extern struct req_format RQF_MDS_UNPIN;
 extern struct req_format RQF_MDS_CONNECT;
@@ -262,6 +263,7 @@ extern struct req_msg_field RMF_GETINFO_VAL;
 extern struct req_msg_field RMF_GETINFO_VALLEN;
 extern struct req_msg_field RMF_GETINFO_KEY;
 extern struct req_msg_field RMF_IDX_INFO;
+extern struct req_msg_field RMF_CLOSE_DATA;
 
 /*
  * connection handle received in MDS_CONNECT request.
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f0c1773..3247d1d 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1070,7 +1070,7 @@ struct md_op_data {
 	struct obd_capa	*op_capa2;
 
 	/* Various operation flags. */
-	__u32		   op_bias;
+	enum mds_op_bias        op_bias;
 
 	/* Operation type */
 	__u32		   op_opc;
@@ -1084,6 +1084,10 @@ struct md_op_data {
 	/* used to transfer info between the stacks of MD client
 	 * see enum op_cli_flags */
 	__u32			op_cli_flags;
+
+	/* File object data version for HSM release, on client */
+	__u64			op_data_version;
+	struct lustre_handle	op_lease_handle;
 };
 
 enum op_cli_flags {
diff --git a/drivers/staging/lustre/lustre/lclient/lcommon_misc.c b/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
index 2b4dbee..e04c2d3 100644
--- a/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
@@ -140,7 +140,9 @@ int cl_get_grouplock(struct cl_object *obj, unsigned long gid, int nonblock,
 
 	rc = cl_io_init(env, io, CIT_MISC, io->ci_obj);
 	if (rc) {
-		LASSERT(rc < 0);
+		/* Does not make sense to take GL for released layout */
+		if (rc > 0)
+			rc = -ENOTSUPP;
 		cl_env_put(env, &refcheck);
 		return rc;
 	}
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 1f07903..22d0acc9 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1809,8 +1809,28 @@ out_rmdir:
 			return -EFAULT;
 		}
 
-		rc = obd_iocontrol(cmd, ll_i2mdexp(inode), totalsize,
-				   hur, NULL);
+		if (hur->hur_request.hr_action == HUA_RELEASE) {
+			const struct lu_fid *fid;
+			struct inode *f;
+			int i;
+
+			for (i = 0; i < hur->hur_request.hr_itemcount; i++) {
+				fid = &hur->hur_user_item[i].hui_fid;
+				f = search_inode_for_lustre(inode->i_sb, fid);
+				if (IS_ERR(f)) {
+					rc = PTR_ERR(f);
+					break;
+				}
+
+				rc = ll_hsm_release(f);
+				iput(f);
+				if (rc != 0)
+					break;
+			}
+		} else {
+			rc = obd_iocontrol(cmd, ll_i2mdexp(inode), totalsize,
+					   hur, NULL);
+		}
 
 		OBD_FREE_LARGE(hur, totalsize);
 
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index f36c5d8..7a93936 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -115,7 +115,8 @@ out:
 
 static int ll_close_inode_openhandle(struct obd_export *md_exp,
 				     struct inode *inode,
-				     struct obd_client_handle *och)
+				     struct obd_client_handle *och,
+				     const __u64 *data_version)
 {
 	struct obd_export *exp = ll_i2mdexp(inode);
 	struct md_op_data *op_data;
@@ -139,6 +140,13 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		GOTO(out, rc = -ENOMEM); // XXX We leak openhandle and request here.
 
 	ll_prepare_close(inode, op_data, och);
+	if (data_version != NULL) {
+		/* Pass in data_version implies release. */
+		op_data->op_bias |= MDS_HSM_RELEASE;
+		op_data->op_data_version = *data_version;
+		op_data->op_lease_handle = och->och_lease_handle;
+		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
+	}
 	epoch_close = (op_data->op_flags & MF_EPOCH_CLOSE);
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
 	if (rc == -EAGAIN) {
@@ -167,14 +175,20 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		spin_unlock(&lli->lli_lock);
 	}
 
-	ll_finish_md_op_data(op_data);
-
 	if (rc == 0) {
 		rc = ll_objects_destroy(req, inode);
 		if (rc)
 			CERROR("inode %lu ll_objects destroy: rc = %d\n",
 			       inode->i_ino, rc);
 	}
+	if (rc == 0 && op_data->op_bias & MDS_HSM_RELEASE) {
+		struct mdt_body *body;
+		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+		if (!(body->valid & OBD_MD_FLRELEASED))
+			rc = -EBUSY;
+	}
+
+	ll_finish_md_op_data(op_data);
 
 out:
 	if (exp_connect_som(exp) && !epoch_close &&
@@ -224,7 +238,7 @@ int ll_md_real_close(struct inode *inode, int flags)
 	if (och) { /* There might be a race and somebody have freed this och
 		      already */
 		rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					       inode, och);
+					       inode, och, NULL);
 	}
 
 	return rc;
@@ -254,7 +268,7 @@ int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 	}
 
 	if (fd->fd_och != NULL) {
-		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och);
+		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och, NULL);
 		fd->fd_och = NULL;
 		GOTO(out, rc);
 	}
@@ -718,7 +732,7 @@ static int ll_md_blocking_lease_ast(struct ldlm_lock *lock,
  * Acquire a lease and open the file.
  */
 struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
-					fmode_t fmode)
+					fmode_t fmode, __u64 open_flags)
 {
 	struct lookup_intent it = { .it_op = IT_OPEN };
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
@@ -786,7 +800,8 @@ struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
 	/* To tell the MDT this openhandle is from the same owner */
 	op_data->op_handle = old_handle;
 
-	it.it_flags = fmode | MDS_OPEN_LOCK | MDS_OPEN_BY_FID | MDS_OPEN_LEASE;
+	it.it_flags = fmode | open_flags;
+	it.it_flags |= MDS_OPEN_LOCK | MDS_OPEN_BY_FID | MDS_OPEN_LEASE;
 	rc = md_intent_lock(sbi->ll_md_exp, op_data, NULL, 0, &it, 0, &req,
 				ll_md_blocking_lease_ast,
 	/* LDLM_FL_NO_LRU: To not put the lease lock into LRU list, otherwise
@@ -832,7 +847,7 @@ struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
 	return och;
 
 out_close:
-	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och);
+	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och, NULL);
 	if (rc2)
 		CERROR("Close openhandle returned %d\n", rc2);
 
@@ -877,7 +892,8 @@ int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 	if (lease_broken != NULL)
 		*lease_broken = cancelled;
 
-	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och);
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
+				       NULL);
 	return rc;
 }
 EXPORT_SYMBOL(ll_lease_close);
@@ -1686,8 +1702,8 @@ int ll_release_openhandle(struct dentry *dentry, struct lookup_intent *it)
 	ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 
 	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-				       inode, och);
- out:
+				       inode, och, NULL);
+out:
 	/* this one is in place of ll_file_open */
 	if (it_disposition(it, DISP_ENQ_OPEN_REF)) {
 		ptlrpc_req_finished(it->d.lustre.it_data);
@@ -1892,6 +1908,53 @@ out:
 	return rc;
 }
 
+/*
+ * Trigger a HSM release request for the provided inode.
+ */
+int ll_hsm_release(struct inode *inode)
+{
+	struct cl_env_nest nest;
+	struct lu_env *env;
+	struct obd_client_handle *och = NULL;
+	__u64 data_version = 0;
+	int rc;
+
+
+	CDEBUG(D_INODE, "%s: Releasing file "DFID".\n",
+	       ll_get_fsname(inode->i_sb, NULL, 0),
+	       PFID(&ll_i2info(inode)->lli_fid));
+
+	och = ll_lease_open(inode, NULL, FMODE_WRITE, MDS_OPEN_RELEASE);
+	if (IS_ERR(och))
+		GOTO(out, rc = PTR_ERR(och));
+
+	/* Grab latest data_version and [am]time values */
+	rc = ll_data_version(inode, &data_version, 1);
+	if (rc != 0)
+		GOTO(out, rc);
+
+	env = cl_env_nested_get(&nest);
+	if (IS_ERR(env))
+		GOTO(out, rc = PTR_ERR(env));
+
+	ll_merge_lvb(env, inode);
+	cl_env_nested_put(&nest, env);
+
+	/* Release the file.
+	 * NB: lease lock handle is released in mdc_hsm_release_pack() because
+	 * we still need it to pack l_remote_handle to MDT. */
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
+				       &data_version);
+	och = NULL;
+
+
+out:
+	if (och != NULL && !IS_ERR(och)) /* close the file */
+		ll_lease_close(och, inode, NULL);
+
+	return rc;
+}
+
 struct ll_swap_stack {
 	struct iattr		 ia1, ia2;
 	__u64			 dv1, dv2;
@@ -2319,7 +2382,7 @@ long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		CDEBUG(D_INODE, "Set lease with mode %d\n", mode);
 
 		/* apply for lease */
-		och = ll_lease_open(inode, file, mode);
+		och = ll_lease_open(inode, file, mode, 0);
 		if (IS_ERR(och))
 			return PTR_ERR(och);
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index b6a22b6..e479a69 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -787,9 +787,10 @@ int ll_get_grouplock(struct inode *inode, struct file *file, unsigned long arg);
 int ll_put_grouplock(struct inode *inode, struct file *file, unsigned long arg);
 int ll_fid2path(struct inode *inode, void *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int extent_lock);
+int ll_hsm_release(struct inode *inode);
 
 struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
-					fmode_t mode);
+					fmode_t mode, __u64 flags);
 int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 		   bool *lease_broken);
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 33173fc..25973de 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -138,7 +138,7 @@ int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 			lli->lli_layout_gen,
 			conf->u.coc_md->lsm->lsm_layout_gen);
 
-		lli->lli_has_smd = true;
+		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 		lli->lli_layout_gen = conf->u.coc_md->lsm->lsm_layout_gen;
 	} else {
 		CDEBUG(D_VFSTRACE, "layout lock destroyed: %u.\n",
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 4276124..3965d5e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -168,6 +168,22 @@ enum lov_layout_type {
 	LLT_NR
 };
 
+static inline char *llt2str(enum lov_layout_type llt)
+{
+	switch (llt) {
+	case LLT_EMPTY:
+		return "EMPTY";
+	case LLT_RAID0:
+		return "RAID0";
+	case LLT_RELEASED:
+		return "RELEASED";
+	case LLT_NR:
+		LBUG();
+	}
+	LBUG();
+	return "";
+}
+
 /**
  * lov-specific file state.
  *
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index cf2fa8a..df8b5b5 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -393,13 +393,13 @@ static int lov_print_empty(const struct lu_env *env, void *cookie,
 static int lov_print_raid0(const struct lu_env *env, void *cookie,
 			   lu_printer_t p, const struct lu_object *o)
 {
-	struct lov_object       *lov = lu2lov(o);
-	struct lov_layout_raid0 *r0  = lov_r0(lov);
-	struct lov_stripe_md    *lsm = lov->lo_lsm;
-	int i;
+	struct lov_object	*lov = lu2lov(o);
+	struct lov_layout_raid0	*r0  = lov_r0(lov);
+	struct lov_stripe_md	*lsm = lov->lo_lsm;
+	int			 i;
 
-	(*p)(env, cookie, "stripes: %d, %svalid, lsm{%p 0x%08X %d %u %u}: \n",
-		r0->lo_nr, lov->lo_layout_invalid ? "in" : "", lsm,
+	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
+		r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 		lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
 		lsm->lsm_stripe_count, lsm->lsm_layout_gen);
 	for (i = 0; i < r0->lo_nr; ++i) {
@@ -408,8 +408,9 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 		if (r0->lo_sub[i] != NULL) {
 			sub = lovsub2lu(r0->lo_sub[i]);
 			lu_object_print(env, cookie, p, sub);
-		} else
+		} else {
 			(*p)(env, cookie, "sub %d absent\n", i);
+		}
 	}
 	return 0;
 }
@@ -417,7 +418,14 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 static int lov_print_released(const struct lu_env *env, void *cookie,
 				lu_printer_t p, const struct lu_object *o)
 {
-	(*p)(env, cookie, "released\n");
+	struct lov_object	*lov = lu2lov(o);
+	struct lov_stripe_md	*lsm = lov->lo_lsm;
+
+	(*p)(env, cookie,
+		"released: %s, lsm{%p 0x%08X %d %u %u}:\n",
+		lov->lo_layout_invalid ? "invalid" : "valid", lsm,
+		lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
+		lsm->lsm_stripe_count, lsm->lsm_layout_gen);
 	return 0;
 }
 
@@ -662,6 +670,10 @@ static int lov_layout_change(const struct lu_env *unused,
 		return PTR_ERR(env);
 	}
 
+	CDEBUG(D_INODE, DFID" from %s to %s\n",
+	       PFID(lu_object_fid(lov2lu(lov))),
+	       llt2str(lov->lo_type), llt2str(llt));
+
 	old_ops = &lov_dispatch[lov->lo_type];
 	new_ops = &lov_dispatch[llt];
 
@@ -750,8 +762,9 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 	if (conf->u.coc_md != NULL)
 		lsm = conf->u.coc_md->lsm;
 	if ((lsm == NULL && lov->lo_lsm == NULL) ||
-	    (lsm != NULL && lov->lo_lsm != NULL &&
-	     lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen)) {
+	    ((lsm != NULL && lov->lo_lsm != NULL) &&
+	     (lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen) &&
+	     (lov->lo_lsm->lsm_pattern == lsm->lsm_pattern))) {
 		/* same version of layout */
 		lov->lo_layout_invalid = false;
 		GOTO(out, result = 0);
@@ -767,6 +780,8 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 
 out:
 	lov_conf_unlock(lov);
+	CDEBUG(D_INODE, DFID" lo_layout_invalid=%d\n",
+	       PFID(lu_object_fid(lov2lu(lov))), lov->lo_layout_invalid);
 	return result;
 }
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 3e77020..91f6876 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -179,7 +179,8 @@ static __u64 mds_pack_open_flags(__u64 flags, __u32 mode)
 	__u64 cr_flags = (flags & (FMODE_READ | FMODE_WRITE |
 				   MDS_OPEN_HAS_EA | MDS_OPEN_HAS_OBJS |
 				   MDS_OPEN_OWNEROVERRIDE | MDS_OPEN_LOCK |
-				   MDS_OPEN_BY_FID | MDS_OPEN_LEASE));
+				   MDS_OPEN_BY_FID | MDS_OPEN_LEASE |
+				   MDS_OPEN_RELEASE));
 	if (flags & O_CREAT)
 		cr_flags |= MDS_OPEN_CREAT;
 	if (flags & O_EXCL)
@@ -490,6 +491,28 @@ void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, int flags,
 	}
 }
 
+static void mdc_hsm_release_pack(struct ptlrpc_request *req,
+				 struct md_op_data *op_data)
+{
+	if (op_data->op_bias & MDS_HSM_RELEASE) {
+		struct close_data *data;
+		struct ldlm_lock *lock;
+
+		data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+		LASSERT(data != NULL);
+
+		lock = ldlm_handle2lock(&op_data->op_lease_handle);
+		if (lock != NULL) {
+			data->cd_handle = lock->l_remote_handle;
+			ldlm_lock_put(lock);
+		}
+		ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+
+		data->cd_data_version = op_data->op_data_version;
+		data->cd_fid = op_data->op_fid2;
+	}
+}
+
 void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 {
 	struct mdt_ioepoch *epoch;
@@ -501,6 +524,7 @@ void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 	mdc_setattr_pack_rec(rec, op_data);
 	mdc_pack_capa(req, &RMF_CAPA1, op_data->op_capa1);
 	mdc_ioepoch_pack(epoch, op_data);
+	mdc_hsm_release_pack(req, op_data);
 }
 
 static int mdc_req_avail(struct client_obd *cli, struct mdc_cache_waiter *mcw)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 8373778..0b43251 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -800,10 +800,27 @@ int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 {
 	struct obd_device     *obd = class_exp2obd(exp);
 	struct ptlrpc_request *req;
-	int		    rc;
+	struct req_format     *req_fmt;
+	int                    rc;
+	int		       saved_rc = 0;
+
+
+	req_fmt = &RQF_MDS_CLOSE;
+	if (op_data->op_bias & MDS_HSM_RELEASE) {
+		req_fmt = &RQF_MDS_RELEASE_CLOSE;
+
+		/* allocate a FID for volatile file */
+		rc = mdc_fid_alloc(exp, &op_data->op_fid2, op_data);
+		if (rc < 0) {
+			CERROR("%s: "DFID" failed to allocate FID: %d\n",
+			       obd->obd_name, PFID(&op_data->op_fid1), rc);
+			/* save the errcode and proceed to close */
+			saved_rc = rc;
+		}
+	}
 
 	*request = NULL;
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_MDS_CLOSE);
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
 	if (req == NULL)
 		return -ENOMEM;
 
@@ -893,7 +910,7 @@ int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 	}
 	*request = req;
 	mdc_close_handle_reply(req, op_data, rc);
-	return rc;
+	return rc < 0 ? rc : saved_rc;
 }
 
 int mdc_done_writing(struct obd_export *exp, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 4d6ac686..c4381eb 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -145,6 +145,14 @@ static const struct req_msg_field *mdt_close_client[] = {
 	&RMF_CAPA1
 };
 
+static const struct req_msg_field *mdt_release_close_client[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_MDT_EPOCH,
+	&RMF_REC_REINT,
+	&RMF_CAPA1,
+	&RMF_CLOSE_DATA
+};
+
 static const struct req_msg_field *obd_statfs_server[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_OBD_STATFS
@@ -666,6 +674,7 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_GETXATTR,
 	&RQF_MDS_SYNC,
 	&RQF_MDS_CLOSE,
+	&RQF_MDS_RELEASE_CLOSE,
 	&RQF_MDS_PIN,
 	&RQF_MDS_UNPIN,
 	&RQF_MDS_READPAGE,
@@ -885,6 +894,11 @@ struct req_msg_field RMF_PTLRPC_BODY =
 		    sizeof(struct ptlrpc_body), lustre_swab_ptlrpc_body, NULL);
 EXPORT_SYMBOL(RMF_PTLRPC_BODY);
 
+struct req_msg_field RMF_CLOSE_DATA =
+	DEFINE_MSGF("data_version", 0,
+		    sizeof(struct close_data), lustre_swab_close_data, NULL);
+EXPORT_SYMBOL(RMF_CLOSE_DATA);
+
 struct req_msg_field RMF_OBD_STATFS =
 	DEFINE_MSGF("obd_statfs", 0,
 		    sizeof(struct obd_statfs), lustre_swab_obd_statfs, NULL);
@@ -1412,6 +1426,11 @@ struct req_format RQF_MDS_CLOSE =
 			mdt_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_CLOSE);
 
+struct req_format RQF_MDS_RELEASE_CLOSE =
+	DEFINE_REQ_FMT0("MDS_CLOSE",
+			mdt_release_close_client, mds_last_unlink_server);
+EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
+
 struct req_format RQF_MDS_PIN =
 	DEFINE_REQ_FMT0("MDS_PIN",
 			mdt_body_capa, mdt_body_only);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index d831bd7..464479c 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -2565,3 +2565,10 @@ void lustre_swab_swap_layouts(struct mdc_swap_layouts *msl)
 	__swab64s(&msl->msl_flags);
 }
 EXPORT_SYMBOL(lustre_swab_swap_layouts);
+
+void lustre_swab_close_data(struct close_data *cd)
+{
+	lustre_swab_lu_fid(&cd->cd_fid);
+	__swab64s(&cd->cd_data_version);
+}
+EXPORT_SYMBOL(lustre_swab_close_data);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] staging/lustre/llite: extended attribute cache
  2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
                   ` (14 preceding siblings ...)
  2013-11-26  2:05 ` [PATCH 15/16] staging/lustre/hsm: Add hsm_release feature Peng Tao
@ 2013-11-26  2:05 ` Peng Tao
  15 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  2:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Andrew Perepechko, Peng Tao, Andreas Dilger

From: Andrew Perepechko <andrew_perepechko@xyratex.com>

This patch implements an extended attribute cache for
a Lustre client. It is organized as a write-through
cache: reads are performed from cache, updates are sent
synchronously to the MDS. An additional inode bit
MDS_INODELOCK_XATTR is added to protect the cache.

Lustre-change: http://review.whamcloud.com/5537
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2869
Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/linux/lustre_lite.h      |    1 +
 .../lustre/lustre/include/lustre/lustre_idl.h      |   12 +-
 .../lustre/lustre/include/lustre_req_layout.h      |    3 +
 drivers/staging/lustre/lustre/include/md_object.h  |    4 +-
 drivers/staging/lustre/lustre/include/obd.h        |    5 +
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    2 +
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |   13 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   36 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   23 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  |   37 ++
 drivers/staging/lustre/lustre/llite/namei.c        |    4 +
 drivers/staging/lustre/lustre/llite/super25.c      |    4 +
 drivers/staging/lustre/lustre/llite/xattr.c        |  102 ++--
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  640 ++++++++++++++++++++
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    1 +
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   63 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   34 ++
 19 files changed, 925 insertions(+), 62 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_cache.c

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_lite.h b/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
index 9e5df8d..df93912 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
@@ -88,6 +88,7 @@ enum {
 	 LPROC_LL_ALLOC_INODE,
 	 LPROC_LL_SETXATTR,
 	 LPROC_LL_GETXATTR,
+	 LPROC_LL_GETXATTR_HITS,
 	 LPROC_LL_LISTXATTR,
 	 LPROC_LL_REMOVEXATTR,
 	 LPROC_LL_INODE_PERM,
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index c173bba..500f5d2 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1369,7 +1369,7 @@ extern void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 				OBD_CONNECT_EINPROGRESS | \
 				OBD_CONNECT_LIGHTWEIGHT | OBD_CONNECT_UMASK | \
 				OBD_CONNECT_LVB_TYPE | OBD_CONNECT_LAYOUTLOCK |\
-				OBD_CONNECT_PINGLESS)
+				OBD_CONNECT_PINGLESS | OBD_CONNECT_MAX_EASIZE)
 #define OST_CONNECT_SUPPORTED  (OBD_CONNECT_SRVLOCK | OBD_CONNECT_GRANT | \
 				OBD_CONNECT_REQPORTAL | OBD_CONNECT_VERSION | \
 				OBD_CONNECT_TRUNCLOCK | OBD_CONNECT_INDEX | \
@@ -1753,7 +1753,9 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 #define OBD_MD_FLCKSPLIT     (0x0000080000000000ULL) /* Check split on server */
 #define OBD_MD_FLCROSSREF    (0x0000100000000000ULL) /* Cross-ref case */
 #define OBD_MD_FLGETATTRLOCK (0x0000200000000000ULL) /* Get IOEpoch attributes
-						      * under lock */
+						      * under lock; for xattr
+						      * requests means the
+						      * client holds the lock */
 #define OBD_MD_FLOBJCOUNT    (0x0000400000000000ULL) /* for multiple destroy */
 
 #define OBD_MD_FLRMTLSETFACL (0x0001000000000000ULL) /* lfs lsetfacl case */
@@ -1770,6 +1772,9 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 			  OBD_MD_FLGID   | OBD_MD_FLFLAGS | OBD_MD_FLNLINK | \
 			  OBD_MD_FLGENER | OBD_MD_FLRDEV  | OBD_MD_FLGROUP)
 
+#define OBD_MD_FLXATTRLOCKED OBD_MD_FLGETATTRLOCK
+#define OBD_MD_FLXATTRALL (OBD_MD_FLXATTR | OBD_MD_FLXATTRLS)
+
 /* don't forget obdo_fid which is way down at the bottom so it can
  * come after the definition of llog_cookie */
 
@@ -2142,8 +2147,9 @@ extern void lustre_swab_generic_32s (__u32 *val);
 #define MDS_INODELOCK_OPEN   0x000004       /* For opened files */
 #define MDS_INODELOCK_LAYOUT 0x000008       /* for layout */
 #define MDS_INODELOCK_PERM   0x000010       /* for permission */
+#define MDS_INODELOCK_XATTR  0x000020       /* extended attributes */
 
-#define MDS_INODELOCK_MAXSHIFT 4
+#define MDS_INODELOCK_MAXSHIFT 5
 /* This FULL lock is useful to take on unlink sort of operations */
 #define MDS_INODELOCK_FULL ((1<<(MDS_INODELOCK_MAXSHIFT+1))-1)
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index a75f4c6..a83db61 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -230,6 +230,7 @@ extern struct req_format RQF_LDLM_INTENT_GETATTR;
 extern struct req_format RQF_LDLM_INTENT_OPEN;
 extern struct req_format RQF_LDLM_INTENT_CREATE;
 extern struct req_format RQF_LDLM_INTENT_UNLINK;
+extern struct req_format RQF_LDLM_INTENT_GETXATTR;
 extern struct req_format RQF_LDLM_INTENT_QUOTA;
 extern struct req_format RQF_LDLM_CANCEL;
 extern struct req_format RQF_LDLM_CALLBACK;
@@ -279,6 +280,8 @@ extern struct req_msg_field RMF_LAYOUT_INTENT;
 extern struct req_msg_field RMF_MDT_MD;
 extern struct req_msg_field RMF_REC_REINT;
 extern struct req_msg_field RMF_EADATA;
+extern struct req_msg_field RMF_EAVALS;
+extern struct req_msg_field RMF_EAVALS_LENS;
 extern struct req_msg_field RMF_ACL;
 extern struct req_msg_field RMF_LOGCOOKIES;
 extern struct req_msg_field RMF_CAPA1;
diff --git a/drivers/staging/lustre/lustre/include/md_object.h b/drivers/staging/lustre/lustre/include/md_object.h
index daf93af..7b45b47 100644
--- a/drivers/staging/lustre/lustre/include/md_object.h
+++ b/drivers/staging/lustre/lustre/include/md_object.h
@@ -352,8 +352,8 @@ struct md_device_operations {
 	int (*mdo_root_get)(const struct lu_env *env, struct md_device *m,
 			    struct lu_fid *f);
 
-	int (*mdo_maxsize_get)(const struct lu_env *env, struct md_device *m,
-			       int *md_size, int *cookie_size);
+	int (*mdo_maxeasize_get)(const struct lu_env *env, struct md_device *m,
+				int *easize);
 
 	int (*mdo_statfs)(const struct lu_env *env, struct md_device *m,
 			  struct obd_statfs *sfs);
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 3247d1d..c3470ce 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1022,6 +1022,7 @@ struct lu_context;
 #define IT_LAYOUT   (1 << 10)
 #define IT_QUOTA_DQACQ (1 << 11)
 #define IT_QUOTA_CONN  (1 << 12)
+#define IT_SETXATTR (1 << 13)
 
 static inline int it_to_lock_mode(struct lookup_intent *it)
 {
@@ -1031,6 +1032,10 @@ static inline int it_to_lock_mode(struct lookup_intent *it)
 	else if (it->it_op & (IT_READDIR | IT_GETATTR | IT_OPEN | IT_LOOKUP |
 			      IT_LAYOUT))
 		return LCK_CR;
+	else if (it->it_op &  IT_GETXATTR)
+		return LCK_PR;
+	else if (it->it_op &  IT_SETXATTR)
+		return LCK_PW;
 
 	LASSERTF(0, "Invalid it_op: %d\n", it->it_op);
 	return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 4d1f62c..36bdeab 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -466,6 +466,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_LOCK_STATE_WAIT_INTR	       0x1402
 #define OBD_FAIL_LOV_INIT			    0x1403
 #define OBD_FAIL_GLIMPSE_DELAY			    0x1404
+#define OBD_FAIL_LLITE_XATTR_ENOMEM		    0x1405
 
 #define OBD_FAIL_FID_INDIR	0x1501
 #define OBD_FAIL_FID_INLMA	0x1502
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index ef826e9..95eff79 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -145,6 +145,8 @@ char *ldlm_it2str(int it)
 		return "getxattr";
 	case IT_LAYOUT:
 		return "layout";
+	case IT_SETXATTR:
+		return "setxattr";
 	default:
 		CERROR("Unknown intent %d\n", it);
 		return "UNKNOWN";
diff --git a/drivers/staging/lustre/lustre/llite/Makefile b/drivers/staging/lustre/lustre/llite/Makefile
index f493e07..bb34b8b 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -2,7 +2,7 @@ obj-$(CONFIG_LUSTRE_FS) += lustre.o
 obj-$(CONFIG_LUSTRE_LLITE_LLOOP) += llite_lloop.o
 lustre-y := dcache.o dir.o file.o llite_close.o llite_lib.o llite_nfs.o \
 	    rw.o lproc_llite.o namei.o symlink.o llite_mmap.o \
-	    xattr.o remote_perm.o llite_rmtacl.o llite_capa.o \
+	    xattr.o xattr_cache.o remote_perm.o llite_rmtacl.o llite_capa.o \
 	    rw26.o super25.o statahead.o \
 	    ../lclient/glimpse.o ../lclient/lcommon_cl.o ../lclient/lcommon_misc.o \
 	    vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 7a93936..1423da2 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2783,7 +2783,8 @@ int ll_have_md_lock(struct inode *inode, __u64 *bits,  ldlm_mode_t l_req_mode)
 }
 
 ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
-			    struct lustre_handle *lockh, __u64 flags)
+			    struct lustre_handle *lockh, __u64 flags,
+			    ldlm_mode_t mode)
 {
 	ldlm_policy_data_t policy = { .l_inodebits = {bits}};
 	struct lu_fid *fid;
@@ -2793,8 +2794,8 @@ ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
 	CDEBUG(D_INFO, "trying to match res "DFID"\n", PFID(fid));
 
 	rc = md_lock_match(ll_i2mdexp(inode), LDLM_FL_BLOCK_GRANTED|flags,
-			   fid, LDLM_IBITS, &policy,
-			   LCK_CR|LCK_CW|LCK_PR|LCK_PW, lockh);
+			   fid, LDLM_IBITS, &policy, mode, lockh);
+
 	return rc;
 }
 
@@ -3470,7 +3471,8 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 
 	/* mostly layout lock is caching on the local side, so try to match
 	 * it before grabbing layout lock mutex. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0);
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
+			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
 		rc = ll_layout_lock_set(&lockh, mode, inode, gen, false);
 		if (rc == 0)
@@ -3485,7 +3487,8 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 
 again:
 	/* try again. Maybe somebody else has done this. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0);
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
+			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
 		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
 		if (rc == -EAGAIN)
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index e479a69..1355de4 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -127,6 +127,8 @@ enum lli_flags {
 	LLIF_DATA_MODIFIED      = (1 << 6),
 	/* File is being restored */
 	LLIF_FILE_RESTORING	= (1 << 7),
+	/* Xattr cache is attached to the file */
+	LLIF_XATTR_CACHE	= (1 << 8),
 };
 
 struct ll_inode_info {
@@ -279,8 +281,27 @@ struct ll_inode_info {
 	struct mutex			lli_layout_mutex;
 	/* valid only inside LAYOUT ibits lock, protected by lli_layout_mutex */
 	__u32				lli_layout_gen;
+
+	struct rw_semaphore		lli_xattrs_list_rwsem;
+	struct mutex			lli_xattrs_enq_lock;
+	struct list_head		lli_xattrs;/* ll_xattr_entry->xe_list */
 };
 
+int ll_xattr_cache_destroy(struct inode *inode);
+
+int ll_xattr_cache_get(struct inode *inode,
+			const char *name,
+			char *buffer,
+			size_t size,
+			__u64 valid);
+
+int ll_xattr_cache_update(struct inode *inode,
+			const char *name,
+			const char *newval,
+			size_t size,
+			__u64 valid,
+			int flags);
+
 /*
  * Locking to guarantee consistency of non-atomic updates to long long i_size,
  * consistency between file size and KMS.
@@ -402,6 +423,7 @@ enum stats_track_type {
 #define LL_SBI_VERBOSE	0x10000 /* verbose mount/umount */
 #define LL_SBI_LAYOUT_LOCK    0x20000 /* layout lock support */
 #define LL_SBI_USER_FID2PATH  0x40000 /* allow fid2path by unprivileged users */
+#define LL_SBI_XATTR_CACHE    0x80000 /* support for xattr cache */
 
 #define LL_SBI_FLAGS {	\
 	"nolck",	\
@@ -409,6 +431,7 @@ enum stats_track_type {
 	"flock",	\
 	"xattr",	\
 	"acl",		\
+	"???",		\
 	"rmt_client",	\
 	"mds_capa",	\
 	"oss_capa",	\
@@ -421,7 +444,9 @@ enum stats_track_type {
 	"agl",		\
 	"verbose",	\
 	"layout",	\
-	"user_fid2path" }
+	"user_fid2path",\
+	"xattr",	\
+}
 
 /* default value for ll_sb_info->contention_time */
 #define SBI_DEFAULT_CONTENTION_SECONDS     60
@@ -461,7 +486,8 @@ struct ll_sb_info {
 	struct lu_fid	     ll_root_fid; /* root object fid */
 
 	int		       ll_flags;
-	unsigned int			  ll_umounting:1;
+	unsigned int		  ll_umounting:1,
+				  ll_xattr_cache_enabled:1;
 	struct list_head		ll_conn_chain; /* per-conn chain of SBs */
 	struct lustre_client_ocd  ll_lco;
 
@@ -732,7 +758,8 @@ extern int ll_inode_revalidate_it(struct dentry *, struct lookup_intent *,
 extern int ll_have_md_lock(struct inode *inode, __u64 *bits,
 			   ldlm_mode_t l_req_mode);
 extern ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
-				   struct lustre_handle *lockh, __u64 flags);
+				   struct lustre_handle *lockh, __u64 flags,
+				   ldlm_mode_t mode);
 int __ll_inode_revalidate_it(struct dentry *, struct lookup_intent *,
 			     __u64 bits);
 int ll_revalidate_nd(struct dentry *dentry, unsigned int flags);
@@ -1598,4 +1625,7 @@ int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
 int ll_layout_restore(struct inode *inode);
 
+int ll_xattr_init(void);
+void ll_xattr_fini(void);
+
 #endif /* LLITE_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index facc391..154bdb2 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -209,7 +209,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_FULL20   | OBD_CONNECT_64BITHASH|
 				  OBD_CONNECT_EINPROGRESS |
 				  OBD_CONNECT_JOBSTATS | OBD_CONNECT_LVB_TYPE |
-				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS;
+				  OBD_CONNECT_LAYOUTLOCK |
+				  OBD_CONNECT_PINGLESS | OBD_CONNECT_MAX_EASIZE;
 
 	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
 		data->ocd_connect_flags |= OBD_CONNECT_SOM;
@@ -383,6 +384,17 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 		sbi->ll_flags |= LL_SBI_LAYOUT_LOCK;
 	}
 
+	if (data->ocd_ibits_known & MDS_INODELOCK_XATTR) {
+		if (!(data->ocd_connect_flags & OBD_CONNECT_MAX_EASIZE)) {
+			LCONSOLE_INFO(
+				"%s: disabling xattr cache due to unknown maximum xattr size.\n",
+				dt);
+		} else {
+			sbi->ll_flags |= LL_SBI_XATTR_CACHE;
+			sbi->ll_xattr_cache_enabled = 1;
+		}
+	}
+
 	obd = class_name2obd(dt);
 	if (!obd) {
 		CERROR("DT %s: not setup or attached\n", dt);
@@ -922,6 +934,9 @@ void ll_lli_init(struct ll_inode_info *lli)
 	lli->lli_layout_gen = LL_LAYOUT_GEN_NONE;
 	lli->lli_clob = NULL;
 
+	init_rwsem(&lli->lli_xattrs_list_rwsem);
+	mutex_init(&lli->lli_xattrs_enq_lock);
+
 	LASSERT(lli->lli_vfs_inode.i_mode != 0);
 	if (S_ISDIR(lli->lli_vfs_inode.i_mode)) {
 		mutex_init(&lli->lli_readdir_mutex);
@@ -1194,6 +1209,8 @@ void ll_clear_inode(struct inode *inode)
 		lli->lli_symlink_name = NULL;
 	}
 
+	ll_xattr_cache_destroy(inode);
+
 	if (sbi->ll_flags & LL_SBI_RMT_CLIENT) {
 		LASSERT(lli->lli_posix_acl == NULL);
 		if (lli->lli_remote_perms) {
@@ -1752,7 +1769,9 @@ void ll_update_inode(struct inode *inode, struct lustre_md *md)
 			 * lock on the client and set LLIF_MDS_SIZE_LOCK holding
 			 * it. */
 			mode = ll_take_md_lock(inode, MDS_INODELOCK_UPDATE,
-					       &lockh, LDLM_FL_CBPENDING);
+					       &lockh, LDLM_FL_CBPENDING,
+					       LCK_CR | LCK_CW |
+					       LCK_PR | LCK_PW);
 			if (mode) {
 				if (lli->lli_flags & (LLIF_DONE_WRITING |
 						      LLIF_EPOCH_PENDING |
diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 4bf09c4..1ded16a 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -723,6 +723,41 @@ static int ll_sbi_flags_seq_show(struct seq_file *m, void *v)
 }
 LPROC_SEQ_FOPS_RO(ll_sbi_flags);
 
+static int ll_xattr_cache_seq_show(struct seq_file *m, void *v)
+{
+	struct super_block *sb = m->private;
+	struct ll_sb_info *sbi = ll_s2sbi(sb);
+	int rc;
+
+	rc = seq_printf(m, "%u\n", sbi->ll_xattr_cache_enabled);
+
+	return rc;
+}
+
+static ssize_t ll_xattr_cache_seq_write(struct file *file, const char *buffer,
+					size_t count, loff_t *off)
+{
+	struct seq_file *seq = file->private_data;
+	struct super_block *sb = seq->private;
+	struct ll_sb_info *sbi = ll_s2sbi(sb);
+	int val, rc;
+
+	rc = lprocfs_write_helper(buffer, count, &val);
+	if (rc)
+		return rc;
+
+	if (val != 0 && val != 1)
+		return -ERANGE;
+
+	if (val == 1 && !(sbi->ll_flags & LL_SBI_XATTR_CACHE))
+		return -ENOTSUPP;
+
+	sbi->ll_xattr_cache_enabled = val;
+
+	return count;
+}
+LPROC_SEQ_FOPS(ll_xattr_cache);
+
 static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
 	{ "uuid",	  &ll_sb_uuid_fops,	  0, 0 },
 	//{ "mntpt_path",   ll_rd_path,	     0, 0 },
@@ -751,6 +786,7 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
 	{ "lazystatfs",       &ll_lazystatfs_fops, 0 },
 	{ "max_easize",       &ll_maxea_size_fops, 0, 0 },
 	{ "sbi_flags",	      &ll_sbi_flags_fops, 0, 0 },
+	{ "xattr_cache",      &ll_xattr_cache_fops, 0, 0 },
 	{ 0 }
 };
 
@@ -802,6 +838,7 @@ struct llite_file_opcode {
 	{ LPROC_LL_ALLOC_INODE,    LPROCFS_TYPE_REGS, "alloc_inode" },
 	{ LPROC_LL_SETXATTR,       LPROCFS_TYPE_REGS, "setxattr" },
 	{ LPROC_LL_GETXATTR,       LPROCFS_TYPE_REGS, "getxattr" },
+	{ LPROC_LL_GETXATTR_HITS,  LPROCFS_TYPE_REGS, "getxattr_hits" },
 	{ LPROC_LL_LISTXATTR,      LPROCFS_TYPE_REGS, "listxattr" },
 	{ LPROC_LL_REMOVEXATTR,    LPROCFS_TYPE_REGS, "removexattr" },
 	{ LPROC_LL_INODE_PERM,     LPROCFS_TYPE_REGS, "inode_permission" },
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 0000530..dfcfbb8 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -222,6 +222,10 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 			break;
 
 		LASSERT(lock->l_flags & LDLM_FL_CANCELING);
+
+		if (bits & MDS_INODELOCK_XATTR)
+			ll_xattr_cache_destroy(inode);
+
 		/* For OPEN locks we differentiate between lock modes
 		 * LCK_CR, LCK_CW, LCK_PR - bug 22891 */
 		if (bits & (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE |
diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
index 0beaf4e..e21e1c7 100644
--- a/drivers/staging/lustre/lustre/llite/super25.c
+++ b/drivers/staging/lustre/lustre/llite/super25.c
@@ -187,11 +187,15 @@ static int __init init_lustre_lite(void)
 	if (rc == 0)
 		rc = vvp_global_init();
 
+	if (rc == 0)
+		rc = ll_xattr_init();
+
 	return rc;
 }
 
 static void __exit exit_lustre_lite(void)
 {
+	ll_xattr_fini();
 	vvp_global_fini();
 	del_timer(&ll_capa_timer);
 	ll_capa_thread_stop();
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index bcf86ba..ee95855 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -109,7 +109,7 @@ int ll_setxattr_common(struct inode *inode, const char *name,
 		       int flags, __u64 valid)
 {
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
-	struct ptlrpc_request *req;
+	struct ptlrpc_request *req = NULL;
 	int xattr_type, rc;
 	struct obd_capa *oc;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -183,11 +183,17 @@ int ll_setxattr_common(struct inode *inode, const char *name,
 		valid |= rce_ops2valid(rce->rce_ops);
 	}
 #endif
-	oc = ll_mdscapa_get(inode);
-	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-			 valid, name, pv, size, 0, flags, ll_i2suppgid(inode),
-			 &req);
-	capa_put(oc);
+	if (sbi->ll_xattr_cache_enabled &&
+	    (rce == NULL || rce->rce_ops == RMT_LSETFACL)) {
+		rc = ll_xattr_cache_update(inode, name, pv, size, valid, flags);
+	} else {
+		oc = ll_mdscapa_get(inode);
+		rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+				valid, name, pv, size, 0, flags,
+				ll_i2suppgid(inode), &req);
+		capa_put(oc);
+	}
+
 #ifdef CONFIG_FS_POSIX_ACL
 	if (new_value != NULL)
 		lustre_posix_acl_xattr_free(new_value, size);
@@ -352,48 +358,54 @@ int ll_getxattr_common(struct inode *inode, const char *name,
 #endif
 
 do_getxattr:
-	oc = ll_mdscapa_get(inode);
-	rc = md_getxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-			 valid | (rce ? rce_ops2valid(rce->rce_ops) : 0),
-			 name, NULL, 0, size, 0, &req);
-	capa_put(oc);
-	if (rc) {
-		if (rc == -EOPNOTSUPP && xattr_type == XATTR_USER_T) {
-			LCONSOLE_INFO("Disabling user_xattr feature because "
-				      "it is not supported on the server\n");
-			sbi->ll_flags &= ~LL_SBI_USER_XATTR;
-		}
-		return rc;
-	}
+	if (sbi->ll_xattr_cache_enabled && (rce == NULL ||
+					    rce->rce_ops == RMT_LGETFACL ||
+					    rce->rce_ops == RMT_LSETFACL)) {
+		rc = ll_xattr_cache_get(inode, name, buffer, size, valid);
+		if (rc < 0)
+			GOTO(out_xattr, rc);
+	} else {
+		oc = ll_mdscapa_get(inode);
+		rc = md_getxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+				valid | (rce ? rce_ops2valid(rce->rce_ops) : 0),
+				name, NULL, 0, size, 0, &req);
+		capa_put(oc);
 
-	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-	LASSERT(body);
+		if (rc < 0)
+			GOTO(out_xattr, rc);
 
-	/* only detect the xattr size */
-	if (size == 0)
-		GOTO(out, rc = body->eadatasize);
+		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+		LASSERT(body);
 
-	if (size < body->eadatasize) {
-		CERROR("server bug: replied size %u > %u\n",
-		       body->eadatasize, (int)size);
-		GOTO(out, rc = -ERANGE);
-	}
+		/* only detect the xattr size */
+		if (size == 0)
+			GOTO(out, rc = body->eadatasize);
 
-	if (body->eadatasize == 0)
-		GOTO(out, rc = -ENODATA);
+		if (size < body->eadatasize) {
+			CERROR("server bug: replied size %u > %u\n",
+				body->eadatasize, (int)size);
+			GOTO(out, rc = -ERANGE);
+		}
+
+		if (body->eadatasize == 0)
+			GOTO(out, rc = -ENODATA);
 
-	/* do not need swab xattr data */
-	xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
-					     body->eadatasize);
-	if (!xdata)
-		GOTO(out, rc = -EFAULT);
+		/* do not need swab xattr data */
+		xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
+							body->eadatasize);
+		if (!xdata)
+			GOTO(out, rc = -EFAULT);
+
+		memcpy(buffer, xdata, body->eadatasize);
+		rc = body->eadatasize;
+	}
 
 #ifdef CONFIG_FS_POSIX_ACL
-	if (body->eadatasize >= 0 && rce && rce->rce_ops == RMT_LSETFACL) {
+	if (rce && rce->rce_ops == RMT_LSETFACL) {
 		ext_acl_xattr_header *acl;
 
-		acl = lustre_posix_acl_xattr_2ext((posix_acl_xattr_header *)xdata,
-						  body->eadatasize);
+		acl = lustre_posix_acl_xattr_2ext(
+					(posix_acl_xattr_header *)buffer, rc);
 		if (IS_ERR(acl))
 			GOTO(out, rc = PTR_ERR(acl));
 
@@ -406,12 +418,12 @@ do_getxattr:
 	}
 #endif
 
-	if (body->eadatasize == 0) {
-		rc = -ENODATA;
-	} else {
-		LASSERT(buffer);
-		memcpy(buffer, xdata, body->eadatasize);
-		rc = body->eadatasize;
+out_xattr:
+	if (rc == -EOPNOTSUPP && xattr_type == XATTR_USER_T) {
+		LCONSOLE_INFO(
+			"%s: disabling user_xattr feature because it is not supported on the server: rc = %d\n",
+			ll_get_fsname(inode->i_sb, NULL, 0), rc);
+		sbi->ll_flags &= ~LL_SBI_USER_XATTR;
 	}
 out:
 	ptlrpc_req_finished(req);
diff --git a/drivers/staging/lustre/lustre/llite/xattr_cache.c b/drivers/staging/lustre/lustre/llite/xattr_cache.c
new file mode 100644
index 0000000..4bd544a
--- /dev/null
+++ b/drivers/staging/lustre/lustre/llite/xattr_cache.c
@@ -0,0 +1,640 @@
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please  visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Author: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
+ *
+ */
+
+#define DEBUG_SUBSYSTEM S_LLITE
+
+#include <linux/fs.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <obd_support.h>
+#include <lustre_lite.h>
+#include <lustre_dlm.h>
+#include <lustre_ver.h>
+#include "llite_internal.h"
+
+/* If we ever have hundreds of extended attributes, we might want to consider
+ * using a hash or a tree structure instead of list for faster lookups.
+ */
+struct ll_xattr_entry {
+	struct list_head	xe_list;    /* protected with
+					     * lli_xattrs_list_rwsem */
+	char			*xe_name;   /* xattr name, \0-terminated */
+	char			*xe_value;  /* xattr value */
+	unsigned		xe_namelen; /* strlen(xe_name) + 1 */
+	unsigned		xe_vallen;  /* xattr value length */
+};
+
+static struct kmem_cache *xattr_kmem;
+static struct lu_kmem_descr xattr_caches[] = {
+	{
+		.ckd_cache = &xattr_kmem,
+		.ckd_name  = "xattr_kmem",
+		.ckd_size  = sizeof(struct ll_xattr_entry)
+	},
+	{
+		.ckd_cache = NULL
+	}
+};
+
+int ll_xattr_init(void)
+{
+	return lu_kmem_init(xattr_caches);
+}
+
+void ll_xattr_fini(void)
+{
+	lu_kmem_fini(xattr_caches);
+}
+
+/**
+ * Initializes xattr cache for an inode.
+ *
+ * This initializes the xattr list and marks cache presence.
+ */
+static void ll_xattr_cache_init(struct ll_inode_info *lli)
+{
+
+
+	LASSERT(lli != NULL);
+
+	INIT_LIST_HEAD(&lli->lli_xattrs);
+	lli->lli_flags |= LLIF_XATTR_CACHE;
+}
+
+/**
+ *  This looks for a specific extended attribute.
+ *
+ *  Find in @cache and return @xattr_name attribute in @xattr,
+ *  for the NULL @xattr_name return the first cached @xattr.
+ *
+ *  \retval 0        success
+ *  \retval -ENODATA if not found
+ */
+static int ll_xattr_cache_find(struct list_head *cache,
+			       const char *xattr_name,
+			       struct ll_xattr_entry **xattr)
+{
+	struct ll_xattr_entry *entry;
+
+
+
+	list_for_each_entry(entry, cache, xe_list) {
+		/* xattr_name == NULL means look for any entry */
+		if (xattr_name == NULL ||
+		    strcmp(xattr_name, entry->xe_name) == 0) {
+			*xattr = entry;
+			CDEBUG(D_CACHE, "find: [%s]=%.*s\n",
+			       entry->xe_name, entry->xe_vallen,
+			       entry->xe_value);
+			return 0;
+		}
+	}
+
+	return -ENODATA;
+}
+
+/**
+ * This adds or updates an xattr.
+ *
+ * Add @xattr_name attr with @xattr_val value and @xattr_val_len length,
+ * if the attribute already exists, then update its value.
+ *
+ * \retval 0       success
+ * \retval -ENOMEM if no memory could be allocated for the cached attr
+ */
+static int ll_xattr_cache_add(struct list_head *cache,
+			      const char *xattr_name,
+			      const char *xattr_val,
+			      unsigned xattr_val_len)
+{
+	struct ll_xattr_entry *xattr;
+
+
+
+	if (ll_xattr_cache_find(cache, xattr_name, &xattr) == 0) {
+		/* Found a cached EA, update it */
+
+		if (xattr_val_len != xattr->xe_vallen) {
+			char *val;
+			OBD_ALLOC(val, xattr_val_len);
+			if (val == NULL) {
+				CDEBUG(D_CACHE,
+				       "failed to allocate %u bytes for xattr %s update\n",
+				       xattr_val_len, xattr_name);
+				return -ENOMEM;
+			}
+			OBD_FREE(xattr->xe_value, xattr->xe_vallen);
+			xattr->xe_value = val;
+			xattr->xe_vallen = xattr_val_len;
+		}
+		memcpy(xattr->xe_value, xattr_val, xattr_val_len);
+
+		CDEBUG(D_CACHE, "update: [%s]=%.*s\n", xattr_name,
+			xattr_val_len, xattr_val);
+
+		return 0;
+	}
+
+	OBD_SLAB_ALLOC_PTR_GFP(xattr, xattr_kmem, __GFP_IO);
+	if (xattr == NULL) {
+		CDEBUG(D_CACHE, "failed to allocate xattr\n");
+		return -ENOMEM;
+	}
+
+	xattr->xe_namelen = strlen(xattr_name) + 1;
+
+	OBD_ALLOC(xattr->xe_name, xattr->xe_namelen);
+	if (!xattr->xe_name) {
+		CDEBUG(D_CACHE, "failed to alloc xattr name %u\n",
+		       xattr->xe_namelen);
+		goto err_name;
+	}
+	OBD_ALLOC(xattr->xe_value, xattr_val_len);
+	if (!xattr->xe_value) {
+		CDEBUG(D_CACHE, "failed to alloc xattr value %d\n",
+		       xattr_val_len);
+		goto err_value;
+	}
+
+	memcpy(xattr->xe_name, xattr_name, xattr->xe_namelen);
+	memcpy(xattr->xe_value, xattr_val, xattr_val_len);
+	xattr->xe_vallen = xattr_val_len;
+	list_add(&xattr->xe_list, cache);
+
+	CDEBUG(D_CACHE, "set: [%s]=%.*s\n", xattr_name,
+		xattr_val_len, xattr_val);
+
+	return 0;
+err_value:
+	OBD_FREE(xattr->xe_name, xattr->xe_namelen);
+err_name:
+	OBD_SLAB_FREE_PTR(xattr, xattr_kmem);
+
+	return -ENOMEM;
+}
+
+/**
+ * This removes an extended attribute from cache.
+ *
+ * Remove @xattr_name attribute from @cache.
+ *
+ * \retval 0        success
+ * \retval -ENODATA if @xattr_name is not cached
+ */
+static int ll_xattr_cache_del(struct list_head *cache,
+			      const char *xattr_name)
+{
+	struct ll_xattr_entry *xattr;
+
+
+
+	CDEBUG(D_CACHE, "del xattr: %s\n", xattr_name);
+
+	if (ll_xattr_cache_find(cache, xattr_name, &xattr) == 0) {
+		list_del(&xattr->xe_list);
+		OBD_FREE(xattr->xe_name, xattr->xe_namelen);
+		OBD_FREE(xattr->xe_value, xattr->xe_vallen);
+		OBD_SLAB_FREE_PTR(xattr, xattr_kmem);
+
+		return 0;
+	}
+
+	return -ENODATA;
+}
+
+/**
+ * This iterates cached extended attributes.
+ *
+ * Walk over cached attributes in @cache and
+ * fill in @xld_buffer or only calculate buffer
+ * size if @xld_buffer is NULL.
+ *
+ * \retval >= 0     buffer list size
+ * \retval -ENODATA if the list cannot fit @xld_size buffer
+ */
+static int ll_xattr_cache_list(struct list_head *cache,
+			       char *xld_buffer,
+			       int xld_size)
+{
+	struct ll_xattr_entry *xattr, *tmp;
+	int xld_tail = 0;
+
+
+
+	list_for_each_entry_safe(xattr, tmp, cache, xe_list) {
+		CDEBUG(D_CACHE, "list: buffer=%p[%d] name=%s\n",
+			xld_buffer, xld_tail, xattr->xe_name);
+
+		if (xld_buffer) {
+			xld_size -= xattr->xe_namelen;
+			if (xld_size < 0)
+				break;
+			memcpy(&xld_buffer[xld_tail],
+			       xattr->xe_name, xattr->xe_namelen);
+		}
+		xld_tail += xattr->xe_namelen;
+	}
+
+	if (xld_size < 0)
+		return -ERANGE;
+
+	return xld_tail;
+}
+
+/**
+ * Check if the xattr cache is initialized (filled).
+ *
+ * \retval 0 @cache is not initialized
+ * \retval 1 @cache is initialized
+ */
+int ll_xattr_cache_valid(struct ll_inode_info *lli)
+{
+	return !!(lli->lli_flags & LLIF_XATTR_CACHE);
+}
+
+/**
+ * This finalizes the xattr cache.
+ *
+ * Free all xattr memory. @lli is the inode info pointer.
+ *
+ * \retval 0 no error occured
+ */
+static int ll_xattr_cache_destroy_locked(struct ll_inode_info *lli)
+{
+
+
+	if (!ll_xattr_cache_valid(lli))
+		return 0;
+
+	while (ll_xattr_cache_del(&lli->lli_xattrs, NULL) == 0)
+		; /* empty loop */
+	lli->lli_flags &= ~LLIF_XATTR_CACHE;
+
+	return 0;
+}
+
+int ll_xattr_cache_destroy(struct inode *inode)
+{
+	struct ll_inode_info *lli = ll_i2info(inode);
+	int rc;
+
+
+
+	down_write(&lli->lli_xattrs_list_rwsem);
+	rc = ll_xattr_cache_destroy_locked(lli);
+	up_write(&lli->lli_xattrs_list_rwsem);
+
+	return rc;
+}
+
+/**
+ * Match or enqueue a PR or PW LDLM lock.
+ *
+ * Find or request an LDLM lock with xattr data.
+ * Since LDLM does not provide API for atomic match_or_enqueue,
+ * the function handles it with a separate enq lock.
+ * If successful, the function exits with the list lock held.
+ *
+ * \retval 0       no error occured
+ * \retval -ENOMEM not enough memory
+ */
+static int ll_xattr_find_get_lock(struct inode *inode,
+				  struct lookup_intent *oit,
+				  struct ptlrpc_request **req)
+{
+	ldlm_mode_t mode;
+	struct lustre_handle lockh = { 0 };
+	struct md_op_data *op_data;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct ldlm_enqueue_info einfo = { .ei_type = LDLM_IBITS,
+					   .ei_mode = it_to_lock_mode(oit),
+					   .ei_cb_bl = ll_md_blocking_ast,
+					   .ei_cb_cp = ldlm_completion_ast };
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct obd_export *exp = sbi->ll_md_exp;
+	int rc;
+
+
+
+	mutex_lock(&lli->lli_xattrs_enq_lock);
+	/* Try matching first. */
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0,
+			       oit->it_op == IT_SETXATTR ? LCK_PW :
+							   (LCK_PR | LCK_PW));
+	if (mode != 0) {
+		/* fake oit in mdc_revalidate_lock() manner */
+		oit->d.lustre.it_lock_handle = lockh.cookie;
+		oit->d.lustre.it_lock_mode = mode;
+		goto out;
+	}
+
+	/* Enqueue if the lock isn't cached locally. */
+	op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0,
+				     LUSTRE_OPC_ANY, NULL);
+	if (IS_ERR(op_data)) {
+		mutex_unlock(&lli->lli_xattrs_enq_lock);
+		return PTR_ERR(op_data);
+	}
+
+	op_data->op_valid = OBD_MD_FLXATTR | OBD_MD_FLXATTRLS |
+			    OBD_MD_FLXATTRLOCKED;
+#ifdef CONFIG_FS_POSIX_ACL
+	/* If working with ACLs, we would like to cache local ACLs */
+	if (sbi->ll_flags & LL_SBI_RMT_CLIENT)
+		op_data->op_valid |= OBD_MD_FLRMTLGETFACL;
+#endif
+
+	rc = md_enqueue(exp, &einfo, oit, op_data, &lockh, NULL, 0, NULL, 0);
+	ll_finish_md_op_data(op_data);
+
+	if (rc < 0) {
+		CDEBUG(D_CACHE,
+		       "md_intent_lock failed with %d for fid "DFID"\n",
+		       rc, PFID(ll_inode2fid(inode)));
+		mutex_unlock(&lli->lli_xattrs_enq_lock);
+		return rc;
+	}
+
+	*req = (struct ptlrpc_request *)oit->d.lustre.it_data;
+out:
+	down_write(&lli->lli_xattrs_list_rwsem);
+	mutex_unlock(&lli->lli_xattrs_enq_lock);
+
+	return 0;
+}
+
+/**
+ * Refill the xattr cache.
+ *
+ * Fetch and cache the whole of xattrs for @inode, acquiring
+ * a read or a write xattr lock depending on operation in @oit.
+ * Intent is dropped on exit unless the operation is setxattr.
+ *
+ * \retval 0       no error occured
+ * \retval -EPROTO network protocol error
+ * \retval -ENOMEM not enough memory for the cache
+ */
+static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
+{
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct ptlrpc_request *req = NULL;
+	const char *xdata, *xval, *xtail, *xvtail;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct mdt_body *body;
+	__u32 *xsizes;
+	int rc = 0, i;
+
+
+
+	rc = ll_xattr_find_get_lock(inode, oit, &req);
+	if (rc)
+		GOTO(out_no_unlock, rc);
+
+	/* Do we have the data at this point? */
+	if (ll_xattr_cache_valid(lli)) {
+		ll_stats_ops_tally(sbi, LPROC_LL_GETXATTR_HITS, 1);
+		GOTO(out_maybe_drop, rc = 0);
+	}
+
+	/* Matched but no cache? Cancelled on error by a parallel refill. */
+	if (unlikely(req == NULL)) {
+		CDEBUG(D_CACHE, "cancelled by a parallel getxattr\n");
+		GOTO(out_maybe_drop, rc = -EIO);
+	}
+
+	if (oit->d.lustre.it_status < 0) {
+		CDEBUG(D_CACHE, "getxattr intent returned %d for fid "DFID"\n",
+		       oit->d.lustre.it_status, PFID(ll_inode2fid(inode)));
+		GOTO(out_destroy, rc = oit->d.lustre.it_status);
+	}
+
+	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+	if (body == NULL) {
+		CERROR("no MDT BODY in the refill xattr reply\n");
+		GOTO(out_destroy, rc = -EPROTO);
+	}
+	/* do not need swab xattr data */
+	xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
+						body->eadatasize);
+	xval = req_capsule_server_sized_get(&req->rq_pill, &RMF_EAVALS,
+						body->aclsize);
+	xsizes = req_capsule_server_sized_get(&req->rq_pill, &RMF_EAVALS_LENS,
+					      body->max_mdsize * sizeof(__u32));
+	if (xdata == NULL || xval == NULL || xsizes == NULL) {
+		CERROR("wrong setxattr reply\n");
+		GOTO(out_destroy, rc = -EPROTO);
+	}
+
+	xtail = xdata + body->eadatasize;
+	xvtail = xval + body->aclsize;
+
+	CDEBUG(D_CACHE, "caching: xdata=%p xtail=%p\n", xdata, xtail);
+
+	ll_xattr_cache_init(lli);
+
+	for (i = 0; i < body->max_mdsize; i++) {
+		CDEBUG(D_CACHE, "caching [%s]=%.*s\n", xdata, *xsizes, xval);
+		/* Perform consistency checks: attr names and vals in pill */
+		if (memchr(xdata, 0, xtail - xdata) == NULL) {
+			CERROR("xattr protocol violation (names are broken)\n");
+			rc = -EPROTO;
+		} else if (xval + *xsizes > xvtail) {
+			CERROR("xattr protocol violation (vals are broken)\n");
+			rc = -EPROTO;
+		} else if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_XATTR_ENOMEM)) {
+			rc = -ENOMEM;
+		} else {
+			rc = ll_xattr_cache_add(&lli->lli_xattrs, xdata, xval,
+						*xsizes);
+		}
+		if (rc < 0) {
+			ll_xattr_cache_destroy_locked(lli);
+			GOTO(out_destroy, rc);
+		}
+		xdata += strlen(xdata) + 1;
+		xval  += *xsizes;
+		xsizes++;
+	}
+
+	if (xdata != xtail || xval != xvtail)
+		CERROR("a hole in xattr data\n");
+
+	ll_set_lock_data(sbi->ll_md_exp, inode, oit, NULL);
+
+	GOTO(out_maybe_drop, rc);
+out_maybe_drop:
+	/* drop lock on error or getxattr */
+	if (rc != 0 || oit->it_op != IT_SETXATTR)
+		ll_intent_drop_lock(oit);
+
+	if (rc != 0)
+		up_write(&lli->lli_xattrs_list_rwsem);
+out_no_unlock:
+	ptlrpc_req_finished(req);
+
+	return rc;
+
+out_destroy:
+	up_write(&lli->lli_xattrs_list_rwsem);
+
+	ldlm_lock_decref_and_cancel((struct lustre_handle *)
+					&oit->d.lustre.it_lock_handle,
+					oit->d.lustre.it_lock_mode);
+
+	goto out_no_unlock;
+}
+
+/**
+ * Get an xattr value or list xattrs using the write-through cache.
+ *
+ * Get the xattr value (@valid has OBD_MD_FLXATTR set) of @name or
+ * list xattr names (@valid has OBD_MD_FLXATTRLS set) for @inode.
+ * The resulting value/list is stored in @buffer if the former
+ * is not larger than @size.
+ *
+ * \retval 0        no error occured
+ * \retval -EPROTO  network protocol error
+ * \retval -ENOMEM  not enough memory for the cache
+ * \retval -ERANGE  the buffer is not large enough
+ * \retval -ENODATA no such attr or the list is empty
+ */
+int ll_xattr_cache_get(struct inode *inode,
+			const char *name,
+			char *buffer,
+			size_t size,
+			__u64 valid)
+{
+	struct lookup_intent oit = { .it_op = IT_GETXATTR };
+	struct ll_inode_info *lli = ll_i2info(inode);
+	int rc = 0;
+
+
+
+	LASSERT(!!(valid & OBD_MD_FLXATTR) ^ !!(valid & OBD_MD_FLXATTRLS));
+
+	down_read(&lli->lli_xattrs_list_rwsem);
+	if (!ll_xattr_cache_valid(lli)) {
+		up_read(&lli->lli_xattrs_list_rwsem);
+		rc = ll_xattr_cache_refill(inode, &oit);
+		if (rc)
+			return rc;
+		downgrade_write(&lli->lli_xattrs_list_rwsem);
+	} else {
+		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR_HITS, 1);
+	}
+
+	if (valid & OBD_MD_FLXATTR) {
+		struct ll_xattr_entry *xattr;
+
+		rc = ll_xattr_cache_find(&lli->lli_xattrs, name, &xattr);
+		if (rc == 0) {
+			rc = xattr->xe_vallen;
+			/* zero size means we are only requested size in rc */
+			if (size != 0) {
+				if (size >= xattr->xe_vallen)
+					memcpy(buffer, xattr->xe_value,
+						xattr->xe_vallen);
+				else
+					rc = -ERANGE;
+			}
+		}
+	} else if (valid & OBD_MD_FLXATTRLS) {
+		rc = ll_xattr_cache_list(&lli->lli_xattrs,
+					 size ? buffer : NULL, size);
+	}
+
+	GOTO(out, rc);
+out:
+	up_read(&lli->lli_xattrs_list_rwsem);
+
+	return rc;
+}
+
+
+/**
+ * Set/update an xattr value or remove xattr using the write-through cache.
+ *
+ * Set/update the xattr value (if @valid has OBD_MD_FLXATTR) of @name to @newval
+ * or
+ * remove the xattr @name (@valid has OBD_MD_FLXATTRRM set) from @inode.
+ * @flags is either XATTR_CREATE or XATTR_REPLACE as defined by setxattr(2)
+ *
+ * \retval 0        no error occured
+ * \retval -EPROTO  network protocol error
+ * \retval -ENOMEM  not enough memory for the cache
+ * \retval -ERANGE  the buffer is not large enough
+ * \retval -ENODATA no such attr (in the removal case)
+ */
+int ll_xattr_cache_update(struct inode *inode,
+			const char *name,
+			const char *newval,
+			size_t size,
+			__u64 valid,
+			int flags)
+{
+	struct lookup_intent oit = { .it_op = IT_SETXATTR };
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct ptlrpc_request *req = NULL;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct obd_capa *oc;
+	int rc;
+
+
+
+	LASSERT(!!(valid & OBD_MD_FLXATTR) ^ !!(valid & OBD_MD_FLXATTRRM));
+
+	rc = ll_xattr_cache_refill(inode, &oit);
+	if (rc)
+		return rc;
+
+	oc = ll_mdscapa_get(inode);
+	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+			valid | OBD_MD_FLXATTRLOCKED, name, newval,
+			size, 0, flags, ll_i2suppgid(inode), &req);
+	capa_put(oc);
+
+	if (rc) {
+		ll_intent_drop_lock(&oit);
+		GOTO(out, rc);
+	}
+
+	if (valid & OBD_MD_FLXATTR)
+		rc = ll_xattr_cache_add(&lli->lli_xattrs, name, newval, size);
+	else if (valid & OBD_MD_FLXATTRRM)
+		rc = ll_xattr_cache_del(&lli->lli_xattrs, name);
+
+	ll_intent_drop_lock(&oit);
+	GOTO(out, rc);
+out:
+	up_write(&lli->lli_xattrs_list_rwsem);
+	ptlrpc_req_finished(req);
+
+	return rc;
+}
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index b995af6..5069829 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -72,6 +72,7 @@ void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		   __u32 mode, __u64 rdev, __u64 flags, const void *data,
 		   int datalen);
 void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
+void mdc_getxattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
 void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
 void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		     const char *old, int oldlen, const char *new, int newlen);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 09dee11..cc5490c 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -360,6 +360,62 @@ static struct ptlrpc_request *mdc_intent_open_pack(struct obd_export *exp,
 	return req;
 }
 
+static struct ptlrpc_request *
+mdc_intent_getxattr_pack(struct obd_export *exp,
+			 struct lookup_intent *it,
+			 struct md_op_data *op_data)
+{
+	struct ptlrpc_request	*req;
+	struct ldlm_intent	*lit;
+	int			rc, count = 0, maxdata;
+	LIST_HEAD(cancels);
+
+
+
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
+					&RQF_LDLM_INTENT_GETXATTR);
+	if (req == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	mdc_set_capa_size(req, &RMF_CAPA1, op_data->op_capa1);
+
+	if (it->it_op == IT_SETXATTR)
+		/* If we want to upgrade to LCK_PW, let's cancel LCK_PR
+		 * locks now. This avoids unnecessary ASTs. */
+		count = mdc_resource_get_unused(exp, &op_data->op_fid1,
+						&cancels, LCK_PW,
+						MDS_INODELOCK_XATTR);
+
+	rc = ldlm_prep_enqueue_req(exp, req, &cancels, count);
+	if (rc) {
+		ptlrpc_request_free(req);
+		return ERR_PTR(rc);
+	}
+
+	/* pack the intent */
+	lit = req_capsule_client_get(&req->rq_pill, &RMF_LDLM_INTENT);
+	lit->opc = IT_GETXATTR;
+
+	maxdata = class_exp2cliimp(exp)->imp_connect_data.ocd_max_easize;
+
+	/* pack the intended request */
+	mdc_pack_body(req, &op_data->op_fid1, op_data->op_capa1,
+			op_data->op_valid, maxdata, -1, 0);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EADATA,
+				RCL_SERVER, maxdata);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EAVALS,
+				RCL_SERVER, maxdata);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EAVALS_LENS,
+				RCL_SERVER, maxdata);
+
+	ptlrpc_request_set_replen(req);
+
+	return req;
+}
+
 static struct ptlrpc_request *mdc_intent_unlink_pack(struct obd_export *exp,
 						     struct lookup_intent *it,
 						     struct md_op_data *op_data)
@@ -735,6 +791,8 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 			    { .l_inodebits = { MDS_INODELOCK_UPDATE } };
 	static const ldlm_policy_data_t layout_policy =
 			    { .l_inodebits = { MDS_INODELOCK_LAYOUT } };
+	static const ldlm_policy_data_t getxattr_policy = {
+			      .l_inodebits = { MDS_INODELOCK_XATTR } };
 	ldlm_policy_data_t const *policy = &lookup_policy;
 	int		    generation, resends = 0;
 	struct ldlm_reply     *lockrep;
@@ -751,6 +809,8 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 			policy = &update_policy;
 		else if (it->it_op & IT_LAYOUT)
 			policy = &layout_policy;
+		else if (it->it_op & (IT_GETXATTR | IT_SETXATTR))
+			policy = &getxattr_policy;
 	}
 
 	LASSERT(reqp == NULL);
@@ -781,9 +841,10 @@ resend:
 	} else if (it->it_op & IT_LAYOUT) {
 		if (!imp_connect_lvb_type(class_exp2cliimp(exp)))
 			return -EOPNOTSUPP;
-
 		req = mdc_intent_layout_pack(exp, it, op_data);
 		lvb_type = LVB_T_LAYOUT;
+	} else if (it->it_op & (IT_GETXATTR | IT_SETXATTR)) {
+		req = mdc_intent_getxattr_pack(exp, it, op_data);
 	} else {
 		LBUG();
 		return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index c4381eb..eee8874 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -462,6 +462,25 @@ static const struct req_msg_field *ldlm_intent_unlink_client[] = {
 	&RMF_NAME
 };
 
+static const struct req_msg_field *ldlm_intent_getxattr_client[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_DLM_REQ,
+	&RMF_LDLM_INTENT,
+	&RMF_MDT_BODY,
+	&RMF_CAPA1,
+};
+
+static const struct req_msg_field *ldlm_intent_getxattr_server[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_DLM_REP,
+	&RMF_MDT_BODY,
+	&RMF_MDT_MD,
+	&RMF_ACL, /* for req_capsule_extend/mdt_intent_policy */
+	&RMF_EADATA,
+	&RMF_EAVALS,
+	&RMF_EAVALS_LENS
+};
+
 static const struct req_msg_field *mds_getxattr_client[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_MDT_BODY,
@@ -739,6 +758,7 @@ static struct req_format *req_formats[] = {
 	&RQF_LDLM_INTENT_OPEN,
 	&RQF_LDLM_INTENT_CREATE,
 	&RQF_LDLM_INTENT_UNLINK,
+	&RQF_LDLM_INTENT_GETXATTR,
 	&RQF_LDLM_INTENT_QUOTA,
 	&RQF_QUOTA_DQACQ,
 	&RQF_LOG_CANCEL,
@@ -1013,6 +1033,9 @@ struct req_msg_field RMF_EADATA = DEFINE_MSGF("eadata", 0, -1,
 						    NULL, NULL);
 EXPORT_SYMBOL(RMF_EADATA);
 
+struct req_msg_field RMF_EAVALS = DEFINE_MSGF("eavals", 0, -1, NULL, NULL);
+EXPORT_SYMBOL(RMF_EAVALS);
+
 struct req_msg_field RMF_ACL =
 	DEFINE_MSGF("acl", RMF_F_NO_SIZE_CHECK,
 		    LUSTRE_POSIX_ACL_MAX_SIZE, NULL, NULL);
@@ -1064,6 +1087,11 @@ struct req_msg_field RMF_RCS =
 		    lustre_swab_generic_32s, dump_rcs);
 EXPORT_SYMBOL(RMF_RCS);
 
+struct req_msg_field RMF_EAVALS_LENS =
+	DEFINE_MSGF("eavals_lens", RMF_F_STRUCT_ARRAY, sizeof(__u32),
+		lustre_swab_generic_32s, NULL);
+EXPORT_SYMBOL(RMF_EAVALS_LENS);
+
 struct req_msg_field RMF_OBD_ID =
 	DEFINE_MSGF("obd_id", 0,
 		    sizeof(obd_id), lustre_swab_ost_last_id, NULL);
@@ -1421,6 +1449,12 @@ struct req_format RQF_LDLM_INTENT_UNLINK =
 			ldlm_intent_unlink_client, ldlm_intent_server);
 EXPORT_SYMBOL(RQF_LDLM_INTENT_UNLINK);
 
+struct req_format RQF_LDLM_INTENT_GETXATTR =
+	DEFINE_REQ_FMT0("LDLM_INTENT_GETXATTR",
+			ldlm_intent_getxattr_client,
+			ldlm_intent_getxattr_server);
+EXPORT_SYMBOL(RQF_LDLM_INTENT_GETXATTR);
+
 struct req_format RQF_MDS_CLOSE =
 	DEFINE_REQ_FMT0("MDS_CLOSE",
 			mdt_close_client, mds_last_unlink_server);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-26  2:04 ` [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
@ 2013-11-26  3:14   ` Greg Kroah-Hartman
  2013-11-26  3:25     ` Peng Tao
  0 siblings, 1 reply; 25+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-26  3:14 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, Mikhail Pershin, Andreas Dilger

On Tue, Nov 26, 2013 at 10:04:56AM +0800, Peng Tao wrote:
> From: Mikhail Pershin <mike.pershin@intel.com>
> 
> MGC uses lvfs API to access local llogs blocking removal of old code
> 
> - MGS is converted to use OSD API for local llogs
> - llog_is_empty() and llog_backup() are introduced
> - initial OSD start initialize run ctxt so llog can use it too
> - shrink dcache after initial scrub in osd-ldiskfs to avoid holding
>   data of unlinked objects until umount.

It is not checkpatch clean :(

Come on, it's not hard to do this yourself, why do you make me find
these errors for you?  It does nothing but make me not want to look at
lustre patches for a long time...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-26  3:14   ` Greg Kroah-Hartman
@ 2013-11-26  3:25     ` Peng Tao
  2013-11-26  3:34       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 25+ messages in thread
From: Peng Tao @ 2013-11-26  3:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, Mikhail Pershin, Andreas Dilger

On Tue, Nov 26, 2013 at 11:14 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Tue, Nov 26, 2013 at 10:04:56AM +0800, Peng Tao wrote:
>> From: Mikhail Pershin <mike.pershin@intel.com>
>>
>> MGC uses lvfs API to access local llogs blocking removal of old code
>>
>> - MGS is converted to use OSD API for local llogs
>> - llog_is_empty() and llog_backup() are introduced
>> - initial OSD start initialize run ctxt so llog can use it too
>> - shrink dcache after initial scrub in osd-ldiskfs to avoid holding
>>   data of unlinked objects until umount.
>
> It is not checkpatch clean :(
>
> Come on, it's not hard to do this yourself, why do you make me find
> these errors for you?  It does nothing but make me not want to look at
> lustre patches for a long time...

Strange. I did run checkpatch before sending the patchset out.

Just tried again. Still pass.

[X61@linux-lustre]$./scripts/checkpatch.pl
0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch total:
0 errors, 0 warnings, 883 lines checked

0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch has no
obvious style problems and is ready for submission.

What message did you see?

Thanks,
Tao

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface
  2013-11-26  2:05 ` [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
@ 2013-11-26  3:30   ` Greg Kroah-Hartman
  2013-11-26  4:09     ` Peng Tao
  0 siblings, 1 reply; 25+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-26  3:30 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, JC Lafoucriere, Andreas Dilger

On Tue, Nov 26, 2013 at 10:05:02AM +0800, Peng Tao wrote:
> From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
> 
> To move data with external storage, HSM coordinator
> uses a Copy Tool running on a client named agent.
> This patch implements the interface for these agents.

Interesting text here...

> Lustre-change: http://review.whamcloud.com/6534
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3342
> Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: Peng Tao <bergwolf@gmail.com>
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> ---
>  .../lustre/lustre/include/lustre/lustre_user.h     |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
> index 9436166..631f026 100644
> --- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
> +++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
> @@ -428,8 +428,8 @@ struct obd_uuid {
>  	char uuid[UUID_MAX];
>  };
>  
> -static inline int obd_uuid_equals(const struct obd_uuid *u1,
> -				  const struct obd_uuid *u2)
> +static inline bool obd_uuid_equals(const struct obd_uuid *u1,
> +				   const struct obd_uuid *u2)
>  {
>  	return strcmp((char *)u1->uuid, (char *)u2->uuid) == 0;
>  }
> @@ -446,7 +446,7 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
>  }
>  
>  /* For printf's only, make sure uuid is terminated */
> -static inline char *obd_uuid2str(struct obd_uuid *uuid)
> +static inline char *obd_uuid2str(const struct obd_uuid *uuid)
>  {
>  	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
>  		/* Obviously not safe, but for printfs, no real harm done...

Too bad it doesn't describe the changes made in the code at all.

How can so many people review a patch that is not the same as what it
says it really is?

I'm stopping here in the series, sorry.  Please fix up and resend the
rest when you can.

greg k-h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-26  3:25     ` Peng Tao
@ 2013-11-26  3:34       ` Greg Kroah-Hartman
  2013-11-26  4:05         ` Peng Tao
  0 siblings, 1 reply; 25+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-26  3:34 UTC (permalink / raw)
  To: Peng Tao; +Cc: Linux Kernel Mailing List, Mikhail Pershin, Andreas Dilger

On Tue, Nov 26, 2013 at 11:25:24AM +0800, Peng Tao wrote:
> On Tue, Nov 26, 2013 at 11:14 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Tue, Nov 26, 2013 at 10:04:56AM +0800, Peng Tao wrote:
> >> From: Mikhail Pershin <mike.pershin@intel.com>
> >>
> >> MGC uses lvfs API to access local llogs blocking removal of old code
> >>
> >> - MGS is converted to use OSD API for local llogs
> >> - llog_is_empty() and llog_backup() are introduced
> >> - initial OSD start initialize run ctxt so llog can use it too
> >> - shrink dcache after initial scrub in osd-ldiskfs to avoid holding
> >>   data of unlinked objects until umount.
> >
> > It is not checkpatch clean :(
> >
> > Come on, it's not hard to do this yourself, why do you make me find
> > these errors for you?  It does nothing but make me not want to look at
> > lustre patches for a long time...
> 
> Strange. I did run checkpatch before sending the patchset out.
> 
> Just tried again. Still pass.
> 
> [X61@linux-lustre]$./scripts/checkpatch.pl
> 0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch total:
> 0 errors, 0 warnings, 883 lines checked
> 
> 0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch has no
> obvious style problems and is ready for submission.
> 
> What message did you see?

One about return();

greg k-h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-26  3:34       ` Greg Kroah-Hartman
@ 2013-11-26  4:05         ` Peng Tao
  0 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  4:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, Mikhail Pershin, Andreas Dilger

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

On 11/26/2013 11:34 AM, Greg Kroah-Hartman wrote:
> On Tue, Nov 26, 2013 at 11:25:24AM +0800, Peng Tao wrote:
>> On Tue, Nov 26, 2013 at 11:14 AM, Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>>> On Tue, Nov 26, 2013 at 10:04:56AM +0800, Peng Tao wrote:
>>>> From: Mikhail Pershin <mike.pershin@intel.com>
>>>>
>>>> MGC uses lvfs API to access local llogs blocking removal of old code
>>>>
>>>> - MGS is converted to use OSD API for local llogs
>>>> - llog_is_empty() and llog_backup() are introduced
>>>> - initial OSD start initialize run ctxt so llog can use it too
>>>> - shrink dcache after initial scrub in osd-ldiskfs to avoid holding
>>>>    data of unlinked objects until umount.
>>>
>>> It is not checkpatch clean :(
>>>
>>> Come on, it's not hard to do this yourself, why do you make me find
>>> these errors for you?  It does nothing but make me not want to look at
>>> lustre patches for a long time...
>>
>> Strange. I did run checkpatch before sending the patchset out.
>>
>> Just tried again. Still pass.
>>
>> [X61@linux-lustre]$./scripts/checkpatch.pl
>> 0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch total:
>> 0 errors, 0 warnings, 883 lines checked
>>
>> 0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch has no
>> obvious style problems and is ready for submission.
>>
>> What message did you see?
>
> One about return();
>
OK. Looks like a latest checkpatch.pl change. After rebasing my working 
copy to latest staging-next, I can see the same error. I'll fix up. 
Sorry for the noise.

Thanks,
Tao





[-- Attachment #2: 0002-staging-lustre-llog-MGC-to-use-OSD-API-for-backup-lo.patch --]
[-- Type: text/x-patch, Size: 29486 bytes --]

>From 93455fbe9b61a66218225c2ec4c14fc2a86bcf67 Mon Sep 17 00:00:00 2001
From: Mikhail Pershin <mike.pershin@intel.com>
Date: Thu, 31 Oct 2013 17:06:29 +0800
Subject: [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup
 logs

MGC uses lvfs API to access local llogs blocking removal of old code

- MGS is converted to use OSD API for local llogs
- llog_is_empty() and llog_backup() are introduced
- initial OSD start initialize run ctxt so llog can use it too
- shrink dcache after initial scrub in osd-ldiskfs to avoid holding
  data of unlinked objects until umount.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2059
Lustre-change: http://review.whamcloud.com/5049
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
 drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
 drivers/staging/lustre/lustre/include/obd.h        |    4 +-
 drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  401 ++++++++++++--------
 drivers/staging/lustre/lustre/obdclass/llog.c      |  212 +++++++----
 .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
 .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
 10 files changed, 408 insertions(+), 244 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/dt_object.h b/drivers/staging/lustre/lustre/include/dt_object.h
index e116bb2..9304c26 100644
--- a/drivers/staging/lustre/lustre/include/dt_object.h
+++ b/drivers/staging/lustre/lustre/include/dt_object.h
@@ -692,7 +692,7 @@ struct local_oid_storage {
 	struct dt_object *los_obj;
 
 	/* data used to generate new fids */
-	struct mutex	 los_id_lock;
+	struct mutex	  los_id_lock;
 	__u64		  los_seq;
 	__u32		  los_last_oid;
 };
diff --git a/drivers/staging/lustre/lustre/include/lustre_log.h b/drivers/staging/lustre/lustre/include/lustre_log.h
index 721aa05..896c757 100644
--- a/drivers/staging/lustre/lustre/include/lustre_log.h
+++ b/drivers/staging/lustre/lustre/include/lustre_log.h
@@ -136,7 +136,11 @@ int llog_open(const struct lu_env *env, struct llog_ctxt *ctxt,
 	      struct llog_handle **lgh, struct llog_logid *logid,
 	      char *name, enum llog_open_param open_param);
 int llog_close(const struct lu_env *env, struct llog_handle *cathandle);
-int llog_get_size(struct llog_handle *loghandle);
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name);
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bak_ctxt,
+		char *name, char *backup);
 
 /* llog_process flags */
 #define LLOG_FLAG_NODEAMON 0x0001
@@ -382,6 +386,13 @@ static inline int llog_data_len(int len)
 	return cfs_size_round(len);
 }
 
+static inline int llog_get_size(struct llog_handle *loghandle)
+{
+	if (loghandle && loghandle->lgh_hdr)
+		return loghandle->lgh_hdr->llh_count;
+	return 0;
+}
+
 static inline struct llog_ctxt *llog_ctxt_get(struct llog_ctxt *ctxt)
 {
 	atomic_inc(&ctxt->loc_refcount);
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index d0aea15..f0c1773 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -399,8 +399,8 @@ struct client_obd {
 
 	/* mgc datastruct */
 	struct semaphore	 cl_mgc_sem;
-	struct vfsmount	 *cl_mgc_vfsmnt;
-	struct dentry	   *cl_mgc_configs_dir;
+	struct local_oid_storage *cl_mgc_los;
+	struct dt_object	*cl_mgc_configs_dir;
 	atomic_t	     cl_mgc_refcount;
 	struct obd_export       *cl_mgc_mgsexp;
 
diff --git a/drivers/staging/lustre/lustre/mgc/libmgc.c b/drivers/staging/lustre/lustre/mgc/libmgc.c
index 7b4947c..9b40c57 100644
--- a/drivers/staging/lustre/lustre/mgc/libmgc.c
+++ b/drivers/staging/lustre/lustre/mgc/libmgc.c
@@ -99,11 +99,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	ptlrpcd_decref();
 
 	rc = client_obd_cleanup(obd);
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 93b601d..bf4d8e9 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -41,17 +41,14 @@
 #define DEBUG_SUBSYSTEM S_MGC
 #define D_MGC D_CONFIG /*|D_WARNING*/
 
-# include <linux/module.h>
-# include <linux/pagemap.h>
-# include <linux/miscdevice.h>
-# include <linux/init.h>
-
+#include <linux/module.h>
 #include <obd_class.h>
 #include <lustre_dlm.h>
 #include <lprocfs_status.h>
 #include <lustre_log.h>
-#include <lustre_fsfilt.h>
 #include <lustre_disk.h>
+#include <dt_object.h>
+
 #include "mgc_internal.h"
 
 static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
@@ -578,97 +575,175 @@ static void mgc_requeue_add(struct config_llog_data *cld)
 }
 
 /********************** class fns **********************/
+static int mgc_local_llog_init(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct obd_device *disk)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_ORIG_CTXT, disk,
+			&llog_osd_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	LASSERT(ctxt);
+	ctxt->loc_dir = obd->u.cli.cl_mgc_configs_dir;
+	llog_ctxt_put(ctxt);
 
-static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb,
-			struct vfsmount *mnt)
+	return 0;
+}
+
+static int mgc_local_llog_fini(const struct lu_env *env,
+			       struct obd_device *obd)
 {
-	struct lvfs_run_ctxt saved;
-	struct lustre_sb_info *lsi = s2lsi(sb);
-	struct client_obd *cli = &obd->u.cli;
-	struct dentry *dentry;
-	char *label;
-	int err = 0;
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	llog_cleanup(env, ctxt);
+
+	return 0;
+}
+
+static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb)
+{
+	struct lustre_sb_info	*lsi = s2lsi(sb);
+	struct client_obd	*cli = &obd->u.cli;
+	struct lu_fid		 rfid, fid;
+	struct dt_object	*root, *dto;
+	struct lu_env		*env;
+	int			 rc = 0;
 
 	LASSERT(lsi);
-	LASSERT(lsi->lsi_srv_mnt == mnt);
+	LASSERT(lsi->lsi_dt_dev);
+
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
+		return -ENOMEM;
 
 	/* The mgc fs exclusion sem. Only one fs can be setup at a time. */
 	down(&cli->cl_mgc_sem);
 
 	cfs_cleanup_group_info();
 
-	obd->obd_fsops = fsfilt_get_ops(lsi->lsi_fstype);
-	if (IS_ERR(obd->obd_fsops)) {
-		up(&cli->cl_mgc_sem);
-		CERROR("%s: No fstype %s: rc = %ld\n", lsi->lsi_fstype,
-		       obd->obd_name, PTR_ERR(obd->obd_fsops));
-		return PTR_ERR(obd->obd_fsops);
-	}
+	/* Setup the configs dir */
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_err, rc);
 
-	cli->cl_mgc_vfsmnt = mnt;
-	err = fsfilt_setup(obd, mnt->mnt_sb);
-	if (err)
-		GOTO(err_ops, err);
-
-	OBD_SET_CTXT_MAGIC(&obd->obd_lvfs_ctxt);
-	obd->obd_lvfs_ctxt.pwdmnt = mnt;
-	obd->obd_lvfs_ctxt.pwd = mnt->mnt_root;
-	obd->obd_lvfs_ctxt.fs = get_ds();
-
-	push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	dentry = ll_lookup_one_len(MOUNT_CONFIGS_DIR, cfs_fs_pwd(current->fs),
-				   strlen(MOUNT_CONFIGS_DIR));
-	pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	if (IS_ERR(dentry)) {
-		err = PTR_ERR(dentry);
-		CERROR("cannot lookup %s directory: rc = %d\n",
-		       MOUNT_CONFIGS_DIR, err);
-		GOTO(err_ops, err);
-	}
-	cli->cl_mgc_configs_dir = dentry;
+	fid.f_seq = FID_SEQ_LOCAL_NAME;
+	fid.f_oid = 1;
+	fid.f_ver = 0;
+	rc = local_oid_storage_init(env, lsi->lsi_dt_dev, &fid,
+				    &cli->cl_mgc_los);
+	if (rc)
+		GOTO(out_env, rc);
+
+	rc = dt_root_get(env, lsi->lsi_dt_dev, &rfid);
+	if (rc)
+		GOTO(out_env, rc);
+
+	root = dt_locate_at(env, lsi->lsi_dt_dev, &rfid,
+			    &cli->cl_mgc_los->los_dev->dd_lu_dev);
+	if (unlikely(IS_ERR(root)))
+		GOTO(out_los, rc = PTR_ERR(root));
+
+	dto = local_file_find_or_create(env, cli->cl_mgc_los, root,
+					MOUNT_CONFIGS_DIR,
+					S_IFDIR | S_IRUGO | S_IWUSR | S_IXUGO);
+	lu_object_put_nocache(env, &root->do_lu);
+	if (IS_ERR(dto))
+		GOTO(out_los, rc = PTR_ERR(dto));
+
+	cli->cl_mgc_configs_dir = dto;
+
+	LASSERT(lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt);
+	rc = mgc_local_llog_init(env, obd, lsi->lsi_osd_exp->exp_obd);
+	if (rc)
+		GOTO(out_llog, rc);
 
 	/* We take an obd ref to insure that we can't get to mgc_cleanup
-	   without calling mgc_fs_cleanup first. */
+	 * without calling mgc_fs_cleanup first. */
 	class_incref(obd, "mgc_fs", obd);
 
-	label = fsfilt_get_label(obd, mnt->mnt_sb);
-	if (label)
-		CDEBUG(D_MGC, "MGC using disk labelled=%s\n", label);
-
 	/* We keep the cl_mgc_sem until mgc_fs_cleanup */
-	return 0;
-
-err_ops:
-	fsfilt_put_ops(obd->obd_fsops);
-	obd->obd_fsops = NULL;
-	cli->cl_mgc_vfsmnt = NULL;
-	up(&cli->cl_mgc_sem);
-	return err;
+out_llog:
+	if (rc) {
+		lu_object_put(env, &cli->cl_mgc_configs_dir->do_lu);
+		cli->cl_mgc_configs_dir = NULL;
+	}
+out_los:
+	if (rc < 0) {
+		local_oid_storage_fini(env, cli->cl_mgc_los);
+		cli->cl_mgc_los = NULL;
+		up(&cli->cl_mgc_sem);
+	}
+out_env:
+	lu_env_fini(env);
+out_err:
+	OBD_FREE_PTR(env);
+	return rc;
 }
 
 static int mgc_fs_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
-	int rc = 0;
+	struct lu_env		 env;
+	struct client_obd	*cli = &obd->u.cli;
+	int			 rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt != NULL);
+	LASSERT(cli->cl_mgc_los != NULL);
 
-	if (cli->cl_mgc_configs_dir != NULL) {
-		struct lvfs_run_ctxt saved;
-		push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		l_dput(cli->cl_mgc_configs_dir);
-		cli->cl_mgc_configs_dir = NULL;
-		pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		class_decref(obd, "mgc_fs", obd);
-	}
+	rc = lu_env_init(&env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(unlock, rc);
+
+	mgc_local_llog_fini(&env, obd);
 
-	cli->cl_mgc_vfsmnt = NULL;
-	if (obd->obd_fsops)
-		fsfilt_put_ops(obd->obd_fsops);
+	lu_object_put_nocache(&env, &cli->cl_mgc_configs_dir->do_lu);
+	cli->cl_mgc_configs_dir = NULL;
 
+	local_oid_storage_fini(&env, cli->cl_mgc_los);
+	cli->cl_mgc_los = NULL;
+	lu_env_fini(&env);
+
+unlock:
+	class_decref(obd, "mgc_fs", obd);
 	up(&cli->cl_mgc_sem);
 
-	return rc;
+	return 0;
+}
+
+static int mgc_llog_init(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	/* setup only remote ctxt, the local disk context is switched per each
+	 * filesystem during mgc_fs_setup() */
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_REPL_CTXT, obd,
+			&llog_client_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
+	llog_initiator_connect(ctxt);
+	llog_ctxt_put(ctxt);
+
+	return 0;
+}
+
+static int mgc_llog_fini(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	if (ctxt)
+		llog_cleanup(env, ctxt);
+
+	return 0;
 }
 
 static atomic_t mgc_count = ATOMIC_INIT(0);
@@ -694,7 +769,7 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 			}
 		}
 		obd_cleanup_client_import(obd);
-		rc = obd_llog_finish(obd, 0);
+		rc = mgc_llog_fini(NULL, obd);
 		if (rc != 0)
 			CERROR("failed to cleanup llogging subsystems\n");
 		break;
@@ -704,11 +779,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	/* COMPAT_146 - old config logs may have added profiles we don't
 	   know about */
 	if (obd->obd_type->typ_refcnt <= 1)
@@ -733,7 +805,7 @@ static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	if (rc)
 		GOTO(err_decref, rc);
 
-	rc = obd_llog_init(obd, &obd->obd_olg, obd, NULL);
+	rc = mgc_llog_init(NULL, obd);
 	if (rc) {
 		CERROR("failed to setup llogging subsystems\n");
 		GOTO(err_cleanup, rc);
@@ -1011,23 +1083,17 @@ int mgc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 	}
 	if (KEY_IS(KEY_SET_FS)) {
 		struct super_block *sb = (struct super_block *)val;
-		struct lustre_sb_info *lsi;
+
 		if (vallen != sizeof(struct super_block))
 			return -EINVAL;
-		lsi = s2lsi(sb);
-		rc = mgc_fs_setup(exp->exp_obd, sb, lsi->lsi_srv_mnt);
-		if (rc) {
-			CERROR("set_fs got %d\n", rc);
-		}
+
+		rc = mgc_fs_setup(exp->exp_obd, sb);
 		return rc;
 	}
 	if (KEY_IS(KEY_CLEAR_FS)) {
 		if (vallen != 0)
 			return -EINVAL;
 		rc = mgc_fs_cleanup(exp->exp_obd);
-		if (rc) {
-			CERROR("clear_fs got %d\n", rc);
-		}
 		return rc;
 	}
 	if (KEY_IS(KEY_SET_INFO)) {
@@ -1145,49 +1211,6 @@ static int mgc_import_event(struct obd_device *obd,
 	return rc;
 }
 
-static int mgc_llog_init(struct obd_device *obd, struct obd_llog_group *olg,
-			 struct obd_device *tgt, int *index)
-{
-	struct llog_ctxt *ctxt;
-	int rc;
-
-	LASSERT(olg == &obd->obd_olg);
-
-
-	rc = llog_setup(NULL, obd, olg, LLOG_CONFIG_REPL_CTXT, tgt,
-			&llog_client_ops);
-	if (rc)
-		GOTO(out, rc);
-
-	ctxt = llog_group_get_ctxt(olg, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt)
-		GOTO(out, rc = -ENODEV);
-
-	llog_initiator_connect(ctxt);
-	llog_ctxt_put(ctxt);
-
-	return 0;
-out:
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return rc;
-}
-
-static int mgc_llog_finish(struct obd_device *obd, int count)
-{
-	struct llog_ctxt *ctxt;
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return 0;
-}
-
 enum {
 	CONFIG_READ_NRPAGES_INIT = 1 << (20 - PAGE_CACHE_SHIFT),
 	CONFIG_READ_NRPAGES      = 4
@@ -1540,17 +1563,58 @@ out:
 	return rc;
 }
 
+static int mgc_llog_local_copy(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct llog_ctxt *rctxt,
+			       struct llog_ctxt *lctxt, char *logname)
+{
+	char	*temp_log;
+	int	 rc;
+
+
+
+	/*
+	 * - copy it to backup using llog_backup()
+	 * - copy remote llog to logname using llog_backup()
+	 * - if failed then move bakup to logname again
+	 */
+
+	OBD_ALLOC(temp_log, strlen(logname) + 1);
+	if (!temp_log)
+		return -ENOMEM;
+	sprintf(temp_log, "%sT", logname);
+
+	/* make a copy of local llog at first */
+	rc = llog_backup(env, obd, lctxt, lctxt, logname, temp_log);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out, rc);
+	/* copy remote llog to the local copy */
+	rc = llog_backup(env, obd, rctxt, lctxt, logname, logname);
+	if (rc == -ENOENT) {
+		/* no remote llog, delete local one too */
+		llog_erase(env, lctxt, NULL, logname);
+	} else if (rc < 0) {
+		/* error during backup, get local one back from the copy */
+		llog_backup(env, obd, lctxt, lctxt, temp_log, logname);
+out:
+		CERROR("%s: failed to copy remote log %s: rc = %d\n",
+		       obd->obd_name, logname, rc);
+	}
+	llog_erase(env, lctxt, NULL, temp_log);
+	OBD_FREE(temp_log, strlen(logname) + 1);
+	return rc;
+}
 
 /* local_only means it cannot get remote llogs */
 static int mgc_process_cfg_log(struct obd_device *mgc,
-			       struct config_llog_data *cld,
-			       int local_only)
+			       struct config_llog_data *cld, int local_only)
 {
-	struct llog_ctxt *ctxt, *lctxt = NULL;
-	struct lvfs_run_ctxt *saved_ctxt;
-	struct lustre_sb_info *lsi = NULL;
-	int rc = 0, must_pop = 0;
-	bool sptlrpc_started = false;
+	struct llog_ctxt	*ctxt, *lctxt = NULL;
+	struct client_obd	*cli = &mgc->u.cli;
+	struct lustre_sb_info	*lsi = NULL;
+	int			 rc = 0;
+	bool			 sptlrpc_started = false;
+	struct lu_env		*env;
 
 	LASSERT(cld);
 	LASSERT(mutex_is_locked(&cld->cld_lock));
@@ -1565,20 +1629,49 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	if (cld->cld_cfg.cfg_sb)
 		lsi = s2lsi(cld->cld_cfg.cfg_sb);
 
-	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt) {
-		CERROR("missing llog context\n");
-		return -EINVAL;
-	}
-
-	OBD_ALLOC_PTR(saved_ctxt);
-	if (saved_ctxt == NULL)
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
 		return -ENOMEM;
 
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_free, rc);
+
+	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
 	lctxt = llog_get_context(mgc, LLOG_CONFIG_ORIG_CTXT);
 
-		if (local_only) { /* no local log at client side */
-		GOTO(out_pop, rc = -EIO);
+	/* Copy the setup log locally if we can. Don't mess around if we're
+	 * running an MGS though (logs are already local). */
+	if (lctxt && lsi && IS_SERVER(lsi) && !IS_MGS(lsi) &&
+	    cli->cl_mgc_configs_dir != NULL &&
+	    lu2dt_dev(cli->cl_mgc_configs_dir->do_lu.lo_dev) ==
+	    lsi->lsi_dt_dev) {
+		if (!local_only)
+			/* Only try to copy log if we have the lock. */
+			rc = mgc_llog_local_copy(env, mgc, ctxt, lctxt,
+						 cld->cld_logname);
+		if (local_only || rc) {
+			if (llog_is_empty(env, lctxt, cld->cld_logname)) {
+				LCONSOLE_ERROR_MSG(0x13a,
+						   "Failed to get MGS log %s and no local copy.\n",
+						   cld->cld_logname);
+				GOTO(out_pop, rc = -ENOTCONN);
+			}
+			CDEBUG(D_MGC,
+			       "Failed to get MGS log %s, using local copy for now, will try to update later.\n",
+			       cld->cld_logname);
+		}
+		/* Now, whether we copied or not, start using the local llog.
+		 * If we failed to copy, we'll start using whatever the old
+		 * log has. */
+		llog_ctxt_put(ctxt);
+		ctxt = lctxt;
+		lctxt = NULL;
+	} else {
+		if (local_only) /* no local log at client side */
+			GOTO(out_pop, rc = -EIO);
 	}
 
 	if (cld_is_sptlrpc(cld)) {
@@ -1587,19 +1680,16 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	}
 
 	/* logname and instance info should be the same, so use our
-	   copy of the instance for the update.  The cfg_last_idx will
-	   be updated here. */
-	rc = class_config_parse_llog(NULL, ctxt, cld->cld_logname,
+	 * copy of the instance for the update.  The cfg_last_idx will
+	 * be updated here. */
+	rc = class_config_parse_llog(env, ctxt, cld->cld_logname,
 				     &cld->cld_cfg);
 
 out_pop:
-	llog_ctxt_put(ctxt);
+	__llog_ctxt_put(env, ctxt);
 	if (lctxt)
-		llog_ctxt_put(lctxt);
-	if (must_pop)
-		pop_ctxt(saved_ctxt, &mgc->obd_lvfs_ctxt, NULL);
+		__llog_ctxt_put(env, lctxt);
 
-	OBD_FREE_PTR(saved_ctxt);
 	/*
 	 * update settings on existing OBDs. doing it inside
 	 * of llog_process_lock so no device is attaching/detaching
@@ -1614,6 +1704,9 @@ out_pop:
 					  strlen("-sptlrpc"));
 	}
 
+	lu_env_fini(env);
+out_free:
+	OBD_FREE_PTR(env);
 	return rc;
 }
 
@@ -1801,8 +1894,6 @@ struct obd_ops mgc_obd_ops = {
 	.o_set_info_async = mgc_set_info_async,
 	.o_get_info       = mgc_get_info,
 	.o_import_event = mgc_import_event,
-	.o_llog_init    = mgc_llog_init,
-	.o_llog_finish  = mgc_llog_finish,
 	.o_process_config = mgc_process_config,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index b4dad34..67513be 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -265,31 +265,6 @@ out:
 }
 EXPORT_SYMBOL(llog_init_handle);
 
-int llog_copy_handler(const struct lu_env *env,
-		      struct llog_handle *llh,
-		      struct llog_rec_hdr *rec,
-		      void *data)
-{
-	struct llog_rec_hdr local_rec = *rec;
-	struct llog_handle *local_llh = (struct llog_handle *)data;
-	char *cfg_buf = (char*) (rec + 1);
-	struct lustre_cfg *lcfg;
-	int rc = 0;
-
-	/* Append all records */
-	local_rec.lrh_len -= sizeof(*rec) + sizeof(struct llog_rec_tail);
-	rc = llog_write(env, local_llh, &local_rec, NULL, 0,
-			(void *)cfg_buf, -1);
-
-	lcfg = (struct lustre_cfg *)cfg_buf;
-	CDEBUG(D_INFO, "idx=%d, rc=%d, len=%d, cmd %x %s %s\n",
-	       rec->lrh_index, rc, rec->lrh_len, lcfg->lcfg_command,
-	       lustre_cfg_string(lcfg, 0), lustre_cfg_string(lcfg, 1));
-
-	return rc;
-}
-EXPORT_SYMBOL(llog_copy_handler);
-
 static int llog_process_thread(void *arg)
 {
 	struct llog_process_info	*lpi = arg;
@@ -493,14 +468,6 @@ int llog_process(const struct lu_env *env, struct llog_handle *loghandle,
 }
 EXPORT_SYMBOL(llog_process);
 
-inline int llog_get_size(struct llog_handle *loghandle)
-{
-	if (loghandle && loghandle->lgh_hdr)
-		return loghandle->lgh_hdr->llh_count;
-	return 0;
-}
-EXPORT_SYMBOL(llog_get_size);
-
 int llog_reverse_process(const struct lu_env *env,
 			 struct llog_handle *loghandle, llog_cb_t cb,
 			 void *data, void *catdata)
@@ -767,8 +734,9 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 		     struct llog_handle **res, struct llog_logid *logid,
 		     char *name)
 {
-	struct thandle	*th;
-	int		 rc;
+	struct dt_device	*d;
+	struct thandle		*th;
+	int			 rc;
 
 	rc = llog_open(env, ctxt, res, logid, name, LLOG_OPEN_NEW);
 	if (rc)
@@ -777,27 +745,21 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 	if (llog_exist(*res))
 		return 0;
 
-	if ((*res)->lgh_obj != NULL) {
-		struct dt_device *d;
+	LASSERT((*res)->lgh_obj != NULL);
 
-		d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
+	d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, d);
-		if (IS_ERR(th))
-			GOTO(out, rc = PTR_ERR(th));
+	th = dt_trans_create(env, d);
+	if (IS_ERR(th))
+		GOTO(out, rc = PTR_ERR(th));
 
-		rc = llog_declare_create(env, *res, th);
-		if (rc == 0) {
-			rc = dt_trans_start_local(env, d, th);
-			if (rc == 0)
-				rc = llog_create(env, *res, th);
-		}
-		dt_trans_stop(env, d, th);
-	} else {
-		/* lvfs compat code */
-		LASSERT((*res)->lgh_file == NULL);
-		rc = llog_create(env, *res, NULL);
+	rc = llog_declare_create(env, *res, th);
+	if (rc == 0) {
+		rc = dt_trans_start_local(env, d, th);
+		if (rc == 0)
+			rc = llog_create(env, *res, th);
 	}
+	dt_trans_stop(env, d, th);
 out:
 	if (rc)
 		llog_close(env, *res);
@@ -842,41 +804,34 @@ int llog_write(const struct lu_env *env, struct llog_handle *loghandle,
 	       struct llog_rec_hdr *rec, struct llog_cookie *reccookie,
 	       int cookiecount, void *buf, int idx)
 {
-	int rc;
+	struct dt_device	*dt;
+	struct thandle		*th;
+	int			 rc;
 
 	LASSERT(loghandle);
 	LASSERT(loghandle->lgh_ctxt);
+	LASSERT(loghandle->lgh_obj != NULL);
 
-	if (loghandle->lgh_obj != NULL) {
-		struct dt_device	*dt;
-		struct thandle		*th;
-
-		dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
+	dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, dt);
-		if (IS_ERR(th))
-			return PTR_ERR(th);
+	th = dt_trans_create(env, dt);
+	if (IS_ERR(th))
+		return PTR_ERR(th);
 
-		rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		rc = dt_trans_start_local(env, dt, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = dt_trans_start_local(env, dt, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, th);
-		up_write(&loghandle->lgh_lock);
+	down_write(&loghandle->lgh_lock);
+	rc = llog_write_rec(env, loghandle, rec, reccookie,
+			    cookiecount, buf, idx, th);
+	up_write(&loghandle->lgh_lock);
 out_trans:
-		dt_trans_stop(env, dt, th);
-	} else { /* lvfs compatibility */
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, NULL);
-		up_write(&loghandle->lgh_lock);
-	}
+	dt_trans_stop(env, dt, th);
 	return rc;
 }
 EXPORT_SYMBOL(llog_write);
@@ -932,3 +887,104 @@ out:
 	return rc;
 }
 EXPORT_SYMBOL(llog_close);
+
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name)
+{
+	struct llog_handle	*llh;
+	int			 rc = 0;
+
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		if (likely(rc == -ENOENT))
+			rc = 0;
+		GOTO(out, rc);
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+	rc = llog_get_size(llh);
+
+out_close:
+	llog_close(env, llh);
+out:
+	/* header is record 1 */
+	return (rc <= 1);
+}
+EXPORT_SYMBOL(llog_is_empty);
+
+int llog_copy_handler(const struct lu_env *env, struct llog_handle *llh,
+		      struct llog_rec_hdr *rec, void *data)
+{
+	struct llog_handle	*copy_llh = data;
+
+	/* Append all records */
+	return llog_write(env, copy_llh, rec, NULL, 0, NULL, -1);
+}
+EXPORT_SYMBOL(llog_copy_handler);
+
+/* backup plain llog */
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bctxt,
+		char *name, char *backup)
+{
+	struct llog_handle	*llh, *bllh;
+	int			 rc;
+
+
+
+	/* open original log */
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		/* the -ENOENT case is also reported to the caller
+		 * but silently so it should handle that if needed.
+		 */
+		if (rc != -ENOENT)
+			CERROR("%s: failed to open log %s: rc = %d\n",
+			       obd->obd_name, name, rc);
+		return rc;
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+
+	/* Make sure there's no old backup log */
+	rc = llog_erase(env, bctxt, NULL, backup);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out_close, rc);
+
+	/* open backup log */
+	rc = llog_open_create(env, bctxt, &bllh, NULL, backup);
+	if (rc) {
+		CERROR("%s: failed to open backup logfile %s: rc = %d\n",
+		       obd->obd_name, backup, rc);
+		GOTO(out_close, rc);
+	}
+
+	/* check that backup llog is not the same object as original one */
+	if (llh->lgh_obj == bllh->lgh_obj) {
+		CERROR("%s: backup llog %s to itself (%s), objects %p/%p\n",
+		       obd->obd_name, name, backup, llh->lgh_obj,
+		       bllh->lgh_obj);
+		GOTO(out_backup, rc = -EEXIST);
+	}
+
+	rc = llog_init_handle(env, bllh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_backup, rc);
+
+	/* Copy log record by record */
+	rc = llog_process_or_fork(env, llh, llog_copy_handler, (void *)bllh,
+				  NULL, false);
+	if (rc)
+		CERROR("%s: failed to backup log %s: rc = %d\n",
+		       obd->obd_name, name, rc);
+out_backup:
+	llog_close(env, bllh);
+out_close:
+	llog_close(env, llh);
+	return rc;
+}
+EXPORT_SYMBOL(llog_backup);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.c b/drivers/staging/lustre/lustre/obdclass/local_storage.c
index 51ab7f4..e79e4be 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.c
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.c
@@ -855,9 +855,12 @@ out_los:
 		(*los)->los_seq = fid_seq(first_fid);
 		(*los)->los_last_oid = le64_to_cpu(lastid);
 		(*los)->los_obj = o;
-		/* read value should not be less than initial one */
-		LASSERTF((*los)->los_last_oid >= first_oid, "%u < %u\n",
-			 (*los)->los_last_oid, first_oid);
+		/* Read value should not be less than initial one
+		 * but possible after upgrade from older fs.
+		 * In this case just switch to the first_oid in memory and
+		 * it will be updated on disk with first object generated */
+		if ((*los)->los_last_oid < first_oid)
+			(*los)->los_last_oid = first_oid;
 	}
 out:
 	mutex_unlock(&ls->ls_los_mutex);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.h b/drivers/staging/lustre/lustre/obdclass/local_storage.h
index d553c37..0f63b8c 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.h
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.h
@@ -29,6 +29,8 @@
  *
  * Author: Mikhail Pershin <mike.pershin@intel.com>
  */
+#ifndef __LOCAL_STORAGE_H
+#define __LOCAL_STORAGE_H
 
 #include <dt_object.h>
 #include <obd.h>
@@ -86,3 +88,4 @@ struct los_ondisk {
 };
 
 #define LOS_MAGIC	0xdecafbee
+#endif
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index d61b0db..00c7e9c 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -423,7 +423,7 @@ LU_KEY_INIT_FINI(lu_global, struct lu_cdebug_data);
  */
 struct lu_context_key lu_global_key = {
 	.lct_tags = LCT_MD_THREAD | LCT_DT_THREAD |
-		    LCT_MG_THREAD | LCT_CL_THREAD,
+		    LCT_MG_THREAD | LCT_CL_THREAD | LCT_LOCAL,
 	.lct_init = lu_global_key_init,
 	.lct_fini = lu_global_key_fini
 };
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 68a4d6a..a69a630 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -631,6 +631,9 @@ int lustre_put_lsi(struct super_block *sb)
 	CDEBUG(D_MOUNT, "put %p %d\n", sb, atomic_read(&lsi->lsi_mounts));
 	if (atomic_dec_and_test(&lsi->lsi_mounts)) {
 		if (IS_SERVER(lsi) && lsi->lsi_osd_exp) {
+			lu_device_put(&lsi->lsi_dt_dev->dd_lu_dev);
+			lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt = NULL;
+			lsi->lsi_dt_dev = NULL;
 			obd_disconnect(lsi->lsi_osd_exp);
 			/* wait till OSD is gone */
 			obd_zombie_barrier();
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface
  2013-11-26  3:30   ` Greg Kroah-Hartman
@ 2013-11-26  4:09     ` Peng Tao
  0 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26  4:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Andreas Dilger

On 11/26/2013 11:30 AM, Greg Kroah-Hartman wrote:
> On Tue, Nov 26, 2013 at 10:05:02AM +0800, Peng Tao wrote:
>> From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
>>
>> To move data with external storage, HSM coordinator
>> uses a Copy Tool running on a client named agent.
>> This patch implements the interface for these agents.
>
> Interesting text here...
>
>> Lustre-change: http://review.whamcloud.com/6534
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3342
>> Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
>> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
>> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
>> Reviewed-by: John L. Hammond <john.hammond@intel.com>
>> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
>> Signed-off-by: Peng Tao <bergwolf@gmail.com>
>> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
>> ---
>>   .../lustre/lustre/include/lustre/lustre_user.h     |    6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>> index 9436166..631f026 100644
>> --- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>> +++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>> @@ -428,8 +428,8 @@ struct obd_uuid {
>>   	char uuid[UUID_MAX];
>>   };
>>
>> -static inline int obd_uuid_equals(const struct obd_uuid *u1,
>> -				  const struct obd_uuid *u2)
>> +static inline bool obd_uuid_equals(const struct obd_uuid *u1,
>> +				   const struct obd_uuid *u2)
>>   {
>>   	return strcmp((char *)u1->uuid, (char *)u2->uuid) == 0;
>>   }
>> @@ -446,7 +446,7 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
>>   }
>>
>>   /* For printf's only, make sure uuid is terminated */
>> -static inline char *obd_uuid2str(struct obd_uuid *uuid)
>> +static inline char *obd_uuid2str(const struct obd_uuid *uuid)
>>   {
>>   	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
>>   		/* Obviously not safe, but for printfs, no real harm done...
>
> Too bad it doesn't describe the changes made in the code at all.
>
> How can so many people review a patch that is not the same as what it
> says it really is?
>
The main part of the original commit is dropped as it only touches 
server code. I should have modified the commit message to say what 
exactly the patch does rather than just copying the original commit 
message. I'll fix up. Sorry.

> I'm stopping here in the series, sorry.  Please fix up and resend the
> rest when you can.
>
Will do. Thanks for reviewing!

Thanks,
Tao



^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-26  2:04 ` [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
@ 2013-11-26  6:45   ` Patrick Farrell
  2013-11-26 14:09     ` Peng Tao
  0 siblings, 1 reply; 25+ messages in thread
From: Patrick Farrell @ 2013-11-26  6:45 UTC (permalink / raw)
  To: Peng Tao, Greg Kroah-Hartman; +Cc: linux-kernel, Cheng Shao, Andreas Dilger

Peng,

I'm sorry to say, this patch was reverted due to interoperability problems with 2.1 servers (This either a slightly later [but still broken] version of the patch we discussed recently, or the same one.).  If Greg's accepted this upstream (and it looks like he has), it'll have to be reverted.

Sorry for the trouble here!

- Patrick
________________________________________
From: Peng Tao [bergwolf@gmail.com]
Sent: Monday, November 25, 2013 8:04 PM
To: Greg Kroah-Hartman
Cc: linux-kernel@vger.kernel.org; Patrick Farrell; Cheng Shao; Peng Tao; Andreas Dilger
Subject: [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT

From: Patrick Farrell <paf@cray.com>

This happend with SLES11SP2 Lustre client, which in turn acts as an
NFS server, exporting a subtree of an Lustre fs through NFS.

We detected that whenever we are writing to a new file using, fx,
'echo blah > newfile', it will return ENOENT error. We found
out that this was caused by the anonymous dentry. In SLESS11SP2,
anonymous dentries are assigned '/' as the name, instead of an
empty string. When MDT handles the intent_open call, it will look
up the obj by the name if it is not an empty string, and thus
couldn't find it.

As MDS_OPEN_BY_FID is always set on this request, we never need
to send the name in this request.  The fid is already available
and should be used in case the file has been renamed.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
Lustre-change: http://review.whamcloud.com/6920
Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 82248e9..f36c5d8 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
 {
        struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
        struct dentry *parent = file->f_dentry->d_parent;
-       const char *name = file->f_dentry->d_name.name;
-       const int len = file->f_dentry->d_name.len;
        struct md_op_data *op_data;
        struct ptlrpc_request *req;
        __u32 opc = LUSTRE_OPC_ANY;
@@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
        }

        op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
-                                     file->f_dentry->d_inode, name, len,
+                                     file->f_dentry->d_inode, NULL, 0,
                                      O_RDWR, opc, NULL);
+
        if (IS_ERR(op_data))
                return PTR_ERR(op_data);

--
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-26  6:45   ` Patrick Farrell
@ 2013-11-26 14:09     ` Peng Tao
  0 siblings, 0 replies; 25+ messages in thread
From: Peng Tao @ 2013-11-26 14:09 UTC (permalink / raw)
  To: Patrick Farrell
  Cc: Greg Kroah-Hartman, linux-kernel, Cheng Shao, Andreas Dilger

On Tue, Nov 26, 2013 at 2:45 PM, Patrick Farrell <paf@cray.com> wrote:
> Peng,
>
> I'm sorry to say, this patch was reverted due to interoperability problems with 2.1 servers (This either a slightly later [but still broken] version of the patch we discussed recently, or the same one.).  If Greg's accepted this upstream (and it looks like he has), it'll have to be reverted.
>
Sorry, my bad. I forgot to remove it from my patch queue after we
discussed last time. I'll send a reverting patch. Sorry for the noise.

Thanks,
Tao

> Sorry for the trouble here!
>
> - Patrick
> ________________________________________
> From: Peng Tao [bergwolf@gmail.com]
> Sent: Monday, November 25, 2013 8:04 PM
> To: Greg Kroah-Hartman
> Cc: linux-kernel@vger.kernel.org; Patrick Farrell; Cheng Shao; Peng Tao; Andreas Dilger
> Subject: [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT
>
> From: Patrick Farrell <paf@cray.com>
>
> This happend with SLES11SP2 Lustre client, which in turn acts as an
> NFS server, exporting a subtree of an Lustre fs through NFS.
>
> We detected that whenever we are writing to a new file using, fx,
> 'echo blah > newfile', it will return ENOENT error. We found
> out that this was caused by the anonymous dentry. In SLESS11SP2,
> anonymous dentries are assigned '/' as the name, instead of an
> empty string. When MDT handles the intent_open call, it will look
> up the obj by the name if it is not an empty string, and thus
> couldn't find it.
>
> As MDS_OPEN_BY_FID is always set on this request, we never need
> to send the name in this request.  The fid is already available
> and should be used in case the file has been renamed.
>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
> Lustre-change: http://review.whamcloud.com/6920
> Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
> Signed-off-by: Patrick Farrell <paf@cray.com>
> Reviewed-by: Bob Glossman <bob.glossman@intel.com>
> Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
> Reviewed-by: Lai Siyao <lai.siyao@intel.com>
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: Peng Tao <bergwolf@gmail.com>
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> ---
>  drivers/staging/lustre/lustre/llite/file.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
> index 82248e9..f36c5d8 100644
> --- a/drivers/staging/lustre/lustre/llite/file.c
> +++ b/drivers/staging/lustre/lustre/llite/file.c
> @@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
>  {
>         struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
>         struct dentry *parent = file->f_dentry->d_parent;
> -       const char *name = file->f_dentry->d_name.name;
> -       const int len = file->f_dentry->d_name.len;
>         struct md_op_data *op_data;
>         struct ptlrpc_request *req;
>         __u32 opc = LUSTRE_OPC_ANY;
> @@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
>         }
>
>         op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
> -                                     file->f_dentry->d_inode, name, len,
> +                                     file->f_dentry->d_inode, NULL, 0,
>                                       O_RDWR, opc, NULL);
> +
>         if (IS_ERR(op_data))
>                 return PTR_ERR(op_data);
>
> --
> 1.7.9.5
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-11-26 14:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-26  2:04 [PATCH 00/16] staging/lustre: sync with external tree, set 2 Peng Tao
2013-11-26  2:04 ` [PATCH 01/16] staging/lustre/server: use unified request handler for MGS Peng Tao
2013-11-26  2:04 ` [PATCH 02/16] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
2013-11-26  3:14   ` Greg Kroah-Hartman
2013-11-26  3:25     ` Peng Tao
2013-11-26  3:34       ` Greg Kroah-Hartman
2013-11-26  4:05         ` Peng Tao
2013-11-26  2:04 ` [PATCH 03/16] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
2013-11-26  6:45   ` Patrick Farrell
2013-11-26 14:09     ` Peng Tao
2013-11-26  2:04 ` [PATCH 04/16] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
2013-11-26  2:04 ` [PATCH 05/16] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
2013-11-26  2:05 ` [PATCH 06/16] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
2013-11-26  2:05 ` [PATCH 07/16] staging/lustre/mdt: HSM coordinator client interface Peng Tao
2013-11-26  2:05 ` [PATCH 08/16] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
2013-11-26  3:30   ` Greg Kroah-Hartman
2013-11-26  4:09     ` Peng Tao
2013-11-26  2:05 ` [PATCH 09/16] staging/lustre/scrub: OI scrub on OST Peng Tao
2013-11-26  2:05 ` [PATCH 10/16] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
2013-11-26  2:05 ` [PATCH 11/16] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
2013-11-26  2:05 ` [PATCH 12/16] staging/lustre/build: clean up unused variables and dead code Peng Tao
2013-11-26  2:05 ` [PATCH 13/16] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
2013-11-26  2:05 ` [PATCH 14/16] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
2013-11-26  2:05 ` [PATCH 15/16] staging/lustre/hsm: Add hsm_release feature Peng Tao
2013-11-26  2:05 ` [PATCH] staging/lustre/llite: extended attribute cache Peng Tao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).