linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/40] staging/lustre: patch bomb 1
@ 2013-11-14 16:13 Peng Tao
  2013-11-14 16:13 ` [PATCH 01/40] staging/lustre/llite: restore ll_fiemap Peng Tao
                   ` (40 more replies)
  0 siblings, 41 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Peng Tao, Andreas Dilger

[sadly git-send-email died sending the patchset... so resending]

Hi Greg,

Here are 40 patches. Three cleanup patches and 37 patches ported
from Lustre master to commit
LU-2940 llite: Fix for oops in vvp_pgcache_show()

There are two more patchsets that I will send out following up.

Thanks,
Tao

Cc: Andreas Dilger <andreas.dilger@intel.com>

Amir Shehata (2):
  staging/lustre/lnet: Fix assert on empty group in selftest module
  staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer

Andreas Dilger (2):
  staging/lustre/ldlm: fix resource/fid check, use DLDLMRES
  staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h

Andrew Perepechko (1):
  staging/lustre/llite: extended attribute cache

Andriy Skulysh (2):
  staging/lustre/ptlrpc: Fix race during exp_flock_hash creation
  staging/lustre/ptlrpc: flock deadlock detection does not work

Artem Blagodarenko (1):
  staging/lustre/mgs: set_param -P option that sets value permanently

Dmitry Eremin (2):
  staging/lustre/build: clean up unused variables and dead code
  staging/lustre/build: fix compilation issue with is_compat_task

Doug Oucharek (1):
  staging/lustre/lnet: Add LNet Router Priority parameter

Fan Yong (2):
  staging/lustre/scrub: OI scrub on OST
  staging/lustre/scrub: control OI scrub on OST from user space

JC Lafoucriere (5):
  staging/lustre/llite: Access to released file trigs a restore
  staging/lustre/mdt: HSM coordinator client interface
  staging/lustre/mdt: HSM coordinator agent interface
  staging/lustre/api: HSM import uses new released pattern
  staging/lustre/utils: HSM Posix CopyTool

James Simmons (2):
  staging/lustre/autoconf: remove vectored fops tests
  staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test

Jinshan Xiong (2):
  staging/lustre/hsm: Implementation of exclusive open
  staging/lustre/hsm: Add hsm_release feature.

John L. Hammond (7):
  staging/lustre: validate open handle cookies
  staging/lustre/llite: use correct FID in ll_och_fill()
  staging/lustre/lov: convert magic to host-endian in lov_dump_lmm()
  staging/lustre/mdc: prevent fall through in mdc_iocontrol()
  staging/lustre/lu: shrink lu_object by 8 bytes on x86_64
  staging/lustre/llite: don't check for O_CREAT in it_create_mode
  staging/lustre/llite: pass correct pointer to obd_iocontrol()

Lai Siyao (1):
  staging/lustre/llite: remove ll_d_root_ops

Li Xi (1):
  staging/lustre/llog: fix return value of llog_alloc_handle

Mikhail Pershin (4):
  staging/lustre/server: use unified request handler for MGS
  staging/lustre/llog: MGC to use OSD API for backup logs
  staging/lustre/target: move OUT to the unified target code
  staging/lustre/seq: unified SEQ handler

Nathaniel Clark (1):
  staging/lustre/xattr: separate ACL and XATTR caches

Patrick Farrell (1):
  staging/lustre/nfs: writing to new files will return ENOENT

Peng Tao (3):
  staging/lustre/llite: restore ll_fiemap
  staging/lustre: remove lu_target.h
  staging/lustre: remove llog_server.c

 .../staging/lustre/include/linux/libcfs/curproc.h  |    1 -
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    5 +-
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    5 +-
 drivers/staging/lustre/lnet/lnet/config.c          |   39 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    6 +
 drivers/staging/lustre/lnet/lnet/router.c          |   19 +-
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   16 +-
 drivers/staging/lustre/lnet/selftest/conctl.c      |   56 +-
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    2 +-
 drivers/staging/lustre/lnet/selftest/console.c     |  105 ++--
 drivers/staging/lustre/lnet/selftest/console.h     |    6 +-
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 -
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
 drivers/staging/lustre/lnet/selftest/timer.c       |    6 +-
 drivers/staging/lustre/lustre/include/cl_object.h  |    6 +-
 drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
 .../lustre/lustre/include/linux/lustre_compat25.h  |    4 +-
 .../lustre/lustre/include/linux/lustre_intent.h    |    2 +-
 .../lustre/lustre/include/linux/lustre_lite.h      |    1 +
 drivers/staging/lustre/lustre/include/lu_object.h  |   19 -
 drivers/staging/lustre/lustre/include/lu_target.h  |   91 ---
 .../lustre/lustre/include/lustre/lustre_idl.h      |  111 ++--
 .../lustre/lustre/include/lustre/lustre_user.h     |   63 +-
 .../lustre/lustre/include/lustre/lustreapi.h       |  208 ++++---
 drivers/staging/lustre/lustre/include/lustre_cfg.h |    2 +
 .../staging/lustre/lustre/include/lustre_disk.h    |    2 +
 .../lustre/lustre/include/lustre_dlm_flags.h       |   24 +-
 drivers/staging/lustre/lustre/include/lustre_fid.h |    6 -
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    3 -
 .../staging/lustre/lustre/include/lustre_handles.h |    5 +-
 drivers/staging/lustre/lustre/include/lustre_lib.h |   11 +-
 drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |   19 +-
 .../lustre/lustre/include/lustre_req_layout.h      |    7 +
 drivers/staging/lustre/lustre/include/md_object.h  |    4 +-
 drivers/staging/lustre/lustre/include/obd.h        |   15 +-
 drivers/staging/lustre/lustre/include/obd_class.h  |    9 +-
 .../staging/lustre/lustre/include/obd_support.h    |   10 +
 drivers/staging/lustre/lustre/lclient/lcommon_cl.c |    6 +
 .../staging/lustre/lustre/lclient/lcommon_misc.c   |    4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   69 ++-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    7 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   47 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |   39 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |   19 +-
 .../lustre/lustre/libcfs/linux/linux-curproc.c     |   13 -
 drivers/staging/lustre/lustre/libcfs/workitem.c    |   55 +-
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/dcache.c       |   77 +--
 drivers/staging/lustre/lustre/llite/dir.c          |   47 +-
 drivers/staging/lustre/lustre/llite/file.c         |  622 ++++++++++++++++++--
 .../staging/lustre/lustre/llite/llite_internal.h   |   59 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   91 ++-
 drivers/staging/lustre/lustre/llite/llite_nfs.c    |    6 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  |   36 ++
 drivers/staging/lustre/lustre/llite/namei.c        |   53 +-
 drivers/staging/lustre/lustre/llite/statahead.c    |    7 +-
 drivers/staging/lustre/lustre/llite/super25.c      |    4 +
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   54 +-
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    2 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  105 ++--
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  559 ++++++++++++++++++
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |   16 +
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    4 +-
 drivers/staging/lustre/lustre/lov/lov_io.c         |   15 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     |   35 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c       |   20 +-
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   31 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  104 +++-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  116 ++--
 drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  492 ++++++++++------
 drivers/staging/lustre/lustre/obdclass/genops.c    |    2 +-
 drivers/staging/lustre/lustre/obdclass/llog.c      |  214 ++++---
 .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
 .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
 .../lustre/lustre/obdclass/lprocfs_status.c        |    5 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   26 +-
 .../lustre/lustre/obdclass/lustre_handles.c        |    4 +-
 drivers/staging/lustre/lustre/obdclass/md_attrs.c  |    4 +-
 .../staging/lustre/lustre/obdclass/obd_config.c    |   50 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
 .../staging/lustre/lustre/obdecho/echo_client.c    |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/Makefile      |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   10 +-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |    2 -
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   63 +-
 drivers/staging/lustre/lustre/ptlrpc/llog_server.c |  450 --------------
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   16 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   59 +-
 drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   65 --
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |   44 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   60 +-
 99 files changed, 3043 insertions(+), 1793 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/include/lu_target.h
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_cache.c
 delete mode 100644 drivers/staging/lustre/lustre/ptlrpc/llog_server.c

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 01/40] staging/lustre/llite: restore ll_fiemap
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 02/40] staging/lustre: remove lu_target.h Peng Tao
                   ` (39 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Peng Tao, Peng Tao, Andreas Dilger

From: Peng Tao <tao.peng@emc.com>

It was removed by coan by mistake when first porting the code.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |   33 ++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index fb85a58..c5b721c 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2628,6 +2628,38 @@ int ll_getattr(struct vfsmount *mnt, struct dentry *de, struct kstat *stat)
 	return ll_getattr_it(mnt, de, &it, stat);
 }
 
+int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		__u64 start, __u64 len)
+{
+	int rc;
+	size_t num_bytes;
+	struct ll_user_fiemap *fiemap;
+	unsigned int extent_count = fieinfo->fi_extents_max;
+
+	num_bytes = sizeof(*fiemap) + (extent_count *
+				       sizeof(struct ll_fiemap_extent));
+	OBD_ALLOC_LARGE(fiemap, num_bytes);
+
+	if (fiemap == NULL)
+		return -ENOMEM;
+
+	fiemap->fm_flags = fieinfo->fi_flags;
+	fiemap->fm_extent_count = fieinfo->fi_extents_max;
+	fiemap->fm_start = start;
+	fiemap->fm_length = len;
+	memcpy(&fiemap->fm_extents[0], fieinfo->fi_extents_start,
+	       sizeof(struct ll_fiemap_extent));
+
+	rc = ll_do_fiemap(inode, fiemap, num_bytes);
+
+	fieinfo->fi_flags = fiemap->fm_flags;
+	fieinfo->fi_extents_mapped = fiemap->fm_mapped_extents;
+	memcpy(fieinfo->fi_extents_start, &fiemap->fm_extents[0],
+	       fiemap->fm_mapped_extents * sizeof(struct ll_fiemap_extent));
+
+	OBD_FREE_LARGE(fiemap, num_bytes);
+	return rc;
+}
 
 struct posix_acl * ll_get_acl(struct inode *inode, int type)
 {
@@ -2740,6 +2772,7 @@ struct inode_operations ll_file_inode_operations = {
 	.getxattr	= ll_getxattr,
 	.listxattr	= ll_listxattr,
 	.removexattr	= ll_removexattr,
+	.fiemap		= ll_fiemap,
 	.get_acl	= ll_get_acl,
 };
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 02/40] staging/lustre: remove lu_target.h
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
  2013-11-14 16:13 ` [PATCH 01/40] staging/lustre/llite: restore ll_fiemap Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 03/40] staging/lustre: remove llog_server.c Peng Tao
                   ` (38 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Peng Tao, Peng Tao, Andreas Dilger

From: Peng Tao <tao.peng@emc.com>

It is only needed by server code.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lu_target.h |   91 ---------------------
 1 file changed, 91 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/include/lu_target.h

diff --git a/drivers/staging/lustre/lustre/include/lu_target.h b/drivers/staging/lustre/lustre/include/lu_target.h
deleted file mode 100644
index 8d48cf4..0000000
--- a/drivers/staging/lustre/lustre/include/lu_target.h
+++ /dev/null
@@ -1,91 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.sun.com/software/products/lustre/docs/GPLv2.pdf
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2009, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- */
-
-#ifndef _LUSTRE_LU_TARGET_H
-#define _LUSTRE_LU_TARGET_H
-
-#include <dt_object.h>
-#include <lustre_disk.h>
-
-struct lu_target {
-	struct obd_device       *lut_obd;
-	struct dt_device	*lut_bottom;
-	/** last_rcvd file */
-	struct dt_object	*lut_last_rcvd;
-	/* transaction callbacks */
-	struct dt_txn_callback   lut_txn_cb;
-	/** server data in last_rcvd file */
-	struct lr_server_data    lut_lsd;
-	/** Server last transaction number */
-	__u64		    lut_last_transno;
-	/** Lock protecting last transaction number */
-	spinlock_t		 lut_translock;
-	/** Lock protecting client bitmap */
-	spinlock_t		 lut_client_bitmap_lock;
-	/** Bitmap of known clients */
-	unsigned long	   *lut_client_bitmap;
-};
-
-typedef void (*tgt_cb_t)(struct lu_target *lut, __u64 transno,
-			 void *data, int err);
-struct tgt_commit_cb {
-	tgt_cb_t  tgt_cb_func;
-	void     *tgt_cb_data;
-};
-
-void tgt_boot_epoch_update(struct lu_target *lut);
-int tgt_last_commit_cb_add(struct thandle *th, struct lu_target *lut,
-			   struct obd_export *exp, __u64 transno);
-int tgt_new_client_cb_add(struct thandle *th, struct obd_export *exp);
-int tgt_init(const struct lu_env *env, struct lu_target *lut,
-	     struct obd_device *obd, struct dt_device *dt);
-void tgt_fini(const struct lu_env *env, struct lu_target *lut);
-int tgt_client_alloc(struct obd_export *exp);
-void tgt_client_free(struct obd_export *exp);
-int tgt_client_del(const struct lu_env *env, struct obd_export *exp);
-int tgt_client_add(const struct lu_env *env, struct obd_export *exp, int);
-int tgt_client_new(const struct lu_env *env, struct obd_export *exp);
-int tgt_client_data_read(const struct lu_env *env, struct lu_target *tg,
-			 struct lsd_client_data *lcd, loff_t *off, int index);
-int tgt_client_data_write(const struct lu_env *env, struct lu_target *tg,
-			  struct lsd_client_data *lcd, loff_t *off, struct thandle *th);
-int tgt_server_data_read(const struct lu_env *env, struct lu_target *tg);
-int tgt_server_data_write(const struct lu_env *env, struct lu_target *tg,
-			  struct thandle *th);
-int tgt_server_data_update(const struct lu_env *env, struct lu_target *tg, int sync);
-int tgt_truncate_last_rcvd(const struct lu_env *env, struct lu_target *tg, loff_t off);
-
-#endif /* __LUSTRE_LU_TARGET_H */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 03/40] staging/lustre: remove llog_server.c
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
  2013-11-14 16:13 ` [PATCH 01/40] staging/lustre/llite: restore ll_fiemap Peng Tao
  2013-11-14 16:13 ` [PATCH 02/40] staging/lustre: remove lu_target.h Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore Peng Tao
                   ` (37 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Peng Tao, Peng Tao, Andreas Dilger

From: Peng Tao <tao.peng@emc.com>

It is only used by server.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |    9 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   39 --
 drivers/staging/lustre/lustre/ptlrpc/Makefile      |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/llog_server.c |  450 --------------------
 4 files changed, 1 insertion(+), 499 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/ptlrpc/llog_server.c

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 72edf01..91f28e3 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -3470,15 +3470,6 @@ static inline void ptlrpc_lprocfs_brw(struct ptlrpc_request *req, int bytes) {}
 #endif
 /** @} */
 
-/* ptlrpc/llog_server.c */
-int llog_origin_handle_open(struct ptlrpc_request *req);
-int llog_origin_handle_destroy(struct ptlrpc_request *req);
-int llog_origin_handle_prev_block(struct ptlrpc_request *req);
-int llog_origin_handle_next_block(struct ptlrpc_request *req);
-int llog_origin_handle_read_header(struct ptlrpc_request *req);
-int llog_origin_handle_close(struct ptlrpc_request *req);
-int llog_origin_handle_cancel(struct ptlrpc_request *req);
-
 /* ptlrpc/llog_client.c */
 extern struct llog_operations llog_client_ops;
 
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index fde9bcd..40c58b7 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -597,45 +597,6 @@ static int ldlm_callback_handler(struct ptlrpc_request *req)
 		rc = ldlm_handle_setinfo(req);
 		ldlm_callback_reply(req, rc);
 		return 0;
-	case OBD_LOG_CANCEL: /* remove this eventually - for 1.4.0 compat */
-		CERROR("shouldn't be handling OBD_LOG_CANCEL on DLM thread\n");
-		req_capsule_set(&req->rq_pill, &RQF_LOG_CANCEL);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOG_CANCEL_NET))
-			return 0;
-		rc = llog_origin_handle_cancel(req);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOG_CANCEL_REP))
-			return 0;
-		ldlm_callback_reply(req, rc);
-		return 0;
-	case LLOG_ORIGIN_HANDLE_CREATE:
-		req_capsule_set(&req->rq_pill, &RQF_LLOG_ORIGIN_HANDLE_CREATE);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOGD_NET))
-			return 0;
-		rc = llog_origin_handle_open(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
-	case LLOG_ORIGIN_HANDLE_NEXT_BLOCK:
-		req_capsule_set(&req->rq_pill,
-				&RQF_LLOG_ORIGIN_HANDLE_NEXT_BLOCK);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOGD_NET))
-			return 0;
-		rc = llog_origin_handle_next_block(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
-	case LLOG_ORIGIN_HANDLE_READ_HEADER:
-		req_capsule_set(&req->rq_pill,
-				&RQF_LLOG_ORIGIN_HANDLE_READ_HEADER);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOGD_NET))
-			return 0;
-		rc = llog_origin_handle_read_header(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
-	case LLOG_ORIGIN_HANDLE_CLOSE:
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_LOGD_NET))
-			return 0;
-		rc = llog_origin_handle_close(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
 	case OBD_QC_CALLBACK:
 		req_capsule_set(&req->rq_pill, &RQF_QC_CALLBACK);
 		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_QC_CALLBACK_NET))
diff --git a/drivers/staging/lustre/lustre/ptlrpc/Makefile b/drivers/staging/lustre/lustre/ptlrpc/Makefile
index 6d78b80..2ec0c24 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/Makefile
+++ b/drivers/staging/lustre/lustre/ptlrpc/Makefile
@@ -10,7 +10,7 @@ ldlm_objs += $(LDLM)ldlm_pool.o
 ldlm_objs += $(LDLM)interval_tree.o
 ptlrpc_objs := client.o recover.o connection.o niobuf.o pack_generic.o
 ptlrpc_objs += events.o ptlrpc_module.o service.o pinger.o
-ptlrpc_objs += llog_net.o llog_client.o llog_server.o import.o ptlrpcd.o
+ptlrpc_objs += llog_net.o llog_client.o import.o ptlrpcd.o
 ptlrpc_objs += pers.o lproc_ptlrpc.o wiretest.o layout.o
 ptlrpc_objs += sec.o sec_bulk.o sec_gc.o sec_config.o sec_lproc.o
 ptlrpc_objs += sec_null.o sec_plain.o nrs.o nrs_fifo.o
diff --git a/drivers/staging/lustre/lustre/ptlrpc/llog_server.c b/drivers/staging/lustre/lustre/ptlrpc/llog_server.c
deleted file mode 100644
index af9d2ac..0000000
--- a/drivers/staging/lustre/lustre/ptlrpc/llog_server.c
+++ /dev/null
@@ -1,450 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.sun.com/software/products/lustre/docs/GPLv2.pdf
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- *
- * lustre/ptlrpc/llog_server.c
- *
- * remote api for llog - server side
- *
- * Author: Andreas Dilger <adilger@clusterfs.com>
- */
-
-#define DEBUG_SUBSYSTEM S_LOG
-
-
-#include <obd_class.h>
-#include <lustre_log.h>
-#include <lustre_net.h>
-#include <lustre_fsfilt.h>
-
-#if  defined(LUSTRE_LOG_SERVER)
-static int llog_origin_close(const struct lu_env *env, struct llog_handle *lgh)
-{
-	if (lgh->lgh_hdr != NULL && lgh->lgh_hdr->llh_flags & LLOG_F_IS_CAT)
-		return llog_cat_close(env, lgh);
-	else
-		return llog_close(env, lgh);
-}
-
-/* Only open is supported, no new llog can be created remotely */
-int llog_origin_handle_open(struct ptlrpc_request *req)
-{
-	struct obd_export	*exp = req->rq_export;
-	struct obd_device	*obd = exp->exp_obd;
-	struct obd_device	*disk_obd;
-	struct lvfs_run_ctxt	 saved;
-	struct llog_handle	*loghandle;
-	struct llogd_body	*body;
-	struct llog_logid	*logid = NULL;
-	struct llog_ctxt	*ctxt;
-	char			*name = NULL;
-	int			 rc;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	if (body == NULL)
-		return -EFAULT;
-
-	if (ostid_id(&body->lgd_logid.lgl_oi) > 0)
-		logid = &body->lgd_logid;
-
-	if (req_capsule_field_present(&req->rq_pill, &RMF_NAME, RCL_CLIENT)) {
-		name = req_capsule_client_get(&req->rq_pill, &RMF_NAME);
-		if (name == NULL)
-			return -EFAULT;
-		CDEBUG(D_INFO, "%s: opening log %s\n", obd->obd_name, name);
-	}
-
-	ctxt = llog_get_context(obd, body->lgd_ctxt_idx);
-	if (ctxt == NULL) {
-		CDEBUG(D_WARNING, "%s: no ctxt. group=%p idx=%d name=%s\n",
-		       obd->obd_name, &obd->obd_olg, body->lgd_ctxt_idx, name);
-		return -ENODEV;
-	}
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-
-	rc = llog_open(req->rq_svc_thread->t_env, ctxt, &loghandle, logid,
-		       name, LLOG_OPEN_EXISTS);
-	if (rc)
-		GOTO(out_pop, rc);
-
-	rc = req_capsule_server_pack(&req->rq_pill);
-	if (rc)
-		GOTO(out_close, rc = -ENOMEM);
-
-	body = req_capsule_server_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	body->lgd_logid = loghandle->lgh_id;
-
-out_close:
-	llog_origin_close(req->rq_svc_thread->t_env, loghandle);
-out_pop:
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_open);
-
-int llog_origin_handle_destroy(struct ptlrpc_request *req)
-{
-	struct obd_device	*disk_obd;
-	struct lvfs_run_ctxt	 saved;
-	struct llogd_body	*body;
-	struct llog_logid	*logid = NULL;
-	struct llog_ctxt	*ctxt;
-	int			 rc;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	if (body == NULL)
-		return -EFAULT;
-
-	if (ostid_id(&body->lgd_logid.lgl_oi) > 0)
-		logid = &body->lgd_logid;
-
-	if (!(body->lgd_llh_flags & LLOG_F_IS_PLAIN))
-		CERROR("%s: wrong llog flags %x\n",
-		       req->rq_export->exp_obd->obd_name, body->lgd_llh_flags);
-
-	ctxt = llog_get_context(req->rq_export->exp_obd, body->lgd_ctxt_idx);
-	if (ctxt == NULL)
-		return -ENODEV;
-
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-
-	rc = req_capsule_server_pack(&req->rq_pill);
-	/* erase only if no error and logid is valid */
-	if (rc == 0)
-		rc = llog_erase(req->rq_svc_thread->t_env, ctxt, logid, NULL);
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_destroy);
-
-int llog_origin_handle_next_block(struct ptlrpc_request *req)
-{
-	struct obd_device   *disk_obd;
-	struct llog_handle  *loghandle;
-	struct llogd_body   *body;
-	struct llogd_body   *repbody;
-	struct lvfs_run_ctxt saved;
-	struct llog_ctxt    *ctxt;
-	__u32		flags;
-	void		*ptr;
-	int		  rc;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	if (body == NULL)
-		return -EFAULT;
-
-	ctxt = llog_get_context(req->rq_export->exp_obd, body->lgd_ctxt_idx);
-	if (ctxt == NULL)
-		return -ENODEV;
-
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-
-	rc = llog_open(req->rq_svc_thread->t_env, ctxt, &loghandle,
-		       &body->lgd_logid, NULL, LLOG_OPEN_EXISTS);
-	if (rc)
-		GOTO(out_pop, rc);
-
-	flags = body->lgd_llh_flags;
-	rc = llog_init_handle(req->rq_svc_thread->t_env, loghandle, flags,
-			      NULL);
-	if (rc)
-		GOTO(out_close, rc);
-
-	req_capsule_set_size(&req->rq_pill, &RMF_EADATA, RCL_SERVER,
-			     LLOG_CHUNK_SIZE);
-	rc = req_capsule_server_pack(&req->rq_pill);
-	if (rc)
-		GOTO(out_close, rc = -ENOMEM);
-
-	repbody = req_capsule_server_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	*repbody = *body;
-
-	ptr = req_capsule_server_get(&req->rq_pill, &RMF_EADATA);
-	rc = llog_next_block(req->rq_svc_thread->t_env, loghandle,
-			     &repbody->lgd_saved_index, repbody->lgd_index,
-			     &repbody->lgd_cur_offset, ptr, LLOG_CHUNK_SIZE);
-	if (rc)
-		GOTO(out_close, rc);
-out_close:
-	llog_origin_close(req->rq_svc_thread->t_env, loghandle);
-out_pop:
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_next_block);
-
-int llog_origin_handle_prev_block(struct ptlrpc_request *req)
-{
-	struct llog_handle   *loghandle;
-	struct llogd_body    *body;
-	struct llogd_body    *repbody;
-	struct obd_device    *disk_obd;
-	struct lvfs_run_ctxt  saved;
-	struct llog_ctxt     *ctxt;
-	__u32		 flags;
-	void		 *ptr;
-	int		   rc;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	if (body == NULL)
-		return -EFAULT;
-
-	ctxt = llog_get_context(req->rq_export->exp_obd, body->lgd_ctxt_idx);
-	if (ctxt == NULL)
-		return -ENODEV;
-
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-
-	rc = llog_open(req->rq_svc_thread->t_env, ctxt, &loghandle,
-			 &body->lgd_logid, NULL, LLOG_OPEN_EXISTS);
-	if (rc)
-		GOTO(out_pop, rc);
-
-	flags = body->lgd_llh_flags;
-	rc = llog_init_handle(req->rq_svc_thread->t_env, loghandle, flags,
-			      NULL);
-	if (rc)
-		GOTO(out_close, rc);
-
-	req_capsule_set_size(&req->rq_pill, &RMF_EADATA, RCL_SERVER,
-			     LLOG_CHUNK_SIZE);
-	rc = req_capsule_server_pack(&req->rq_pill);
-	if (rc)
-		GOTO(out_close, rc = -ENOMEM);
-
-	repbody = req_capsule_server_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	*repbody = *body;
-
-	ptr = req_capsule_server_get(&req->rq_pill, &RMF_EADATA);
-	rc = llog_prev_block(req->rq_svc_thread->t_env, loghandle,
-			     body->lgd_index, ptr, LLOG_CHUNK_SIZE);
-	if (rc)
-		GOTO(out_close, rc);
-
-out_close:
-	llog_origin_close(req->rq_svc_thread->t_env, loghandle);
-out_pop:
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_prev_block);
-
-int llog_origin_handle_read_header(struct ptlrpc_request *req)
-{
-	struct obd_device    *disk_obd;
-	struct llog_handle   *loghandle;
-	struct llogd_body    *body;
-	struct llog_log_hdr  *hdr;
-	struct lvfs_run_ctxt  saved;
-	struct llog_ctxt     *ctxt;
-	__u32		 flags;
-	int		   rc;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_LLOGD_BODY);
-	if (body == NULL)
-		return -EFAULT;
-
-	ctxt = llog_get_context(req->rq_export->exp_obd, body->lgd_ctxt_idx);
-	if (ctxt == NULL)
-		return -ENODEV;
-
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-
-	rc = llog_open(req->rq_svc_thread->t_env, ctxt, &loghandle,
-		       &body->lgd_logid, NULL, LLOG_OPEN_EXISTS);
-	if (rc)
-		GOTO(out_pop, rc);
-
-	/*
-	 * llog_init_handle() reads the llog header
-	 */
-	flags = body->lgd_llh_flags;
-	rc = llog_init_handle(req->rq_svc_thread->t_env, loghandle, flags,
-			      NULL);
-	if (rc)
-		GOTO(out_close, rc);
-	flags = loghandle->lgh_hdr->llh_flags;
-
-	rc = req_capsule_server_pack(&req->rq_pill);
-	if (rc)
-		GOTO(out_close, rc = -ENOMEM);
-
-	hdr = req_capsule_server_get(&req->rq_pill, &RMF_LLOG_LOG_HDR);
-	*hdr = *loghandle->lgh_hdr;
-out_close:
-	llog_origin_close(req->rq_svc_thread->t_env, loghandle);
-out_pop:
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_read_header);
-
-int llog_origin_handle_close(struct ptlrpc_request *req)
-{
-	/* Nothing to do */
-	return 0;
-}
-EXPORT_SYMBOL(llog_origin_handle_close);
-
-int llog_origin_handle_cancel(struct ptlrpc_request *req)
-{
-	int num_cookies, rc = 0, err, i, failed = 0;
-	struct obd_device *disk_obd;
-	struct llog_cookie *logcookies;
-	struct llog_ctxt *ctxt = NULL;
-	struct lvfs_run_ctxt saved;
-	struct llog_handle *cathandle;
-	struct inode *inode;
-	void *handle;
-
-	logcookies = req_capsule_client_get(&req->rq_pill, &RMF_LOGCOOKIES);
-	num_cookies = req_capsule_get_size(&req->rq_pill, &RMF_LOGCOOKIES,
-					   RCL_CLIENT) / sizeof(*logcookies);
-	if (logcookies == NULL || num_cookies == 0) {
-		DEBUG_REQ(D_HA, req, "No llog cookies sent");
-		return -EFAULT;
-	}
-
-	ctxt = llog_get_context(req->rq_export->exp_obd,
-				logcookies->lgc_subsys);
-	if (ctxt == NULL)
-		return -ENODEV;
-
-	disk_obd = ctxt->loc_exp->exp_obd;
-	push_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	for (i = 0; i < num_cookies; i++, logcookies++) {
-		cathandle = ctxt->loc_handle;
-		LASSERT(cathandle != NULL);
-		inode = cathandle->lgh_file->f_dentry->d_inode;
-
-		handle = fsfilt_start_log(disk_obd, inode,
-					  FSFILT_OP_CANCEL_UNLINK, NULL, 1);
-		if (IS_ERR(handle)) {
-			CERROR("fsfilt_start_log() failed: %ld\n",
-			       PTR_ERR(handle));
-			GOTO(pop_ctxt, rc = PTR_ERR(handle));
-		}
-
-		rc = llog_cat_cancel_records(req->rq_svc_thread->t_env,
-					     cathandle, 1, logcookies);
-
-		/*
-		 * Do not raise -ENOENT errors for resent rpcs. This rec already
-		 * might be killed.
-		 */
-		if (rc == -ENOENT &&
-		    (lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT)) {
-			/*
-			 * Do not change this message, reply-single.sh test_59b
-			 * expects to find this in log.
-			 */
-			CDEBUG(D_RPCTRACE, "RESENT cancel req %p - ignored\n",
-			       req);
-			rc = 0;
-		} else if (rc == 0) {
-			CDEBUG(D_RPCTRACE, "Canceled %d llog-records\n",
-			       num_cookies);
-		}
-
-		err = fsfilt_commit(disk_obd, inode, handle, 0);
-		if (err) {
-			CERROR("Error committing transaction: %d\n", err);
-			if (!rc)
-				rc = err;
-			failed++;
-			GOTO(pop_ctxt, rc);
-		} else if (rc)
-			failed++;
-	}
-	GOTO(pop_ctxt, rc);
-pop_ctxt:
-	pop_ctxt(&saved, &disk_obd->obd_lvfs_ctxt, NULL);
-	if (rc)
-		CERROR("Cancel %d of %d llog-records failed: %d\n",
-		       failed, num_cookies, rc);
-
-	llog_ctxt_put(ctxt);
-	return rc;
-}
-EXPORT_SYMBOL(llog_origin_handle_cancel);
-
-#else /* !__KERNEL__ */
-int llog_origin_handle_open(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-
-int llog_origin_handle_destroy(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-
-int llog_origin_handle_next_block(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-int llog_origin_handle_prev_block(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-int llog_origin_handle_read_header(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-int llog_origin_handle_close(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-int llog_origin_handle_cancel(struct ptlrpc_request *req)
-{
-	LBUG();
-	return 0;
-}
-#endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (2 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 03/40] staging/lustre: remove llog_server.c Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-15  4:09   ` Greg Kroah-Hartman
  2013-11-14 16:13 ` [PATCH 05/40] staging/lustre: validate open handle cookies Peng Tao
                   ` (36 subsequent siblings)
  40 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

When a client accesses data in a released file,
or truncate it, client must trig a restore request.
During this restore, the client must not glimpse and
must use size from MDT. To bring the "restore is running"
information on the client we add a new t_state bit field
to mdt_info which will be used to carry transient file state.
To memorise this information in the inode we add a new flag
LLIF_FILE_RESTORING.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3432
Lustre-change: http://review.whamcloud.com/6537
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    6 ++-
 .../lustre/lustre/include/lustre/lustre_idl.h      |   14 +++--
 drivers/staging/lustre/lustre/lclient/lcommon_cl.c |    6 +++
 drivers/staging/lustre/lustre/llite/file.c         |   40 ++++++++++++++-
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 ++
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   39 +++++++++++++-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   54 ++++++++++++++++++--
 drivers/staging/lustre/lustre/lov/lov_io.c         |   15 ++++--
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   52 +++++++++----------
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   14 ++---
 10 files changed, 194 insertions(+), 49 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index c485206..4d692dc 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -2388,7 +2388,11 @@ struct cl_io {
 	 * Right now, only two opertaions need to verify layout: glimpse
 	 * and setattr.
 	 */
-			     ci_verify_layout:1;
+			     ci_verify_layout:1,
+	/**
+	 * file is released, restore has to to be triggered by vvp layer
+	 */
+			     ci_restore_needed:1;
 	/**
 	 * Number of pages owned by this IO. For invariant checking.
 	 */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 5ca18d0..7d60896 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1725,10 +1725,7 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 #define OBD_MD_MDS	 (0x0000000100000000ULL) /* where an inode lives on */
 #define OBD_MD_REINT       (0x0000000200000000ULL) /* reintegrate oa */
 #define OBD_MD_MEA	 (0x0000000400000000ULL) /* CMD split EA  */
-
-/* OBD_MD_MDTIDX is used to get MDT index, but it is never been used overwire,
- * and it is already obsolete since 2.3 */
-/* #define OBD_MD_MDTIDX      (0x0000000800000000ULL) */
+#define OBD_MD_TSTATE      (0x0000000800000000ULL) /* transient state field */
 
 #define OBD_MD_FLXATTR       (0x0000001000000000ULL) /* xattr */
 #define OBD_MD_FLXATTRLS     (0x0000002000000000ULL) /* xattr list */
@@ -2207,6 +2204,11 @@ static inline int ll_inode_to_ext_flags(int iflags)
 		((iflags & S_IMMUTABLE) ? LUSTRE_IMMUTABLE_FL : 0));
 }
 
+/* 64 possible states */
+enum md_transient_state {
+	MS_RESTORE	= (1 << 0),	/* restore is running */
+};
+
 struct mdt_body {
 	struct lu_fid  fid1;
 	struct lu_fid  fid2;
@@ -2218,7 +2220,9 @@ struct mdt_body {
        obd_time	ctime;
 	__u64	  blocks; /* XID, in the case of MDS_READPAGE */
 	__u64	  ioepoch;
-	__u64	       unused1; /* was "ino" until 2.4.0 */
+	__u64	       t_state; /* transient file state defined in
+				 * enum md_transient_state
+				 * was "ino" until 2.4.0 */
 	__u32	  fsuid;
 	__u32	  fsgid;
 	__u32	  capability;
diff --git a/drivers/staging/lustre/lustre/lclient/lcommon_cl.c b/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
index e60c04d..1c628e3 100644
--- a/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
@@ -1006,6 +1006,12 @@ again:
 	cl_io_fini(env, io);
 	if (unlikely(io->ci_need_restart))
 		goto again;
+	/* HSM import case: file is released, cannot be restored
+	 * no need to fail except if restore registration failed
+	 * with -ENODATA */
+	if (result == -ENODATA && io->ci_restore_needed &&
+	    io->ci_result != -ENODATA)
+		result = 0;
 	cl_env_put(env, &refcheck);
 	return result;
 }
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index c5b721c..56d9d36 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -905,7 +905,7 @@ out:
 	cl_io_fini(env, io);
 	/* If any bit been read/written (result != 0), we just return
 	 * short read/write instead of restart io. */
-	if (result == 0 && io->ci_need_restart) {
+	if ((result == 0 || result == -ENODATA) && io->ci_need_restart) {
 		CDEBUG(D_VFSTRACE, "Restart %s on %s from %lld, count:%zd\n",
 		       iot == CIT_READ ? "read" : "write",
 		       file->f_dentry->d_name.name, *ppos, count);
@@ -2581,7 +2581,15 @@ int ll_inode_revalidate_it(struct dentry *dentry, struct lookup_intent *it,
 		LTIME_S(inode->i_mtime) = ll_i2info(inode)->lli_lvb.lvb_mtime;
 		LTIME_S(inode->i_ctime) = ll_i2info(inode)->lli_lvb.lvb_ctime;
 	} else {
-		rc = ll_glimpse_size(inode);
+		/* In case of restore, the MDT has the right size and has
+		 * already send it back without granting the layout lock,
+		 * inode is up-to-date so glimpse is useless.
+		 * Also to glimpse we need the layout, in case of a running
+		 * restore the MDT holds the layout lock so the glimpse will
+		 * block up to the end of restore (getattr will block)
+		 */
+		if (!(ll_i2info(inode)->lli_flags & LLIF_FILE_RESTORING))
+			rc = ll_glimpse_size(inode);
 	}
 	return rc;
 }
@@ -3006,6 +3014,7 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, ldlm_mode_t mode,
 	unlock_res_and_lock(lock);
 	/* checking lvb_ready is racy but this is okay. The worst case is
 	 * that multi processes may configure the file on the same time. */
+
 	if (lvb_ready || !reconf) {
 		rc = -ENODATA;
 		if (lvb_ready) {
@@ -3183,3 +3192,30 @@ again:
 
 	return rc;
 }
+
+/**
+ *  This function send a restore request to the MDT
+ */
+int ll_layout_restore(struct inode *inode)
+{
+	struct hsm_user_request	*hur;
+	int			 len, rc;
+
+	len = sizeof(struct hsm_user_request) +
+	      sizeof(struct hsm_user_item);
+	OBD_ALLOC(hur, len);
+	if (hur == NULL)
+		return -ENOMEM;
+
+	hur->hur_request.hr_action = HUA_RESTORE;
+	hur->hur_request.hr_archive_id = 0;
+	hur->hur_request.hr_flags = 0;
+	memcpy(&hur->hur_user_item[0].hui_fid, &ll_i2info(inode)->lli_fid,
+	       sizeof(hur->hur_user_item[0].hui_fid));
+	hur->hur_user_item[0].hui_extent.length = -1;
+	hur->hur_request.hr_itemcount = 1;
+	rc = obd_iocontrol(LL_IOC_HSM_REQUEST, cl_i2sbi(inode)->ll_md_exp,
+			   len, hur, NULL);
+	OBD_FREE(hur, len);
+	return rc;
+}
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 47e443d..3fedf1b 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -124,6 +124,8 @@ enum lli_flags {
 	LLIF_SRVLOCK	    = (1 << 5),
 	/* File data is modified. */
 	LLIF_DATA_MODIFIED      = (1 << 6),
+	/* File is being restored */
+	LLIF_FILE_RESTORING	= (1 << 7),
 };
 
 struct ll_inode_info {
@@ -1578,5 +1580,6 @@ enum {
 
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
+int ll_layout_restore(struct inode *inode);
 
 #endif /* LLITE_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index fd584ff..478fc44 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1353,6 +1353,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct md_op_data *op_data = NULL;
 	struct md_open_data *mod = NULL;
+	bool file_is_released = false;
 	int rc = 0, rc1 = 0;
 
 	CDEBUG(D_VFSTRACE, "%s: setattr inode %p/fid:"DFID" from %llu to %llu, "
@@ -1436,10 +1437,40 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 	    (attr->ia_valid & (ATTR_SIZE | ATTR_MTIME | ATTR_MTIME_SET)))
 		op_data->op_flags = MF_EPOCH_OPEN;
 
+	/* truncate on a released file must failed with -ENODATA,
+	 * so size must not be set on MDS for released file
+	 * but other attributes must be set
+	 */
+	if (S_ISREG(inode->i_mode)) {
+		struct lov_stripe_md *lsm;
+		__u32 gen;
+
+		ll_layout_refresh(inode, &gen);
+		lsm = ccc_inode_lsm_get(inode);
+		if (lsm && lsm->lsm_pattern & LOV_PATTERN_F_RELEASED)
+			file_is_released = true;
+		ccc_inode_lsm_put(inode, lsm);
+	}
+
+	/* clear size attr for released file
+	 * we clear the attribute send to MDT in op_data, not the original
+	 * received from caller in attr which is used later to
+	 * decide return code */
+	if (file_is_released && (attr->ia_valid & ATTR_SIZE))
+		op_data->op_attr.ia_valid &= ~ATTR_SIZE;
+
 	rc = ll_md_setattr(dentry, op_data, &mod);
 	if (rc)
 		GOTO(out, rc);
 
+	/* truncate failed, others succeed */
+	if (file_is_released) {
+		if (attr->ia_valid & ATTR_SIZE)
+			GOTO(out, rc = -ENODATA);
+		else
+			GOTO(out, rc = 0);
+	}
+
 	/* RPC to MDT is sent, cancel data modification flag */
 	if (rc == 0 && (op_data->op_bias & MDS_DATA_MODIFIED)) {
 		spin_lock(&lli->lli_lock);
@@ -1453,7 +1484,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 
 	if (attr->ia_valid & (ATTR_SIZE |
 			      ATTR_ATIME | ATTR_ATIME_SET |
-			      ATTR_MTIME | ATTR_MTIME_SET))
+			      ATTR_MTIME | ATTR_MTIME_SET)) {
 		/* For truncate and utimes sending attributes to OSTs, setting
 		 * mtime/atime to the past will be performed under PW [0:EOF]
 		 * extent lock (new_size:EOF for truncate).  It may seem
@@ -1461,6 +1492,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 		 * setting times to past, but it is necessary due to possible
 		 * time de-synchronization between MDT inode and OST objects */
 		rc = ll_setattr_ost(inode, attr);
+	}
 out:
 	if (op_data) {
 		if (op_data->op_ioepoch) {
@@ -1761,6 +1793,11 @@ void ll_update_inode(struct inode *inode, struct lustre_md *md)
 		LASSERT(md->oss_capa);
 		ll_add_capa(inode, md->oss_capa);
 	}
+
+	if (body->valid & OBD_MD_TSTATE) {
+		if (body->t_state & MS_RESTORE)
+			lli->lli_flags |= LLIF_FILE_RESTORING;
+	}
 }
 
 void ll_read_inode2(struct inode *inode, void *opaque)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 3ff664c..7053cc8 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -121,8 +121,38 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 
 	CLOBINVRNT(env, obj, ccc_object_invariant(obj));
 
-	CDEBUG(D_VFSTRACE, "ignore/verify layout %d/%d, layout version %d.\n",
-		io->ci_ignore_layout, io->ci_verify_layout, cio->cui_layout_gen);
+	CDEBUG(D_VFSTRACE, DFID" ignore/verify layout %d/%d, layout version %d "
+			   "restore needed %d\n",
+	       PFID(lu_object_fid(&obj->co_lu)),
+	       io->ci_ignore_layout, io->ci_verify_layout,
+	       cio->cui_layout_gen, io->ci_restore_needed);
+
+	if (io->ci_restore_needed == 1) {
+		int	rc;
+
+		/* file was detected release, we need to restore it
+		 * before finishing the io
+		 */
+		rc = ll_layout_restore(ccc_object_inode(obj));
+		/* if restore registration failed, no restart,
+		 * we will return -ENODATA */
+		/* The layout will change after restore, so we need to
+		 * block on layout lock hold by the MDT
+		 * as MDT will not send new layout in lvb (see LU-3124)
+		 * we have to explicitly fetch it, all this will be done
+		 * by ll_layout_refresh()
+		 */
+		if (rc == 0) {
+			io->ci_restore_needed = 0;
+			io->ci_need_restart = 1;
+			io->ci_verify_layout = 1;
+		} else {
+			io->ci_restore_needed = 1;
+			io->ci_need_restart = 0;
+			io->ci_verify_layout = 0;
+			io->ci_result = rc;
+		}
+	}
 
 	if (!io->ci_ignore_layout && io->ci_verify_layout) {
 		__u32 gen = 0;
@@ -130,9 +160,17 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		/* check layout version */
 		ll_layout_refresh(ccc_object_inode(obj), &gen);
 		io->ci_need_restart = cio->cui_layout_gen != gen;
-		if (io->ci_need_restart)
-			CDEBUG(D_VFSTRACE, "layout changed from %d to %d.\n",
-				cio->cui_layout_gen, gen);
+		if (io->ci_need_restart) {
+			CDEBUG(D_VFSTRACE,
+			       DFID" layout changed from %d to %d.\n",
+			       PFID(lu_object_fid(&obj->co_lu)),
+			       cio->cui_layout_gen, gen);
+			/* today successful restore is the only possible
+			 * case */
+			/* restore was done, clear restoring state */
+			ll_i2info(ccc_object_inode(obj))->lli_flags &=
+				~LLIF_FILE_RESTORING;
+		}
 	}
 }
 
@@ -1111,6 +1149,12 @@ int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
 
 	CLOBINVRNT(env, obj, ccc_object_invariant(obj));
 
+	CDEBUG(D_VFSTRACE, DFID" ignore/verify layout %d/%d, layout version %d "
+			   "restore needed %d\n",
+	       PFID(lu_object_fid(&obj->co_lu)),
+	       io->ci_ignore_layout, io->ci_verify_layout,
+	       cio->cui_layout_gen, io->ci_restore_needed);
+
 	CL_IO_SLICE_CLEAN(cio, cui_cl);
 	cl_io_slice_add(io, &cio->cui_cl, obj, &vvp_io_ops);
 	vio->cui_ra_window_set = 0;
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 2792fa5..5a6ab70 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -947,14 +947,23 @@ int lov_io_init_released(const struct lu_env *env, struct cl_object *obj,
 		LASSERTF(0, "invalid type %d\n", io->ci_type);
 	case CIT_MISC:
 	case CIT_FSYNC:
-		result = +1;
+		result = 1;
 		break;
 	case CIT_SETATTR:
+		/* the truncate to 0 is managed by MDT:
+		 * - in open, for open O_TRUNC
+		 * - in setattr, for truncate
+		 */
+		/* the truncate is for size > 0 so triggers a restore */
+		if (cl_io_is_trunc(io))
+			io->ci_restore_needed = 1;
+		result = -ENODATA;
+		break;
 	case CIT_READ:
 	case CIT_WRITE:
 	case CIT_FAULT:
-		/* TODO: need to restore the file. */
-		result = -EBADF;
+		io->ci_restore_needed = 1;
+		result = -ENODATA;
 		break;
 	}
 	if (result == 0) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index cd2611a3..8ec08d9 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1863,34 +1863,34 @@ EXPORT_SYMBOL(lustre_swab_lquota_lvb);
 
 void lustre_swab_mdt_body (struct mdt_body *b)
 {
-	lustre_swab_lu_fid (&b->fid1);
-	lustre_swab_lu_fid (&b->fid2);
+	lustre_swab_lu_fid(&b->fid1);
+	lustre_swab_lu_fid(&b->fid2);
 	/* handle is opaque */
-	__swab64s (&b->valid);
-	__swab64s (&b->size);
-	__swab64s (&b->mtime);
-	__swab64s (&b->atime);
-	__swab64s (&b->ctime);
-	__swab64s (&b->blocks);
-	__swab64s (&b->ioepoch);
-	CLASSERT(offsetof(typeof(*b), unused1) != 0);
-	__swab32s (&b->fsuid);
-	__swab32s (&b->fsgid);
-	__swab32s (&b->capability);
-	__swab32s (&b->mode);
-	__swab32s (&b->uid);
-	__swab32s (&b->gid);
-	__swab32s (&b->flags);
-	__swab32s (&b->rdev);
-	__swab32s (&b->nlink);
+	__swab64s(&b->valid);
+	__swab64s(&b->size);
+	__swab64s(&b->mtime);
+	__swab64s(&b->atime);
+	__swab64s(&b->ctime);
+	__swab64s(&b->blocks);
+	__swab64s(&b->ioepoch);
+	__swab64s(&b->t_state);
+	__swab32s(&b->fsuid);
+	__swab32s(&b->fsgid);
+	__swab32s(&b->capability);
+	__swab32s(&b->mode);
+	__swab32s(&b->uid);
+	__swab32s(&b->gid);
+	__swab32s(&b->flags);
+	__swab32s(&b->rdev);
+	__swab32s(&b->nlink);
 	CLASSERT(offsetof(typeof(*b), unused2) != 0);
-	__swab32s (&b->suppgid);
-	__swab32s (&b->eadatasize);
-	__swab32s (&b->aclsize);
-	__swab32s (&b->max_mdsize);
-	__swab32s (&b->max_cookiesize);
-	__swab32s (&b->uid_h);
-	__swab32s (&b->gid_h);
+	__swab32s(&b->suppgid);
+	__swab32s(&b->eadatasize);
+	__swab32s(&b->aclsize);
+	__swab32s(&b->max_mdsize);
+	__swab32s(&b->max_cookiesize);
+	__swab32s(&b->uid_h);
+	__swab32s(&b->gid_h);
 	CLASSERT(offsetof(typeof(*b), padding_5) != 0);
 }
 EXPORT_SYMBOL(lustre_swab_mdt_body);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 9890bd9..912f194 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -49,8 +49,8 @@ void lustre_assert_wire_constants(void)
 {
 	 /* Wire protocol assertions generated by 'wirecheck'
 	  * (make -C lustre/utils newwiretest)
-	  * running on Linux deva 2.6.32.279.lustre #5 SMP Tue Apr 9 22:52:17 CST 2013 x86_64 x86_64 x
-	  * with gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC)  */
+	  * running on Linux centos6-bis 2.6.32-358.0.1.el6-head #3 SMP Wed Apr 17 17:37:43 CEST 2013
+	  * with gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)  */
 
 
 	/* Constants... */
@@ -1335,6 +1335,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_MD_REINT);
 	LASSERTF(OBD_MD_MEA == (0x0000000400000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_MEA);
+	LASSERTF(OBD_MD_TSTATE == (0x0000000800000000ULL), "found 0x%.16llxULL\n",
+		 OBD_MD_TSTATE);
 	LASSERTF(OBD_MD_FLXATTR == (0x0000001000000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLXATTR);
 	LASSERTF(OBD_MD_FLXATTRLS == (0x0000002000000000ULL), "found 0x%.16llxULL\n",
@@ -1918,10 +1920,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct mdt_body, blocks));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->blocks) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct mdt_body *)0)->blocks));
-	LASSERTF((int)offsetof(struct mdt_body, unused1) == 96, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_body, unused1));
-	LASSERTF((int)sizeof(((struct mdt_body *)0)->unused1) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_body *)0)->unused1));
+	LASSERTF((int)offsetof(struct mdt_body, t_state) == 96, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_body, t_state));
+	LASSERTF((int)sizeof(((struct mdt_body *)0)->t_state) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_body *)0)->t_state));
 	LASSERTF((int)offsetof(struct mdt_body, fsuid) == 104, "found %lld\n",
 		 (long long)(int)offsetof(struct mdt_body, fsuid));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->fsuid) == 4, "found %lld\n",
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (3 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-15  4:13   ` Greg Kroah-Hartman
  2013-11-14 16:13 ` [PATCH 06/40] staging/lustre/llite: use correct FID in ll_och_fill() Peng Tao
                   ` (35 subsequent siblings)
  40 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

Add a const void *h_owner member to struct portals_handle. Add a const
void *owner parameter to class_handle2object() which must be matched
by the h_owner member of the handle in addition to the cookie.  Adjust
the callers of class_handle2object() accordingly, using NULL as the
argument to the owner parameter, except in the case of
mdt_handle2mfd() where we add an explicit mdt_export_data parameter
which we use as the owner when searching for a MFD. When allocating a
new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
store it in h_owner. This allows the MDT to validate that the client
has not sent the wrong open handle cookie, or sent the right cookie to
the wrong MDT.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3233
Lustre-change: http://review.whamcloud.com/6938
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/lustre/include/lustre_handles.h |    5 +++--
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    2 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    4 ++--
 drivers/staging/lustre/lustre/obdclass/genops.c    |    2 +-
 .../lustre/lustre/obdclass/lustre_handles.c        |    4 ++--
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_handles.h b/drivers/staging/lustre/lustre/include/lustre_handles.h
index fcd40f3..5671f62 100644
--- a/drivers/staging/lustre/lustre/include/lustre_handles.h
+++ b/drivers/staging/lustre/lustre/include/lustre_handles.h
@@ -66,7 +66,8 @@ struct portals_handle_ops {
 struct portals_handle {
 	struct list_head			h_link;
 	__u64				h_cookie;
-	struct portals_handle_ops	*h_ops;
+	const void		       *h_owner;
+	struct portals_handle_ops      *h_ops;
 
 	/* newly added fields to handle the RCU issue. -jxiong */
 	cfs_rcu_head_t			h_rcu;
@@ -83,7 +84,7 @@ void class_handle_hash(struct portals_handle *,
 		       struct portals_handle_ops *ops);
 void class_handle_unhash(struct portals_handle *);
 void class_handle_hash_back(struct portals_handle *);
-void *class_handle2object(__u64 cookie);
+void *class_handle2object(__u64 cookie, const void *owner);
 void class_handle_free_cb(cfs_rcu_head_t *);
 int class_handle_init(void);
 void class_handle_cleanup(void);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index 3900a69..d81ca5c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -570,7 +570,7 @@ struct ldlm_lock *__ldlm_handle2lock(const struct lustre_handle *handle,
 
 	LASSERT(handle);
 
-	lock = class_handle2object(handle->cookie);
+	lock = class_handle2object(handle->cookie, NULL);
 	if (lock == NULL)
 		return NULL;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 796da89..79ac458 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -107,10 +107,10 @@ static inline void lov_put_reqset(struct lov_request_set *set)
 }
 
 static inline struct lov_lock_handles *
-lov_handle2llh(struct lustre_handle *handle)
+lov_handle2llh(const struct lustre_handle *handle)
 {
 	LASSERT(handle != NULL);
-	return(class_handle2object(handle->cookie));
+	return class_handle2object(handle->cookie, NULL);
 }
 
 static inline void lov_llh_put(struct lov_lock_handles *llh)
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index f6fae16..0da0034 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -701,7 +701,7 @@ struct obd_export *class_conn2export(struct lustre_handle *conn)
 	}
 
 	CDEBUG(D_INFO, "looking for export cookie "LPX64"\n", conn->cookie);
-	export = class_handle2object(conn->cookie);
+	export = class_handle2object(conn->cookie, NULL);
 	return export;
 }
 EXPORT_SYMBOL(class_conn2export);
diff --git a/drivers/staging/lustre/lustre/obdclass/lustre_handles.c b/drivers/staging/lustre/lustre/obdclass/lustre_handles.c
index be31d32..c6406c3 100644
--- a/drivers/staging/lustre/lustre/obdclass/lustre_handles.c
+++ b/drivers/staging/lustre/lustre/obdclass/lustre_handles.c
@@ -147,7 +147,7 @@ void class_handle_hash_back(struct portals_handle *h)
 }
 EXPORT_SYMBOL(class_handle_hash_back);
 
-void *class_handle2object(__u64 cookie)
+void *class_handle2object(__u64 cookie, const void *owner)
 {
 	struct handle_bucket *bucket;
 	struct portals_handle *h;
@@ -161,7 +161,7 @@ void *class_handle2object(__u64 cookie)
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(h, &bucket->head, h_link) {
-		if (h->h_cookie != cookie)
+		if (h->h_cookie != cookie || h->h_owner != owner)
 			continue;
 
 		spin_lock(&h->h_lock);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 06/40] staging/lustre/llite: use correct FID in ll_och_fill()
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (4 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 05/40] staging/lustre: validate open handle cookies Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open Peng Tao
                   ` (34 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

When ll_intent_file_open() is called on a file with a stale dentry,
ll_och_fill() may incorrectly use the FID from the struct
ll_inode_info rather than the FID from the response body (which is the
correct FID for the close). Fix this, remove the ll_inode_info
parameter from ll_och_fill(), and move the call to ll_ioepoch_open()
from ll_och_fill() to ll_local_open().

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3233
Lustre-change: http://review.whamcloud.com/6695
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |   25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 56d9d36..eab9f03 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -431,22 +431,17 @@ void ll_ioepoch_open(struct ll_inode_info *lli, __u64 ioepoch)
 	}
 }
 
-static int ll_och_fill(struct obd_export *md_exp, struct ll_inode_info *lli,
-		       struct lookup_intent *it, struct obd_client_handle *och)
+static int ll_och_fill(struct obd_export *md_exp, struct lookup_intent *it,
+		       struct obd_client_handle *och)
 {
 	struct ptlrpc_request *req = it->d.lustre.it_data;
 	struct mdt_body *body;
 
-	LASSERT(och);
-
 	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-	LASSERT(body != NULL);		      /* reply already checked out */
-
-	memcpy(&och->och_fh, &body->handle, sizeof(body->handle));
+	och->och_fh = body->handle;
+	och->och_fid = body->fid1;
 	och->och_magic = OBD_CLIENT_HANDLE_MAGIC;
-	och->och_fid = lli->lli_fid;
 	och->och_flags = it->it_flags;
-	ll_ioepoch_open(lli, body->ioepoch);
 
 	return md_set_open_replay_data(md_exp, och, req);
 }
@@ -466,15 +461,12 @@ int ll_local_open(struct file *file, struct lookup_intent *it,
 		struct mdt_body *body;
 		int rc;
 
-		rc = ll_och_fill(ll_i2sbi(inode)->ll_md_exp, lli, it, och);
-		if (rc)
+		rc = ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
+		if (rc != 0)
 			return rc;
 
 		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		if ((it->it_flags & FMODE_WRITE) &&
-		    (body->valid & OBD_MD_FLSIZE))
-			CDEBUG(D_INODE, "Epoch "LPU64" opened on "DFID"\n",
-			       lli->lli_ioepoch, PFID(&lli->lli_fid));
+		ll_ioepoch_open(lli, body->ioepoch);
 	}
 
 	LUSTRE_FPRIVATE(file) = fd;
@@ -1482,8 +1474,7 @@ int ll_release_openhandle(struct dentry *dentry, struct lookup_intent *it)
 	if (!och)
 		GOTO(out, rc = -ENOMEM);
 
-	ll_och_fill(ll_i2sbi(inode)->ll_md_exp,
-		    ll_i2info(inode), it, och);
+	ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 
 	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
 				       inode, och);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (5 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 06/40] staging/lustre/llite: use correct FID in ll_och_fill() Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-15  4:17   ` Greg Kroah-Hartman
  2013-11-14 16:13 ` [PATCH 08/40] staging/lustre/lnet: Fix assert on empty group in selftest module Peng Tao
                   ` (33 subsequent siblings)
  40 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jinshan Xiong, John L. Hammond, Peng Tao, Andreas Dilger

From: Jinshan Xiong <jinshan.xiong@intel.com>

In this patch, a framework of lease is implemented. However,
only exclusive lease is supported right now.

To apply a lease, MDS_OPEN_LEASE must be set to open the file, EX
mode open lock is returned to the client side to hold a lease. From
that time on, if this file is opened again by other processes, the
open lock will be revoked so the client who holds the lease will
know the lease is already broken by checking that open lock.

To release a lease, normal close is used. The client will revoke the
open lock before sending CLOSE request.

Lease can be applied in two ways. ll_lease_open()/close() can be
called directly if the lease holder is in kernel space; or if the
lease holder lives in user space, it has to open the file first and
then use ioctl() with command LL_IOC_SET_LEASE to apply a lease. The
lease holder has to poll the lease status itself.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2919
Lustre-change: http://review.whamcloud.com/6730
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/linux/lustre_intent.h    |    2 +-
 .../lustre/lustre/include/lustre/lustre_idl.h      |    5 +
 .../lustre/lustre/include/lustre/lustre_user.h     |   13 +-
 .../lustre/lustre/include/lustre_dlm_flags.h       |   12 +-
 drivers/staging/lustre/lustre/include/lustre_lib.h |   11 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    5 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    4 +-
 drivers/staging/lustre/lustre/llite/file.c         |  296 +++++++++++++++++++-
 .../staging/lustre/lustre/llite/llite_internal.h   |   11 +-
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    7 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   39 ++-
 12 files changed, 373 insertions(+), 34 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_intent.h b/drivers/staging/lustre/lustre/include/linux/lustre_intent.h
index b10ddfa..c491d52 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_intent.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_intent.h
@@ -52,8 +52,8 @@ struct lustre_intent_data {
 
 struct lookup_intent {
 	int     it_op;
-	int     it_flags;
 	int     it_create_mode;
+	__u64   it_flags;
 	union {
 		struct lustre_intent_data lustre;
 	} d;
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 7d60896..4d8d8c3 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -2117,6 +2117,7 @@ extern void lustre_swab_generic_32s (__u32 *val);
 #define DISP_ENQ_OPEN_REF    0x00800000
 #define DISP_ENQ_CREATE_REF  0x01000000
 #define DISP_OPEN_LOCK       0x02000000
+#define DISP_OPEN_LEASE      0x04000000
 
 /* INODE LOCK PARTS */
 #define MDS_INODELOCK_LOOKUP 0x000001       /* dentry, mode, owner, group */
@@ -2377,6 +2378,10 @@ extern void lustre_swab_mdt_rec_setattr (struct mdt_rec_setattr *sa);
 					      * hsm restore) */
 #define MDS_OPEN_VOLATILE   0400000000000ULL /* File is volatile = created
 						unlinked */
+#define MDS_OPEN_LEASE	   01000000000000ULL /* Open the file and grant lease
+					      * delegation, succeed if it's not
+					      * being opened with conflict mode.
+					      */
 
 /* permission for create non-directory file */
 #define MAY_CREATE      (1 << 7)
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index c7bd447..9891ace 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -241,12 +241,15 @@ struct ost_id {
 						struct hsm_current_action)
 /* see <lustre_lib.h> for ioctl numbers 221-232 */
 
-#define LL_IOC_LMV_SETSTRIPE	    _IOWR('f', 240, struct lmv_user_md)
-#define LL_IOC_LMV_GETSTRIPE	    _IOWR('f', 241, struct lmv_user_md)
-#define LL_IOC_REMOVE_ENTRY	    _IOWR('f', 242, __u64)
+#define LL_IOC_LMV_SETSTRIPE		_IOWR('f', 240, struct lmv_user_md)
+#define LL_IOC_LMV_GETSTRIPE		_IOWR('f', 241, struct lmv_user_md)
+#define LL_IOC_REMOVE_ENTRY		_IOWR('f', 242, __u64)
 
-#define LL_STATFS_LMV	   1
-#define LL_STATFS_LOV	   2
+#define LL_IOC_SET_LEASE		_IOWR('f', 243, long)
+#define LL_IOC_GET_LEASE		_IO('f', 244)
+
+#define LL_STATFS_LMV		1
+#define LL_STATFS_LOV		2
 #define LL_STATFS_NODELAY	4
 
 #define IOC_MDC_TYPE	    'i'
diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h b/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
index 8c34d9d..a632217 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
@@ -35,7 +35,7 @@
 #ifndef LDLM_ALL_FLAGS_MASK
 
 /** l_flags bits marked as "all_flags" bits */
-#define LDLM_FL_ALL_FLAGS_MASK          0x007FFFFFC08F132FULL
+#define LDLM_FL_ALL_FLAGS_MASK          0x00FFFFFFC08F132FULL
 
 /** l_flags bits marked as "ast" bits */
 #define LDLM_FL_AST_MASK                0x0000000080000000ULL
@@ -53,7 +53,7 @@
 #define LDLM_FL_INHERIT_MASK            0x0000000000800000ULL
 
 /** l_flags bits marked as "local_only" bits */
-#define LDLM_FL_LOCAL_ONLY_MASK         0x007FFFFF00000000ULL
+#define LDLM_FL_LOCAL_ONLY_MASK         0x00FFFFFF00000000ULL
 
 /** l_flags bits marked as "on_wire" bits */
 #define LDLM_FL_ON_WIRE_MASK            0x00000000C08F132FULL
@@ -358,6 +358,12 @@
 #define ldlm_set_ns_srv(_l)             LDLM_SET_FLAG((  _l), 1ULL << 54)
 #define ldlm_clear_ns_srv(_l)           LDLM_CLEAR_FLAG((_l), 1ULL << 54)
 
+/** Flag whether this lock can be reused. Used by exclusive open. */
+#define LDLM_FL_EXCL                    0x0080000000000000ULL // bit  55
+#define ldlm_is_excl(_l)                LDLM_TEST_FLAG(( _l), 1ULL << 55)
+#define ldlm_set_excl(_l)               LDLM_SET_FLAG((  _l), 1ULL << 55)
+#define ldlm_clear_excl(_l)             LDLM_CLEAR_FLAG((_l), 1ULL << 55)
+
 /** test for ldlm_lock flag bit set */
 #define LDLM_TEST_FLAG(_l, _b)        (((_l)->l_flags & (_b)) != 0)
 
@@ -414,6 +420,7 @@ static int hf_lustre_ldlm_fl_server_lock         = -1;
 static int hf_lustre_ldlm_fl_res_locked          = -1;
 static int hf_lustre_ldlm_fl_waited              = -1;
 static int hf_lustre_ldlm_fl_ns_srv              = -1;
+static int hf_lustre_ldlm_fl_excl                = -1;
 
 const value_string lustre_ldlm_flags_vals[] = {
   {LDLM_FL_LOCK_CHANGED,        "LDLM_FL_LOCK_CHANGED"},
@@ -454,6 +461,7 @@ const value_string lustre_ldlm_flags_vals[] = {
   {LDLM_FL_RES_LOCKED,          "LDLM_FL_RES_LOCKED"},
   {LDLM_FL_WAITED,              "LDLM_FL_WAITED"},
   {LDLM_FL_NS_SRV,              "LDLM_FL_NS_SRV"},
+  {LDLM_FL_EXCL,                "LDLM_FL_EXCL"},
   { 0, NULL }
 };
 #endif /*  WIRESHARK_COMPILE */
diff --git a/drivers/staging/lustre/lustre/include/lustre_lib.h b/drivers/staging/lustre/lustre/include/lustre_lib.h
index 5e11107..d576b7f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_lib.h
+++ b/drivers/staging/lustre/lustre/include/lustre_lib.h
@@ -81,11 +81,12 @@ struct client_obd *client_conn2cli(struct lustre_handle *conn);
 
 struct md_open_data;
 struct obd_client_handle {
-	struct lustre_handle  och_fh;
-	struct lu_fid	 och_fid;
-	struct md_open_data  *och_mod;
-	__u32 och_magic;
-	int och_flags;
+	struct lustre_handle	 och_fh;
+	struct lu_fid		 och_fid;
+	struct md_open_data	*och_mod;
+	struct lustre_handle	 och_lease_handle; /* open lock for lease */
+	__u32			 och_magic;
+	int			 och_flags;
 };
 #define OBD_CLIENT_HANDLE_MAGIC 0xd15ea5ed
 
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index d81ca5c..c4fa9f9 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1129,6 +1129,11 @@ static struct ldlm_lock *search_queue(struct list_head *queue,
 		if (lock == old_lock)
 			break;
 
+		/* Check if this lock can be matched.
+		 * Used by LU-2919(exclusive open) for open lease lock */
+		if (ldlm_is_excl(lock))
+			continue;
+
 		/* llite sometimes wants to match locks that will be
 		 * canceled when their users drop, but we allow it to match
 		 * if it passes in CBPENDING and the lock still has users.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index dcc2784..1e2c0dd 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -910,7 +910,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
 	lock->l_conn_export = exp;
 	lock->l_export = NULL;
 	lock->l_blocking_ast = einfo->ei_cb_bl;
-	lock->l_flags |= (*flags & LDLM_FL_NO_LRU);
+	lock->l_flags |= (*flags & (LDLM_FL_NO_LRU | LDLM_FL_EXCL));
 
 	/* lock not sent to server yet */
 
@@ -1333,7 +1333,7 @@ int ldlm_cli_cancel(struct lustre_handle *lockh,
 	}
 
 	rc = ldlm_cli_cancel_local(lock);
-	if (rc == LDLM_FL_LOCAL_ONLY) {
+	if (rc == LDLM_FL_LOCAL_ONLY || cancel_flags & LCF_LOCAL) {
 		LDLM_LOCK_RELEASE(lock);
 		return 0;
 	}
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index eab9f03..f97c600 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -241,6 +241,24 @@ int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 	if (unlikely(fd->fd_flags & LL_FILE_GROUP_LOCKED))
 		ll_put_grouplock(inode, file, fd->fd_grouplock.cg_gid);
 
+	if (fd->fd_lease_och != NULL) {
+		bool lease_broken;
+
+		/* Usually the lease is not released when the
+		 * application crashed, we need to release here. */
+		rc = ll_lease_close(fd->fd_lease_och, inode, &lease_broken);
+		CDEBUG(rc ? D_ERROR : D_INODE, "Clean up lease "DFID" %d/%d\n",
+			PFID(&lli->lli_fid), rc, lease_broken);
+
+		fd->fd_lease_och = NULL;
+	}
+
+	if (fd->fd_och != NULL) {
+		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och);
+		fd->fd_och = NULL;
+		GOTO(out, rc);
+	}
+
 	/* Let's see if we have good enough OPEN lock on the file and if
 	   we can skip talking to MDS */
 	if (file->f_dentry->d_inode) { /* Can this ever be false? */
@@ -277,6 +295,7 @@ int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 		       file, file->f_dentry, file->f_dentry->d_name.name);
 	}
 
+out:
 	LUSTRE_FPRIVATE(file) = NULL;
 	ll_file_data_put(fd);
 	ll_capa_close(inode);
@@ -440,6 +459,7 @@ static int ll_och_fill(struct obd_export *md_exp, struct lookup_intent *it,
 	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
 	och->och_fh = body->handle;
 	och->och_fid = body->fid1;
+	och->och_lease_handle.cookie = it->d.lustre.it_lock_handle;
 	och->och_magic = OBD_CLIENT_HANDLE_MAGIC;
 	och->och_flags = it->it_flags;
 
@@ -471,7 +491,7 @@ int ll_local_open(struct file *file, struct lookup_intent *it,
 
 	LUSTRE_FPRIVATE(file) = fd;
 	ll_readahead_init(inode, &fd->fd_ras);
-	fd->fd_omode = it->it_flags;
+	fd->fd_omode = it->it_flags & (FMODE_READ | FMODE_WRITE | FMODE_EXEC);
 	return 0;
 }
 
@@ -673,6 +693,196 @@ out_openerr:
 	return rc;
 }
 
+static int ll_md_blocking_lease_ast(struct ldlm_lock *lock,
+			struct ldlm_lock_desc *desc, void *data, int flag)
+{
+	int rc;
+	struct lustre_handle lockh;
+
+	switch (flag) {
+	case LDLM_CB_BLOCKING:
+		ldlm_lock2handle(lock, &lockh);
+		rc = ldlm_cli_cancel(&lockh, LCF_ASYNC);
+		if (rc < 0) {
+			CDEBUG(D_INODE, "ldlm_cli_cancel: %d\n", rc);
+			return rc;
+		}
+		break;
+	case LDLM_CB_CANCELING:
+		/* do nothing */
+		break;
+	}
+	return 0;
+}
+
+/**
+ * Acquire a lease and open the file.
+ */
+struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
+					fmode_t fmode)
+{
+	struct lookup_intent it = { .it_op = IT_OPEN };
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct md_op_data *op_data;
+	struct ptlrpc_request *req;
+	struct lustre_handle old_handle = { 0 };
+	struct obd_client_handle *och = NULL;
+	int rc;
+	int rc2;
+
+	if (fmode != FMODE_WRITE && fmode != FMODE_READ)
+		return ERR_PTR(-EINVAL);
+
+	if (file != NULL) {
+		struct ll_inode_info *lli = ll_i2info(inode);
+		struct ll_file_data *fd = LUSTRE_FPRIVATE(file);
+		struct obd_client_handle **och_p;
+		__u64 *och_usecount;
+
+		if (!(fmode & file->f_mode) || (file->f_mode & FMODE_EXEC))
+			return ERR_PTR(-EPERM);
+
+		/* Get the openhandle of the file */
+		rc = -EBUSY;
+		mutex_lock(&lli->lli_och_mutex);
+		if (fd->fd_lease_och != NULL) {
+			mutex_unlock(&lli->lli_och_mutex);
+			return ERR_PTR(rc);
+		}
+
+		if (fd->fd_och == NULL) {
+			if (file->f_mode & FMODE_WRITE) {
+				LASSERT(lli->lli_mds_write_och != NULL);
+				och_p = &lli->lli_mds_write_och;
+				och_usecount = &lli->lli_open_fd_write_count;
+			} else {
+				LASSERT(lli->lli_mds_read_och != NULL);
+				och_p = &lli->lli_mds_read_och;
+				och_usecount = &lli->lli_open_fd_read_count;
+			}
+			if (*och_usecount == 1) {
+				fd->fd_och = *och_p;
+				*och_p = NULL;
+				*och_usecount = 0;
+				rc = 0;
+			}
+		}
+		mutex_unlock(&lli->lli_och_mutex);
+		if (rc < 0) /* more than 1 opener */
+			return ERR_PTR(rc);
+
+		LASSERT(fd->fd_och != NULL);
+		old_handle = fd->fd_och->och_fh;
+	}
+
+	OBD_ALLOC_PTR(och);
+	if (och == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	op_data = ll_prep_md_op_data(NULL, inode, inode, NULL, 0, 0,
+					LUSTRE_OPC_ANY, NULL);
+	if (IS_ERR(op_data))
+		GOTO(out, rc = PTR_ERR(op_data));
+
+	/* To tell the MDT this openhandle is from the same owner */
+	op_data->op_handle = old_handle;
+
+	it.it_flags = fmode | MDS_OPEN_LOCK | MDS_OPEN_BY_FID | MDS_OPEN_LEASE;
+	rc = md_intent_lock(sbi->ll_md_exp, op_data, NULL, 0, &it, 0, &req,
+				ll_md_blocking_lease_ast,
+	/* LDLM_FL_NO_LRU: To not put the lease lock into LRU list, otherwise
+	 * it can be cancelled which may mislead applications that the lease is
+	 * broken;
+	 * LDLM_FL_EXCL: Set this flag so that it won't be matched by normal
+	 * open in ll_md_blocking_ast(). Otherwise as ll_md_blocking_lease_ast
+	 * doesn't deal with openhandle, so normal openhandle will be leaked. */
+				LDLM_FL_NO_LRU | LDLM_FL_EXCL);
+	ll_finish_md_op_data(op_data);
+	if (req != NULL) {
+		ptlrpc_req_finished(req);
+		it_clear_disposition(&it, DISP_ENQ_COMPLETE);
+	}
+	if (rc < 0)
+		GOTO(out_release_it, rc);
+
+	if (it_disposition(&it, DISP_LOOKUP_NEG))
+		GOTO(out_release_it, rc = -ENOENT);
+
+	rc = it_open_error(DISP_OPEN_OPEN, &it);
+	if (rc)
+		GOTO(out_release_it, rc);
+
+	LASSERT(it_disposition(&it, DISP_ENQ_OPEN_REF));
+	ll_och_fill(sbi->ll_md_exp, &it, och);
+
+	if (!it_disposition(&it, DISP_OPEN_LEASE)) /* old server? */
+		GOTO(out_close, rc = -EOPNOTSUPP);
+
+	/* already get lease, handle lease lock */
+	ll_set_lock_data(sbi->ll_md_exp, inode, &it, NULL);
+	if (it.d.lustre.it_lock_mode == 0 ||
+	    it.d.lustre.it_lock_bits != MDS_INODELOCK_OPEN) {
+		/* open lock must return for lease */
+		CERROR(DFID "lease granted but no open lock, %d/%Lu.\n",
+			PFID(ll_inode2fid(inode)), it.d.lustre.it_lock_mode,
+			it.d.lustre.it_lock_bits);
+		GOTO(out_close, rc = -EPROTO);
+	}
+
+	ll_intent_release(&it);
+	return och;
+
+out_close:
+	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och);
+	if (rc2)
+		CERROR("Close openhandle returned %d\n", rc2);
+
+	/* cancel open lock */
+	if (it.d.lustre.it_lock_mode != 0) {
+		ldlm_lock_decref_and_cancel(&och->och_lease_handle,
+						it.d.lustre.it_lock_mode);
+		it.d.lustre.it_lock_mode = 0;
+	}
+out_release_it:
+	ll_intent_release(&it);
+out:
+	OBD_FREE_PTR(och);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL(ll_lease_open);
+
+/**
+ * Release lease and close the file.
+ * It will check if the lease has ever broken.
+ */
+int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
+			bool *lease_broken)
+{
+	struct ldlm_lock *lock;
+	bool cancelled = true;
+	int rc;
+
+	lock = ldlm_handle2lock(&och->och_lease_handle);
+	if (lock != NULL) {
+		lock_res_and_lock(lock);
+		cancelled = ldlm_is_cancel(lock);
+		unlock_res_and_lock(lock);
+		ldlm_lock_put(lock);
+	}
+
+	CDEBUG(D_INODE, "lease for "DFID" broken? %d\n",
+		PFID(&ll_i2info(inode)->lli_fid), cancelled);
+
+	if (!cancelled)
+		ldlm_cli_cancel(&och->och_lease_handle, 0);
+	if (lease_broken != NULL)
+		*lease_broken = cancelled;
+
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och);
+	return rc;
+}
+EXPORT_SYMBOL(ll_lease_close);
+
 /* Fills the obdo with the attributes for the lsm */
 static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 			  struct obd_capa *capa, struct obdo *obdo,
@@ -2066,6 +2276,90 @@ long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		OBD_FREE_PTR(hca);
 		return rc;
 	}
+	case LL_IOC_SET_LEASE: {
+		struct ll_inode_info *lli = ll_i2info(inode);
+		struct obd_client_handle *och = NULL;
+		bool lease_broken;
+		fmode_t mode = 0;
+
+		switch (arg) {
+		case F_WRLCK:
+			if (!(file->f_mode & FMODE_WRITE))
+				return -EPERM;
+			mode = FMODE_WRITE;
+			break;
+		case F_RDLCK:
+			if (!(file->f_mode & FMODE_READ))
+				return -EPERM;
+			mode = FMODE_READ;
+			break;
+		case F_UNLCK:
+			mutex_lock(&lli->lli_och_mutex);
+			if (fd->fd_lease_och != NULL) {
+				och = fd->fd_lease_och;
+				fd->fd_lease_och = NULL;
+			}
+			mutex_unlock(&lli->lli_och_mutex);
+
+			if (och != NULL) {
+				mode = och->och_flags &(FMODE_READ|FMODE_WRITE);
+				rc = ll_lease_close(och, inode, &lease_broken);
+				if (rc == 0 && lease_broken)
+					mode = 0;
+			} else {
+				rc = -ENOLCK;
+			}
+
+			/* return the type of lease or error */
+			return rc < 0 ? rc : (int)mode;
+		default:
+			return -EINVAL;
+		}
+
+		CDEBUG(D_INODE, "Set lease with mode %d\n", mode);
+
+		/* apply for lease */
+		och = ll_lease_open(inode, file, mode);
+		if (IS_ERR(och))
+			return PTR_ERR(och);
+
+		rc = 0;
+		mutex_lock(&lli->lli_och_mutex);
+		if (fd->fd_lease_och == NULL) {
+			fd->fd_lease_och = och;
+			och = NULL;
+		}
+		mutex_unlock(&lli->lli_och_mutex);
+		if (och != NULL) {
+			/* impossible now that only excl is supported for now */
+			ll_lease_close(och, inode, &lease_broken);
+			rc = -EBUSY;
+		}
+		return rc;
+	}
+	case LL_IOC_GET_LEASE: {
+		struct ll_inode_info *lli = ll_i2info(inode);
+		struct ldlm_lock *lock = NULL;
+
+		rc = 0;
+		mutex_lock(&lli->lli_och_mutex);
+		if (fd->fd_lease_och != NULL) {
+			struct obd_client_handle *och = fd->fd_lease_och;
+
+			lock = ldlm_handle2lock(&och->och_lease_handle);
+			if (lock != NULL) {
+				lock_res_and_lock(lock);
+				if (!ldlm_is_cancel(lock))
+					rc = och->och_flags &
+						(FMODE_READ | FMODE_WRITE);
+				unlock_res_and_lock(lock);
+				ldlm_lock_put(lock);
+			}
+		}
+		mutex_unlock(&lli->lli_och_mutex);
+
+		return rc;
+	}
 	default: {
 		int err;
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 3fedf1b..2debb24 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -609,10 +609,14 @@ extern struct kmem_cache *ll_file_data_slab;
 struct lustre_handle;
 struct ll_file_data {
 	struct ll_readahead_state fd_ras;
-	int fd_omode;
 	struct ccc_grouplock fd_grouplock;
 	__u64 lfd_pos;
 	__u32 fd_flags;
+	fmode_t fd_omode;
+	/* openhandle if lease exists for this file.
+	 * Borrow lli->lli_och_mutex to protect assignment */
+	struct obd_client_handle *fd_lease_och;
+	struct obd_client_handle *fd_och;
 	struct file *fd_file;
 	/* Indicate whether need to report failure when close.
 	 * true: failure is known, not report again.
@@ -778,6 +782,11 @@ int ll_put_grouplock(struct inode *inode, struct file *file, unsigned long arg);
 int ll_fid2path(struct inode *inode, void *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int extent_lock);
 
+struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
+					fmode_t mode);
+int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
+		   bool *lease_broken);
+
 /* llite/dcache.c */
 
 int ll_dops_init(struct dentry *de, int block, int init_sa);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index 2aeff0e..b995af6 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -69,7 +69,7 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		     const void *data, int datalen, __u32 mode, __u32 uid,
 		     __u32 gid, cfs_cap_t capability, __u64 rdev);
 void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		   __u32 mode, __u64 rdev, __u32 flags, const void *data,
+		   __u32 mode, __u64 rdev, __u64 flags, const void *data,
 		   int datalen);
 void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
 void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index b2de478..3e77020 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -174,12 +174,12 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	}
 }
 
-static __u64 mds_pack_open_flags(__u32 flags, __u32 mode)
+static __u64 mds_pack_open_flags(__u64 flags, __u32 mode)
 {
 	__u64 cr_flags = (flags & (FMODE_READ | FMODE_WRITE |
 				   MDS_OPEN_HAS_EA | MDS_OPEN_HAS_OBJS |
 				   MDS_OPEN_OWNEROVERRIDE | MDS_OPEN_LOCK |
-				   MDS_OPEN_BY_FID));
+				   MDS_OPEN_BY_FID | MDS_OPEN_LEASE));
 	if (flags & O_CREAT)
 		cr_flags |= MDS_OPEN_CREAT;
 	if (flags & O_EXCL)
@@ -207,7 +207,7 @@ static __u64 mds_pack_open_flags(__u32 flags, __u32 mode)
 
 /* packing of MDS records */
 void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		   __u32 mode, __u64 rdev, __u32 flags, const void *lmm,
+		   __u32 mode, __u64 rdev, __u64 flags, const void *lmm,
 		   int lmmlen)
 {
 	struct mdt_rec_create *rec;
@@ -234,6 +234,7 @@ void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->cr_suppgid2 = op_data->op_suppgids[1];
 	rec->cr_bias     = op_data->op_bias;
 	rec->cr_umask    = current_umask();
+	rec->cr_old_handle = op_data->op_handle;
 
 	mdc_pack_capa(req, &RMF_CAPA1, op_data->op_capa1);
 	/* the next buffer is child capa, which is used for replay,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index fb5a995..3fc0697 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -75,6 +75,12 @@ EXPORT_SYMBOL(it_clear_disposition);
 
 int it_open_error(int phase, struct lookup_intent *it)
 {
+	if (it_disposition(it, DISP_OPEN_LEASE)) {
+		if (phase >= DISP_OPEN_LEASE)
+			return it->d.lustre.it_status;
+		else
+			return 0;
+	}
 	if (it_disposition(it, DISP_OPEN_OPEN)) {
 		if (phase >= DISP_OPEN_OPEN)
 			return it->d.lustre.it_status;
@@ -281,14 +287,21 @@ static struct ptlrpc_request *mdc_intent_open_pack(struct obd_export *exp,
 	/* XXX: openlock is not cancelled for cross-refs. */
 	/* If inode is known, cancel conflicting OPEN locks. */
 	if (fid_is_sane(&op_data->op_fid2)) {
-		if (it->it_flags & (FMODE_WRITE|MDS_OPEN_TRUNC))
-			mode = LCK_CW;
+		if (it->it_flags & MDS_OPEN_LEASE) { /* try to get lease */
+			if (it->it_flags & FMODE_WRITE)
+				mode = LCK_EX;
+			else
+				mode = LCK_PR;
+		} else {
+			if (it->it_flags & (FMODE_WRITE|MDS_OPEN_TRUNC))
+				mode = LCK_CW;
 #ifdef FMODE_EXEC
-		else if (it->it_flags & FMODE_EXEC)
-			mode = LCK_PR;
+			else if (it->it_flags & FMODE_EXEC)
+				mode = LCK_PR;
 #endif
-		else
-			mode = LCK_CR;
+			else
+				mode = LCK_CR;
+		}
 		count = mdc_resource_get_unused(exp, &op_data->op_fid2,
 						&cancels, mode,
 						MDS_INODELOCK_OPEN);
@@ -1065,10 +1078,10 @@ int mdc_intent_lock(struct obd_export *exp, struct md_op_data *op_data,
 	LASSERT(it);
 
 	CDEBUG(D_DLMTRACE, "(name: %.*s,"DFID") in obj "DFID
-	       ", intent: %s flags %#o\n", op_data->op_namelen,
-	       op_data->op_name, PFID(&op_data->op_fid2),
-	       PFID(&op_data->op_fid1), ldlm_it2str(it->it_op),
-	       it->it_flags);
+		", intent: %s flags %#Lo\n", op_data->op_namelen,
+		op_data->op_name, PFID(&op_data->op_fid2),
+		PFID(&op_data->op_fid1), ldlm_it2str(it->it_op),
+		it->it_flags);
 
 	lockh.cookie = 0;
 	if (fid_is_sane(&op_data->op_fid2) &&
@@ -1194,9 +1207,9 @@ int mdc_intent_getattr_async(struct obd_export *exp,
 	int		      rc = 0;
 	__u64		    flags = LDLM_FL_HAS_INTENT;
 
-	CDEBUG(D_DLMTRACE,"name: %.*s in inode "DFID", intent: %s flags %#o\n",
-	       op_data->op_namelen, op_data->op_name, PFID(&op_data->op_fid1),
-	       ldlm_it2str(it->it_op), it->it_flags);
+	CDEBUG(D_DLMTRACE,"name: %.*s in inode "DFID", intent: %s flags %#Lo\n",
+		op_data->op_namelen, op_data->op_name, PFID(&op_data->op_fid1),
+		ldlm_it2str(it->it_op), it->it_flags);
 
 	fid_build_reg_res_name(&op_data->op_fid1, &res_id);
 	req = mdc_intent_getattr_pack(exp, it, op_data);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 08/40] staging/lustre/lnet: Fix assert on empty group in selftest module
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (6 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 09/40] staging/lustre/ldlm: fix resource/fid check, use DLDLMRES Peng Tao
                   ` (32 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Amir Shehata, Peng Tao, Andreas Dilger

From: Amir Shehata <amir.shehata@intel.com>

The core of the issue is that the selftest module doesn't sanitize its
own API, but it depends on lst utility to do such checks.  As a result
this issue manifests itself in this particular LU through an assert
on an empty group.  If the NID is misspelled then an empty group is
added.  An error output is provided, but if that's never checked in a
batch script, as is the case with this issue, then the script will try
to add an empty group to a test to run in a batch, and that will cause
an assert

The fix is two fold.  Ensure that lst utility checks that a group is
added with at least one node.  If not the group is subsequently
deleted.  And the add_test command would fail, since the group no
longer exists.

The second fix is to ensure that the kernel module itself sanitizes
its own API in this particular case, so that if a different utility is
used other than lst to communicate with the selftest kernel module
then this error would be caught.  This fix looks up the batch and the
groups, src and dst, in the ioctl handle and sanitizes that input at
this point.  If the group looked up either doesn't exist or doesn't
have at least one ACTIVE node, then the command fails.

NOTE:there are many other cases in the code where the selftest kernel
module doesn't check for sanity of the input, but depends totally on
the lst module to do such checks.  Particularly around length of
strings passed in.  Thus it is possible to crash the selftest module
if someone tries to create another userspace app to communicate with
the selftest kernel module without ensuring sanity of the params sent
to the kernel module.  In effect, it's always assumed that lst is the
front end for selftest and no other front end is to be used.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3093
Lustre-change: http://review.whamcloud.com/6092
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lnet/selftest/conctl.c  |   56 ++++++-------
 drivers/staging/lustre/lnet/selftest/conrpc.c  |    2 +-
 drivers/staging/lustre/lnet/selftest/console.c |  105 ++++++++++++++++--------
 drivers/staging/lustre/lnet/selftest/console.h |    6 +-
 4 files changed, 102 insertions(+), 67 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index bce3d3b..cbc416d 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -723,12 +723,12 @@ lst_stat_query_ioctl(lstio_stat_args_t *args)
 
 int lst_test_add_ioctl(lstio_test_args_t *args)
 {
-	char	   *name;
-	char	   *srcgrp = NULL;
-	char	   *dstgrp = NULL;
-	void	   *param = NULL;
-	int	     ret = 0;
-	int	     rc = -ENOMEM;
+	char		*batch_name;
+	char		*src_name = NULL;
+	char		*dst_name = NULL;
+	void		*param = NULL;
+	int		ret = 0;
+	int		rc = -ENOMEM;
 
 	if (args->lstio_tes_resultp == NULL ||
 	    args->lstio_tes_retp == NULL ||
@@ -755,16 +755,16 @@ int lst_test_add_ioctl(lstio_test_args_t *args)
 	     args->lstio_tes_param_len > PAGE_CACHE_SIZE - sizeof(lstcon_test_t)))
 		return -EINVAL;
 
-	LIBCFS_ALLOC(name, args->lstio_tes_bat_nmlen + 1);
-	if (name == NULL)
+	LIBCFS_ALLOC(batch_name, args->lstio_tes_bat_nmlen + 1);
+	if (batch_name == NULL)
 		return rc;
 
-	LIBCFS_ALLOC(srcgrp, args->lstio_tes_sgrp_nmlen + 1);
-	if (srcgrp == NULL)
+	LIBCFS_ALLOC(src_name, args->lstio_tes_sgrp_nmlen + 1);
+	if (src_name == NULL)
 		goto out;
 
-	LIBCFS_ALLOC(dstgrp, args->lstio_tes_dgrp_nmlen + 1);
-	 if (dstgrp == NULL)
+	LIBCFS_ALLOC(dst_name, args->lstio_tes_dgrp_nmlen + 1);
+	 if (dst_name == NULL)
 		goto out;
 
 	if (args->lstio_tes_param != NULL) {
@@ -774,39 +774,37 @@ int lst_test_add_ioctl(lstio_test_args_t *args)
 	}
 
 	rc = -EFAULT;
-	if (copy_from_user(name,
-			      args->lstio_tes_bat_name,
-			      args->lstio_tes_bat_nmlen) ||
-	    copy_from_user(srcgrp,
-			      args->lstio_tes_sgrp_name,
-			      args->lstio_tes_sgrp_nmlen) ||
-	    copy_from_user(dstgrp,
-			      args->lstio_tes_dgrp_name,
-			      args->lstio_tes_dgrp_nmlen) ||
+	if (copy_from_user(batch_name, args->lstio_tes_bat_name,
+			   args->lstio_tes_bat_nmlen) ||
+	    copy_from_user(src_name, args->lstio_tes_sgrp_name,
+			   args->lstio_tes_sgrp_nmlen) ||
+	    copy_from_user(dst_name, args->lstio_tes_dgrp_name,
+			   args->lstio_tes_dgrp_nmlen) ||
 	    copy_from_user(param, args->lstio_tes_param,
 			      args->lstio_tes_param_len))
 		goto out;
 
-	rc = lstcon_test_add(name,
+	rc = lstcon_test_add(batch_name,
 			    args->lstio_tes_type,
 			    args->lstio_tes_loop,
 			    args->lstio_tes_concur,
 			    args->lstio_tes_dist, args->lstio_tes_span,
-			    srcgrp, dstgrp, param, args->lstio_tes_param_len,
+			    src_name, dst_name, param,
+			    args->lstio_tes_param_len,
 			    &ret, args->lstio_tes_resultp);
 
 	if (ret != 0)
 		rc = (copy_to_user(args->lstio_tes_retp, &ret,
 				       sizeof(ret))) ? -EFAULT : 0;
 out:
-	if (name != NULL)
-		LIBCFS_FREE(name, args->lstio_tes_bat_nmlen + 1);
+	if (batch_name != NULL)
+		LIBCFS_FREE(batch_name, args->lstio_tes_bat_nmlen + 1);
 
-	if (srcgrp != NULL)
-		LIBCFS_FREE(srcgrp, args->lstio_tes_sgrp_nmlen + 1);
+	if (src_name != NULL)
+		LIBCFS_FREE(src_name, args->lstio_tes_sgrp_nmlen + 1);
 
-	if (dstgrp != NULL)
-		LIBCFS_FREE(dstgrp, args->lstio_tes_dgrp_nmlen + 1);
+	if (dst_name != NULL)
+		LIBCFS_FREE(dst_name, args->lstio_tes_dgrp_nmlen + 1);
 
 	if (param != NULL)
 		LIBCFS_FREE(param, args->lstio_tes_param_len);
diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
index 9a52f25..4d48c77 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.c
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
@@ -311,7 +311,7 @@ lstcon_rpc_trans_abort(lstcon_rpc_trans_t *trans, int error)
 
 		sfw_abort_rpc(rpc);
 
-		if  (error != ETIMEDOUT)
+		if (error != -ETIMEDOUT)
 			continue;
 
 		nd = crpc->crp_node;
diff --git a/drivers/staging/lustre/lnet/selftest/console.c b/drivers/staging/lustre/lnet/selftest/console.c
index f1152e4..d498e3e 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -265,7 +265,7 @@ lstcon_group_decref(lstcon_group_t *grp)
 }
 
 static int
-lstcon_group_find(char *name, lstcon_group_t **grpp)
+lstcon_group_find(const char *name, lstcon_group_t **grpp)
 {
 	lstcon_group_t   *grp;
 
@@ -831,7 +831,7 @@ lstcon_group_info(char *name, lstcon_ndlist_ent_t *gents_p,
 }
 
 int
-lstcon_batch_find(char *name, lstcon_batch_t **batpp)
+lstcon_batch_find(const char *name, lstcon_batch_t **batpp)
 {
 	lstcon_batch_t   *bat;
 
@@ -1237,41 +1237,77 @@ again:
 	goto again;
 }
 
-int
-lstcon_test_add(char *name, int type, int loop, int concur,
-		int dist, int span, char *src_name, char * dst_name,
-		void *param, int paramlen, int *retp,
-		struct list_head *result_up)
+static int
+lstcon_verify_batch(const char *name, lstcon_batch_t **batch)
 {
-	lstcon_group_t  *src_grp = NULL;
-	lstcon_group_t  *dst_grp = NULL;
-	lstcon_test_t   *test    = NULL;
-	lstcon_batch_t  *batch;
-	int	      rc;
+	int rc;
 
-	rc = lstcon_batch_find(name, &batch);
+	rc = lstcon_batch_find(name, batch);
 	if (rc != 0) {
 		CDEBUG(D_NET, "Can't find batch %s\n", name);
 		return rc;
 	}
 
-	if (batch->bat_state != LST_BATCH_IDLE) {
+	if ((*batch)->bat_state != LST_BATCH_IDLE) {
 		CDEBUG(D_NET, "Can't change running batch %s\n", name);
-		return rc;
+		return -EINVAL;
 	}
 
-	rc = lstcon_group_find(src_name, &src_grp);
+	return 0;
+}
+
+static int
+lstcon_verify_group(const char *name, lstcon_group_t **grp)
+{
+	int			rc;
+	lstcon_ndlink_t		*ndl;
+
+	rc = lstcon_group_find(name, grp);
 	if (rc != 0) {
-		CDEBUG(D_NET, "Can't find group %s\n", src_name);
-		goto out;
+		CDEBUG(D_NET, "can't find group %s\n", name);
+		return rc;
 	}
 
-	rc = lstcon_group_find(dst_name, &dst_grp);
-	if (rc != 0) {
-		CDEBUG(D_NET, "Can't find group %s\n", dst_name);
-		goto out;
+	list_for_each_entry(ndl, &(*grp)->grp_ndl_list, ndl_link) {
+		if (ndl->ndl_node->nd_state == LST_NODE_ACTIVE)
+			return 0;
 	}
 
+	CDEBUG(D_NET, "Group %s has no ACTIVE nodes\n", name);
+
+	return -EINVAL;
+}
+
+int
+lstcon_test_add(char *batch_name, int type, int loop,
+		int concur, int dist, int span,
+		char *src_name, char *dst_name,
+		void *param, int paramlen, int *retp,
+		struct list_head *result_up)
+{
+	lstcon_test_t	 *test	 = NULL;
+	int		 rc;
+	lstcon_group_t	 *src_grp = NULL;
+	lstcon_group_t	 *dst_grp = NULL;
+	lstcon_batch_t	 *batch = NULL;
+
+	/*
+	 * verify that a batch of the given name exists, and the groups
+	 * that will be part of the batch exist and have at least one
+	 * active node
+	 */
+	rc = lstcon_verify_batch(batch_name, &batch);
+	if (rc != 0)
+		goto out;
+
+	rc = lstcon_verify_group(src_name, &src_grp);
+	if (rc != 0)
+		goto out;
+
+	rc = lstcon_verify_group(dst_name, &dst_grp);
+	if (rc != 0)
+		goto out;
+
 	if (dst_grp->grp_userland)
 		*retp = 1;
 
@@ -1284,18 +1320,18 @@ lstcon_test_add(char *name, int type, int loop, int concur,
 	}
 
 	memset(test, 0, offsetof(lstcon_test_t, tes_param[paramlen]));
-	test->tes_hdr.tsb_id    = batch->bat_hdr.tsb_id;
-	test->tes_batch	 = batch;
-	test->tes_type	  = type;
-	test->tes_oneside       = 0; /* TODO */
-	test->tes_loop	  = loop;
+	test->tes_hdr.tsb_id	= batch->bat_hdr.tsb_id;
+	test->tes_batch		= batch;
+	test->tes_type		= type;
+	test->tes_oneside	= 0; /* TODO */
+	test->tes_loop		= loop;
 	test->tes_concur	= concur;
-	test->tes_stop_onerr    = 1; /* TODO */
-	test->tes_span	  = span;
-	test->tes_dist	  = dist;
+	test->tes_stop_onerr	= 1; /* TODO */
+	test->tes_span		= span;
+	test->tes_dist		= dist;
 	test->tes_cliidx	= 0; /* just used for creating RPC */
-	test->tes_src_grp       = src_grp;
-	test->tes_dst_grp       = dst_grp;
+	test->tes_src_grp	= src_grp;
+	test->tes_dst_grp	= dst_grp;
 	INIT_LIST_HEAD(&test->tes_trans_list);
 
 	if (param != NULL) {
@@ -1310,12 +1346,13 @@ lstcon_test_add(char *name, int type, int loop, int concur,
 
 	if (lstcon_trans_stat()->trs_rpc_errno != 0 ||
 	    lstcon_trans_stat()->trs_fwk_errno != 0)
-		CDEBUG(D_NET, "Failed to add test %d to batch %s\n", type, name);
+		CDEBUG(D_NET, "Failed to add test %d to batch %s\n", type,
+		       batch_name);
 
 	/* add to test list anyway, so user can check what's going on */
 	list_add_tail(&test->tes_link, &batch->bat_test_list);
 
-	batch->bat_ntest ++;
+	batch->bat_ntest++;
 	test->tes_hdr.tsb_index = batch->bat_ntest;
 
 	/*  hold groups so nobody can change them */
diff --git a/drivers/staging/lustre/lnet/selftest/console.h b/drivers/staging/lustre/lnet/selftest/console.h
index e61b266..b57dbd8 100644
--- a/drivers/staging/lustre/lnet/selftest/console.h
+++ b/drivers/staging/lustre/lnet/selftest/console.h
@@ -224,9 +224,9 @@ extern int lstcon_group_stat(char *grp_name, int timeout,
 			     struct list_head *result_up);
 extern int lstcon_nodes_stat(int count, lnet_process_id_t *ids_up,
 			     int timeout, struct list_head *result_up);
-extern int lstcon_test_add(char *name, int type, int loop, int concur,
-			   int dist, int span, char *src_name, char * dst_name,
+extern int lstcon_test_add(char *batch_name, int type, int loop,
+			   int concur, int dist, int span,
+			   char *src_name, char *dst_name,
 			   void *param, int paramlen, int *retp,
 			   struct list_head *result_up);
-
 #endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 09/40] staging/lustre/ldlm: fix resource/fid check, use DLDLMRES
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (7 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 08/40] staging/lustre/lnet: Fix assert on empty group in selftest module Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 10/40] staging/lustre/server: use unified request handler for MGS Peng Tao
                   ` (31 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Andreas Dilger, Lai Siyao, Peng Tao

From: Andreas Dilger <andreas.dilger@intel.com>

In ll_md_blocking_ast() the FID/resource comparison is incorrectly
checking for fid_ver() stored in res_id.name[2] instead of name[1]
changed since http://review.whamcloud.com/2271 (commit 4f91d5161d00)
landed.  This does not impact current clients, since name[2] and
fid_ver() are always zero, but it could cause problems in the future.

In ldlm_cli_enqueue_fini() use ldlm_res_eq() instead of comparing
each of the resource fields separately.

Use DLDLMRES/PLDLMRES when printing resource names everywhere.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2901
Lustre-change: http://review.whamcloud.com/6592
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |   32 +++++++-------------
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |   19 +++---------
 drivers/staging/lustre/lustre/llite/namei.c        |    5 +--
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    9 ++----
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    4 +--
 5 files changed, 21 insertions(+), 48 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 1e2c0dd..1ddcca3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -610,18 +610,12 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req,
 			lock->l_req_mode = newmode;
 		}
 
-		if (memcmp(reply->lock_desc.l_resource.lr_name.name,
-			  lock->l_resource->lr_name.name,
-			  sizeof(struct ldlm_res_id))) {
-			CDEBUG(D_INFO, "remote intent success, locking "
-					"(%ld,%ld,%ld) instead of "
-					"(%ld,%ld,%ld)\n",
-			      (long)reply->lock_desc.l_resource.lr_name.name[0],
-			      (long)reply->lock_desc.l_resource.lr_name.name[1],
-			      (long)reply->lock_desc.l_resource.lr_name.name[2],
-			      (long)lock->l_resource->lr_name.name[0],
-			      (long)lock->l_resource->lr_name.name[1],
-			      (long)lock->l_resource->lr_name.name[2]);
+		if (!ldlm_res_eq(&reply->lock_desc.l_resource.lr_name,
+				 &lock->l_resource->lr_name)) {
+			CDEBUG(D_INFO, "remote intent success, locking "DLDLMRES
+				       " instead of "DLDLMRES"\n",
+			       PLDLMRES(&reply->lock_desc.l_resource),
+			       PLDLMRES(lock->l_resource));
 
 			rc = ldlm_lock_change_resource(ns, lock,
 					&reply->lock_desc.l_resource.lr_name);
@@ -1912,7 +1906,8 @@ int ldlm_cli_cancel_unused_resource(struct ldlm_namespace *ns,
 					   0, flags | LCF_BL_AST, opaque);
 	rc = ldlm_cli_cancel_list(&cancels, count, NULL, flags);
 	if (rc != ELDLM_OK)
-		CERROR("ldlm_cli_cancel_unused_resource: %d\n", rc);
+		CERROR("canceling unused lock "DLDLMRES": rc = %d\n",
+		       PLDLMRES(res), rc);
 
 	LDLM_RESOURCE_DELREF(res);
 	ldlm_resource_putref(res);
@@ -1930,15 +1925,10 @@ static int ldlm_cli_hash_cancel_unused(struct cfs_hash *hs, struct cfs_hash_bd *
 {
 	struct ldlm_resource	   *res = cfs_hash_object(hs, hnode);
 	struct ldlm_cli_cancel_arg     *lc = arg;
-	int			     rc;
 
-	rc = ldlm_cli_cancel_unused_resource(ldlm_res_to_ns(res), &res->lr_name,
-					     NULL, LCK_MINMODE,
-					     lc->lc_flags, lc->lc_opaque);
-	if (rc != 0) {
-		CERROR("ldlm_cli_cancel_unused ("LPU64"): %d\n",
-		       res->lr_name.name[0], rc);
-	}
+	ldlm_cli_cancel_unused_resource(ldlm_res_to_ns(res), &res->lr_name,
+					NULL, LCK_MINMODE,
+					lc->lc_flags, lc->lc_opaque);
 	/* must return 0 for hash iteration */
 	return 0;
 }
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 77e022b..2fdd938 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -762,16 +762,9 @@ static int ldlm_resource_complain(struct cfs_hash *hs, struct cfs_hash_bd *bd,
 	struct ldlm_resource  *res = cfs_hash_object(hs, hnode);
 
 	lock_res(res);
-	CERROR("Namespace %s resource refcount nonzero "
-	       "(%d) after lock cleanup; forcing "
-	       "cleanup.\n",
-	       ldlm_ns_name(ldlm_res_to_ns(res)),
-	       atomic_read(&res->lr_refcount) - 1);
-
-	CERROR("Resource: %p ("LPU64"/"LPU64"/"LPU64"/"
-	       LPU64") (rc: %d)\n", res,
-	       res->lr_name.name[0], res->lr_name.name[1],
-	       res->lr_name.name[2], res->lr_name.name[3],
+	CERROR("%s: namespace resource "DLDLMRES" (%p) refcount nonzero "
+	       "(%d) after lock cleanup; forcing cleanup.\n",
+	       ldlm_ns_name(ldlm_res_to_ns(res)), PLDLMRES(res), res,
 	       atomic_read(&res->lr_refcount) - 1);
 
 	ldlm_resource_dump(D_ERROR, res);
@@ -1403,10 +1396,8 @@ void ldlm_resource_dump(int level, struct ldlm_resource *res)
 	if (!((libcfs_debug | D_ERROR) & level))
 		return;
 
-	CDEBUG(level, "--- Resource: %p ("LPU64"/"LPU64"/"LPU64"/"LPU64
-	       ") (rc: %d)\n", res, res->lr_name.name[0], res->lr_name.name[1],
-	       res->lr_name.name[2], res->lr_name.name[3],
-	       atomic_read(&res->lr_refcount));
+	CDEBUG(level, "--- Resource: "DLDLMRES" (%p) refcount = %d\n",
+	       PLDLMRES(res), res, atomic_read(&res->lr_refcount));
 
 	if (!list_empty(&res->lr_granted)) {
 		CDEBUG(level, "Granted locks (in reverse order):\n");
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 34815b5..8377468 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -233,12 +233,9 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 			ll_have_md_lock(inode, &bits, mode);
 
 		fid = ll_inode2fid(inode);
-		if (lock->l_resource->lr_name.name[0] != fid_seq(fid) ||
-		    lock->l_resource->lr_name.name[1] != fid_oid(fid) ||
-		    lock->l_resource->lr_name.name[2] != fid_ver(fid)) {
+		if (!fid_res_name_eq(fid, &lock->l_resource->lr_name))
 			LDLM_ERROR(lock, "data mismatch with object "
 				   DFID" (%p)", PFID(fid), inode);
-		}
 
 		if (bits & MDS_INODELOCK_OPEN) {
 			int flags = 0;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 3fc0697..8bc9007 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -971,13 +971,8 @@ static int mdc_finish_intent_lock(struct obd_export *exp,
 
 		LASSERTF(fid_res_name_eq(&mdt_body->fid1,
 					 &lock->l_resource->lr_name),
-			 "Lock res_id: %lu/%lu/%lu, fid: %lu/%lu/%lu.\n",
-			 (unsigned long)lock->l_resource->lr_name.name[0],
-			 (unsigned long)lock->l_resource->lr_name.name[1],
-			 (unsigned long)lock->l_resource->lr_name.name[2],
-			 (unsigned long)fid_seq(&mdt_body->fid1),
-			 (unsigned long)fid_oid(&mdt_body->fid1),
-			 (unsigned long)fid_ver(&mdt_body->fid1));
+			 "Lock res_id: "DLDLMRES", fid: "DFID"\n",
+			 PLDLMRES(lock->l_resource), PFID(&mdt_body->fid1));
 		LDLM_LOCK_PUT(lock);
 
 		memcpy(&old_lock, lockh, sizeof(*lockh));
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 12a9ede..93b601d 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -788,8 +788,8 @@ static int mgc_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 		/* We've given up the lock, prepare ourselves to update. */
 		LDLM_DEBUG(lock, "MGC cancel CB");
 
-		CDEBUG(D_MGC, "Lock res "LPX64" (%.8s)\n",
-		       lock->l_resource->lr_name.name[0],
+		CDEBUG(D_MGC, "Lock res "DLDLMRES" (%.8s)\n",
+		       PLDLMRES(lock->l_resource),
 		       (char *)&lock->l_resource->lr_name.name[0]);
 
 		if (!cld) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 10/40] staging/lustre/server: use unified request handler for MGS
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (8 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 09/40] staging/lustre/ldlm: fix resource/fid check, use DLDLMRES Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 11/40] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
                   ` (30 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <tappro@whamcloud.com>

- Unify request handler. It finds target for particular request and
  calls appropriate handler for it. Generic handlers are moved to
  the unified target code. The tgt_session_info is introduced to
  store all request-related data and passed to all handlers.
- Pack reply in llog server functions early and use err_serious()
- remove obsoleted llog_origin_handle_cancel(), it is not used
  anymore
- remove push_ctxt/pop_ctxt from llog server function, it is based
  on OSD now.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2145
Lustre-change: http://review.whamcloud.com/4826
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre_req_layout.h      |    2 ++
 .../staging/lustre/lustre/include/obd_support.h    |    7 +++++++
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    7 ++++++-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c    |    4 ++--
 4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index f4d3820..2832569 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -245,6 +245,8 @@ extern struct req_format RQF_LLOG_ORIGIN_HANDLE_PREV_BLOCK;
 extern struct req_format RQF_LLOG_ORIGIN_HANDLE_READ_HEADER;
 extern struct req_format RQF_LLOG_ORIGIN_CONNECT;
 
+extern struct req_format RQF_CONNECT;
+
 extern struct req_msg_field RMF_GENERIC_DATA;
 extern struct req_msg_field RMF_PTLRPC_BODY;
 extern struct req_msg_field RMF_MDT_BODY;
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 9697e7f..825a0c9 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -416,6 +416,13 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_MGC_PAUSE_PROCESS_LOG   0x903
 #define OBD_FAIL_MGS_PAUSE_REQ	   0x904
 #define OBD_FAIL_MGS_PAUSE_TARGET_REG    0x905
+#define OBD_FAIL_MGS_CONNECT_NET	 0x906
+#define OBD_FAIL_MGS_DISCONNECT_NET	 0x907
+#define OBD_FAIL_MGS_SET_INFO_NET	 0x908
+#define OBD_FAIL_MGS_EXCEPTION_NET	 0x909
+#define OBD_FAIL_MGS_TARGET_REG_NET	 0x90a
+#define OBD_FAIL_MGS_TARGET_DEL_NET	 0x90b
+#define OBD_FAIL_MGS_CONFIG_READ_NET	 0x90c
 
 #define OBD_FAIL_QUOTA_DQACQ_NET			0xA01
 #define OBD_FAIL_QUOTA_EDQUOT	    0xA02
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index d0a6e56..e7524f9 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -738,7 +738,8 @@ static struct req_format *req_formats[] = {
 	&RQF_LLOG_ORIGIN_HANDLE_NEXT_BLOCK,
 	&RQF_LLOG_ORIGIN_HANDLE_PREV_BLOCK,
 	&RQF_LLOG_ORIGIN_HANDLE_READ_HEADER,
-	&RQF_LLOG_ORIGIN_CONNECT
+	&RQF_LLOG_ORIGIN_CONNECT,
+	&RQF_CONNECT,
 };
 
 struct req_msg_field {
@@ -1504,6 +1505,10 @@ struct req_format RQF_LLOG_ORIGIN_CONNECT =
 	DEFINE_REQ_FMT0("LLOG_ORIGIN_CONNECT", llogd_conn_body_only, empty);
 EXPORT_SYMBOL(RQF_LLOG_ORIGIN_CONNECT);
 
+struct req_format RQF_CONNECT =
+	DEFINE_REQ_FMT0("CONNECT", obd_connect_client, obd_connect_server);
+EXPORT_SYMBOL(RQF_CONNECT);
+
 struct req_format RQF_OST_CONNECT =
 	DEFINE_REQ_FMT0("OST_CONNECT",
 			obd_connect_client, obd_connect_server);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c b/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
index bea44a3..537c540 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c
@@ -114,10 +114,10 @@ struct ll_rpc_opcode {
 	{ MGS_SET_INFO,     "mgs_set_info" },
 	{ MGS_CONFIG_READ,  "mgs_config_read" },
 	{ OBD_PING,	 "obd_ping" },
-	{ OBD_LOG_CANCEL,   "llog_origin_handle_cancel" },
+	{ OBD_LOG_CANCEL,	"llog_cancel" },
 	{ OBD_QC_CALLBACK,  "obd_quota_callback" },
 	{ OBD_IDX_READ,	    "dt_index_read" },
-	{ LLOG_ORIGIN_HANDLE_CREATE,     "llog_origin_handle_create" },
+	{ LLOG_ORIGIN_HANDLE_CREATE,	 "llog_origin_handle_open" },
 	{ LLOG_ORIGIN_HANDLE_NEXT_BLOCK, "llog_origin_handle_next_block" },
 	{ LLOG_ORIGIN_HANDLE_READ_HEADER,"llog_origin_handle_read_header" },
 	{ LLOG_ORIGIN_HANDLE_WRITE_REC,  "llog_origin_handle_write_rec" },
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 11/40] staging/lustre/llog: MGC to use OSD API for backup logs
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (9 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 10/40] staging/lustre/server: use unified request handler for MGS Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
                   ` (29 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <mike.pershin@intel.com>

MGC uses lvfs API to access local llogs blocking removal of old code

- MGS is converted to use OSD API for local llogs
- llog_is_empty() and llog_backup() are introduced
- initial OSD start initialize run ctxt so llog can use it too
- shrink dcache after initial scrub in osd-ldiskfs to avoid holding
  data of unlinked objects until umount.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2059
Lustre-change: http://review.whamcloud.com/5049
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
 drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
 drivers/staging/lustre/lustre/include/obd.h        |    4 +-
 drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  401 ++++++++++++--------
 drivers/staging/lustre/lustre/obdclass/llog.c      |  212 +++++++----
 .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
 .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
 10 files changed, 408 insertions(+), 244 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/dt_object.h b/drivers/staging/lustre/lustre/include/dt_object.h
index e116bb2..9304c26 100644
--- a/drivers/staging/lustre/lustre/include/dt_object.h
+++ b/drivers/staging/lustre/lustre/include/dt_object.h
@@ -692,7 +692,7 @@ struct local_oid_storage {
 	struct dt_object *los_obj;
 
 	/* data used to generate new fids */
-	struct mutex	 los_id_lock;
+	struct mutex	  los_id_lock;
 	__u64		  los_seq;
 	__u32		  los_last_oid;
 };
diff --git a/drivers/staging/lustre/lustre/include/lustre_log.h b/drivers/staging/lustre/lustre/include/lustre_log.h
index 721aa05..896c757 100644
--- a/drivers/staging/lustre/lustre/include/lustre_log.h
+++ b/drivers/staging/lustre/lustre/include/lustre_log.h
@@ -136,7 +136,11 @@ int llog_open(const struct lu_env *env, struct llog_ctxt *ctxt,
 	      struct llog_handle **lgh, struct llog_logid *logid,
 	      char *name, enum llog_open_param open_param);
 int llog_close(const struct lu_env *env, struct llog_handle *cathandle);
-int llog_get_size(struct llog_handle *loghandle);
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name);
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bak_ctxt,
+		char *name, char *backup);
 
 /* llog_process flags */
 #define LLOG_FLAG_NODEAMON 0x0001
@@ -382,6 +386,13 @@ static inline int llog_data_len(int len)
 	return cfs_size_round(len);
 }
 
+static inline int llog_get_size(struct llog_handle *loghandle)
+{
+	if (loghandle && loghandle->lgh_hdr)
+		return loghandle->lgh_hdr->llh_count;
+	return 0;
+}
+
 static inline struct llog_ctxt *llog_ctxt_get(struct llog_ctxt *ctxt)
 {
 	atomic_inc(&ctxt->loc_refcount);
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index d0aea15..f0c1773 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -399,8 +399,8 @@ struct client_obd {
 
 	/* mgc datastruct */
 	struct semaphore	 cl_mgc_sem;
-	struct vfsmount	 *cl_mgc_vfsmnt;
-	struct dentry	   *cl_mgc_configs_dir;
+	struct local_oid_storage *cl_mgc_los;
+	struct dt_object	*cl_mgc_configs_dir;
 	atomic_t	     cl_mgc_refcount;
 	struct obd_export       *cl_mgc_mgsexp;
 
diff --git a/drivers/staging/lustre/lustre/mgc/libmgc.c b/drivers/staging/lustre/lustre/mgc/libmgc.c
index 7b4947c..9b40c57 100644
--- a/drivers/staging/lustre/lustre/mgc/libmgc.c
+++ b/drivers/staging/lustre/lustre/mgc/libmgc.c
@@ -99,11 +99,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	ptlrpcd_decref();
 
 	rc = client_obd_cleanup(obd);
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 93b601d..9325f91 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -41,17 +41,14 @@
 #define DEBUG_SUBSYSTEM S_MGC
 #define D_MGC D_CONFIG /*|D_WARNING*/
 
-# include <linux/module.h>
-# include <linux/pagemap.h>
-# include <linux/miscdevice.h>
-# include <linux/init.h>
-
+#include <linux/module.h>
 #include <obd_class.h>
 #include <lustre_dlm.h>
 #include <lprocfs_status.h>
 #include <lustre_log.h>
-#include <lustre_fsfilt.h>
 #include <lustre_disk.h>
+#include <dt_object.h>
+
 #include "mgc_internal.h"
 
 static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
@@ -578,97 +575,175 @@ static void mgc_requeue_add(struct config_llog_data *cld)
 }
 
 /********************** class fns **********************/
+static int mgc_local_llog_init(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct obd_device *disk)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_ORIG_CTXT, disk,
+			&llog_osd_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	LASSERT(ctxt);
+	ctxt->loc_dir = obd->u.cli.cl_mgc_configs_dir;
+	llog_ctxt_put(ctxt);
 
-static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb,
-			struct vfsmount *mnt)
+	return 0;
+}
+
+static int mgc_local_llog_fini(const struct lu_env *env,
+			       struct obd_device *obd)
 {
-	struct lvfs_run_ctxt saved;
-	struct lustre_sb_info *lsi = s2lsi(sb);
-	struct client_obd *cli = &obd->u.cli;
-	struct dentry *dentry;
-	char *label;
-	int err = 0;
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
+	llog_cleanup(env, ctxt);
+
+	return 0;
+}
+
+static int mgc_fs_setup(struct obd_device *obd, struct super_block *sb)
+{
+	struct lustre_sb_info	*lsi = s2lsi(sb);
+	struct client_obd	*cli = &obd->u.cli;
+	struct lu_fid		 rfid, fid;
+	struct dt_object	*root, *dto;
+	struct lu_env		*env;
+	int			 rc = 0;
 
 	LASSERT(lsi);
-	LASSERT(lsi->lsi_srv_mnt == mnt);
+	LASSERT(lsi->lsi_dt_dev);
+
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
+		return -ENOMEM;
 
 	/* The mgc fs exclusion sem. Only one fs can be setup at a time. */
 	down(&cli->cl_mgc_sem);
 
 	cfs_cleanup_group_info();
 
-	obd->obd_fsops = fsfilt_get_ops(lsi->lsi_fstype);
-	if (IS_ERR(obd->obd_fsops)) {
-		up(&cli->cl_mgc_sem);
-		CERROR("%s: No fstype %s: rc = %ld\n", lsi->lsi_fstype,
-		       obd->obd_name, PTR_ERR(obd->obd_fsops));
-		return PTR_ERR(obd->obd_fsops);
-	}
+	/* Setup the configs dir */
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_err, rc);
 
-	cli->cl_mgc_vfsmnt = mnt;
-	err = fsfilt_setup(obd, mnt->mnt_sb);
-	if (err)
-		GOTO(err_ops, err);
-
-	OBD_SET_CTXT_MAGIC(&obd->obd_lvfs_ctxt);
-	obd->obd_lvfs_ctxt.pwdmnt = mnt;
-	obd->obd_lvfs_ctxt.pwd = mnt->mnt_root;
-	obd->obd_lvfs_ctxt.fs = get_ds();
-
-	push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	dentry = ll_lookup_one_len(MOUNT_CONFIGS_DIR, cfs_fs_pwd(current->fs),
-				   strlen(MOUNT_CONFIGS_DIR));
-	pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-	if (IS_ERR(dentry)) {
-		err = PTR_ERR(dentry);
-		CERROR("cannot lookup %s directory: rc = %d\n",
-		       MOUNT_CONFIGS_DIR, err);
-		GOTO(err_ops, err);
-	}
-	cli->cl_mgc_configs_dir = dentry;
+	fid.f_seq = FID_SEQ_LOCAL_NAME;
+	fid.f_oid = 1;
+	fid.f_ver = 0;
+	rc = local_oid_storage_init(env, lsi->lsi_dt_dev, &fid,
+				    &cli->cl_mgc_los);
+	if (rc)
+		GOTO(out_env, rc);
+
+	rc = dt_root_get(env, lsi->lsi_dt_dev, &rfid);
+	if (rc)
+		GOTO(out_env, rc);
+
+	root = dt_locate_at(env, lsi->lsi_dt_dev, &rfid,
+			    &cli->cl_mgc_los->los_dev->dd_lu_dev);
+	if (unlikely(IS_ERR(root)))
+		GOTO(out_los, rc = PTR_ERR(root));
+
+	dto = local_file_find_or_create(env, cli->cl_mgc_los, root,
+					MOUNT_CONFIGS_DIR,
+					S_IFDIR | S_IRUGO | S_IWUSR | S_IXUGO);
+	lu_object_put_nocache(env, &root->do_lu);
+	if (IS_ERR(dto))
+		GOTO(out_los, rc = PTR_ERR(dto));
+
+	cli->cl_mgc_configs_dir = dto;
+
+	LASSERT(lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt);
+	rc = mgc_local_llog_init(env, obd, lsi->lsi_osd_exp->exp_obd);
+	if (rc)
+		GOTO(out_llog, rc);
 
 	/* We take an obd ref to insure that we can't get to mgc_cleanup
-	   without calling mgc_fs_cleanup first. */
+	 * without calling mgc_fs_cleanup first. */
 	class_incref(obd, "mgc_fs", obd);
 
-	label = fsfilt_get_label(obd, mnt->mnt_sb);
-	if (label)
-		CDEBUG(D_MGC, "MGC using disk labelled=%s\n", label);
-
 	/* We keep the cl_mgc_sem until mgc_fs_cleanup */
-	return 0;
-
-err_ops:
-	fsfilt_put_ops(obd->obd_fsops);
-	obd->obd_fsops = NULL;
-	cli->cl_mgc_vfsmnt = NULL;
-	up(&cli->cl_mgc_sem);
-	return err;
+out_llog:
+	if (rc) {
+		lu_object_put(env, &cli->cl_mgc_configs_dir->do_lu);
+		cli->cl_mgc_configs_dir = NULL;
+	}
+out_los:
+	if (rc < 0) {
+		local_oid_storage_fini(env, cli->cl_mgc_los);
+		cli->cl_mgc_los = NULL;
+		up(&cli->cl_mgc_sem);
+	}
+out_env:
+	lu_env_fini(env);
+out_err:
+	OBD_FREE_PTR(env);
+	return rc;
 }
 
 static int mgc_fs_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
-	int rc = 0;
+	struct lu_env		 env;
+	struct client_obd	*cli = &obd->u.cli;
+	int			 rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt != NULL);
+	LASSERT(cli->cl_mgc_los != NULL);
 
-	if (cli->cl_mgc_configs_dir != NULL) {
-		struct lvfs_run_ctxt saved;
-		push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		l_dput(cli->cl_mgc_configs_dir);
-		cli->cl_mgc_configs_dir = NULL;
-		pop_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
-		class_decref(obd, "mgc_fs", obd);
-	}
+	rc = lu_env_init(&env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(unlock, rc);
+
+	mgc_local_llog_fini(&env, obd);
 
-	cli->cl_mgc_vfsmnt = NULL;
-	if (obd->obd_fsops)
-		fsfilt_put_ops(obd->obd_fsops);
+	lu_object_put_nocache(&env, &cli->cl_mgc_configs_dir->do_lu);
+	cli->cl_mgc_configs_dir = NULL;
 
+	local_oid_storage_fini(&env, cli->cl_mgc_los);
+	cli->cl_mgc_los = NULL;
+	lu_env_fini(&env);
+
+unlock:
+	class_decref(obd, "mgc_fs", obd);
 	up(&cli->cl_mgc_sem);
 
-	return rc;
+	return 0;
+}
+
+static int mgc_llog_init(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt	*ctxt;
+	int			 rc;
+
+	/* setup only remote ctxt, the local disk context is switched per each
+	 * filesystem during mgc_fs_setup() */
+	rc = llog_setup(env, obd, &obd->obd_olg, LLOG_CONFIG_REPL_CTXT, obd,
+			&llog_client_ops);
+	if (rc)
+		return rc;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
+	llog_initiator_connect(ctxt);
+	llog_ctxt_put(ctxt);
+
+	return 0;
+}
+
+static int mgc_llog_fini(const struct lu_env *env, struct obd_device *obd)
+{
+	struct llog_ctxt *ctxt;
+
+	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
+	if (ctxt)
+		llog_cleanup(env, ctxt);
+
+	return 0;
 }
 
 static atomic_t mgc_count = ATOMIC_INIT(0);
@@ -694,7 +769,7 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 			}
 		}
 		obd_cleanup_client_import(obd);
-		rc = obd_llog_finish(obd, 0);
+		rc = mgc_llog_fini(NULL, obd);
 		if (rc != 0)
 			CERROR("failed to cleanup llogging subsystems\n");
 		break;
@@ -704,11 +779,8 @@ static int mgc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mgc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
-	LASSERT(cli->cl_mgc_vfsmnt == NULL);
-
 	/* COMPAT_146 - old config logs may have added profiles we don't
 	   know about */
 	if (obd->obd_type->typ_refcnt <= 1)
@@ -733,7 +805,7 @@ static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	if (rc)
 		GOTO(err_decref, rc);
 
-	rc = obd_llog_init(obd, &obd->obd_olg, obd, NULL);
+	rc = mgc_llog_init(NULL, obd);
 	if (rc) {
 		CERROR("failed to setup llogging subsystems\n");
 		GOTO(err_cleanup, rc);
@@ -1011,23 +1083,17 @@ int mgc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 	}
 	if (KEY_IS(KEY_SET_FS)) {
 		struct super_block *sb = (struct super_block *)val;
-		struct lustre_sb_info *lsi;
+
 		if (vallen != sizeof(struct super_block))
 			return -EINVAL;
-		lsi = s2lsi(sb);
-		rc = mgc_fs_setup(exp->exp_obd, sb, lsi->lsi_srv_mnt);
-		if (rc) {
-			CERROR("set_fs got %d\n", rc);
-		}
+
+		rc = mgc_fs_setup(exp->exp_obd, sb);
 		return rc;
 	}
 	if (KEY_IS(KEY_CLEAR_FS)) {
 		if (vallen != 0)
 			return -EINVAL;
 		rc = mgc_fs_cleanup(exp->exp_obd);
-		if (rc) {
-			CERROR("clear_fs got %d\n", rc);
-		}
 		return rc;
 	}
 	if (KEY_IS(KEY_SET_INFO)) {
@@ -1145,49 +1211,6 @@ static int mgc_import_event(struct obd_device *obd,
 	return rc;
 }
 
-static int mgc_llog_init(struct obd_device *obd, struct obd_llog_group *olg,
-			 struct obd_device *tgt, int *index)
-{
-	struct llog_ctxt *ctxt;
-	int rc;
-
-	LASSERT(olg == &obd->obd_olg);
-
-
-	rc = llog_setup(NULL, obd, olg, LLOG_CONFIG_REPL_CTXT, tgt,
-			&llog_client_ops);
-	if (rc)
-		GOTO(out, rc);
-
-	ctxt = llog_group_get_ctxt(olg, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt)
-		GOTO(out, rc = -ENODEV);
-
-	llog_initiator_connect(ctxt);
-	llog_ctxt_put(ctxt);
-
-	return 0;
-out:
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return rc;
-}
-
-static int mgc_llog_finish(struct obd_device *obd, int count)
-{
-	struct llog_ctxt *ctxt;
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_REPL_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-
-	ctxt = llog_get_context(obd, LLOG_CONFIG_ORIG_CTXT);
-	if (ctxt)
-		llog_cleanup(NULL, ctxt);
-	return 0;
-}
-
 enum {
 	CONFIG_READ_NRPAGES_INIT = 1 << (20 - PAGE_CACHE_SHIFT),
 	CONFIG_READ_NRPAGES      = 4
@@ -1540,17 +1563,58 @@ out:
 	return rc;
 }
 
+static int mgc_llog_local_copy(const struct lu_env *env,
+			       struct obd_device *obd,
+			       struct llog_ctxt *rctxt,
+			       struct llog_ctxt *lctxt, char *logname)
+{
+	char	*temp_log;
+	int	 rc;
+
+
+
+	/*
+	 * - copy it to backup using llog_backup()
+	 * - copy remote llog to logname using llog_backup()
+	 * - if failed then move bakup to logname again
+	 */
+
+	OBD_ALLOC(temp_log, strlen(logname) + 1);
+	if (!temp_log)
+		return -ENOMEM;
+	sprintf(temp_log, "%sT", logname);
+
+	/* make a copy of local llog at first */
+	rc = llog_backup(env, obd, lctxt, lctxt, logname, temp_log);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out, rc);
+	/* copy remote llog to the local copy */
+	rc = llog_backup(env, obd, rctxt, lctxt, logname, logname);
+	if (rc == -ENOENT) {
+		/* no remote llog, delete local one too */
+		llog_erase(env, lctxt, NULL, logname);
+	} else if (rc < 0) {
+		/* error during backup, get local one back from the copy */
+		llog_backup(env, obd, lctxt, lctxt, temp_log, logname);
+out:
+		CERROR("%s: failed to copy remote log %s: rc = %d\n",
+		       obd->obd_name, logname, rc);
+	}
+	llog_erase(env, lctxt, NULL, temp_log);
+	OBD_FREE(temp_log, strlen(logname) + 1);
+	return rc;
+}
 
 /* local_only means it cannot get remote llogs */
 static int mgc_process_cfg_log(struct obd_device *mgc,
-			       struct config_llog_data *cld,
-			       int local_only)
+			       struct config_llog_data *cld, int local_only)
 {
-	struct llog_ctxt *ctxt, *lctxt = NULL;
-	struct lvfs_run_ctxt *saved_ctxt;
-	struct lustre_sb_info *lsi = NULL;
-	int rc = 0, must_pop = 0;
-	bool sptlrpc_started = false;
+	struct llog_ctxt	*ctxt, *lctxt = NULL;
+	struct client_obd	*cli = &mgc->u.cli;
+	struct lustre_sb_info	*lsi = NULL;
+	int			 rc = 0;
+	bool			 sptlrpc_started = false;
+	struct lu_env		*env;
 
 	LASSERT(cld);
 	LASSERT(mutex_is_locked(&cld->cld_lock));
@@ -1565,20 +1629,49 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	if (cld->cld_cfg.cfg_sb)
 		lsi = s2lsi(cld->cld_cfg.cfg_sb);
 
-	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
-	if (!ctxt) {
-		CERROR("missing llog context\n");
-		return -EINVAL;
-	}
-
-	OBD_ALLOC_PTR(saved_ctxt);
-	if (saved_ctxt == NULL)
+	OBD_ALLOC_PTR(env);
+	if (env == NULL)
 		return -ENOMEM;
 
+	rc = lu_env_init(env, LCT_MG_THREAD);
+	if (rc)
+		GOTO(out_free, rc);
+
+	ctxt = llog_get_context(mgc, LLOG_CONFIG_REPL_CTXT);
+	LASSERT(ctxt);
+
 	lctxt = llog_get_context(mgc, LLOG_CONFIG_ORIG_CTXT);
 
-		if (local_only) { /* no local log at client side */
-		GOTO(out_pop, rc = -EIO);
+	/* Copy the setup log locally if we can. Don't mess around if we're
+	 * running an MGS though (logs are already local). */
+	if (lctxt && lsi && IS_SERVER(lsi) && !IS_MGS(lsi) &&
+	    cli->cl_mgc_configs_dir != NULL &&
+	    lu2dt_dev(cli->cl_mgc_configs_dir->do_lu.lo_dev) ==
+	    lsi->lsi_dt_dev) {
+		if (!local_only)
+			/* Only try to copy log if we have the lock. */
+			rc = mgc_llog_local_copy(env, mgc, ctxt, lctxt,
+						 cld->cld_logname);
+		if (local_only || rc) {
+			if (llog_is_empty(env, lctxt, cld->cld_logname)) {
+				LCONSOLE_ERROR_MSG(0x13a, "Failed to get MGS "
+						   "log %s and no local copy."
+						   "\n", cld->cld_logname);
+				GOTO(out_pop, rc = -ENOTCONN);
+			}
+			CDEBUG(D_MGC, "Failed to get MGS log %s, using local "
+			       "copy for now, will try to update later.\n",
+			       cld->cld_logname);
+		}
+		/* Now, whether we copied or not, start using the local llog.
+		 * If we failed to copy, we'll start using whatever the old
+		 * log has. */
+		llog_ctxt_put(ctxt);
+		ctxt = lctxt;
+		lctxt = NULL;
+	} else {
+		if (local_only) /* no local log at client side */
+			GOTO(out_pop, rc = -EIO);
 	}
 
 	if (cld_is_sptlrpc(cld)) {
@@ -1587,19 +1680,16 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 	}
 
 	/* logname and instance info should be the same, so use our
-	   copy of the instance for the update.  The cfg_last_idx will
-	   be updated here. */
-	rc = class_config_parse_llog(NULL, ctxt, cld->cld_logname,
+	 * copy of the instance for the update.  The cfg_last_idx will
+	 * be updated here. */
+	rc = class_config_parse_llog(env, ctxt, cld->cld_logname,
 				     &cld->cld_cfg);
 
 out_pop:
-	llog_ctxt_put(ctxt);
+	__llog_ctxt_put(env, ctxt);
 	if (lctxt)
-		llog_ctxt_put(lctxt);
-	if (must_pop)
-		pop_ctxt(saved_ctxt, &mgc->obd_lvfs_ctxt, NULL);
+		__llog_ctxt_put(env, lctxt);
 
-	OBD_FREE_PTR(saved_ctxt);
 	/*
 	 * update settings on existing OBDs. doing it inside
 	 * of llog_process_lock so no device is attaching/detaching
@@ -1614,6 +1704,9 @@ out_pop:
 					  strlen("-sptlrpc"));
 	}
 
+	lu_env_fini(env);
+out_free:
+	OBD_FREE_PTR(env);
 	return rc;
 }
 
@@ -1801,8 +1894,6 @@ struct obd_ops mgc_obd_ops = {
 	.o_set_info_async = mgc_set_info_async,
 	.o_get_info       = mgc_get_info,
 	.o_import_event = mgc_import_event,
-	.o_llog_init    = mgc_llog_init,
-	.o_llog_finish  = mgc_llog_finish,
 	.o_process_config = mgc_process_config,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index 0cb4428..e390a42 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -265,31 +265,6 @@ out:
 }
 EXPORT_SYMBOL(llog_init_handle);
 
-int llog_copy_handler(const struct lu_env *env,
-		      struct llog_handle *llh,
-		      struct llog_rec_hdr *rec,
-		      void *data)
-{
-	struct llog_rec_hdr local_rec = *rec;
-	struct llog_handle *local_llh = (struct llog_handle *)data;
-	char *cfg_buf = (char*) (rec + 1);
-	struct lustre_cfg *lcfg;
-	int rc = 0;
-
-	/* Append all records */
-	local_rec.lrh_len -= sizeof(*rec) + sizeof(struct llog_rec_tail);
-	rc = llog_write(env, local_llh, &local_rec, NULL, 0,
-			(void *)cfg_buf, -1);
-
-	lcfg = (struct lustre_cfg *)cfg_buf;
-	CDEBUG(D_INFO, "idx=%d, rc=%d, len=%d, cmd %x %s %s\n",
-	       rec->lrh_index, rc, rec->lrh_len, lcfg->lcfg_command,
-	       lustre_cfg_string(lcfg, 0), lustre_cfg_string(lcfg, 1));
-
-	return rc;
-}
-EXPORT_SYMBOL(llog_copy_handler);
-
 static int llog_process_thread(void *arg)
 {
 	struct llog_process_info	*lpi = arg;
@@ -493,14 +468,6 @@ int llog_process(const struct lu_env *env, struct llog_handle *loghandle,
 }
 EXPORT_SYMBOL(llog_process);
 
-inline int llog_get_size(struct llog_handle *loghandle)
-{
-	if (loghandle && loghandle->lgh_hdr)
-		return loghandle->lgh_hdr->llh_count;
-	return 0;
-}
-EXPORT_SYMBOL(llog_get_size);
-
 int llog_reverse_process(const struct lu_env *env,
 			 struct llog_handle *loghandle, llog_cb_t cb,
 			 void *data, void *catdata)
@@ -767,8 +734,9 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 		     struct llog_handle **res, struct llog_logid *logid,
 		     char *name)
 {
-	struct thandle	*th;
-	int		 rc;
+	struct dt_device	*d;
+	struct thandle		*th;
+	int			 rc;
 
 	rc = llog_open(env, ctxt, res, logid, name, LLOG_OPEN_NEW);
 	if (rc)
@@ -777,27 +745,21 @@ int llog_open_create(const struct lu_env *env, struct llog_ctxt *ctxt,
 	if (llog_exist(*res))
 		return 0;
 
-	if ((*res)->lgh_obj != NULL) {
-		struct dt_device *d;
+	LASSERT((*res)->lgh_obj != NULL);
 
-		d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
+	d = lu2dt_dev((*res)->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, d);
-		if (IS_ERR(th))
-			GOTO(out, rc = PTR_ERR(th));
+	th = dt_trans_create(env, d);
+	if (IS_ERR(th))
+		GOTO(out, rc = PTR_ERR(th));
 
-		rc = llog_declare_create(env, *res, th);
-		if (rc == 0) {
-			rc = dt_trans_start_local(env, d, th);
-			if (rc == 0)
-				rc = llog_create(env, *res, th);
-		}
-		dt_trans_stop(env, d, th);
-	} else {
-		/* lvfs compat code */
-		LASSERT((*res)->lgh_file == NULL);
-		rc = llog_create(env, *res, NULL);
+	rc = llog_declare_create(env, *res, th);
+	if (rc == 0) {
+		rc = dt_trans_start_local(env, d, th);
+		if (rc == 0)
+			rc = llog_create(env, *res, th);
 	}
+	dt_trans_stop(env, d, th);
 out:
 	if (rc)
 		llog_close(env, *res);
@@ -842,41 +804,34 @@ int llog_write(const struct lu_env *env, struct llog_handle *loghandle,
 	       struct llog_rec_hdr *rec, struct llog_cookie *reccookie,
 	       int cookiecount, void *buf, int idx)
 {
-	int rc;
+	struct dt_device	*dt;
+	struct thandle		*th;
+	int			 rc;
 
 	LASSERT(loghandle);
 	LASSERT(loghandle->lgh_ctxt);
+	LASSERT(loghandle->lgh_obj != NULL);
 
-	if (loghandle->lgh_obj != NULL) {
-		struct dt_device	*dt;
-		struct thandle		*th;
-
-		dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
+	dt = lu2dt_dev(loghandle->lgh_obj->do_lu.lo_dev);
 
-		th = dt_trans_create(env, dt);
-		if (IS_ERR(th))
-			return PTR_ERR(th);
+	th = dt_trans_create(env, dt);
+	if (IS_ERR(th))
+		return PTR_ERR(th);
 
-		rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = llog_declare_write_rec(env, loghandle, rec, idx, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		rc = dt_trans_start_local(env, dt, th);
-		if (rc)
-			GOTO(out_trans, rc);
+	rc = dt_trans_start_local(env, dt, th);
+	if (rc)
+		GOTO(out_trans, rc);
 
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, th);
-		up_write(&loghandle->lgh_lock);
+	down_write(&loghandle->lgh_lock);
+	rc = llog_write_rec(env, loghandle, rec, reccookie,
+			    cookiecount, buf, idx, th);
+	up_write(&loghandle->lgh_lock);
 out_trans:
-		dt_trans_stop(env, dt, th);
-	} else { /* lvfs compatibility */
-		down_write(&loghandle->lgh_lock);
-		rc = llog_write_rec(env, loghandle, rec, reccookie,
-				    cookiecount, buf, idx, NULL);
-		up_write(&loghandle->lgh_lock);
-	}
+	dt_trans_stop(env, dt, th);
 	return rc;
 }
 EXPORT_SYMBOL(llog_write);
@@ -932,3 +887,104 @@ out:
 	return rc;
 }
 EXPORT_SYMBOL(llog_close);
+
+int llog_is_empty(const struct lu_env *env, struct llog_ctxt *ctxt,
+		  char *name)
+{
+	struct llog_handle	*llh;
+	int			 rc = 0;
+
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		if (likely(rc == -ENOENT))
+			rc = 0;
+		GOTO(out, rc);
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+	rc = llog_get_size(llh);
+
+out_close:
+	llog_close(env, llh);
+out:
+	/* header is record 1 */
+	return (rc <= 1);
+}
+EXPORT_SYMBOL(llog_is_empty);
+
+int llog_copy_handler(const struct lu_env *env, struct llog_handle *llh,
+		      struct llog_rec_hdr *rec, void *data)
+{
+	struct llog_handle	*copy_llh = data;
+
+	/* Append all records */
+	return llog_write(env, copy_llh, rec, NULL, 0, NULL, -1);
+}
+EXPORT_SYMBOL(llog_copy_handler);
+
+/* backup plain llog */
+int llog_backup(const struct lu_env *env, struct obd_device *obd,
+		struct llog_ctxt *ctxt, struct llog_ctxt *bctxt,
+		char *name, char *backup)
+{
+	struct llog_handle	*llh, *bllh;
+	int			 rc;
+
+
+
+	/* open original log */
+	rc = llog_open(env, ctxt, &llh, NULL, name, LLOG_OPEN_EXISTS);
+	if (rc < 0) {
+		/* the -ENOENT case is also reported to the caller
+		 * but silently so it should handle that if needed.
+		 */
+		if (rc != -ENOENT)
+			CERROR("%s: failed to open log %s: rc = %d\n",
+			       obd->obd_name, name, rc);
+		return rc;
+	}
+
+	rc = llog_init_handle(env, llh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_close, rc);
+
+	/* Make sure there's no old backup log */
+	rc = llog_erase(env, bctxt, NULL, backup);
+	if (rc < 0 && rc != -ENOENT)
+		GOTO(out_close, rc);
+
+	/* open backup log */
+	rc = llog_open_create(env, bctxt, &bllh, NULL, backup);
+	if (rc) {
+		CERROR("%s: failed to open backup logfile %s: rc = %d\n",
+		       obd->obd_name, backup, rc);
+		GOTO(out_close, rc);
+	}
+
+	/* check that backup llog is not the same object as original one */
+	if (llh->lgh_obj == bllh->lgh_obj) {
+		CERROR("%s: backup llog %s to itself (%s), objects %p/%p\n",
+		       obd->obd_name, name, backup, llh->lgh_obj,
+		       bllh->lgh_obj);
+		GOTO(out_backup, rc = -EEXIST);
+	}
+
+	rc = llog_init_handle(env, bllh, LLOG_F_IS_PLAIN, NULL);
+	if (rc)
+		GOTO(out_backup, rc);
+
+	/* Copy log record by record */
+	rc = llog_process_or_fork(env, llh, llog_copy_handler, (void *)bllh,
+				  NULL, false);
+	if (rc)
+		CERROR("%s: failed to backup log %s: rc = %d\n",
+		       obd->obd_name, name, rc);
+out_backup:
+	llog_close(env, bllh);
+out_close:
+	llog_close(env, llh);
+	return rc;
+}
+EXPORT_SYMBOL(llog_backup);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.c b/drivers/staging/lustre/lustre/obdclass/local_storage.c
index cc19fba..295bb7c 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.c
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.c
@@ -855,9 +855,12 @@ out_los:
 		(*los)->los_seq = fid_seq(first_fid);
 		(*los)->los_last_oid = le64_to_cpu(lastid);
 		(*los)->los_obj = o;
-		/* read value should not be less than initial one */
-		LASSERTF((*los)->los_last_oid >= first_oid, "%u < %u\n",
-			 (*los)->los_last_oid, first_oid);
+		/* Read value should not be less than initial one
+		 * but possible after upgrade from older fs.
+		 * In this case just switch to the first_oid in memory and
+		 * it will be updated on disk with first object generated */
+		if ((*los)->los_last_oid < first_oid)
+			(*los)->los_last_oid = first_oid;
 	}
 out:
 	mutex_unlock(&ls->ls_los_mutex);
diff --git a/drivers/staging/lustre/lustre/obdclass/local_storage.h b/drivers/staging/lustre/lustre/obdclass/local_storage.h
index d553c37..0f63b8c 100644
--- a/drivers/staging/lustre/lustre/obdclass/local_storage.h
+++ b/drivers/staging/lustre/lustre/obdclass/local_storage.h
@@ -29,6 +29,8 @@
  *
  * Author: Mikhail Pershin <mike.pershin@intel.com>
  */
+#ifndef __LOCAL_STORAGE_H
+#define __LOCAL_STORAGE_H
 
 #include <dt_object.h>
 #include <obd.h>
@@ -86,3 +88,4 @@ struct los_ondisk {
 };
 
 #define LOS_MAGIC	0xdecafbee
+#endif
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 212823a..346a7b2 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -423,7 +423,7 @@ LU_KEY_INIT_FINI(lu_global, struct lu_cdebug_data);
  */
 struct lu_context_key lu_global_key = {
 	.lct_tags = LCT_MD_THREAD | LCT_DT_THREAD |
-		    LCT_MG_THREAD | LCT_CL_THREAD,
+		    LCT_MG_THREAD | LCT_CL_THREAD | LCT_LOCAL,
 	.lct_init = lu_global_key_init,
 	.lct_fini = lu_global_key_fini
 };
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 68a4d6a..a69a630 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -631,6 +631,9 @@ int lustre_put_lsi(struct super_block *sb)
 	CDEBUG(D_MOUNT, "put %p %d\n", sb, atomic_read(&lsi->lsi_mounts));
 	if (atomic_dec_and_test(&lsi->lsi_mounts)) {
 		if (IS_SERVER(lsi) && lsi->lsi_osd_exp) {
+			lu_device_put(&lsi->lsi_dt_dev->dd_lu_dev);
+			lsi->lsi_osd_exp->exp_obd->obd_lvfs_ctxt.dt = NULL;
+			lsi->lsi_dt_dev = NULL;
 			obd_disconnect(lsi->lsi_osd_exp);
 			/* wait till OSD is gone */
 			obd_zombie_barrier();
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (10 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 11/40] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:22   ` Cheng Shao
  2013-11-14 16:13 ` [PATCH 13/40] staging/lustre/autoconf: remove vectored fops tests Peng Tao
                   ` (28 subsequent siblings)
  40 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Patrick Farrell, Cheng Shao, Peng Tao, Andreas Dilger

From: Patrick Farrell <paf@cray.com>

This happend with SLES11SP2 Lustre client, which in turn acts as an
NFS server, exporting a subtree of an Lustre fs through NFS.

We detected that whenever we are writing to a new file using, fx,
'echo blah > newfile', it will return ENOENT error. We found
out that this was caused by the anonymous dentry. In SLESS11SP2,
anonymous dentries are assigned '/' as the name, instead of an
empty string. When MDT handles the intent_open call, it will look
up the obj by the name if it is not an empty string, and thus
couldn't find it.

As MDS_OPEN_BY_FID is always set on this request, we never need
to send the name in this request.  The fid is already available
and should be used in case the file has been renamed.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
Lustre-change: http://review.whamcloud.com/6920
Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index f97c600..efc614b 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
 {
 	struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
 	struct dentry *parent = file->f_dentry->d_parent;
-	const char *name = file->f_dentry->d_name.name;
-	const int len = file->f_dentry->d_name.len;
 	struct md_op_data *op_data;
 	struct ptlrpc_request *req;
 	__u32 opc = LUSTRE_OPC_ANY;
@@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
 	}
 
 	op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
-				      file->f_dentry->d_inode, name, len,
+				      file->f_dentry->d_inode, NULL, 0,
 				      O_RDWR, opc, NULL);
+
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 13/40] staging/lustre/autoconf: remove vectored fops tests
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (11 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 14/40] staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test Peng Tao
                   ` (27 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, James Simmons, Jeff Mahoney, Peng Tao, Andreas Dilger

From: James Simmons <uja.ornl@gmail.com>

file_operations.readv/writev have been removed since v2.6.19
We can remove the test and the dead code.

Lustre-change: http://review.whamcloud.com/5343
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2800
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/file.c |   17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index efc614b..174ba4e 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3000,17 +3000,12 @@ int ll_inode_permission(struct inode *inode, int mask)
 	return rc;
 }
 
-#define READ_METHOD aio_read
-#define READ_FUNCTION ll_file_aio_read
-#define WRITE_METHOD aio_write
-#define WRITE_FUNCTION ll_file_aio_write
-
 /* -o localflock - only provides locally consistent flock locks */
 struct file_operations ll_file_operations = {
 	.read	   = ll_file_read,
-	.READ_METHOD    = READ_FUNCTION,
+	.aio_read = ll_file_aio_read,
 	.write	  = ll_file_write,
-	.WRITE_METHOD   = WRITE_FUNCTION,
+	.aio_write = ll_file_aio_write,
 	.unlocked_ioctl = ll_file_ioctl,
 	.open	   = ll_file_open,
 	.release	= ll_file_release,
@@ -3023,9 +3018,9 @@ struct file_operations ll_file_operations = {
 
 struct file_operations ll_file_operations_flock = {
 	.read	   = ll_file_read,
-	.READ_METHOD    = READ_FUNCTION,
+	.aio_read    = ll_file_aio_read,
 	.write	  = ll_file_write,
-	.WRITE_METHOD   = WRITE_FUNCTION,
+	.aio_write   = ll_file_aio_write,
 	.unlocked_ioctl = ll_file_ioctl,
 	.open	   = ll_file_open,
 	.release	= ll_file_release,
@@ -3041,9 +3036,9 @@ struct file_operations ll_file_operations_flock = {
 /* These are for -o noflock - to return ENOSYS on flock calls */
 struct file_operations ll_file_operations_noflock = {
 	.read	   = ll_file_read,
-	.READ_METHOD    = READ_FUNCTION,
+	.aio_read    = ll_file_aio_read,
 	.write	  = ll_file_write,
-	.WRITE_METHOD   = WRITE_FUNCTION,
+	.aio_write   = ll_file_aio_write,
 	.unlocked_ioctl = ll_file_ioctl,
 	.open	   = ll_file_open,
 	.release	= ll_file_release,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 14/40] staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (12 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 13/40] staging/lustre/autoconf: remove vectored fops tests Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 15/40] staging/lustre/llog: fix return value of llog_alloc_handle Peng Tao
                   ` (26 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, James Simmons, Jeff Mahoney, Peng Tao, Andreas Dilger

From: James Simmons <uja.ornl@gmail.com>

is_compat_task has been defined on all arches since v2.6.29.
We can remove the test and dead code.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2800
Lustre-change: http://review.whamcloud.com/5393
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/include/linux/libcfs/curproc.h  |    1 -
 .../lustre/lustre/libcfs/linux/linux-curproc.c     |   13 -------------
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 ++-
 3 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/curproc.h b/drivers/staging/lustre/include/linux/libcfs/curproc.h
index de8e35b..507d16b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/curproc.h
+++ b/drivers/staging/lustre/include/linux/libcfs/curproc.h
@@ -61,7 +61,6 @@ int    cfs_curproc_groups_nr(void);
  */
 
 /* check if task is running in compat mode.*/
-int current_is_32bit(void);
 #define current_pid()		(current->pid)
 #define current_comm()		(current->comm)
 int cfs_get_environ(const char *key, char *value, int *val_len);
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
index 0bf8e5d..a2ef64c 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
@@ -140,18 +140,6 @@ int cfs_capable(cfs_cap_t cap)
 	return capable(cfs_cap_unpack(cap));
 }
 
-/* Check if task is running in 32-bit API mode, for the purpose of
- * userspace binary interfaces.  On 32-bit Linux this is (unfortunately)
- * always true, even if the application is using LARGEFILE64 and 64-bit
- * APIs, because Linux provides no way for the filesystem to know if it
- * is called via 32-bit or 64-bit APIs.  Other clients may vary.  On
- * 64-bit systems, this will only be true if the binary is calling a
- * 32-bit system call. */
-int current_is_32bit(void)
-{
-	return is_compat_task();
-}
-
 static int cfs_access_process_vm(struct task_struct *tsk, unsigned long addr,
 				 void *buf, int len, int write)
 {
@@ -311,7 +299,6 @@ EXPORT_SYMBOL(cfs_cap_raised);
 EXPORT_SYMBOL(cfs_curproc_cap_pack);
 EXPORT_SYMBOL(cfs_curproc_cap_unpack);
 EXPORT_SYMBOL(cfs_capable);
-EXPORT_SYMBOL(current_is_32bit);
 
 /*
  * Local variables:
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 2debb24..e800f8d 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -46,6 +46,7 @@
 #include <lclient.h>
 #include <lustre_mdc.h>
 #include <linux/lustre_intent.h>
+#include <linux/compat.h>
 
 #ifndef FMODE_EXEC
 #define FMODE_EXEC 0
@@ -649,7 +650,7 @@ static inline int ll_need_32bit_api(struct ll_sb_info *sbi)
 #if BITS_PER_LONG == 32
 	return 1;
 #else
-	return unlikely(current_is_32bit() || (sbi->ll_flags & LL_SBI_32BIT_API));
+	return unlikely(is_compat_task() || (sbi->ll_flags & LL_SBI_32BIT_API));
 #endif
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 15/40] staging/lustre/llog: fix return value of llog_alloc_handle
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (13 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 14/40] staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 16/40] staging/lustre/lov: convert magic to host-endian in lov_dump_lmm() Peng Tao
                   ` (25 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Li Xi, Peng Tao, Andreas Dilger

From: Li Xi <pkuelelixi@gmail.com>

llog_open() calls llog_alloc_handle() taking NULL as the error return
value. But llog_alloc_handle() returns ERR_PTR(-ENOMEM) instead when
error.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3470
Lustre-change: http://review.whamcloud.com/6644
Signed-off-by: Li Xi <pkuelelixi@gmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/obdclass/llog.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index e390a42..67513be 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -62,7 +62,7 @@ struct llog_handle *llog_alloc_handle(void)
 
 	OBD_ALLOC_PTR(loghandle);
 	if (loghandle == NULL)
-		return ERR_PTR(-ENOMEM);
+		return NULL;
 
 	init_rwsem(&loghandle->lgh_lock);
 	spin_lock_init(&loghandle->lgh_hdr_lock);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 16/40] staging/lustre/lov: convert magic to host-endian in lov_dump_lmm()
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (14 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 15/40] staging/lustre/llog: fix return value of llog_alloc_handle Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 17/40] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
                   ` (24 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

In lov_dump_lmm(), convert the lmm_magic from little-endian to
host-endian byte order before the switch statement, as the other
lov_dump_xxx() and lov_verify_xxx() functions already do.  Remove the
unused macro LMM_ASSERT().

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3297
Lustre-change: http://review.whamcloud.com/6290
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/lov/lov_pack.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index ec6f6e0..27ed27e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -105,24 +105,22 @@ void lov_dump_lmm(int level, void *lmm)
 {
 	int magic;
 
-	magic = ((struct lov_mds_md_v1 *)(lmm))->lmm_magic;
+	magic = le32_to_cpu(((struct lov_mds_md *)lmm)->lmm_magic);
 	switch (magic) {
 	case LOV_MAGIC_V1:
-		return lov_dump_lmm_v1(level, (struct lov_mds_md_v1 *)(lmm));
+		lov_dump_lmm_v1(level, (struct lov_mds_md_v1 *)lmm);
+		break;
 	case LOV_MAGIC_V3:
-		return lov_dump_lmm_v3(level, (struct lov_mds_md_v3 *)(lmm));
+		lov_dump_lmm_v3(level, (struct lov_mds_md_v3 *)lmm);
+		break;
 	default:
-		CERROR("Cannot recognize lmm_magic %x", magic);
+		CDEBUG(level, "unrecognized lmm_magic %x, assuming %x\n",
+		       magic, LOV_MAGIC_V1);
+		lov_dump_lmm_common(level, lmm);
+		break;
 	}
-	return;
 }
 
-#define LMM_ASSERT(test)						\
-do {								    \
-	if (!(test)) lov_dump_lmm(D_ERROR, lmm);			\
-	LASSERT(test); /* so we know what assertion failed */	   \
-} while (0)
-
 /* Pack LOV object metadata for disk storage.  It is packed in LE byte
  * order and is opaque to the networking layer.
  *
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 17/40] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (15 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 16/40] staging/lustre/lov: convert magic to host-endian in lov_dump_lmm() Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 18/40] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
                   ` (23 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Andriy Skulysh, Peng Tao, Andreas Dilger

From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>

During race exp_flock_hash can be created 2 times.
It is created & assigned without any lock.

Move hash initialization from ldlm_flock_blocking_link()
to ldlm_init_export()

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2835
Lustre-change: http://review.whamcloud.com/5471
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <Alexander_Boyko@xyratex.com>
Reviewed-by: Vitaly Fertman <Vitaly_Fertman@xyratex.com>
Tested-by: Kyrylo Shatskyy <kyrylo_shatskyy@xyratex.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c |   28 +++++++----------------
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c |    8 +++++++
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 39fcdac..182773b 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -95,20 +95,12 @@ ldlm_flocks_overlap(struct ldlm_lock *lock, struct ldlm_lock *new)
 		lock->l_policy_data.l_flock.start));
 }
 
-static inline int ldlm_flock_blocking_link(struct ldlm_lock *req,
-					   struct ldlm_lock *lock)
+static inline void ldlm_flock_blocking_link(struct ldlm_lock *req,
+					    struct ldlm_lock *lock)
 {
-	int rc = 0;
-
 	/* For server only */
 	if (req->l_export == NULL)
-		return 0;
-
-	if (unlikely(req->l_export->exp_flock_hash == NULL)) {
-		rc = ldlm_init_flock_export(req->l_export);
-		if (rc)
-			goto error;
-	}
+		return;
 
 	LASSERT(hlist_unhashed(&req->l_exp_flock_hash));
 
@@ -121,8 +113,6 @@ static inline int ldlm_flock_blocking_link(struct ldlm_lock *req,
 	cfs_hash_add(req->l_export->exp_flock_hash,
 		     &req->l_policy_data.l_flock.owner,
 		     &req->l_exp_flock_hash);
-error:
-	return rc;
 }
 
 static inline void ldlm_flock_blocking_unlink(struct ldlm_lock *req)
@@ -250,7 +240,6 @@ ldlm_process_flock_lock(struct ldlm_lock *req, __u64 *flags, int first_enq,
 	int overlaps = 0;
 	int splitted = 0;
 	const struct ldlm_callback_suite null_cbs = { NULL };
-	int rc;
 
 	CDEBUG(D_DLMTRACE, "flags %#llx owner "LPU64" pid %u mode %u start "
 	       LPU64" end "LPU64"\n", *flags,
@@ -328,12 +317,8 @@ reprocess:
 
 			/* add lock to blocking list before deadlock
 			 * check to prevent race */
-			rc = ldlm_flock_blocking_link(req, lock);
-			if (rc) {
-				ldlm_flock_destroy(req, mode, *flags);
-				*err = rc;
-				return LDLM_ITER_STOP;
-			}
+			ldlm_flock_blocking_link(req, lock);
+
 			if (ldlm_flock_deadlock(req, lock)) {
 				ldlm_flock_blocking_unlink(req);
 				ldlm_flock_destroy(req, mode, *flags);
@@ -816,6 +801,9 @@ static cfs_hash_ops_t ldlm_export_flock_ops = {
 
 int ldlm_init_flock_export(struct obd_export *exp)
 {
+	if( strcmp(exp->exp_obd->obd_type->typ_name, LUSTRE_MDT_NAME) != 0)
+		return 0;
+
 	exp->exp_flock_hash =
 		cfs_hash_create(obd_uuid2str(&exp->exp_client_uuid),
 				HASH_EXP_LOCK_CUR_BITS,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index 40c58b7..ae08483 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -964,6 +964,7 @@ static cfs_hash_ops_t ldlm_export_lock_ops = {
 
 int ldlm_init_export(struct obd_export *exp)
 {
+	int rc;
 	exp->exp_lock_hash =
 		cfs_hash_create(obd_uuid2str(&exp->exp_client_uuid),
 				HASH_EXP_LOCK_CUR_BITS,
@@ -977,7 +978,14 @@ int ldlm_init_export(struct obd_export *exp)
 	if (!exp->exp_lock_hash)
 		return -ENOMEM;
 
+	rc = ldlm_init_flock_export(exp);
+	if (rc)
+		GOTO(err, rc);
+
 	return 0;
+err:
+	ldlm_destroy_export(exp);
+	return rc;
 }
 EXPORT_SYMBOL(ldlm_init_export);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 18/40] staging/lustre/mdc: prevent fall through in mdc_iocontrol()
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (16 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 17/40] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 19/40] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
                   ` (22 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

In mdc_iocontrol() add a goto to the end of the LL_IOC_HSM_STATE_SET
case, preventing fall through into the next case. In the same
function, replace the return statement in OBD_IOC_QUOTACTL with a
goto, so that a reference to the module is not leaked.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3576
Lustre-change: http://review.whamcloud.com/6962
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   27 +++++++++++------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index ed3a7a0..a09a5c3 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1743,6 +1743,7 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		GOTO(out, rc);
 	case LL_IOC_HSM_STATE_SET:
 		rc = mdc_ioc_hsm_state_set(exp, karg);
+		GOTO(out, rc);
 	case LL_IOC_HSM_ACTION:
 		rc = mdc_ioc_hsm_current_action(exp, karg);
 		GOTO(out, rc);
@@ -1814,8 +1815,8 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		struct obd_quotactl *oqctl;
 
 		OBD_ALLOC_PTR(oqctl);
-		if (!oqctl)
-			return -ENOMEM;
+		if (oqctl == NULL)
+			GOTO(out, rc = -ENOMEM);
 
 		QCTL_COPY(oqctl, qctl);
 		rc = obd_quotactl(exp, oqctl);
@@ -1824,23 +1825,21 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			qctl->qc_valid = QC_MDTIDX;
 			qctl->obd_uuid = obd->u.cli.cl_target_uuid;
 		}
+
 		OBD_FREE_PTR(oqctl);
-		break;
+		GOTO(out, rc);
 	}
-	case LL_IOC_GET_CONNECT_FLAGS: {
-		if (copy_to_user(uarg,
-				     exp_connect_flags_ptr(exp),
-				     sizeof(__u64)))
+	case LL_IOC_GET_CONNECT_FLAGS:
+		if (copy_to_user(uarg, exp_connect_flags_ptr(exp),
+				 sizeof(*exp_connect_flags_ptr(exp))))
 			GOTO(out, rc = -EFAULT);
-		else
-			GOTO(out, rc = 0);
-	}
-	case LL_IOC_LOV_SWAP_LAYOUTS: {
+
+		GOTO(out, rc = 0);
+	case LL_IOC_LOV_SWAP_LAYOUTS:
 		rc = mdc_ioc_swap_layouts(exp, karg);
-		break;
-	}
+		GOTO(out, rc);
 	default:
-		CERROR("mdc_ioctl(): unrecognised ioctl %#x\n", cmd);
+		CERROR("unrecognised ioctl: cmd = %#x\n", cmd);
 		GOTO(out, rc = -ENOTTY);
 	}
 out:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 19/40] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (17 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 18/40] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 20/40] staging/lustre/mdt: HSM coordinator client interface Peng Tao
                   ` (21 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

Remove the lo_depth member from struct lu_object.  This field is never
set and only read in lu_object_print().  Remove the lo_flags member.
This field was only used in lu_object_alloc() and can be replaced with
an on-stack mask to keep trace of which layers have been allocated.

Lustre-change: http://review.whamcloud.com/5890
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3059
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lu_object.h  |   19 ----------------
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 +++++++++++++-------
 2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index d5b8225..6773bca 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -398,17 +398,6 @@ static inline int lu_device_is_md(const struct lu_device *d)
 }
 
 /**
- * Flags for the object layers.
- */
-enum lu_object_flags {
-	/**
-	 * this flags is set if lu_object_operations::loo_object_init() has
-	 * been called for this layer. Used by lu_object_alloc().
-	 */
-	LU_OBJECT_ALLOCATED = (1 << 0)
-};
-
-/**
  * Common object attributes.
  */
 struct lu_attr {
@@ -486,14 +475,6 @@ struct lu_object {
 	 */
 	struct list_head			 lo_linkage;
 	/**
-	 * Depth. Top level layer depth is 0.
-	 */
-	int				lo_depth;
-	/**
-	 * Flags from enum lu_object_flags.
-	 */
-	__u32					lo_flags;
-	/**
 	 * Link to the device, for debugging.
 	 */
 	struct lu_ref_link                 lo_dev_ref;
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 346a7b2..9019fd1 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -200,6 +200,8 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 	struct lu_object *scan;
 	struct lu_object *top;
 	struct list_head *layers;
+	unsigned int init_mask = 0;
+	unsigned int init_flag;
 	int clean;
 	int result;
 
@@ -218,15 +220,17 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 	 */
 	top->lo_header->loh_fid = *f;
 	layers = &top->lo_header->loh_layers;
+
 	do {
 		/*
 		 * Call ->loo_object_init() repeatedly, until no more new
 		 * object slices are created.
 		 */
 		clean = 1;
+		init_flag = 1;
 		list_for_each_entry(scan, layers, lo_linkage) {
-			if (scan->lo_flags & LU_OBJECT_ALLOCATED)
-				continue;
+			if (init_mask & init_flag)
+				goto next;
 			clean = 0;
 			scan->lo_header = top->lo_header;
 			result = scan->lo_ops->loo_object_init(env, scan, conf);
@@ -234,7 +238,9 @@ static struct lu_object *lu_object_alloc(const struct lu_env *env,
 				lu_object_free(env, top);
 				return ERR_PTR(result);
 			}
-			scan->lo_flags |= LU_OBJECT_ALLOCATED;
+			init_mask |= init_flag;
+next:
+			init_flag <<= 1;
 		}
 	} while (!clean);
 
@@ -487,23 +493,25 @@ void lu_object_print(const struct lu_env *env, void *cookie,
 {
 	static const char ruler[] = "........................................";
 	struct lu_object_header *top;
-	int depth;
+	int depth = 4;
 
 	top = o->lo_header;
 	lu_object_header_print(env, cookie, printer, top);
-	(*printer)(env, cookie, "{ \n");
-	list_for_each_entry(o, &top->loh_layers, lo_linkage) {
-		depth = o->lo_depth + 4;
+	(*printer)(env, cookie, "{\n");
 
+	list_for_each_entry(o, &top->loh_layers, lo_linkage) {
 		/*
 		 * print `.' \a depth times followed by type name and address
 		 */
 		(*printer)(env, cookie, "%*.*s%s@%p", depth, depth, ruler,
 			   o->lo_dev->ld_type->ldt_name, o);
+
 		if (o->lo_ops->loo_object_print != NULL)
-			o->lo_ops->loo_object_print(env, cookie, printer, o);
+			(*o->lo_ops->loo_object_print)(env, cookie, printer, o);
+
 		(*printer)(env, cookie, "\n");
 	}
+
 	(*printer)(env, cookie, "} header@%p\n", top);
 }
 EXPORT_SYMBOL(lu_object_print);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 20/40] staging/lustre/mdt: HSM coordinator client interface
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (18 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 19/40] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 21/40] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
                   ` (20 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

This patch implements the HSM coordinator methods
used by client to add/remove/list HSM actions on
FID.

Lustre-change: http://review.whamcloud.com/6532
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3341
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |    7 +++----
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    4 +---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 9891ace..19fa813 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -1121,11 +1121,10 @@ static inline int hal_size(struct hsm_action_list *hal)
 
 	sz = sizeof(*hal) + cfs_size_round(strlen(hal->hal_fsname));
 	hai = hai_zero(hal);
-	for (i = 0 ; i < hal->hal_count ; i++) {
+	for (i = 0 ; i < hal->hal_count ; i++, hai = hai_next(hai))
 		sz += cfs_size_round(hai->hai_len);
-		hai = hai_next(hai);
-	}
-	return(sz);
+
+	return sz;
 }
 
 /* Copytool progress reporting */
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index a09a5c3..8373778 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1919,10 +1919,8 @@ static void lustre_swab_hal(struct hsm_action_list *h)
 	__swab32s(&h->hal_archive_id);
 	__swab64s(&h->hal_flags);
 	hai = hai_zero(h);
-	for (i = 0; i < h->hal_count; i++) {
+	for (i = 0; i < h->hal_count; i++, hai = hai_next(hai))
 		lustre_swab_hai(hai);
-		hai = hai_next(hai);
-	}
 }
 
 static void lustre_swab_kuch(struct kuc_hdr *l)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 21/40] staging/lustre/mdt: HSM coordinator agent interface
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (19 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 20/40] staging/lustre/mdt: HSM coordinator client interface Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 22/40] staging/lustre/scrub: OI scrub on OST Peng Tao
                   ` (19 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

To move data with external storage, HSM coordinator
uses a Copy Tool running on a client named agent.
This patch implements the interface for these agents.

Lustre-change: http://review.whamcloud.com/6534
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3342
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 19fa813..b6090b5 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -428,8 +428,8 @@ struct obd_uuid {
 	char uuid[UUID_MAX];
 };
 
-static inline int obd_uuid_equals(const struct obd_uuid *u1,
-				  const struct obd_uuid *u2)
+static inline bool obd_uuid_equals(const struct obd_uuid *u1,
+				   const struct obd_uuid *u2)
 {
 	return strcmp((char *)u1->uuid, (char *)u2->uuid) == 0;
 }
@@ -446,7 +446,7 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
 }
 
 /* For printf's only, make sure uuid is terminated */
-static inline char *obd_uuid2str(struct obd_uuid *uuid)
+static inline char *obd_uuid2str(const struct obd_uuid *uuid)
 {
 	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
 		/* Obviously not safe, but for printfs, no real harm done...
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 22/40] staging/lustre/scrub: OI scrub on OST
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (20 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 21/40] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 23/40] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
                   ` (18 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Fan Yong, Peng Tao, Andreas Dilger

From: Fan Yong <fan.yong@intel.com>

OI scrub should has the ability to handle kinds of OI, including
both the OI files on MDT and the /O directory on OST.

We trust the FID in LMA for both MDT objects and OST objects. So
if some /O sub-item does not match related LMA, then the /O will
be updated, instead of the LMA.

To guarantee that the OI scrub can run without MDT0 involved for
FLDB, the OST object needs to store some flag in its LMA to tell
OI scrub that it is for an OST object, no need to query the MDT0.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3335
Lustre-change: http://review.whamcloud.com/6669
Signed-off-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   18 +++++++++++-------
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 drivers/staging/lustre/lustre/obdclass/md_attrs.c  |    4 ++--
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |    4 ++++
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 4d8d8c3..05e583f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -321,8 +321,11 @@ static inline int range_compare_loc(const struct lu_seq_range *r1,
  * xattr.
  */
 enum lma_compat {
-	LMAC_HSM = 0x00000001,
-	LMAC_SOM = 0x00000002,
+	LMAC_HSM	= 0x00000001,
+	LMAC_SOM	= 0x00000002,
+	LMAC_NOT_IN_OI	= 0x00000004, /* the object does NOT need OI mapping */
+	LMAC_FID_ON_OST = 0x00000008, /* For OST-object, its OI mapping is
+				       * under /O/<seq>/d<x>. */
 };
 
 /**
@@ -331,16 +334,17 @@ enum lma_compat {
  * This information is stored in lustre_mdt_attrs::lma_incompat.
  */
 enum lma_incompat {
-	LMAI_RELEASED = 0x0000001, /* file is released */
-	LMAI_AGENT = 0x00000002, /* agent inode */
-	LMAI_REMOTE_PARENT = 0x00000004, /* the parent of the object
-					    is on the remote MDT */
+	LMAI_RELEASED		= 0x00000001, /* file is released */
+	LMAI_AGENT		= 0x00000002, /* agent inode */
+	LMAI_REMOTE_PARENT	= 0x00000004, /* the parent of the object
+						 is on the remote MDT */
 };
 #define LMA_INCOMPAT_SUPP	(LMAI_AGENT | LMAI_REMOTE_PARENT)
 
 extern void lustre_lma_swab(struct lustre_mdt_attrs *lma);
 extern void lustre_lma_init(struct lustre_mdt_attrs *lma,
-			    const struct lu_fid *fid, __u32 incompat);
+			    const struct lu_fid *fid,
+			    __u32 compat, __u32 incompat);
 /**
  * SOM on-disk attributes stored in a separate xattr.
  */
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 825a0c9..1748985 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -256,6 +256,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_OSD_SCRUB_FATAL			0x192
 #define OBD_FAIL_OSD_FID_MAPPING			0x193
 #define OBD_FAIL_OSD_LMA_INCOMPAT			0x194
+#define OBD_FAIL_OSD_COMPAT_INVALID_ENTRY		0x195
 
 #define OBD_FAIL_OST		     0x200
 #define OBD_FAIL_OST_CONNECT_NET	 0x201
diff --git a/drivers/staging/lustre/lustre/obdclass/md_attrs.c b/drivers/staging/lustre/lustre/obdclass/md_attrs.c
index f718782..91eb30b 100644
--- a/drivers/staging/lustre/lustre/obdclass/md_attrs.c
+++ b/drivers/staging/lustre/lustre/obdclass/md_attrs.c
@@ -39,9 +39,9 @@
  * \param incompat - features that MDS must understand to access object
  */
 void lustre_lma_init(struct lustre_mdt_attrs *lma, const struct lu_fid *fid,
-		     __u32 incompat)
+		     __u32 compat, __u32 incompat)
 {
-	lma->lma_compat   = 0;
+	lma->lma_compat   = compat;
 	lma->lma_incompat = incompat;
 	lma->lma_self_fid = *fid;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 912f194..ca2ef59 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -432,6 +432,10 @@ void lustre_assert_wire_constants(void)
 		(unsigned)LMAC_HSM);
 	LASSERTF(LMAC_SOM == 0x00000002UL, "found 0x%.8xUL\n",
 		(unsigned)LMAC_SOM);
+	LASSERTF(LMAC_NOT_IN_OI == 0x00000004UL, "found 0x%.8xUL\n",
+		(unsigned)LMAC_NOT_IN_OI);
+	LASSERTF(LMAC_FID_ON_OST == 0x00000008UL, "found 0x%.8xUL\n",
+		(unsigned)LMAC_FID_ON_OST);
 	LASSERTF(OBJ_CREATE == 1, "found %lld\n",
 		 (long long)OBJ_CREATE);
 	LASSERTF(OBJ_DESTROY == 2, "found %lld\n",
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 23/40] staging/lustre/scrub: control OI scrub on OST from user space
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (21 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 22/40] staging/lustre/scrub: OI scrub on OST Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 24/40] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
                   ` (17 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Fan Yong, Peng Tao, Andreas Dilger

From: Fan Yong <fan.yong@intel.com>

Not all the OI inconsistency can be detected automatically, such as /O
entry lost case. So the OI scrub on OST should can be triggered by the
administrator manually.

Lustre-change: http://review.whamcloud.com/6698
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3335
Signed-off-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 1748985..4d1f62c 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -257,6 +257,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_OSD_FID_MAPPING			0x193
 #define OBD_FAIL_OSD_LMA_INCOMPAT			0x194
 #define OBD_FAIL_OSD_COMPAT_INVALID_ENTRY		0x195
+#define OBD_FAIL_OSD_COMPAT_NO_ENTRY			0x196
 
 #define OBD_FAIL_OST		     0x200
 #define OBD_FAIL_OST_CONNECT_NET	 0x201
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 24/40] staging/lustre/llite: don't check for O_CREAT in it_create_mode
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (22 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 23/40] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 25/40] staging/lustre/build: clean up unused variables and dead code Peng Tao
                   ` (16 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

ll_lookup_it() checks for O_CREAT in struct lookup_intent's
it_create_mode member which is nonsensical, as it_create_mode is used
for file mode bits (S_IFREG, S_IRUSR, ...). Fix this by just checking
for IT_CREATE in it_op and do the same in llu_lookup_it(). This will
not affect the behavior of either function, since if O_CREATE (0100)
is actually set in o_create_mode then IT_CREATE must have been set in
it_op. In ll_atomic_open() check for O_CREAT in the open_flags
parameter rather than testing mode.

Lustre-change: http://review.whamcloud.com/6786
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3517
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/namei.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 8377468..e47e2b1 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -523,8 +523,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	icbd.icbd_childp = &dentry;
 	icbd.icbd_parent = parent;
 
-	if (it->it_op & IT_CREAT ||
-	    (it->it_op & IT_OPEN && it->it_create_mode & O_CREAT))
+	if (it->it_op & IT_CREAT)
 		opc = LUSTRE_OPC_CREATE;
 	else
 		opc = LUSTRE_OPC_ANY;
@@ -623,7 +622,7 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 		return -ENOMEM;
 
 	it->it_op = IT_OPEN;
-	if (mode) {
+	if (open_flags & O_CREAT) {
 		it->it_op |= IT_CREAT;
 		lookup_flags |= LOOKUP_CREATE;
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 25/40] staging/lustre/build: clean up unused variables and dead code
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (23 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 24/40] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 26/40] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
                   ` (15 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Dmitry Eremin, Ned Bass, Peng Tao, Andreas Dilger

From: Dmitry Eremin <dmitry.eremin@intel.com>

Clean up the code from really unused variables and rearrange other
to remove SET_BUT_UNUSED and UNUSED macro usage if possible.

Clean up or comment on some dead code.

Removed never implemented suspend timeouts functionality from pinger.c
which was commented out since 2007 and going to be replaced by adaptive
timeouts. Also removed all references to this functionality from
ldlm_lockd.c, ldlm_request.c and import.c which actually nevers executes
or do nothing.

 pinger.c commit d2d56f38da01001c92a09afc6b52b5acbd9bc13c
 Author: tappro <tappro>
 Date:   Mon Jul 30 21:08:59 2007 +0000
    - make HEAD from b_post_cmd3

Lustre-change: http://review.whamcloud.com/6139
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3204
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 -
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
 drivers/staging/lustre/lnet/selftest/timer.c       |    6 +-
 .../lustre/lustre/include/linux/lustre_compat25.h  |    4 +-
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    3 -
 drivers/staging/lustre/lustre/include/lustre_net.h |    2 -
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    3 -
 drivers/staging/lustre/lustre/libcfs/workitem.c    |   55 ++++++++---------
 drivers/staging/lustre/lustre/llite/dcache.c       |   36 ++++-------
 drivers/staging/lustre/lustre/llite/namei.c        |    6 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |    4 --
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   10 ++-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |    2 -
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   10 ++-
 drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   65 --------------------
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |   44 ++++++-------
 17 files changed, 75 insertions(+), 182 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 687dbab..4a6c7da 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -181,8 +181,6 @@ static inline void *__container_of(void *ptr, unsigned long shift)
 #define container_of0(ptr, type, member) \
 	((type *)__container_of((void *)(ptr), offsetof(type, member)))
 
-#define SET_BUT_UNUSED(a) do { } while(sizeof(a) - sizeof(a))
-
 #define _LIBCFS_H
 
 #endif /* _LIBCFS_H */
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 7659a26..5ae59d2 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -124,7 +124,6 @@ srpc_bulk_t *
 srpc_alloc_bulk(int cpt, unsigned bulk_npg, unsigned bulk_len, int sink)
 {
 	srpc_bulk_t  *bk;
-	struct page  **pages;
 	int	      i;
 
 	LASSERT(bulk_npg > 0 && bulk_npg <= LNET_MAX_IOV);
@@ -140,7 +139,6 @@ srpc_alloc_bulk(int cpt, unsigned bulk_npg, unsigned bulk_len, int sink)
 	bk->bk_sink   = sink;
 	bk->bk_len    = bulk_len;
 	bk->bk_niov   = bulk_npg;
-	UNUSED(pages);
 
 	for (i = 0; i < bulk_npg; i++) {
 		struct page *pg;
diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h
index 8053b05..cd53926 100644
--- a/drivers/staging/lustre/lnet/selftest/selftest.h
+++ b/drivers/staging/lustre/lnet/selftest/selftest.h
@@ -572,9 +572,6 @@ swi_state2str (int state)
 #undef STATE2STR
 }
 
-#define UNUSED(x)       ( (void)(x) )
-
-
 #define selftest_wait_events()	cfs_pause(cfs_time_seconds(1) / 10)
 
 
diff --git a/drivers/staging/lustre/lnet/selftest/timer.c b/drivers/staging/lustre/lnet/selftest/timer.c
index 82fd363..5dc8e3d 100644
--- a/drivers/staging/lustre/lnet/selftest/timer.c
+++ b/drivers/staging/lustre/lnet/selftest/timer.c
@@ -172,9 +172,6 @@ int
 stt_timer_main(void *arg)
 {
 	int rc = 0;
-	UNUSED(arg);
-
-	SET_BUT_UNUSED(rc);
 
 	cfs_block_allsigs();
 
@@ -184,12 +181,13 @@ stt_timer_main(void *arg)
 		rc = wait_event_timeout(stt_data.stt_waitq,
 					stt_data.stt_shuttingdown,
 					cfs_time_seconds(STTIMER_SLOTTIME));
+		rc = 0; /* Discard jiffies remaining before timeout. */
 	}
 
 	spin_lock(&stt_data.stt_lock);
 	stt_data.stt_nthreads--;
 	spin_unlock(&stt_data.stt_lock);
-	return 0;
+	return rc;
 }
 
 int
diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 359c6c1..2e1b7f1 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -171,8 +171,8 @@ static inline int ll_quota_off(struct super_block *sb, int off, int remount)
 #define ll_d_hlist_empty(list) hlist_empty(list)
 #define ll_d_hlist_entry(ptr, type, name) hlist_entry(ptr.first, type, name)
 #define ll_d_hlist_for_each(tmp, i_dentry) hlist_for_each(tmp, i_dentry)
-#define ll_d_hlist_for_each_entry(dentry, p, i_dentry, alias) \
-	p = NULL; hlist_for_each_entry(dentry, i_dentry, alias)
+#define ll_d_hlist_for_each_entry(dentry, i_dentry, alias) \
+	hlist_for_each_entry(dentry, i_dentry, alias)
 
 
 #define bio_hw_segments(q, bio) 0
diff --git a/drivers/staging/lustre/lustre/include/lustre_ha.h b/drivers/staging/lustre/lustre/include/lustre_ha.h
index 105f6d6..f3ae02b 100644
--- a/drivers/staging/lustre/lustre/include/lustre_ha.h
+++ b/drivers/staging/lustre/lustre/include/lustre_ha.h
@@ -58,9 +58,6 @@ void ptlrpc_activate_import(struct obd_import *imp);
 void ptlrpc_deactivate_import(struct obd_import *imp);
 void ptlrpc_invalidate_import(struct obd_import *imp);
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt);
-int ptlrpc_check_suspend(void);
-void ptlrpc_activate_timeouts(struct obd_import *imp);
-void ptlrpc_deactivate_timeouts(struct obd_import *imp);
 
 /** @} ha */
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 91f28e3..6c5479b 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -3403,10 +3403,8 @@ int ptlrpc_del_timeout_client(struct list_head *obd_list,
 			      enum timeout_event event);
 struct ptlrpc_request * ptlrpc_prep_ping(struct obd_import *imp);
 int ptlrpc_obd_ping(struct obd_device *obd);
-cfs_time_t ptlrpc_suspend_wakeup_time(void);
 void ping_evictor_start(void);
 void ping_evictor_stop(void);
-int ptlrpc_check_and_wait_suspend(struct ptlrpc_request *req);
 void ptlrpc_pinger_ir_up(void);
 void ptlrpc_pinger_ir_down(void);
 /** @} */
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 1ddcca3..68ace4a 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -97,9 +97,6 @@ int ldlm_expired_completion_wait(void *data)
 	if (lock->l_conn_export == NULL) {
 		static cfs_time_t next_dump = 0, last_dump = 0;
 
-		if (ptlrpc_check_suspend())
-			return 0;
-
 		LCONSOLE_WARN("lock timed out (enqueued at "CFS_TIME_T", "
 			      CFS_DURATION_T"s ago)\n",
 			      lock->l_last_activity,
diff --git a/drivers/staging/lustre/lustre/libcfs/workitem.c b/drivers/staging/lustre/lustre/libcfs/workitem.c
index 1a55c81..11227dd 100644
--- a/drivers/staging/lustre/lustre/libcfs/workitem.c
+++ b/drivers/staging/lustre/lustre/libcfs/workitem.c
@@ -306,8 +306,6 @@ cfs_wi_scheduler (void *arg)
 void
 cfs_wi_sched_destroy(struct cfs_wi_sched *sched)
 {
-	int	i;
-
 	LASSERT(cfs_wi_data.wi_init);
 	LASSERT(!cfs_wi_data.wi_stopping);
 
@@ -324,18 +322,22 @@ cfs_wi_sched_destroy(struct cfs_wi_sched *sched)
 
 	spin_unlock(&cfs_wi_data.wi_glock);
 
-	i = 2;
 	wake_up_all(&sched->ws_waitq);
 
 	spin_lock(&cfs_wi_data.wi_glock);
-	while (sched->ws_nthreads > 0) {
-		CDEBUG(IS_PO2(++i) ? D_WARNING : D_NET,
-		       "waiting for %d threads of WI sched[%s] to terminate\n",
-		       sched->ws_nthreads, sched->ws_name);
+	{
+		int i = 2;
 
-		spin_unlock(&cfs_wi_data.wi_glock);
-		cfs_pause(cfs_time_seconds(1) / 20);
-		spin_lock(&cfs_wi_data.wi_glock);
+		while (sched->ws_nthreads > 0) {
+			CDEBUG(IS_PO2(++i) ? D_WARNING : D_NET,
+			       "waiting for %d threads of WI sched[%s] to "
+			       "terminate\n", sched->ws_nthreads,
+			       sched->ws_name);
+
+			spin_unlock(&cfs_wi_data.wi_glock);
+			cfs_pause(cfs_time_seconds(1) / 20);
+			spin_lock(&cfs_wi_data.wi_glock);
+		}
 	}
 
 	list_del(&sched->ws_list);
@@ -352,7 +354,6 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 		    int cpt, int nthrs, struct cfs_wi_sched **sched_pp)
 {
 	struct cfs_wi_sched	*sched;
-	int			rc;
 
 	LASSERT(cfs_wi_data.wi_init);
 	LASSERT(!cfs_wi_data.wi_stopping);
@@ -373,9 +374,8 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 	INIT_LIST_HEAD(&sched->ws_rerunq);
 	INIT_LIST_HEAD(&sched->ws_list);
 
-	rc = 0;
-	while (nthrs > 0)  {
-		char	name[16];
+	for (; nthrs > 0; nthrs--)  {
+		char		name[16];
 		struct task_struct *task;
 
 		spin_lock(&cfs_wi_data.wi_glock);
@@ -402,21 +402,18 @@ cfs_wi_sched_create(char *name, struct cfs_cpt_table *cptab,
 			nthrs--;
 			continue;
 		}
-		rc = PTR_ERR(task);
-
-		CERROR("Failed to create thread for WI scheduler %s: %d\n",
-		       name, rc);
-
-		spin_lock(&cfs_wi_data.wi_glock);
-
-		/* make up for cfs_wi_sched_destroy */
-		list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
-		sched->ws_starting--;
-
-		spin_unlock(&cfs_wi_data.wi_glock);
-
-		cfs_wi_sched_destroy(sched);
-		return rc;
+		if (IS_ERR(task)) {
+			int rc = PTR_ERR(task);
+			CERROR("Failed to create thread for "
+				"WI scheduler %s: %d\n", name, rc);
+			spin_lock(&cfs_wi_data.wi_glock);
+			/* make up for cfs_wi_sched_destroy */
+			list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
+			sched->ws_starting--;
+			spin_unlock(&cfs_wi_data.wi_glock);
+			cfs_wi_sched_destroy(sched);
+			return rc;
+		}
 	}
 	spin_lock(&cfs_wi_data.wi_glock);
 	list_add(&sched->ws_list, &cfs_wi_data.wi_scheds);
diff --git a/drivers/staging/lustre/lustre/llite/dcache.c b/drivers/staging/lustre/lustre/llite/dcache.c
index e7629be..cc7befe 100644
--- a/drivers/staging/lustre/lustre/llite/dcache.c
+++ b/drivers/staging/lustre/lustre/llite/dcache.c
@@ -270,7 +270,6 @@ void ll_intent_release(struct lookup_intent *it)
 void ll_invalidate_aliases(struct inode *inode)
 {
 	struct dentry *dentry;
-	struct ll_d_hlist_node *p;
 
 	LASSERT(inode != NULL);
 
@@ -278,7 +277,7 @@ void ll_invalidate_aliases(struct inode *inode)
 	       inode->i_ino, inode->i_generation, inode);
 
 	ll_lock_dcache(inode);
-	ll_d_hlist_for_each_entry(dentry, p, &inode->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
 		CDEBUG(D_DENTRY, "dentry in drop %.*s (%p) parent %p "
 		       "inode %p flags %d\n", dentry->d_name.len,
 		       dentry->d_name.name, dentry, dentry->d_parent,
@@ -404,7 +403,6 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 		struct inode *inode = de->d_inode;
 		struct ll_inode_info *lli = ll_i2info(inode);
 		struct obd_client_handle **och_p;
-		__u64 *och_usecount;
 		__u64 ibits;
 
 		/*
@@ -417,38 +415,30 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 		 * change, LOOKUP lock is revoked.
 		 */
 
-
-		if (it->it_flags & FMODE_WRITE) {
+		if (it->it_flags & FMODE_WRITE)
 			och_p = &lli->lli_mds_write_och;
-			och_usecount = &lli->lli_open_fd_write_count;
-		} else if (it->it_flags & FMODE_EXEC) {
+		else if (it->it_flags & FMODE_EXEC)
 			och_p = &lli->lli_mds_exec_och;
-			och_usecount = &lli->lli_open_fd_exec_count;
-		} else {
+		else
 			och_p = &lli->lli_mds_read_och;
-			och_usecount = &lli->lli_open_fd_read_count;
-		}
 		/* Check for the proper lock. */
 		ibits = MDS_INODELOCK_LOOKUP;
 		if (!ll_have_md_lock(inode, &ibits, LCK_MINMODE))
 			goto do_lock;
 		mutex_lock(&lli->lli_och_mutex);
 		if (*och_p) { /* Everything is open already, do nothing */
-			/*(*och_usecount)++;  Do not let them steal our open
-			  handle from under us */
-			SET_BUT_UNUSED(och_usecount);
-			/* XXX The code above was my original idea, but in case
-			   we have the handle, but we cannot use it due to later
-			   checks (e.g. O_CREAT|O_EXCL flags set), nobody
-			   would decrement counter increased here. So we just
-			   hope the lock won't be invalidated in between. But
-			   if it would be, we'll reopen the open request to
-			   MDS later during file open path */
+			/* Originally it was idea to do not let them steal our
+			 * open handle from under us by (*och_usecount)++ here.
+			 * But in case we have the handle, but we cannot use it
+			 * due to later checks (e.g. O_CREAT|O_EXCL flags set),
+			 * nobody would decrement counter increased here. So we
+			 * just hope the lock won't be invalidated in between.
+			 * But if it would be, we'll reopen the open request to
+			 * MDS later during file open path. */
 			mutex_unlock(&lli->lli_och_mutex);
 			return 1;
-		} else {
-			mutex_unlock(&lli->lli_och_mutex);
 		}
+		mutex_unlock(&lli->lli_och_mutex);
 	}
 
 	if (it->it_op == IT_GETATTR) {
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index e47e2b1..0000530 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -172,10 +172,9 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash,
 static void ll_invalidate_negative_children(struct inode *dir)
 {
 	struct dentry *dentry, *tmp_subdir;
-	struct ll_d_hlist_node *p;
 
 	ll_lock_dcache(dir);
-	ll_d_hlist_for_each_entry(dentry, p, &dir->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(dentry, &dir->i_dentry, d_alias) {
 		spin_lock(&dentry->d_lock);
 		if (!list_empty(&dentry->d_subdirs)) {
 			struct dentry *child;
@@ -352,7 +351,6 @@ void ll_i2gids(__u32 *suppgids, struct inode *i1, struct inode *i2)
 static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 {
 	struct dentry *alias, *discon_alias, *invalid_alias;
-	struct ll_d_hlist_node *p;
 
 	if (ll_d_hlist_empty(&inode->i_dentry))
 		return NULL;
@@ -360,7 +358,7 @@ static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 	discon_alias = invalid_alias = NULL;
 
 	ll_lock_dcache(inode);
-	ll_d_hlist_for_each_entry(alias, p, &inode->i_dentry, d_alias) {
+	ll_d_hlist_for_each_entry(alias, &inode->i_dentry, d_alias) {
 		LASSERT(alias != dentry);
 
 		spin_lock(&alias->d_lock);
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 02d76f8..9008861 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -420,7 +420,6 @@ void lprocfs_stats_collect(struct lprocfs_stats *stats, int idx,
 {
 	unsigned int			num_entry;
 	struct lprocfs_counter		*percpu_cntr;
-	struct lprocfs_counter_header	*cntr_header;
 	int				i;
 	unsigned long			flags = 0;
 
@@ -439,7 +438,6 @@ void lprocfs_stats_collect(struct lprocfs_stats *stats, int idx,
 	for (i = 0; i < num_entry; i++) {
 		if (stats->ls_percpu[i] == NULL)
 			continue;
-		cntr_header = &stats->ls_cnt_header[idx];
 		percpu_cntr = lprocfs_stats_counter_get(stats, i, idx);
 
 		cnt->lc_count += percpu_cntr->lc_count;
@@ -999,7 +997,6 @@ EXPORT_SYMBOL(lprocfs_free_stats);
 void lprocfs_clear_stats(struct lprocfs_stats *stats)
 {
 	struct lprocfs_counter		*percpu_cntr;
-	struct lprocfs_counter_header	*header;
 	int				i;
 	int				j;
 	unsigned int			num_entry;
@@ -1011,7 +1008,6 @@ void lprocfs_clear_stats(struct lprocfs_stats *stats)
 		if (stats->ls_percpu[i] == NULL)
 			continue;
 		for (j = 0; j < stats->ls_num; j++) {
-			header = &stats->ls_cnt_header[j];
 			percpu_cntr = lprocfs_stats_counter_get(stats, i, j);
 			percpu_cntr->lc_count		= 0;
 			percpu_cntr->lc_min		= LC_MIN_INIT;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index c2ab0c8..faa982a 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2288,7 +2288,6 @@ EXPORT_SYMBOL(ptlrpc_req_xid);
 int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async)
 {
 	int		rc;
-	wait_queue_head_t       *wq;
 	struct l_wait_info lwi;
 
 	/*
@@ -2333,12 +2332,11 @@ int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async)
 	 * a chance to run reply_in_callback(), and to make sure we've
 	 * unlinked before returning a req to the pool.
 	 */
-	if (request->rq_set != NULL)
-		wq = &request->rq_set->set_waitq;
-	else
-		wq = &request->rq_reply_waitq;
-
 	for (;;) {
+		/* The wq argument is ignored by user-space wait_event macros */
+		wait_queue_head_t *wq = (request->rq_set != NULL) ?
+				  &request->rq_set->set_waitq :
+				  &request->rq_reply_waitq;
 		/* Network access will complete in finite time but the HUGE
 		 * timeout lets us CWARN for visibility of sluggish NALs */
 		lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(LONG_UNLINK),
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 7b96a0e..7c133d3 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -170,7 +170,6 @@ int ptlrpc_set_import_discon(struct obd_import *imp, __u32 conn_cnt)
 			       target_len, target_start,
 			       libcfs_nid2str(imp->imp_connection->c_peer.nid));
 		}
-		ptlrpc_deactivate_timeouts(imp);
 		IMPORT_SET_STATE_NOLOCK(imp, LUSTRE_IMP_DISCON);
 		spin_unlock(&imp->imp_lock);
 
@@ -383,7 +382,6 @@ void ptlrpc_activate_import(struct obd_import *imp)
 
 	spin_lock(&imp->imp_lock);
 	imp->imp_invalid = 0;
-	ptlrpc_activate_timeouts(imp);
 	spin_unlock(&imp->imp_lock);
 	obd_import_event(obd, imp, IMP_EVENT_ACTIVE);
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index a0e0097..a68ed47 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -242,7 +242,6 @@ EXPORT_SYMBOL(ptlrpc_register_bulk);
 int ptlrpc_unregister_bulk(struct ptlrpc_request *req, int async)
 {
 	struct ptlrpc_bulk_desc *desc = req->rq_bulk;
-	wait_queue_head_t	     *wq;
 	struct l_wait_info       lwi;
 	int		      rc;
 
@@ -274,12 +273,11 @@ int ptlrpc_unregister_bulk(struct ptlrpc_request *req, int async)
 	if (async)
 		return 0;
 
-	if (req->rq_set != NULL)
-		wq = &req->rq_set->set_waitq;
-	else
-		wq = &req->rq_reply_waitq;
-
 	for (;;) {
+		/* The wq argument is ignored by user-space wait_event macros */
+		wait_queue_head_t *wq = (req->rq_set != NULL) ?
+				  &req->rq_set->set_waitq :
+				  &req->rq_reply_waitq;
 		/* Network access will complete in finite time but the HUGE
 		 * timeout lets us CWARN for visibility of sluggish NALs */
 		lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(LONG_UNLINK),
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pinger.c b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
index 5dec771..d2f2805 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pinger.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
@@ -140,9 +140,6 @@ static inline int ptlrpc_next_reconnect(struct obd_import *imp)
 		return cfs_time_shift(obd_timeout);
 }
 
-static atomic_t suspend_timeouts = ATOMIC_INIT(0);
-static cfs_time_t suspend_wakeup_time = 0;
-
 cfs_duration_t pinger_check_timeout(cfs_time_t time)
 {
 	struct timeout_item *item;
@@ -162,67 +159,6 @@ cfs_duration_t pinger_check_timeout(cfs_time_t time)
 					 cfs_time_current());
 }
 
-static wait_queue_head_t suspend_timeouts_waitq;
-
-cfs_time_t ptlrpc_suspend_wakeup_time(void)
-{
-	return suspend_wakeup_time;
-}
-
-void ptlrpc_deactivate_timeouts(struct obd_import *imp)
-{
-	/*XXX: disabled for now, will be replaced by adaptive timeouts */
-#if 0
-	if (imp->imp_no_timeout)
-		return;
-	imp->imp_no_timeout = 1;
-	atomic_inc(&suspend_timeouts);
-	CDEBUG(D_HA|D_WARNING, "deactivate timeouts %u\n",
-	       atomic_read(&suspend_timeouts));
-#endif
-}
-
-void ptlrpc_activate_timeouts(struct obd_import *imp)
-{
-	/*XXX: disabled for now, will be replaced by adaptive timeouts */
-#if 0
-	if (!imp->imp_no_timeout)
-		return;
-	imp->imp_no_timeout = 0;
-	LASSERT(atomic_read(&suspend_timeouts) > 0);
-	if (atomic_dec_and_test(&suspend_timeouts)) {
-		suspend_wakeup_time = cfs_time_current();
-		wake_up(&suspend_timeouts_waitq);
-	}
-	CDEBUG(D_HA|D_WARNING, "activate timeouts %u\n",
-	       atomic_read(&suspend_timeouts));
-#endif
-}
-
-int ptlrpc_check_suspend(void)
-{
-	if (atomic_read(&suspend_timeouts))
-		return 1;
-	return 0;
-}
-
-int ptlrpc_check_and_wait_suspend(struct ptlrpc_request *req)
-{
-	struct l_wait_info lwi;
-
-	if (atomic_read(&suspend_timeouts)) {
-		DEBUG_REQ(D_NET, req, "-- suspend %d regular timeout",
-			  atomic_read(&suspend_timeouts));
-		lwi = LWI_INTR(NULL, NULL);
-		l_wait_event(suspend_timeouts_waitq,
-			     atomic_read(&suspend_timeouts) == 0, &lwi);
-		DEBUG_REQ(D_NET, req, "-- recharge regular timeout");
-		return 1;
-	}
-	return 0;
-}
-
-
 static bool ir_up;
 
 void ptlrpc_pinger_ir_up(void)
@@ -377,7 +313,6 @@ int ptlrpc_start_pinger(void)
 		return -EALREADY;
 
 	init_waitqueue_head(&pinger_thread.t_ctl_waitq);
-	init_waitqueue_head(&suspend_timeouts_waitq);
 
 	strcpy(pinger_thread.t_name, "ll_ping");
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 89c9be96..337eefd 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -600,7 +600,6 @@ static int ptlrpcd_bind(int index, int max)
 int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 {
 	int rc;
-	int env = 0;
 
 	/*
 	 * Do not allow start second thread for one pc.
@@ -619,6 +618,7 @@ int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 	pc->pc_set = ptlrpc_prep_set();
 	if (pc->pc_set == NULL)
 		GOTO(out, rc = -ENOMEM);
+
 	/*
 	 * So far only "client" ptlrpcd uses an environment. In the future,
 	 * ptlrpcd thread (or a thread-set) has to be given an argument,
@@ -626,40 +626,40 @@ int ptlrpcd_start(int index, int max, const char *name, struct ptlrpcd_ctl *pc)
 	 */
 	rc = lu_context_init(&pc->pc_env.le_ctx, LCT_CL_THREAD|LCT_REMEMBER);
 	if (rc != 0)
-		GOTO(out, rc);
+		GOTO(out_set, rc);
 
-	env = 1;
 	{
 		struct task_struct *task;
-
 		if (index >= 0) {
 			rc = ptlrpcd_bind(index, max);
 			if (rc < 0)
-				GOTO(out, rc);
+				GOTO(out_env, rc);
 		}
 
-		task = kthread_run(ptlrpcd, pc, "%s", pc->pc_name);
+		task = kthread_run(ptlrpcd, pc, pc->pc_name);
 		if (IS_ERR(task))
-			GOTO(out, rc = PTR_ERR(task));
+			GOTO(out_env, rc = PTR_ERR(task));
 
-		rc = 0;
 		wait_for_completion(&pc->pc_starting);
 	}
-out:
-	if (rc) {
-		if (pc->pc_set != NULL) {
-			struct ptlrpc_request_set *set = pc->pc_set;
-
-			spin_lock(&pc->pc_lock);
-			pc->pc_set = NULL;
-			spin_unlock(&pc->pc_lock);
-			ptlrpc_set_destroy(set);
-		}
-		if (env != 0)
-			lu_context_fini(&pc->pc_env.le_ctx);
-		clear_bit(LIOD_BIND, &pc->pc_flags);
-		clear_bit(LIOD_START, &pc->pc_flags);
+	return 0;
+
+out_env:
+	lu_context_fini(&pc->pc_env.le_ctx);
+
+out_set:
+	if (pc->pc_set != NULL) {
+		struct ptlrpc_request_set *set = pc->pc_set;
+
+		spin_lock(&pc->pc_lock);
+		pc->pc_set = NULL;
+		spin_unlock(&pc->pc_lock);
+		ptlrpc_set_destroy(set);
 	}
+	clear_bit(LIOD_BIND, &pc->pc_flags);
+
+out:
+	clear_bit(LIOD_START, &pc->pc_flags);
 	return rc;
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 26/40] staging/lustre/build: fix compilation issue with is_compat_task
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (24 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 25/40] staging/lustre/build: clean up unused variables and dead code Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 27/40] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
                   ` (14 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Dmitry Eremin, Peng Tao, Andreas Dilger

From: Dmitry Eremin <dmitry.eremin@intel.com>

After removing LIBCFS_HAVE_IS_COMPAT_TASK test we have a
compilation issue with kernels configured without CONFIG_COMPAT.

Lustre-change: http://review.whamcloud.com/7118
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2800
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/lustre/llite/llite_internal.h   |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index e800f8d..da56f77 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -650,7 +650,12 @@ static inline int ll_need_32bit_api(struct ll_sb_info *sbi)
 #if BITS_PER_LONG == 32
 	return 1;
 #else
-	return unlikely(is_compat_task() || (sbi->ll_flags & LL_SBI_32BIT_API));
+	return unlikely(
+#ifdef CONFIG_COMPAT
+		is_compat_task() ||
+#endif
+		(sbi->ll_flags & LL_SBI_32BIT_API)
+	);
 #endif
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 27/40] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (25 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 26/40] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 28/40] staging/lustre/hsm: Add hsm_release feature Peng Tao
                   ` (13 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Amir Shehata, Peng Tao, Andreas Dilger

From: Amir Shehata <amir.shehata@intel.com>

When a system runs out of memory and the function
ptlrpc_register_bulk() is called from ptl_send_rpc() the call to
LNetMEAttach() fails due to failure to allocate memory.  This forces
the code into an error path, which most probably previously went
untested.  The error path:
if (rc != 0) {
        CERROR("%s: LNetMEAttach failed x"LPU64"/%d: rc = %dn",
                desc->bd_export->exp_obd->obd_name, xid,
                posted_md, rc);
        break;
}
This print assumes that desc->bd_export is not NULL.  However, it is.
In fact it is expected to be NULL.  desc->bd_import is the correct
structure to access in this case.

Lustre-change: http://review.whamcloud.com/7121
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3585
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index a68ed47..7ceb114 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -179,7 +179,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 				  LNET_UNLINK, LNET_INS_AFTER, &me_h);
 		if (rc != 0) {
 			CERROR("%s: LNetMEAttach failed x"LPU64"/%d: rc = %d\n",
-			       desc->bd_export->exp_obd->obd_name, xid,
+			       desc->bd_import->imp_obd->obd_name, xid,
 			       posted_md, rc);
 			break;
 		}
@@ -189,7 +189,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 				  &desc->bd_mds[posted_md]);
 		if (rc != 0) {
 			CERROR("%s: LNetMDAttach failed x"LPU64"/%d: rc = %d\n",
-			       desc->bd_export->exp_obd->obd_name, xid,
+			       desc->bd_import->imp_obd->obd_name, xid,
 			       posted_md, rc);
 			rc2 = LNetMEUnlink(me_h);
 			LASSERT(rc2 == 0);
@@ -219,7 +219,7 @@ int ptlrpc_register_bulk(struct ptlrpc_request *req)
 	/* Holler if peer manages to touch buffers before he knows the xid */
 	if (desc->bd_md_count != total_md)
 		CWARN("%s: Peer %s touched %d buffers while I registered\n",
-		      desc->bd_export->exp_obd->obd_name, libcfs_id2str(peer),
+		      desc->bd_import->imp_obd->obd_name, libcfs_id2str(peer),
 		      total_md - desc->bd_md_count);
 	spin_unlock(&desc->bd_lock);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 28/40] staging/lustre/hsm: Add hsm_release feature.
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (26 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 27/40] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 29/40] staging/lustre/llite: extended attribute cache Peng Tao
                   ` (12 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jinshan Xiong, Aurelien Degremont, Peng Tao,
	Andreas Dilger

From: Jinshan Xiong <jinshan.xiong@intel.com>

HSM Release is one of the key feature of HSM. To perform HSM
release, clients need to acquire the file lease exclusivelt and
flush dirty cache from clients. A special close REQ will be sent
to the MDT to release the lease and get rid of OST objects.

Lustre-change: http://review.whamcloud.com/7028
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1333
Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   14 +++-
 .../lustre/lustre/include/lustre/lustre_user.h     |   11 ++-
 .../lustre/lustre/include/lustre_req_layout.h      |    2 +
 drivers/staging/lustre/lustre/include/obd.h        |    6 +-
 .../staging/lustre/lustre/lclient/lcommon_misc.c   |    4 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   24 +++++-
 drivers/staging/lustre/lustre/llite/file.c         |   87 +++++++++++++++++---
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    2 +-
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |   16 ++++
 drivers/staging/lustre/lustre/lov/lov_object.c     |   35 +++++---
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   26 +++++-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   23 +++++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   19 +++++
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    7 ++
 15 files changed, 242 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 05e583f..97d633e 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1750,6 +1750,7 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 #define OBD_MD_FLRMTRGETFACL (0x0008000000000000ULL) /* lfs rgetfacl case */
 
 #define OBD_MD_FLDATAVERSION (0x0010000000000000ULL) /* iversion sum */
+#define OBD_MD_FLRELEASED    (0x0020000000000000ULL) /* file released */
 
 #define OBD_MD_FLGETATTR (OBD_MD_FLID    | OBD_MD_FLATIME | OBD_MD_FLMTIME | \
 			  OBD_MD_FLCTIME | OBD_MD_FLSIZE  | OBD_MD_FLBLKSZ | \
@@ -2386,6 +2387,7 @@ extern void lustre_swab_mdt_rec_setattr (struct mdt_rec_setattr *sa);
 					      * delegation, succeed if it's not
 					      * being opened with conflict mode.
 					      */
+#define MDS_OPEN_RELEASE   02000000000000ULL /* Open the file for HSM release */
 
 /* permission for create non-directory file */
 #define MAY_CREATE      (1 << 7)
@@ -2404,7 +2406,7 @@ extern void lustre_swab_mdt_rec_setattr (struct mdt_rec_setattr *sa);
 /* lfs rgetfacl permission check */
 #define MAY_RGETFACL    (1 << 14)
 
-enum {
+enum mds_op_bias {
 	MDS_CHECK_SPLIT		= 1 << 0,
 	MDS_CROSS_REF		= 1 << 1,
 	MDS_VTX_BYPASS		= 1 << 2,
@@ -2417,6 +2419,7 @@ enum {
 	MDS_DATA_MODIFIED	= 1 << 9,
 	MDS_CREATE_VOLATILE	= 1 << 10,
 	MDS_OWNEROVERRIDE	= 1 << 11,
+	MDS_HSM_RELEASE		= 1 << 12,
 };
 
 /* instance of mdt_reint_rec */
@@ -3740,5 +3743,14 @@ struct mdc_swap_layouts {
 
 void lustre_swab_swap_layouts(struct mdc_swap_layouts *msl);
 
+struct close_data {
+	struct lustre_handle	cd_handle;
+	struct lu_fid		cd_fid;
+	__u64			cd_data_version;
+	__u64			cd_reserved[8];
+};
+
+void lustre_swab_close_data(struct close_data *data);
+
 #endif
 /** @} lustreidl */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index b6090b5..ec1bcf5 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -623,10 +623,13 @@ struct if_quotactl {
 };
 
 /* swap layout flags */
-#define	SWAP_LAYOUTS_CHECK_DV1		(1 << 0)
-#define	SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
-#define	SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
-#define	SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+#define SWAP_LAYOUTS_CHECK_DV1		(1 << 0)
+#define SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
+#define SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
+#define SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+
+/* Swap XATTR_NAME_HSM as well, only on the MDT so far */
+#define SWAP_LAYOUTS_MDS_HSM		(1 << 31)
 struct lustre_swap_layouts {
 	__u64	sl_flags;
 	__u32	sl_fd;
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index 2832569..a75f4c6 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -164,6 +164,7 @@ extern struct req_format RQF_UPDATE_OBJ;
  */
 extern struct req_format RQF_MDS_GETATTR_NAME;
 extern struct req_format RQF_MDS_CLOSE;
+extern struct req_format RQF_MDS_RELEASE_CLOSE;
 extern struct req_format RQF_MDS_PIN;
 extern struct req_format RQF_MDS_UNPIN;
 extern struct req_format RQF_MDS_CONNECT;
@@ -262,6 +263,7 @@ extern struct req_msg_field RMF_GETINFO_VAL;
 extern struct req_msg_field RMF_GETINFO_VALLEN;
 extern struct req_msg_field RMF_GETINFO_KEY;
 extern struct req_msg_field RMF_IDX_INFO;
+extern struct req_msg_field RMF_CLOSE_DATA;
 
 /*
  * connection handle received in MDS_CONNECT request.
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f0c1773..3247d1d 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1070,7 +1070,7 @@ struct md_op_data {
 	struct obd_capa	*op_capa2;
 
 	/* Various operation flags. */
-	__u32		   op_bias;
+	enum mds_op_bias        op_bias;
 
 	/* Operation type */
 	__u32		   op_opc;
@@ -1084,6 +1084,10 @@ struct md_op_data {
 	/* used to transfer info between the stacks of MD client
 	 * see enum op_cli_flags */
 	__u32			op_cli_flags;
+
+	/* File object data version for HSM release, on client */
+	__u64			op_data_version;
+	struct lustre_handle	op_lease_handle;
 };
 
 enum op_cli_flags {
diff --git a/drivers/staging/lustre/lustre/lclient/lcommon_misc.c b/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
index 2b4dbee..e04c2d3 100644
--- a/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/lclient/lcommon_misc.c
@@ -140,7 +140,9 @@ int cl_get_grouplock(struct cl_object *obj, unsigned long gid, int nonblock,
 
 	rc = cl_io_init(env, io, CIT_MISC, io->ci_obj);
 	if (rc) {
-		LASSERT(rc < 0);
+		/* Does not make sense to take GL for released layout */
+		if (rc > 0)
+			rc = -ENOTSUPP;
 		cl_env_put(env, &refcheck);
 		return rc;
 	}
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 1f07903..22d0acc9 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1809,8 +1809,28 @@ out_rmdir:
 			return -EFAULT;
 		}
 
-		rc = obd_iocontrol(cmd, ll_i2mdexp(inode), totalsize,
-				   hur, NULL);
+		if (hur->hur_request.hr_action == HUA_RELEASE) {
+			const struct lu_fid *fid;
+			struct inode *f;
+			int i;
+
+			for (i = 0; i < hur->hur_request.hr_itemcount; i++) {
+				fid = &hur->hur_user_item[i].hui_fid;
+				f = search_inode_for_lustre(inode->i_sb, fid);
+				if (IS_ERR(f)) {
+					rc = PTR_ERR(f);
+					break;
+				}
+
+				rc = ll_hsm_release(f);
+				iput(f);
+				if (rc != 0)
+					break;
+			}
+		} else {
+			rc = obd_iocontrol(cmd, ll_i2mdexp(inode), totalsize,
+					   hur, NULL);
+		}
 
 		OBD_FREE_LARGE(hur, totalsize);
 
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 174ba4e..981d3fe 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -115,7 +115,8 @@ out:
 
 static int ll_close_inode_openhandle(struct obd_export *md_exp,
 				     struct inode *inode,
-				     struct obd_client_handle *och)
+				     struct obd_client_handle *och,
+				     const __u64 *data_version)
 {
 	struct obd_export *exp = ll_i2mdexp(inode);
 	struct md_op_data *op_data;
@@ -139,6 +140,13 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		GOTO(out, rc = -ENOMEM); // XXX We leak openhandle and request here.
 
 	ll_prepare_close(inode, op_data, och);
+	if (data_version != NULL) {
+		/* Pass in data_version implies release. */
+		op_data->op_bias |= MDS_HSM_RELEASE;
+		op_data->op_data_version = *data_version;
+		op_data->op_lease_handle = och->och_lease_handle;
+		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
+	}
 	epoch_close = (op_data->op_flags & MF_EPOCH_CLOSE);
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
 	if (rc == -EAGAIN) {
@@ -167,14 +175,20 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		spin_unlock(&lli->lli_lock);
 	}
 
-	ll_finish_md_op_data(op_data);
-
 	if (rc == 0) {
 		rc = ll_objects_destroy(req, inode);
 		if (rc)
 			CERROR("inode %lu ll_objects destroy: rc = %d\n",
 			       inode->i_ino, rc);
 	}
+	if (rc == 0 && op_data->op_bias & MDS_HSM_RELEASE) {
+		struct mdt_body *body;
+		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+		if (!(body->valid & OBD_MD_FLRELEASED))
+			rc = -EBUSY;
+	}
+
+	ll_finish_md_op_data(op_data);
 
 out:
 	if (exp_connect_som(exp) && !epoch_close &&
@@ -224,7 +238,7 @@ int ll_md_real_close(struct inode *inode, int flags)
 	if (och) { /* There might be a race and somebody have freed this och
 		      already */
 		rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					       inode, och);
+					       inode, och, NULL);
 	}
 
 	return rc;
@@ -254,7 +268,7 @@ int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 	}
 
 	if (fd->fd_och != NULL) {
-		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och);
+		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och, NULL);
 		fd->fd_och = NULL;
 		GOTO(out, rc);
 	}
@@ -718,7 +732,7 @@ static int ll_md_blocking_lease_ast(struct ldlm_lock *lock,
  * Acquire a lease and open the file.
  */
 struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
-					fmode_t fmode)
+					fmode_t fmode, __u64 open_flags)
 {
 	struct lookup_intent it = { .it_op = IT_OPEN };
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
@@ -786,7 +800,8 @@ struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
 	/* To tell the MDT this openhandle is from the same owner */
 	op_data->op_handle = old_handle;
 
-	it.it_flags = fmode | MDS_OPEN_LOCK | MDS_OPEN_BY_FID | MDS_OPEN_LEASE;
+	it.it_flags = fmode | open_flags;
+	it.it_flags |= MDS_OPEN_LOCK | MDS_OPEN_BY_FID | MDS_OPEN_LEASE;
 	rc = md_intent_lock(sbi->ll_md_exp, op_data, NULL, 0, &it, 0, &req,
 				ll_md_blocking_lease_ast,
 	/* LDLM_FL_NO_LRU: To not put the lease lock into LRU list, otherwise
@@ -832,7 +847,7 @@ struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
 	return och;
 
 out_close:
-	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och);
+	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och, NULL);
 	if (rc2)
 		CERROR("Close openhandle returned %d\n", rc2);
 
@@ -877,7 +892,8 @@ int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 	if (lease_broken != NULL)
 		*lease_broken = cancelled;
 
-	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och);
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
+				       NULL);
 	return rc;
 }
 EXPORT_SYMBOL(ll_lease_close);
@@ -1686,8 +1702,8 @@ int ll_release_openhandle(struct dentry *dentry, struct lookup_intent *it)
 	ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 
 	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-				       inode, och);
- out:
+				       inode, och, NULL);
+out:
 	/* this one is in place of ll_file_open */
 	if (it_disposition(it, DISP_ENQ_OPEN_REF)) {
 		ptlrpc_req_finished(it->d.lustre.it_data);
@@ -1892,6 +1908,53 @@ out:
 	return rc;
 }
 
+/*
+ * Trigger a HSM release request for the provided inode.
+ */
+int ll_hsm_release(struct inode *inode)
+{
+	struct cl_env_nest nest;
+	struct lu_env *env;
+	struct obd_client_handle *och = NULL;
+	__u64 data_version = 0;
+	int rc;
+
+
+	CDEBUG(D_INODE, "%s: Releasing file "DFID".\n",
+	       ll_get_fsname(inode->i_sb, NULL, 0),
+	       PFID(&ll_i2info(inode)->lli_fid));
+
+	och = ll_lease_open(inode, NULL, FMODE_WRITE, MDS_OPEN_RELEASE);
+	if (IS_ERR(och))
+		GOTO(out, rc = PTR_ERR(och));
+
+	/* Grab latest data_version and [am]time values */
+	rc = ll_data_version(inode, &data_version, 1);
+	if (rc != 0)
+		GOTO(out, rc);
+
+	env = cl_env_nested_get(&nest);
+	if (IS_ERR(env))
+		GOTO(out, rc = PTR_ERR(env));
+
+	ll_merge_lvb(env, inode);
+	cl_env_nested_put(&nest, env);
+
+	/* Release the file.
+	 * NB: lease lock handle is released in mdc_hsm_release_pack() because
+	 * we still need it to pack l_remote_handle to MDT. */
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
+				       &data_version);
+	och = NULL;
+
+
+out:
+	if (och != NULL && !IS_ERR(och)) /* close the file */
+		ll_lease_close(och, inode, NULL);
+
+	return rc;
+}
+
 struct ll_swap_stack {
 	struct iattr		 ia1, ia2;
 	__u64			 dv1, dv2;
@@ -2318,7 +2381,7 @@ long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		CDEBUG(D_INODE, "Set lease with mode %d\n", mode);
 
 		/* apply for lease */
-		och = ll_lease_open(inode, file, mode);
+		och = ll_lease_open(inode, file, mode, 0);
 		if (IS_ERR(och))
 			return PTR_ERR(och);
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index da56f77..06e83dd 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -787,9 +787,10 @@ int ll_get_grouplock(struct inode *inode, struct file *file, unsigned long arg);
 int ll_put_grouplock(struct inode *inode, struct file *file, unsigned long arg);
 int ll_fid2path(struct inode *inode, void *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int extent_lock);
+int ll_hsm_release(struct inode *inode);
 
 struct obd_client_handle *ll_lease_open(struct inode *inode, struct file *file,
-					fmode_t mode);
+					fmode_t mode, __u64 flags);
 int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 		   bool *lease_broken);
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 33173fc..25973de 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -138,7 +138,7 @@ int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 			lli->lli_layout_gen,
 			conf->u.coc_md->lsm->lsm_layout_gen);
 
-		lli->lli_has_smd = true;
+		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 		lli->lli_layout_gen = conf->u.coc_md->lsm->lsm_layout_gen;
 	} else {
 		CDEBUG(D_VFSTRACE, "layout lock destroyed: %u.\n",
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 4276124..3965d5e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -168,6 +168,22 @@ enum lov_layout_type {
 	LLT_NR
 };
 
+static inline char *llt2str(enum lov_layout_type llt)
+{
+	switch (llt) {
+	case LLT_EMPTY:
+		return "EMPTY";
+	case LLT_RAID0:
+		return "RAID0";
+	case LLT_RELEASED:
+		return "RELEASED";
+	case LLT_NR:
+		LBUG();
+	}
+	LBUG();
+	return "";
+}
+
 /**
  * lov-specific file state.
  *
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index cf2fa8a..df8b5b5 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -393,13 +393,13 @@ static int lov_print_empty(const struct lu_env *env, void *cookie,
 static int lov_print_raid0(const struct lu_env *env, void *cookie,
 			   lu_printer_t p, const struct lu_object *o)
 {
-	struct lov_object       *lov = lu2lov(o);
-	struct lov_layout_raid0 *r0  = lov_r0(lov);
-	struct lov_stripe_md    *lsm = lov->lo_lsm;
-	int i;
+	struct lov_object	*lov = lu2lov(o);
+	struct lov_layout_raid0	*r0  = lov_r0(lov);
+	struct lov_stripe_md	*lsm = lov->lo_lsm;
+	int			 i;
 
-	(*p)(env, cookie, "stripes: %d, %svalid, lsm{%p 0x%08X %d %u %u}: \n",
-		r0->lo_nr, lov->lo_layout_invalid ? "in" : "", lsm,
+	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
+		r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 		lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
 		lsm->lsm_stripe_count, lsm->lsm_layout_gen);
 	for (i = 0; i < r0->lo_nr; ++i) {
@@ -408,8 +408,9 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 		if (r0->lo_sub[i] != NULL) {
 			sub = lovsub2lu(r0->lo_sub[i]);
 			lu_object_print(env, cookie, p, sub);
-		} else
+		} else {
 			(*p)(env, cookie, "sub %d absent\n", i);
+		}
 	}
 	return 0;
 }
@@ -417,7 +418,14 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 static int lov_print_released(const struct lu_env *env, void *cookie,
 				lu_printer_t p, const struct lu_object *o)
 {
-	(*p)(env, cookie, "released\n");
+	struct lov_object	*lov = lu2lov(o);
+	struct lov_stripe_md	*lsm = lov->lo_lsm;
+
+	(*p)(env, cookie,
+		"released: %s, lsm{%p 0x%08X %d %u %u}:\n",
+		lov->lo_layout_invalid ? "invalid" : "valid", lsm,
+		lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
+		lsm->lsm_stripe_count, lsm->lsm_layout_gen);
 	return 0;
 }
 
@@ -662,6 +670,10 @@ static int lov_layout_change(const struct lu_env *unused,
 		return PTR_ERR(env);
 	}
 
+	CDEBUG(D_INODE, DFID" from %s to %s\n",
+	       PFID(lu_object_fid(lov2lu(lov))),
+	       llt2str(lov->lo_type), llt2str(llt));
+
 	old_ops = &lov_dispatch[lov->lo_type];
 	new_ops = &lov_dispatch[llt];
 
@@ -750,8 +762,9 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 	if (conf->u.coc_md != NULL)
 		lsm = conf->u.coc_md->lsm;
 	if ((lsm == NULL && lov->lo_lsm == NULL) ||
-	    (lsm != NULL && lov->lo_lsm != NULL &&
-	     lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen)) {
+	    ((lsm != NULL && lov->lo_lsm != NULL) &&
+	     (lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen) &&
+	     (lov->lo_lsm->lsm_pattern == lsm->lsm_pattern))) {
 		/* same version of layout */
 		lov->lo_layout_invalid = false;
 		GOTO(out, result = 0);
@@ -767,6 +780,8 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 
 out:
 	lov_conf_unlock(lov);
+	CDEBUG(D_INODE, DFID" lo_layout_invalid=%d\n",
+	       PFID(lu_object_fid(lov2lu(lov))), lov->lo_layout_invalid);
 	return result;
 }
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 3e77020..91f6876 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -179,7 +179,8 @@ static __u64 mds_pack_open_flags(__u64 flags, __u32 mode)
 	__u64 cr_flags = (flags & (FMODE_READ | FMODE_WRITE |
 				   MDS_OPEN_HAS_EA | MDS_OPEN_HAS_OBJS |
 				   MDS_OPEN_OWNEROVERRIDE | MDS_OPEN_LOCK |
-				   MDS_OPEN_BY_FID | MDS_OPEN_LEASE));
+				   MDS_OPEN_BY_FID | MDS_OPEN_LEASE |
+				   MDS_OPEN_RELEASE));
 	if (flags & O_CREAT)
 		cr_flags |= MDS_OPEN_CREAT;
 	if (flags & O_EXCL)
@@ -490,6 +491,28 @@ void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, int flags,
 	}
 }
 
+static void mdc_hsm_release_pack(struct ptlrpc_request *req,
+				 struct md_op_data *op_data)
+{
+	if (op_data->op_bias & MDS_HSM_RELEASE) {
+		struct close_data *data;
+		struct ldlm_lock *lock;
+
+		data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+		LASSERT(data != NULL);
+
+		lock = ldlm_handle2lock(&op_data->op_lease_handle);
+		if (lock != NULL) {
+			data->cd_handle = lock->l_remote_handle;
+			ldlm_lock_put(lock);
+		}
+		ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+
+		data->cd_data_version = op_data->op_data_version;
+		data->cd_fid = op_data->op_fid2;
+	}
+}
+
 void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 {
 	struct mdt_ioepoch *epoch;
@@ -501,6 +524,7 @@ void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 	mdc_setattr_pack_rec(rec, op_data);
 	mdc_pack_capa(req, &RMF_CAPA1, op_data->op_capa1);
 	mdc_ioepoch_pack(epoch, op_data);
+	mdc_hsm_release_pack(req, op_data);
 }
 
 static int mdc_req_avail(struct client_obd *cli, struct mdc_cache_waiter *mcw)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 8373778..0b43251 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -800,10 +800,27 @@ int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 {
 	struct obd_device     *obd = class_exp2obd(exp);
 	struct ptlrpc_request *req;
-	int		    rc;
+	struct req_format     *req_fmt;
+	int                    rc;
+	int		       saved_rc = 0;
+
+
+	req_fmt = &RQF_MDS_CLOSE;
+	if (op_data->op_bias & MDS_HSM_RELEASE) {
+		req_fmt = &RQF_MDS_RELEASE_CLOSE;
+
+		/* allocate a FID for volatile file */
+		rc = mdc_fid_alloc(exp, &op_data->op_fid2, op_data);
+		if (rc < 0) {
+			CERROR("%s: "DFID" failed to allocate FID: %d\n",
+			       obd->obd_name, PFID(&op_data->op_fid1), rc);
+			/* save the errcode and proceed to close */
+			saved_rc = rc;
+		}
+	}
 
 	*request = NULL;
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_MDS_CLOSE);
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
 	if (req == NULL)
 		return -ENOMEM;
 
@@ -893,7 +910,7 @@ int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 	}
 	*request = req;
 	mdc_close_handle_reply(req, op_data, rc);
-	return rc;
+	return rc < 0 ? rc : saved_rc;
 }
 
 int mdc_done_writing(struct obd_export *exp, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index e7524f9..1db01a4 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -145,6 +145,14 @@ static const struct req_msg_field *mdt_close_client[] = {
 	&RMF_CAPA1
 };
 
+static const struct req_msg_field *mdt_release_close_client[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_MDT_EPOCH,
+	&RMF_REC_REINT,
+	&RMF_CAPA1,
+	&RMF_CLOSE_DATA
+};
+
 static const struct req_msg_field *obd_statfs_server[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_OBD_STATFS
@@ -666,6 +674,7 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_GETXATTR,
 	&RQF_MDS_SYNC,
 	&RQF_MDS_CLOSE,
+	&RQF_MDS_RELEASE_CLOSE,
 	&RQF_MDS_PIN,
 	&RQF_MDS_UNPIN,
 	&RQF_MDS_READPAGE,
@@ -885,6 +894,11 @@ struct req_msg_field RMF_PTLRPC_BODY =
 		    sizeof(struct ptlrpc_body), lustre_swab_ptlrpc_body, NULL);
 EXPORT_SYMBOL(RMF_PTLRPC_BODY);
 
+struct req_msg_field RMF_CLOSE_DATA =
+	DEFINE_MSGF("data_version", 0,
+		    sizeof(struct close_data), lustre_swab_close_data, NULL);
+EXPORT_SYMBOL(RMF_CLOSE_DATA);
+
 struct req_msg_field RMF_OBD_STATFS =
 	DEFINE_MSGF("obd_statfs", 0,
 		    sizeof(struct obd_statfs), lustre_swab_obd_statfs, NULL);
@@ -1412,6 +1426,11 @@ struct req_format RQF_MDS_CLOSE =
 			mdt_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_CLOSE);
 
+struct req_format RQF_MDS_RELEASE_CLOSE =
+	DEFINE_REQ_FMT0("MDS_CLOSE",
+			mdt_release_close_client, mds_last_unlink_server);
+EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
+
 struct req_format RQF_MDS_PIN =
 	DEFINE_REQ_FMT0("MDS_PIN",
 			mdt_body_capa, mdt_body_only);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 8ec08d9..cc978d7 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -2565,3 +2565,10 @@ void lustre_swab_swap_layouts(struct mdc_swap_layouts *msl)
 	__swab64s(&msl->msl_flags);
 }
 EXPORT_SYMBOL(lustre_swab_swap_layouts);
+
+void lustre_swab_close_data(struct close_data *cd)
+{
+	lustre_swab_lu_fid(&cd->cd_fid);
+	__swab64s(&cd->cd_data_version);
+}
+EXPORT_SYMBOL(lustre_swab_close_data);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 29/40] staging/lustre/llite: extended attribute cache
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (27 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 28/40] staging/lustre/hsm: Add hsm_release feature Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 30/40] staging/lustre/xattr: separate ACL and XATTR caches Peng Tao
                   ` (11 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Andrew Perepechko, Peng Tao, Andreas Dilger

From: Andrew Perepechko <andrew_perepechko@xyratex.com>

This patch implements an extended attribute cache for
a Lustre client. It is organized as a write-through
cache: reads are performed from cache, updates are sent
synchronously to the MDS. An additional inode bit
MDS_INODELOCK_XATTR is added to protect the cache.

Lustre-change: http://review.whamcloud.com/5537
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2869
Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/linux/lustre_lite.h      |    1 +
 .../lustre/lustre/include/lustre/lustre_idl.h      |   12 +-
 .../lustre/lustre/include/lustre_req_layout.h      |    3 +
 drivers/staging/lustre/lustre/include/md_object.h  |    4 +-
 drivers/staging/lustre/lustre/include/obd.h        |    5 +
 .../staging/lustre/lustre/include/obd_support.h    |    1 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    2 +
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |   13 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   36 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   22 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  |   36 ++
 drivers/staging/lustre/lustre/llite/namei.c        |    4 +
 drivers/staging/lustre/lustre/llite/super25.c      |    4 +
 drivers/staging/lustre/lustre/llite/xattr.c        |  102 ++--
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  640 ++++++++++++++++++++
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    1 +
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   63 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   34 ++
 19 files changed, 923 insertions(+), 62 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_cache.c

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_lite.h b/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
index 9e5df8d..df93912 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_lite.h
@@ -88,6 +88,7 @@ enum {
 	 LPROC_LL_ALLOC_INODE,
 	 LPROC_LL_SETXATTR,
 	 LPROC_LL_GETXATTR,
+	 LPROC_LL_GETXATTR_HITS,
 	 LPROC_LL_LISTXATTR,
 	 LPROC_LL_REMOVEXATTR,
 	 LPROC_LL_INODE_PERM,
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 97d633e..54e1599 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1357,7 +1357,7 @@ extern void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 				OBD_CONNECT_EINPROGRESS | \
 				OBD_CONNECT_LIGHTWEIGHT | OBD_CONNECT_UMASK | \
 				OBD_CONNECT_LVB_TYPE | OBD_CONNECT_LAYOUTLOCK |\
-				OBD_CONNECT_PINGLESS)
+				OBD_CONNECT_PINGLESS | OBD_CONNECT_MAX_EASIZE)
 #define OST_CONNECT_SUPPORTED  (OBD_CONNECT_SRVLOCK | OBD_CONNECT_GRANT | \
 				OBD_CONNECT_REQPORTAL | OBD_CONNECT_VERSION | \
 				OBD_CONNECT_TRUNCLOCK | OBD_CONNECT_INDEX | \
@@ -1741,7 +1741,9 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 #define OBD_MD_FLCKSPLIT     (0x0000080000000000ULL) /* Check split on server */
 #define OBD_MD_FLCROSSREF    (0x0000100000000000ULL) /* Cross-ref case */
 #define OBD_MD_FLGETATTRLOCK (0x0000200000000000ULL) /* Get IOEpoch attributes
-						      * under lock */
+						      * under lock; for xattr
+						      * requests means the
+						      * client holds the lock */
 #define OBD_MD_FLOBJCOUNT    (0x0000400000000000ULL) /* for multiple destroy */
 
 #define OBD_MD_FLRMTLSETFACL (0x0001000000000000ULL) /* lfs lsetfacl case */
@@ -1758,6 +1760,9 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 			  OBD_MD_FLGID   | OBD_MD_FLFLAGS | OBD_MD_FLNLINK | \
 			  OBD_MD_FLGENER | OBD_MD_FLRDEV  | OBD_MD_FLGROUP)
 
+#define OBD_MD_FLXATTRLOCKED OBD_MD_FLGETATTRLOCK
+#define OBD_MD_FLXATTRALL (OBD_MD_FLXATTR | OBD_MD_FLXATTRLS)
+
 /* don't forget obdo_fid which is way down at the bottom so it can
  * come after the definition of llog_cookie */
 
@@ -2130,8 +2135,9 @@ extern void lustre_swab_generic_32s (__u32 *val);
 #define MDS_INODELOCK_OPEN   0x000004       /* For opened files */
 #define MDS_INODELOCK_LAYOUT 0x000008       /* for layout */
 #define MDS_INODELOCK_PERM   0x000010       /* for permission */
+#define MDS_INODELOCK_XATTR  0x000020       /* extended attributes */
 
-#define MDS_INODELOCK_MAXSHIFT 4
+#define MDS_INODELOCK_MAXSHIFT 5
 /* This FULL lock is useful to take on unlink sort of operations */
 #define MDS_INODELOCK_FULL ((1<<(MDS_INODELOCK_MAXSHIFT+1))-1)
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index a75f4c6..a83db61 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -230,6 +230,7 @@ extern struct req_format RQF_LDLM_INTENT_GETATTR;
 extern struct req_format RQF_LDLM_INTENT_OPEN;
 extern struct req_format RQF_LDLM_INTENT_CREATE;
 extern struct req_format RQF_LDLM_INTENT_UNLINK;
+extern struct req_format RQF_LDLM_INTENT_GETXATTR;
 extern struct req_format RQF_LDLM_INTENT_QUOTA;
 extern struct req_format RQF_LDLM_CANCEL;
 extern struct req_format RQF_LDLM_CALLBACK;
@@ -279,6 +280,8 @@ extern struct req_msg_field RMF_LAYOUT_INTENT;
 extern struct req_msg_field RMF_MDT_MD;
 extern struct req_msg_field RMF_REC_REINT;
 extern struct req_msg_field RMF_EADATA;
+extern struct req_msg_field RMF_EAVALS;
+extern struct req_msg_field RMF_EAVALS_LENS;
 extern struct req_msg_field RMF_ACL;
 extern struct req_msg_field RMF_LOGCOOKIES;
 extern struct req_msg_field RMF_CAPA1;
diff --git a/drivers/staging/lustre/lustre/include/md_object.h b/drivers/staging/lustre/lustre/include/md_object.h
index daf93af..7b45b47 100644
--- a/drivers/staging/lustre/lustre/include/md_object.h
+++ b/drivers/staging/lustre/lustre/include/md_object.h
@@ -352,8 +352,8 @@ struct md_device_operations {
 	int (*mdo_root_get)(const struct lu_env *env, struct md_device *m,
 			    struct lu_fid *f);
 
-	int (*mdo_maxsize_get)(const struct lu_env *env, struct md_device *m,
-			       int *md_size, int *cookie_size);
+	int (*mdo_maxeasize_get)(const struct lu_env *env, struct md_device *m,
+				int *easize);
 
 	int (*mdo_statfs)(const struct lu_env *env, struct md_device *m,
 			  struct obd_statfs *sfs);
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 3247d1d..c3470ce 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1022,6 +1022,7 @@ struct lu_context;
 #define IT_LAYOUT   (1 << 10)
 #define IT_QUOTA_DQACQ (1 << 11)
 #define IT_QUOTA_CONN  (1 << 12)
+#define IT_SETXATTR (1 << 13)
 
 static inline int it_to_lock_mode(struct lookup_intent *it)
 {
@@ -1031,6 +1032,10 @@ static inline int it_to_lock_mode(struct lookup_intent *it)
 	else if (it->it_op & (IT_READDIR | IT_GETATTR | IT_OPEN | IT_LOOKUP |
 			      IT_LAYOUT))
 		return LCK_CR;
+	else if (it->it_op &  IT_GETXATTR)
+		return LCK_PR;
+	else if (it->it_op &  IT_SETXATTR)
+		return LCK_PW;
 
 	LASSERTF(0, "Invalid it_op: %d\n", it->it_op);
 	return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 4d1f62c..36bdeab 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -466,6 +466,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
 #define OBD_FAIL_LOCK_STATE_WAIT_INTR	       0x1402
 #define OBD_FAIL_LOV_INIT			    0x1403
 #define OBD_FAIL_GLIMPSE_DELAY			    0x1404
+#define OBD_FAIL_LLITE_XATTR_ENOMEM		    0x1405
 
 #define OBD_FAIL_FID_INDIR	0x1501
 #define OBD_FAIL_FID_INLMA	0x1502
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index c4fa9f9..f72b887 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -145,6 +145,8 @@ char *ldlm_it2str(int it)
 		return "getxattr";
 	case IT_LAYOUT:
 		return "layout";
+	case IT_SETXATTR:
+		return "setxattr";
 	default:
 		CERROR("Unknown intent %d\n", it);
 		return "UNKNOWN";
diff --git a/drivers/staging/lustre/lustre/llite/Makefile b/drivers/staging/lustre/lustre/llite/Makefile
index f493e07..bb34b8b 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -2,7 +2,7 @@ obj-$(CONFIG_LUSTRE_FS) += lustre.o
 obj-$(CONFIG_LUSTRE_LLITE_LLOOP) += llite_lloop.o
 lustre-y := dcache.o dir.o file.o llite_close.o llite_lib.o llite_nfs.o \
 	    rw.o lproc_llite.o namei.o symlink.o llite_mmap.o \
-	    xattr.o remote_perm.o llite_rmtacl.o llite_capa.o \
+	    xattr.o xattr_cache.o remote_perm.o llite_rmtacl.o llite_capa.o \
 	    rw26.o super25.o statahead.o \
 	    ../lclient/glimpse.o ../lclient/lcommon_cl.o ../lclient/lcommon_misc.o \
 	    vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 981d3fe..94a46b5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2782,7 +2782,8 @@ int ll_have_md_lock(struct inode *inode, __u64 *bits,  ldlm_mode_t l_req_mode)
 }
 
 ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
-			    struct lustre_handle *lockh, __u64 flags)
+			    struct lustre_handle *lockh, __u64 flags,
+			    ldlm_mode_t mode)
 {
 	ldlm_policy_data_t policy = { .l_inodebits = {bits}};
 	struct lu_fid *fid;
@@ -2792,8 +2793,8 @@ ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
 	CDEBUG(D_INFO, "trying to match res "DFID"\n", PFID(fid));
 
 	rc = md_lock_match(ll_i2mdexp(inode), LDLM_FL_BLOCK_GRANTED|flags,
-			   fid, LDLM_IBITS, &policy,
-			   LCK_CR|LCK_CW|LCK_PR|LCK_PW, lockh);
+			   fid, LDLM_IBITS, &policy, mode, lockh);
+
 	return rc;
 }
 
@@ -3470,7 +3471,8 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 
 	/* mostly layout lock is caching on the local side, so try to match
 	 * it before grabbing layout lock mutex. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0);
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
+			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
 		rc = ll_layout_lock_set(&lockh, mode, inode, gen, false);
 		if (rc == 0)
@@ -3485,7 +3487,8 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 
 again:
 	/* try again. Maybe somebody else has done this. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0);
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
+			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
 		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
 		if (rc == -EAGAIN)
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 06e83dd..7ded2e4 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -127,6 +127,8 @@ enum lli_flags {
 	LLIF_DATA_MODIFIED      = (1 << 6),
 	/* File is being restored */
 	LLIF_FILE_RESTORING	= (1 << 7),
+	/* Xattr cache is attached to the file */
+	LLIF_XATTR_CACHE	= (1 << 8),
 };
 
 struct ll_inode_info {
@@ -279,8 +281,27 @@ struct ll_inode_info {
 	struct mutex			lli_layout_mutex;
 	/* valid only inside LAYOUT ibits lock, protected by lli_layout_mutex */
 	__u32				lli_layout_gen;
+
+	struct rw_semaphore		lli_xattrs_list_rwsem;
+	struct mutex			lli_xattrs_enq_lock;
+	struct list_head		lli_xattrs; /* ll_xattr_entry->xe_list */
 };
 
+int ll_xattr_cache_destroy(struct inode *inode);
+
+int ll_xattr_cache_get(struct inode *inode,
+			const char *name,
+			char *buffer,
+			size_t size,
+			__u64 valid);
+
+int ll_xattr_cache_update(struct inode *inode,
+			const char *name,
+			const char *newval,
+			size_t size,
+			__u64 valid,
+			int flags);
+
 /*
  * Locking to guarantee consistency of non-atomic updates to long long i_size,
  * consistency between file size and KMS.
@@ -402,6 +423,7 @@ enum stats_track_type {
 #define LL_SBI_VERBOSE	0x10000 /* verbose mount/umount */
 #define LL_SBI_LAYOUT_LOCK    0x20000 /* layout lock support */
 #define LL_SBI_USER_FID2PATH  0x40000 /* allow fid2path by unprivileged users */
+#define LL_SBI_XATTR_CACHE    0x80000 /* support for xattr cache */
 
 #define LL_SBI_FLAGS {	\
 	"nolck",	\
@@ -409,6 +431,7 @@ enum stats_track_type {
 	"flock",	\
 	"xattr",	\
 	"acl",		\
+	"???",		\
 	"rmt_client",	\
 	"mds_capa",	\
 	"oss_capa",	\
@@ -421,7 +444,9 @@ enum stats_track_type {
 	"agl",		\
 	"verbose",	\
 	"layout",	\
-	"user_fid2path" }
+	"user_fid2path",\
+	"xattr",	\
+}
 
 /* default value for ll_sb_info->contention_time */
 #define SBI_DEFAULT_CONTENTION_SECONDS     60
@@ -461,7 +486,8 @@ struct ll_sb_info {
 	struct lu_fid	     ll_root_fid; /* root object fid */
 
 	int		       ll_flags;
-	int			  ll_umounting:1;
+	unsigned int		  ll_umounting:1,
+				  ll_xattr_cache_enabled:1;
 	struct list_head		ll_conn_chain; /* per-conn chain of SBs */
 	struct lustre_client_ocd  ll_lco;
 
@@ -732,7 +758,8 @@ extern int ll_inode_revalidate_it(struct dentry *, struct lookup_intent *,
 extern int ll_have_md_lock(struct inode *inode, __u64 *bits,
 			   ldlm_mode_t l_req_mode);
 extern ldlm_mode_t ll_take_md_lock(struct inode *inode, __u64 bits,
-				   struct lustre_handle *lockh, __u64 flags);
+				   struct lustre_handle *lockh, __u64 flags,
+				   ldlm_mode_t mode);
 int __ll_inode_revalidate_it(struct dentry *, struct lookup_intent *,
 			     __u64 bits);
 int ll_revalidate_nd(struct dentry *dentry, unsigned int flags);
@@ -1598,4 +1625,7 @@ int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
 int ll_layout_restore(struct inode *inode);
 
+int ll_xattr_init(void);
+void ll_xattr_fini(void);
+
 #endif /* LLITE_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 478fc44..876083d 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -209,7 +209,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_FULL20   | OBD_CONNECT_64BITHASH|
 				  OBD_CONNECT_EINPROGRESS |
 				  OBD_CONNECT_JOBSTATS | OBD_CONNECT_LVB_TYPE |
-				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS;
+				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS |
+				  OBD_CONNECT_MAX_EASIZE;
 
 	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
 		data->ocd_connect_flags |= OBD_CONNECT_SOM;
@@ -383,6 +384,16 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 		sbi->ll_flags |= LL_SBI_LAYOUT_LOCK;
 	}
 
+	if (data->ocd_ibits_known & MDS_INODELOCK_XATTR) {
+		if (!(data->ocd_connect_flags & OBD_CONNECT_MAX_EASIZE)) {
+			LCONSOLE_INFO("%s: disabling xattr cache due to "
+				      "unknown maximum xattr size.\n", dt);
+		} else {
+			sbi->ll_flags |= LL_SBI_XATTR_CACHE;
+			sbi->ll_xattr_cache_enabled = 1;
+		}
+	}
+
 	obd = class_name2obd(dt);
 	if (!obd) {
 		CERROR("DT %s: not setup or attached\n", dt);
@@ -922,6 +933,9 @@ void ll_lli_init(struct ll_inode_info *lli)
 	lli->lli_layout_gen = LL_LAYOUT_GEN_NONE;
 	lli->lli_clob = NULL;
 
+	init_rwsem(&lli->lli_xattrs_list_rwsem);
+	mutex_init(&lli->lli_xattrs_enq_lock);
+
 	LASSERT(lli->lli_vfs_inode.i_mode != 0);
 	if (S_ISDIR(lli->lli_vfs_inode.i_mode)) {
 		mutex_init(&lli->lli_readdir_mutex);
@@ -1194,6 +1208,8 @@ void ll_clear_inode(struct inode *inode)
 		lli->lli_symlink_name = NULL;
 	}
 
+	ll_xattr_cache_destroy(inode);
+
 	if (sbi->ll_flags & LL_SBI_RMT_CLIENT) {
 		LASSERT(lli->lli_posix_acl == NULL);
 		if (lli->lli_remote_perms) {
@@ -1753,7 +1769,9 @@ void ll_update_inode(struct inode *inode, struct lustre_md *md)
 			 * lock on the client and set LLIF_MDS_SIZE_LOCK holding
 			 * it. */
 			mode = ll_take_md_lock(inode, MDS_INODELOCK_UPDATE,
-					       &lockh, LDLM_FL_CBPENDING);
+					       &lockh, LDLM_FL_CBPENDING,
+					       LCK_CR | LCK_CW |
+					       LCK_PR | LCK_PW);
 			if (mode) {
 				if (lli->lli_flags & (LLIF_DONE_WRITING |
 						      LLIF_EPOCH_PENDING |
diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 4bf09c4..18a5998c 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -723,6 +723,40 @@ static int ll_sbi_flags_seq_show(struct seq_file *m, void *v)
 }
 LPROC_SEQ_FOPS_RO(ll_sbi_flags);
 
+static int ll_xattr_cache_seq_show(struct seq_file *m, void *v)
+{
+	struct super_block *sb = m->private;
+	struct ll_sb_info *sbi = ll_s2sbi(sb);
+	int rc;
+
+	rc = seq_printf(m, "%u\n", sbi->ll_xattr_cache_enabled);
+
+	return rc;
+}
+
+static ssize_t ll_xattr_cache_seq_write(struct file *file, const char *buffer,
+					size_t count, loff_t *off)
+{
+	struct super_block *sb = ((struct seq_file *)file->private_data)->private;
+	struct ll_sb_info *sbi = ll_s2sbi(sb);
+	int val, rc;
+
+	rc = lprocfs_write_helper(buffer, count, &val);
+	if (rc)
+		return rc;
+
+	if (val != 0 && val != 1)
+		return -ERANGE;
+
+	if (val == 1 && !(sbi->ll_flags & LL_SBI_XATTR_CACHE))
+		return -ENOTSUPP;
+
+	sbi->ll_xattr_cache_enabled = val;
+
+	return count;
+}
+LPROC_SEQ_FOPS(ll_xattr_cache);
+
 static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
 	{ "uuid",	  &ll_sb_uuid_fops,	  0, 0 },
 	//{ "mntpt_path",   ll_rd_path,	     0, 0 },
@@ -751,6 +785,7 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
 	{ "lazystatfs",       &ll_lazystatfs_fops, 0 },
 	{ "max_easize",       &ll_maxea_size_fops, 0, 0 },
 	{ "sbi_flags",	      &ll_sbi_flags_fops, 0, 0 },
+	{ "xattr_cache",      &ll_xattr_cache_fops, 0, 0 },
 	{ 0 }
 };
 
@@ -802,6 +837,7 @@ struct llite_file_opcode {
 	{ LPROC_LL_ALLOC_INODE,    LPROCFS_TYPE_REGS, "alloc_inode" },
 	{ LPROC_LL_SETXATTR,       LPROCFS_TYPE_REGS, "setxattr" },
 	{ LPROC_LL_GETXATTR,       LPROCFS_TYPE_REGS, "getxattr" },
+	{ LPROC_LL_GETXATTR_HITS,  LPROCFS_TYPE_REGS, "getxattr_hits" },
 	{ LPROC_LL_LISTXATTR,      LPROCFS_TYPE_REGS, "listxattr" },
 	{ LPROC_LL_REMOVEXATTR,    LPROCFS_TYPE_REGS, "removexattr" },
 	{ LPROC_LL_INODE_PERM,     LPROCFS_TYPE_REGS, "inode_permission" },
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 0000530..dfcfbb8 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -222,6 +222,10 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 			break;
 
 		LASSERT(lock->l_flags & LDLM_FL_CANCELING);
+
+		if (bits & MDS_INODELOCK_XATTR)
+			ll_xattr_cache_destroy(inode);
+
 		/* For OPEN locks we differentiate between lock modes
 		 * LCK_CR, LCK_CW, LCK_PR - bug 22891 */
 		if (bits & (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE |
diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
index 0beaf4e..e21e1c7 100644
--- a/drivers/staging/lustre/lustre/llite/super25.c
+++ b/drivers/staging/lustre/lustre/llite/super25.c
@@ -187,11 +187,15 @@ static int __init init_lustre_lite(void)
 	if (rc == 0)
 		rc = vvp_global_init();
 
+	if (rc == 0)
+		rc = ll_xattr_init();
+
 	return rc;
 }
 
 static void __exit exit_lustre_lite(void)
 {
+	ll_xattr_fini();
 	vvp_global_fini();
 	del_timer(&ll_capa_timer);
 	ll_capa_thread_stop();
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index bcf86ba..1345b67 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -109,7 +109,7 @@ int ll_setxattr_common(struct inode *inode, const char *name,
 		       int flags, __u64 valid)
 {
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
-	struct ptlrpc_request *req;
+	struct ptlrpc_request *req = NULL;
 	int xattr_type, rc;
 	struct obd_capa *oc;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -183,11 +183,17 @@ int ll_setxattr_common(struct inode *inode, const char *name,
 		valid |= rce_ops2valid(rce->rce_ops);
 	}
 #endif
-	oc = ll_mdscapa_get(inode);
-	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-			 valid, name, pv, size, 0, flags, ll_i2suppgid(inode),
-			 &req);
-	capa_put(oc);
+	if (sbi->ll_xattr_cache_enabled &&
+	    (rce == NULL || rce->rce_ops == RMT_LSETFACL)) {
+		rc = ll_xattr_cache_update(inode, name, pv, size, valid, flags);
+	} else {
+		oc = ll_mdscapa_get(inode);
+		rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+				valid, name, pv, size, 0, flags,
+				ll_i2suppgid(inode), &req);
+		capa_put(oc);
+	}
+
 #ifdef CONFIG_FS_POSIX_ACL
 	if (new_value != NULL)
 		lustre_posix_acl_xattr_free(new_value, size);
@@ -352,48 +358,54 @@ int ll_getxattr_common(struct inode *inode, const char *name,
 #endif
 
 do_getxattr:
-	oc = ll_mdscapa_get(inode);
-	rc = md_getxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-			 valid | (rce ? rce_ops2valid(rce->rce_ops) : 0),
-			 name, NULL, 0, size, 0, &req);
-	capa_put(oc);
-	if (rc) {
-		if (rc == -EOPNOTSUPP && xattr_type == XATTR_USER_T) {
-			LCONSOLE_INFO("Disabling user_xattr feature because "
-				      "it is not supported on the server\n");
-			sbi->ll_flags &= ~LL_SBI_USER_XATTR;
-		}
-		return rc;
-	}
+	if (sbi->ll_xattr_cache_enabled && (rce == NULL ||
+					    rce->rce_ops == RMT_LGETFACL ||
+					    rce->rce_ops == RMT_LSETFACL)) {
+		rc = ll_xattr_cache_get(inode, name, buffer, size, valid);
+		if (rc < 0)
+			GOTO(out_xattr, rc);
+	} else {
+		oc = ll_mdscapa_get(inode);
+		rc = md_getxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+				valid | (rce ? rce_ops2valid(rce->rce_ops) : 0),
+				name, NULL, 0, size, 0, &req);
+		capa_put(oc);
 
-	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-	LASSERT(body);
+		if (rc < 0)
+			GOTO(out_xattr, rc);
 
-	/* only detect the xattr size */
-	if (size == 0)
-		GOTO(out, rc = body->eadatasize);
+		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+		LASSERT(body);
 
-	if (size < body->eadatasize) {
-		CERROR("server bug: replied size %u > %u\n",
-		       body->eadatasize, (int)size);
-		GOTO(out, rc = -ERANGE);
-	}
+		/* only detect the xattr size */
+		if (size == 0)
+			GOTO(out, rc = body->eadatasize);
 
-	if (body->eadatasize == 0)
-		GOTO(out, rc = -ENODATA);
+		if (size < body->eadatasize) {
+			CERROR("server bug: replied size %u > %u\n",
+				body->eadatasize, (int)size);
+			GOTO(out, rc = -ERANGE);
+		}
+
+		if (body->eadatasize == 0)
+			GOTO(out, rc = -ENODATA);
 
-	/* do not need swab xattr data */
-	xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
-					     body->eadatasize);
-	if (!xdata)
-		GOTO(out, rc = -EFAULT);
+		/* do not need swab xattr data */
+		xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
+							body->eadatasize);
+		if (!xdata)
+			GOTO(out, rc = -EFAULT);
+
+		memcpy(buffer, xdata, body->eadatasize);
+		rc = body->eadatasize;
+	}
 
 #ifdef CONFIG_FS_POSIX_ACL
-	if (body->eadatasize >= 0 && rce && rce->rce_ops == RMT_LSETFACL) {
+	if (rce && rce->rce_ops == RMT_LSETFACL) {
 		ext_acl_xattr_header *acl;
 
-		acl = lustre_posix_acl_xattr_2ext((posix_acl_xattr_header *)xdata,
-						  body->eadatasize);
+		acl = lustre_posix_acl_xattr_2ext((posix_acl_xattr_header *)buffer,
+						rc);
 		if (IS_ERR(acl))
 			GOTO(out, rc = PTR_ERR(acl));
 
@@ -406,12 +418,12 @@ do_getxattr:
 	}
 #endif
 
-	if (body->eadatasize == 0) {
-		rc = -ENODATA;
-	} else {
-		LASSERT(buffer);
-		memcpy(buffer, xdata, body->eadatasize);
-		rc = body->eadatasize;
+out_xattr:
+	if (rc == -EOPNOTSUPP && xattr_type == XATTR_USER_T) {
+		LCONSOLE_INFO("%s: disabling user_xattr feature because "
+				"it is not supported on the server: rc = %d\n",
+				ll_get_fsname(inode->i_sb, NULL, 0), rc);
+		sbi->ll_flags &= ~LL_SBI_USER_XATTR;
 	}
 out:
 	ptlrpc_req_finished(req);
diff --git a/drivers/staging/lustre/lustre/llite/xattr_cache.c b/drivers/staging/lustre/lustre/llite/xattr_cache.c
new file mode 100644
index 0000000..fd303cb
--- /dev/null
+++ b/drivers/staging/lustre/lustre/llite/xattr_cache.c
@@ -0,0 +1,640 @@
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please  visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Author: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
+ *
+ */
+
+#define DEBUG_SUBSYSTEM S_LLITE
+
+#include <linux/fs.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <obd_support.h>
+#include <lustre_lite.h>
+#include <lustre_dlm.h>
+#include <lustre_ver.h>
+#include "llite_internal.h"
+
+/* If we ever have hundreds of extended attributes, we might want to consider
+ * using a hash or a tree structure instead of list for faster lookups.
+ */
+struct ll_xattr_entry {
+	struct list_head	xe_list;    /* protected with
+					     * lli_xattrs_list_rwsem */
+	char			*xe_name;   /* xattr name, \0-terminated */
+	char			*xe_value;  /* xattr value */
+	unsigned		xe_namelen; /* strlen(xe_name) + 1 */
+	unsigned		xe_vallen;  /* xattr value length */
+};
+
+static struct kmem_cache *xattr_kmem;
+static struct lu_kmem_descr xattr_caches[] = {
+	{
+		.ckd_cache = &xattr_kmem,
+		.ckd_name  = "xattr_kmem",
+		.ckd_size  = sizeof(struct ll_xattr_entry)
+	},
+	{
+		.ckd_cache = NULL
+	}
+};
+
+int ll_xattr_init(void)
+{
+	return lu_kmem_init(xattr_caches);
+}
+
+void ll_xattr_fini(void)
+{
+	lu_kmem_fini(xattr_caches);
+}
+
+/**
+ * Initializes xattr cache for an inode.
+ *
+ * This initializes the xattr list and marks cache presence.
+ */
+static void ll_xattr_cache_init(struct ll_inode_info *lli)
+{
+
+
+	LASSERT(lli != NULL);
+
+	INIT_LIST_HEAD(&lli->lli_xattrs);
+	lli->lli_flags |= LLIF_XATTR_CACHE;
+}
+
+/**
+ *  This looks for a specific extended attribute.
+ *
+ *  Find in @cache and return @xattr_name attribute in @xattr,
+ *  for the NULL @xattr_name return the first cached @xattr.
+ *
+ *  \retval 0        success
+ *  \retval -ENODATA if not found
+ */
+static int ll_xattr_cache_find(struct list_head *cache,
+			       const char *xattr_name,
+			       struct ll_xattr_entry **xattr)
+{
+	struct ll_xattr_entry *entry;
+
+
+
+	list_for_each_entry(entry, cache, xe_list) {
+		/* xattr_name == NULL means look for any entry */
+		if (xattr_name == NULL ||
+		    strcmp(xattr_name, entry->xe_name) == 0) {
+			*xattr = entry;
+			CDEBUG(D_CACHE, "find: [%s]=%.*s\n",
+			       entry->xe_name, entry->xe_vallen,
+			       entry->xe_value);
+			return 0;
+		}
+	}
+
+	return -ENODATA;
+}
+
+/**
+ * This adds or updates an xattr.
+ *
+ * Add @xattr_name attr with @xattr_val value and @xattr_val_len length,
+ * if the attribute already exists, then update its value.
+ *
+ * \retval 0       success
+ * \retval -ENOMEM if no memory could be allocated for the cached attr
+ */
+static int ll_xattr_cache_add(struct list_head *cache,
+			      const char *xattr_name,
+			      const char *xattr_val,
+			      unsigned xattr_val_len)
+{
+	struct ll_xattr_entry *xattr;
+
+
+
+	if (ll_xattr_cache_find(cache, xattr_name, &xattr) == 0) {
+		/* Found a cached EA, update it */
+
+		if (xattr_val_len != xattr->xe_vallen) {
+			char *val;
+			OBD_ALLOC(val, xattr_val_len);
+			if (val == NULL) {
+				CDEBUG(D_CACHE, "failed to allocate %u bytes "
+						"for xattr %s update\n",
+						xattr_val_len,
+						xattr_name);
+				return -ENOMEM;
+			}
+			OBD_FREE(xattr->xe_value, xattr->xe_vallen);
+			xattr->xe_value = val;
+			xattr->xe_vallen = xattr_val_len;
+		}
+		memcpy(xattr->xe_value, xattr_val, xattr_val_len);
+
+		CDEBUG(D_CACHE, "update: [%s]=%.*s\n", xattr_name,
+			xattr_val_len, xattr_val);
+
+		return 0;
+	}
+
+	OBD_SLAB_ALLOC_PTR_GFP(xattr, xattr_kmem, __GFP_IO);
+	if (xattr == NULL) {
+		CDEBUG(D_CACHE, "failed to allocate xattr\n");
+		return -ENOMEM;
+	}
+
+	xattr->xe_namelen = strlen(xattr_name) + 1;
+
+	OBD_ALLOC(xattr->xe_name, xattr->xe_namelen);
+	if (!xattr->xe_name) {
+		CDEBUG(D_CACHE, "failed to alloc xattr name %u\n",
+		       xattr->xe_namelen);
+		goto err_name;
+	}
+	OBD_ALLOC(xattr->xe_value, xattr_val_len);
+	if (!xattr->xe_value) {
+		CDEBUG(D_CACHE, "failed to alloc xattr value %d\n",
+		       xattr_val_len);
+		goto err_value;
+	}
+
+	memcpy(xattr->xe_name, xattr_name, xattr->xe_namelen);
+	memcpy(xattr->xe_value, xattr_val, xattr_val_len);
+	xattr->xe_vallen = xattr_val_len;
+	list_add(&xattr->xe_list, cache);
+
+	CDEBUG(D_CACHE, "set: [%s]=%.*s\n", xattr_name,
+		xattr_val_len, xattr_val);
+
+	return 0;
+err_value:
+	OBD_FREE(xattr->xe_name, xattr->xe_namelen);
+err_name:
+	OBD_SLAB_FREE_PTR(xattr, xattr_kmem);
+
+	return -ENOMEM;
+}
+
+/**
+ * This removes an extended attribute from cache.
+ *
+ * Remove @xattr_name attribute from @cache.
+ *
+ * \retval 0        success
+ * \retval -ENODATA if @xattr_name is not cached
+ */
+static int ll_xattr_cache_del(struct list_head *cache,
+			      const char *xattr_name)
+{
+	struct ll_xattr_entry *xattr;
+
+
+
+	CDEBUG(D_CACHE, "del xattr: %s\n", xattr_name);
+
+	if (ll_xattr_cache_find(cache, xattr_name, &xattr) == 0) {
+		list_del(&xattr->xe_list);
+		OBD_FREE(xattr->xe_name, xattr->xe_namelen);
+		OBD_FREE(xattr->xe_value, xattr->xe_vallen);
+		OBD_SLAB_FREE_PTR(xattr, xattr_kmem);
+
+		return 0;
+	}
+
+	return -ENODATA;
+}
+
+/**
+ * This iterates cached extended attributes.
+ *
+ * Walk over cached attributes in @cache and
+ * fill in @xld_buffer or only calculate buffer
+ * size if @xld_buffer is NULL.
+ *
+ * \retval >= 0     buffer list size
+ * \retval -ENODATA if the list cannot fit @xld_size buffer
+ */
+static int ll_xattr_cache_list(struct list_head *cache,
+			       char *xld_buffer,
+			       int xld_size)
+{
+	struct ll_xattr_entry *xattr, *tmp;
+	int xld_tail = 0;
+
+
+
+	list_for_each_entry_safe(xattr, tmp, cache, xe_list) {
+		CDEBUG(D_CACHE, "list: buffer=%p[%d] name=%s\n",
+			xld_buffer, xld_tail, xattr->xe_name);
+
+		if (xld_buffer) {
+			xld_size -= xattr->xe_namelen;
+			if (xld_size < 0)
+				break;
+			memcpy(&xld_buffer[xld_tail],
+			       xattr->xe_name, xattr->xe_namelen);
+		}
+		xld_tail += xattr->xe_namelen;
+	}
+
+	if (xld_size < 0)
+		return -ERANGE;
+
+	return xld_tail;
+}
+
+/**
+ * Check if the xattr cache is initialized (filled).
+ *
+ * \retval 0 @cache is not initialized
+ * \retval 1 @cache is initialized
+ */
+int ll_xattr_cache_valid(struct ll_inode_info *lli)
+{
+	return !!(lli->lli_flags & LLIF_XATTR_CACHE);
+}
+
+/**
+ * This finalizes the xattr cache.
+ *
+ * Free all xattr memory. @lli is the inode info pointer.
+ *
+ * \retval 0 no error occured
+ */
+static int ll_xattr_cache_destroy_locked(struct ll_inode_info *lli)
+{
+
+
+	if (!ll_xattr_cache_valid(lli))
+		return 0;
+
+	while (ll_xattr_cache_del(&lli->lli_xattrs, NULL) == 0)
+		/* empty loop */ ;
+	lli->lli_flags &= ~LLIF_XATTR_CACHE;
+
+	return 0;
+}
+
+int ll_xattr_cache_destroy(struct inode *inode)
+{
+	struct ll_inode_info *lli = ll_i2info(inode);
+	int rc;
+
+
+
+	down_write(&lli->lli_xattrs_list_rwsem);
+	rc = ll_xattr_cache_destroy_locked(lli);
+	up_write(&lli->lli_xattrs_list_rwsem);
+
+	return rc;
+}
+
+/**
+ * Match or enqueue a PR or PW LDLM lock.
+ *
+ * Find or request an LDLM lock with xattr data.
+ * Since LDLM does not provide API for atomic match_or_enqueue,
+ * the function handles it with a separate enq lock.
+ * If successful, the function exits with the list lock held.
+ *
+ * \retval 0       no error occured
+ * \retval -ENOMEM not enough memory
+ */
+static int ll_xattr_find_get_lock(struct inode *inode,
+				  struct lookup_intent *oit,
+				  struct ptlrpc_request **req)
+{
+	ldlm_mode_t mode;
+	struct lustre_handle lockh = { 0 };
+	struct md_op_data *op_data;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct ldlm_enqueue_info einfo = { .ei_type = LDLM_IBITS,
+					   .ei_mode = it_to_lock_mode(oit),
+					   .ei_cb_bl = ll_md_blocking_ast,
+					   .ei_cb_cp = ldlm_completion_ast };
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct obd_export *exp = sbi->ll_md_exp;
+	int rc;
+
+
+
+	mutex_lock(&lli->lli_xattrs_enq_lock);
+	/* Try matching first. */
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0,
+			       oit->it_op == IT_SETXATTR ? LCK_PW :
+							   (LCK_PR | LCK_PW));
+	if (mode != 0) {
+		/* fake oit in mdc_revalidate_lock() manner */
+		oit->d.lustre.it_lock_handle = lockh.cookie;
+		oit->d.lustre.it_lock_mode = mode;
+		goto out;
+	}
+
+	/* Enqueue if the lock isn't cached locally. */
+	op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0,
+				     LUSTRE_OPC_ANY, NULL);
+	if (IS_ERR(op_data)) {
+		mutex_unlock(&lli->lli_xattrs_enq_lock);
+		return PTR_ERR(op_data);
+	}
+
+	op_data->op_valid = OBD_MD_FLXATTR | OBD_MD_FLXATTRLS |
+			    OBD_MD_FLXATTRLOCKED;
+#ifdef CONFIG_FS_POSIX_ACL
+	/* If working with ACLs, we would like to cache local ACLs */
+	if (sbi->ll_flags & LL_SBI_RMT_CLIENT)
+		op_data->op_valid |= OBD_MD_FLRMTLGETFACL;
+#endif
+
+	rc = md_enqueue(exp, &einfo, oit, op_data, &lockh, NULL, 0, NULL, 0);
+	ll_finish_md_op_data(op_data);
+
+	if (rc < 0) {
+		CDEBUG(D_CACHE, "md_intent_lock failed with %d for fid "DFID"\n",
+		       rc, PFID(ll_inode2fid(inode)));
+		mutex_unlock(&lli->lli_xattrs_enq_lock);
+		return rc;
+	}
+
+	*req = (struct ptlrpc_request *)oit->d.lustre.it_data;
+out:
+	down_write(&lli->lli_xattrs_list_rwsem);
+	mutex_unlock(&lli->lli_xattrs_enq_lock);
+
+	return 0;
+}
+
+/**
+ * Refill the xattr cache.
+ *
+ * Fetch and cache the whole of xattrs for @inode, acquiring
+ * a read or a write xattr lock depending on operation in @oit.
+ * Intent is dropped on exit unless the operation is setxattr.
+ *
+ * \retval 0       no error occured
+ * \retval -EPROTO network protocol error
+ * \retval -ENOMEM not enough memory for the cache
+ */
+static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
+{
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct ptlrpc_request *req = NULL;
+	const char *xdata, *xval, *xtail, *xvtail;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct mdt_body *body;
+	__u32 *xsizes;
+	int rc = 0, i;
+
+
+
+	rc = ll_xattr_find_get_lock(inode, oit, &req);
+	if (rc)
+		GOTO(out_no_unlock, rc);
+
+	/* Do we have the data at this point? */
+	if (ll_xattr_cache_valid(lli)) {
+		ll_stats_ops_tally(sbi, LPROC_LL_GETXATTR_HITS, 1);
+		GOTO(out_maybe_drop, rc = 0);
+	}
+
+	/* Matched but no cache? Cancelled on error by a parallel refill. */
+	if (unlikely(req == NULL)) {
+		CDEBUG(D_CACHE, "cancelled by a parallel getxattr\n");
+		GOTO(out_maybe_drop, rc = -EIO);
+	}
+
+	if (oit->d.lustre.it_status < 0) {
+		CDEBUG(D_CACHE, "getxattr intent returned %d for fid "DFID"\n",
+		       oit->d.lustre.it_status, PFID(ll_inode2fid(inode)));
+		GOTO(out_destroy, rc = oit->d.lustre.it_status);
+	}
+
+	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+	if (body == NULL) {
+		CERROR("no MDT BODY in the refill xattr reply\n");
+		GOTO(out_destroy, rc = -EPROTO);
+	}
+	/* do not need swab xattr data */
+	xdata = req_capsule_server_sized_get(&req->rq_pill, &RMF_EADATA,
+						body->eadatasize);
+	xval = req_capsule_server_sized_get(&req->rq_pill, &RMF_EAVALS,
+						body->aclsize);
+	xsizes = req_capsule_server_sized_get(&req->rq_pill, &RMF_EAVALS_LENS,
+					      body->max_mdsize * sizeof(__u32));
+	if (xdata == NULL || xval == NULL || xsizes == NULL) {
+		CERROR("wrong setxattr reply\n");
+		GOTO(out_destroy, rc = -EPROTO);
+	}
+
+	xtail = xdata + body->eadatasize;
+	xvtail = xval + body->aclsize;
+
+	CDEBUG(D_CACHE, "caching: xdata=%p xtail=%p\n", xdata, xtail);
+
+	ll_xattr_cache_init(lli);
+
+	for (i = 0; i < body->max_mdsize; i++) {
+		CDEBUG(D_CACHE, "caching [%s]=%.*s\n", xdata, *xsizes, xval);
+		/* Perform consistency checks: attr names and vals in pill */
+		if (memchr(xdata, 0, xtail - xdata) == NULL) {
+			CERROR("xattr protocol violation (names are broken)\n");
+			rc = -EPROTO;
+		} else if (xval + *xsizes > xvtail) {
+			CERROR("xattr protocol violation (vals are broken)\n");
+			rc = -EPROTO;
+		} else if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_XATTR_ENOMEM)) {
+			rc = -ENOMEM;
+		} else {
+			rc = ll_xattr_cache_add(&lli->lli_xattrs, xdata, xval,
+						*xsizes);
+		}
+		if (rc < 0) {
+			ll_xattr_cache_destroy_locked(lli);
+			GOTO(out_destroy, rc);
+		}
+		xdata += strlen(xdata) + 1;
+		xval  += *xsizes;
+		xsizes++;
+	}
+
+	if (xdata != xtail || xval != xvtail)
+		CERROR("a hole in xattr data\n");
+
+	ll_set_lock_data(sbi->ll_md_exp, inode, oit, NULL);
+
+	GOTO(out_maybe_drop, rc);
+out_maybe_drop:
+	/* drop lock on error or getxattr */
+	if (rc != 0 || oit->it_op != IT_SETXATTR)
+		ll_intent_drop_lock(oit);
+
+	if (rc != 0)
+		up_write(&lli->lli_xattrs_list_rwsem);
+out_no_unlock:
+	ptlrpc_req_finished(req);
+
+	return rc;
+
+out_destroy:
+	up_write(&lli->lli_xattrs_list_rwsem);
+
+	ldlm_lock_decref_and_cancel((struct lustre_handle *)
+					&oit->d.lustre.it_lock_handle,
+					oit->d.lustre.it_lock_mode);
+
+	goto out_no_unlock;
+}
+
+/**
+ * Get an xattr value or list xattrs using the write-through cache.
+ *
+ * Get the xattr value (@valid has OBD_MD_FLXATTR set) of @name or
+ * list xattr names (@valid has OBD_MD_FLXATTRLS set) for @inode.
+ * The resulting value/list is stored in @buffer if the former
+ * is not larger than @size.
+ *
+ * \retval 0        no error occured
+ * \retval -EPROTO  network protocol error
+ * \retval -ENOMEM  not enough memory for the cache
+ * \retval -ERANGE  the buffer is not large enough
+ * \retval -ENODATA no such attr or the list is empty
+ */
+int ll_xattr_cache_get(struct inode *inode,
+			const char *name,
+			char *buffer,
+			size_t size,
+			__u64 valid)
+{
+	struct lookup_intent oit = { .it_op = IT_GETXATTR };
+	struct ll_inode_info *lli = ll_i2info(inode);
+	int rc = 0;
+
+
+
+	LASSERT(!!(valid & OBD_MD_FLXATTR) ^ !!(valid & OBD_MD_FLXATTRLS));
+
+	down_read(&lli->lli_xattrs_list_rwsem);
+	if (!ll_xattr_cache_valid(lli)) {
+		up_read(&lli->lli_xattrs_list_rwsem);
+		rc = ll_xattr_cache_refill(inode, &oit);
+		if (rc)
+			return rc;
+		downgrade_write(&lli->lli_xattrs_list_rwsem);
+	} else {
+		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR_HITS, 1);
+	}
+
+	if (valid & OBD_MD_FLXATTR) {
+		struct ll_xattr_entry *xattr;
+
+		rc = ll_xattr_cache_find(&lli->lli_xattrs, name, &xattr);
+		if (rc == 0) {
+			rc = xattr->xe_vallen;
+			/* zero size means we are only requested size in rc */
+			if (size != 0) {
+				if (size >= xattr->xe_vallen)
+					memcpy(buffer, xattr->xe_value,
+						xattr->xe_vallen);
+				else
+					rc = -ERANGE;
+			}
+		}
+	} else if (valid & OBD_MD_FLXATTRLS) {
+		rc = ll_xattr_cache_list(&lli->lli_xattrs,
+					 size ? buffer : NULL, size);
+	}
+
+	GOTO(out, rc);
+out:
+	up_read(&lli->lli_xattrs_list_rwsem);
+
+	return rc;
+}
+
+
+/**
+ * Set/update an xattr value or remove xattr using the write-through cache.
+ *
+ * Set/update the xattr value (if @valid has OBD_MD_FLXATTR) of @name to @newval
+ * or
+ * remove the xattr @name (@valid has OBD_MD_FLXATTRRM set) from @inode.
+ * @flags is either XATTR_CREATE or XATTR_REPLACE as defined by setxattr(2)
+ *
+ * \retval 0        no error occured
+ * \retval -EPROTO  network protocol error
+ * \retval -ENOMEM  not enough memory for the cache
+ * \retval -ERANGE  the buffer is not large enough
+ * \retval -ENODATA no such attr (in the removal case)
+ */
+int ll_xattr_cache_update(struct inode *inode,
+			const char *name,
+			const char *newval,
+			size_t size,
+			__u64 valid,
+			int flags)
+{
+	struct lookup_intent oit = { .it_op = IT_SETXATTR };
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct ptlrpc_request *req = NULL;
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct obd_capa *oc;
+	int rc;
+
+
+
+	LASSERT(!!(valid & OBD_MD_FLXATTR) ^ !!(valid & OBD_MD_FLXATTRRM));
+
+	rc = ll_xattr_cache_refill(inode, &oit);
+	if (rc)
+		return rc;
+
+	oc = ll_mdscapa_get(inode);
+	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+			valid | OBD_MD_FLXATTRLOCKED, name, newval,
+			size, 0, flags, ll_i2suppgid(inode), &req);
+	capa_put(oc);
+
+	if (rc) {
+		ll_intent_drop_lock(&oit);
+		GOTO(out, rc);
+	}
+
+	if (valid & OBD_MD_FLXATTR)
+		rc = ll_xattr_cache_add(&lli->lli_xattrs, name, newval, size);
+	else if (valid & OBD_MD_FLXATTRRM)
+		rc = ll_xattr_cache_del(&lli->lli_xattrs, name);
+
+	ll_intent_drop_lock(&oit);
+	GOTO(out, rc);
+out:
+	up_write(&lli->lli_xattrs_list_rwsem);
+	ptlrpc_req_finished(req);
+
+	return rc;
+}
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index b995af6..5069829 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -72,6 +72,7 @@ void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		   __u32 mode, __u64 rdev, __u64 flags, const void *data,
 		   int datalen);
 void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
+void mdc_getxattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
 void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
 void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		     const char *old, int oldlen, const char *new, int newlen);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 8bc9007..f285cbe 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -360,6 +360,62 @@ static struct ptlrpc_request *mdc_intent_open_pack(struct obd_export *exp,
 	return req;
 }
 
+static struct ptlrpc_request *
+mdc_intent_getxattr_pack(struct obd_export *exp,
+			 struct lookup_intent *it,
+			 struct md_op_data *op_data)
+{
+	struct ptlrpc_request	*req;
+	struct ldlm_intent	*lit;
+	int			rc, count = 0, maxdata;
+	LIST_HEAD(cancels);
+
+
+
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
+					&RQF_LDLM_INTENT_GETXATTR);
+	if (req == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	mdc_set_capa_size(req, &RMF_CAPA1, op_data->op_capa1);
+
+	if (it->it_op == IT_SETXATTR)
+		/* If we want to upgrade to LCK_PW, let's cancel LCK_PR
+		 * locks now. This avoids unnecessary ASTs. */
+		count = mdc_resource_get_unused(exp, &op_data->op_fid1,
+						&cancels, LCK_PW,
+						MDS_INODELOCK_XATTR);
+
+	rc = ldlm_prep_enqueue_req(exp, req, &cancels, count);
+	if (rc) {
+		ptlrpc_request_free(req);
+		return ERR_PTR(rc);
+	}
+
+	/* pack the intent */
+	lit = req_capsule_client_get(&req->rq_pill, &RMF_LDLM_INTENT);
+	lit->opc = IT_GETXATTR;
+
+	maxdata = class_exp2cliimp(exp)->imp_connect_data.ocd_max_easize;
+
+	/* pack the intended request */
+	mdc_pack_body(req, &op_data->op_fid1, op_data->op_capa1,
+			op_data->op_valid, maxdata, -1, 0);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EADATA,
+				RCL_SERVER, maxdata);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EAVALS,
+				RCL_SERVER, maxdata);
+
+	req_capsule_set_size(&req->rq_pill, &RMF_EAVALS_LENS,
+				RCL_SERVER, maxdata);
+
+	ptlrpc_request_set_replen(req);
+
+	return req;
+}
+
 static struct ptlrpc_request *mdc_intent_unlink_pack(struct obd_export *exp,
 						     struct lookup_intent *it,
 						     struct md_op_data *op_data)
@@ -735,6 +791,8 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 			    { .l_inodebits = { MDS_INODELOCK_UPDATE } };
 	static const ldlm_policy_data_t layout_policy =
 			    { .l_inodebits = { MDS_INODELOCK_LAYOUT } };
+	static const ldlm_policy_data_t getxattr_policy = {
+			      .l_inodebits = { MDS_INODELOCK_XATTR } };
 	ldlm_policy_data_t const *policy = &lookup_policy;
 	int		    generation, resends = 0;
 	struct ldlm_reply     *lockrep;
@@ -751,6 +809,8 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 			policy = &update_policy;
 		else if (it->it_op & IT_LAYOUT)
 			policy = &layout_policy;
+		else if (it->it_op & (IT_GETXATTR | IT_SETXATTR))
+			policy = &getxattr_policy;
 	}
 
 	LASSERT(reqp == NULL);
@@ -781,9 +841,10 @@ resend:
 	} else if (it->it_op & IT_LAYOUT) {
 		if (!imp_connect_lvb_type(class_exp2cliimp(exp)))
 			return -EOPNOTSUPP;
-
 		req = mdc_intent_layout_pack(exp, it, op_data);
 		lvb_type = LVB_T_LAYOUT;
+	} else if (it->it_op & (IT_GETXATTR | IT_SETXATTR)) {
+		req = mdc_intent_getxattr_pack(exp, it, op_data);
 	} else {
 		LBUG();
 		return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 1db01a4..d7c427f 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -462,6 +462,25 @@ static const struct req_msg_field *ldlm_intent_unlink_client[] = {
 	&RMF_NAME
 };
 
+static const struct req_msg_field *ldlm_intent_getxattr_client[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_DLM_REQ,
+	&RMF_LDLM_INTENT,
+	&RMF_MDT_BODY,
+	&RMF_CAPA1,
+};
+
+static const struct req_msg_field *ldlm_intent_getxattr_server[] = {
+	&RMF_PTLRPC_BODY,
+	&RMF_DLM_REP,
+	&RMF_MDT_BODY,
+	&RMF_MDT_MD,
+	&RMF_ACL, /* for req_capsule_extend/mdt_intent_policy */
+	&RMF_EADATA,
+	&RMF_EAVALS,
+	&RMF_EAVALS_LENS
+};
+
 static const struct req_msg_field *mds_getxattr_client[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_MDT_BODY,
@@ -739,6 +758,7 @@ static struct req_format *req_formats[] = {
 	&RQF_LDLM_INTENT_OPEN,
 	&RQF_LDLM_INTENT_CREATE,
 	&RQF_LDLM_INTENT_UNLINK,
+	&RQF_LDLM_INTENT_GETXATTR,
 	&RQF_LDLM_INTENT_QUOTA,
 	&RQF_QUOTA_DQACQ,
 	&RQF_LOG_CANCEL,
@@ -1013,6 +1033,9 @@ struct req_msg_field RMF_EADATA = DEFINE_MSGF("eadata", 0, -1,
 						    NULL, NULL);
 EXPORT_SYMBOL(RMF_EADATA);
 
+struct req_msg_field RMF_EAVALS = DEFINE_MSGF("eavals", 0, -1, NULL, NULL);
+EXPORT_SYMBOL(RMF_EAVALS);
+
 struct req_msg_field RMF_ACL =
 	DEFINE_MSGF("acl", RMF_F_NO_SIZE_CHECK,
 		    LUSTRE_POSIX_ACL_MAX_SIZE, NULL, NULL);
@@ -1064,6 +1087,11 @@ struct req_msg_field RMF_RCS =
 		    lustre_swab_generic_32s, dump_rcs);
 EXPORT_SYMBOL(RMF_RCS);
 
+struct req_msg_field RMF_EAVALS_LENS =
+	DEFINE_MSGF("eavals_lens", RMF_F_STRUCT_ARRAY, sizeof(__u32),
+		lustre_swab_generic_32s, NULL);
+EXPORT_SYMBOL(RMF_EAVALS_LENS);
+
 struct req_msg_field RMF_OBD_ID =
 	DEFINE_MSGF("obd_id", 0,
 		    sizeof(obd_id), lustre_swab_ost_last_id, NULL);
@@ -1421,6 +1449,12 @@ struct req_format RQF_LDLM_INTENT_UNLINK =
 			ldlm_intent_unlink_client, ldlm_intent_server);
 EXPORT_SYMBOL(RQF_LDLM_INTENT_UNLINK);
 
+struct req_format RQF_LDLM_INTENT_GETXATTR =
+	DEFINE_REQ_FMT0("LDLM_INTENT_GETXATTR",
+			ldlm_intent_getxattr_client,
+			ldlm_intent_getxattr_server);
+EXPORT_SYMBOL(RQF_LDLM_INTENT_GETXATTR);
+
 struct req_format RQF_MDS_CLOSE =
 	DEFINE_REQ_FMT0("MDS_CLOSE",
 			mdt_close_client, mds_last_unlink_server);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 30/40] staging/lustre/xattr: separate ACL and XATTR caches
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (28 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 29/40] staging/lustre/llite: extended attribute cache Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 31/40] staging/lustre/lnet: Add LNet Router Priority parameter Peng Tao
                   ` (10 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Nathaniel Clark, Andrew Perepechko, Peng Tao,
	Andreas Dilger

From: Nathaniel Clark <nathaniel.l.clark@intel.com>

This patch separates ACL and XATTR caches, so that
when updating an ACL only LOOKUP lock is needed and
when updating another XATTR only XATTR lock is needed.

This patch also reverts XATTR cache support for setxattr
because client performing REINT under even PR lock
can deadlock if an active server operation (like unlink)
attempts to cancel all locks, and setxattr has to wait
for it (MDC max-in-flight is 1).

This patch also drops mot_xattr_sem because there is
no use case for FLXATTRLS with locking.

This patch also takes into account that osd_xattr_list
may be changing lu_buf, so it does not assume that
the buf is unchanged after the call.

This patch disables the r/o cache if the data is
unreasonably large (larger than maximum single EA
size).

Lustre-change: http://review.whamcloud.com/7208
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3669
Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    1 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    2 -
 .../staging/lustre/lustre/llite/llite_internal.h   |    7 --
 drivers/staging/lustre/lustre/llite/xattr.c        |   39 ++++---
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  119 ++++----------------
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    9 +-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   30 ++++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    3 +-
 10 files changed, 74 insertions(+), 140 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 54e1599..bc4eaf2 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1760,7 +1760,6 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 			  OBD_MD_FLGID   | OBD_MD_FLFLAGS | OBD_MD_FLNLINK | \
 			  OBD_MD_FLGENER | OBD_MD_FLRDEV  | OBD_MD_FLGROUP)
 
-#define OBD_MD_FLXATTRLOCKED OBD_MD_FLGETATTRLOCK
 #define OBD_MD_FLXATTRALL (OBD_MD_FLXATTR | OBD_MD_FLXATTRLS)
 
 /* don't forget obdo_fid which is way down at the bottom so it can
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index f72b887..c4fa9f9 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -145,8 +145,6 @@ char *ldlm_it2str(int it)
 		return "getxattr";
 	case IT_LAYOUT:
 		return "layout";
-	case IT_SETXATTR:
-		return "setxattr";
 	default:
 		CERROR("Unknown intent %d\n", it);
 		return "UNKNOWN";
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 7ded2e4..7f0197c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -295,13 +295,6 @@ int ll_xattr_cache_get(struct inode *inode,
 			size_t size,
 			__u64 valid);
 
-int ll_xattr_cache_update(struct inode *inode,
-			const char *name,
-			const char *newval,
-			size_t size,
-			__u64 valid,
-			int flags);
-
 /*
  * Locking to guarantee consistency of non-atomic updates to long long i_size,
  * consistency between file size and KMS.
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index 1345b67..8c96854 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -183,17 +183,11 @@ int ll_setxattr_common(struct inode *inode, const char *name,
 		valid |= rce_ops2valid(rce->rce_ops);
 	}
 #endif
-	if (sbi->ll_xattr_cache_enabled &&
-	    (rce == NULL || rce->rce_ops == RMT_LSETFACL)) {
-		rc = ll_xattr_cache_update(inode, name, pv, size, valid, flags);
-	} else {
-		oc = ll_mdscapa_get(inode);
-		rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-				valid, name, pv, size, 0, flags,
-				ll_i2suppgid(inode), &req);
-		capa_put(oc);
-	}
-
+	oc = ll_mdscapa_get(inode);
+	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
+			valid, name, pv, size, 0, flags,
+			ll_i2suppgid(inode), &req);
+	capa_put(oc);
 #ifdef CONFIG_FS_POSIX_ACL
 	if (new_value != NULL)
 		lustre_posix_acl_xattr_free(new_value, size);
@@ -292,6 +286,7 @@ int ll_getxattr_common(struct inode *inode, const char *name,
 	void *xdata;
 	struct obd_capa *oc;
 	struct rmtacl_ctl_entry *rce = NULL;
+	struct ll_inode_info *lli = ll_i2info(inode);
 
 	CDEBUG(D_VFSTRACE, "VFS Op:inode=%lu/%u(%p)\n",
 	       inode->i_ino, inode->i_generation, inode);
@@ -339,7 +334,7 @@ int ll_getxattr_common(struct inode *inode, const char *name,
 	 */
 	if (xattr_type == XATTR_ACL_ACCESS_T &&
 	    !(sbi->ll_flags & LL_SBI_RMT_CLIENT)) {
-		struct ll_inode_info *lli = ll_i2info(inode);
+
 		struct posix_acl *acl;
 
 		spin_lock(&lli->lli_lock);
@@ -358,13 +353,27 @@ int ll_getxattr_common(struct inode *inode, const char *name,
 #endif
 
 do_getxattr:
-	if (sbi->ll_xattr_cache_enabled && (rce == NULL ||
-					    rce->rce_ops == RMT_LGETFACL ||
-					    rce->rce_ops == RMT_LSETFACL)) {
+	if (sbi->ll_xattr_cache_enabled && xattr_type != XATTR_ACL_ACCESS_T) {
 		rc = ll_xattr_cache_get(inode, name, buffer, size, valid);
+		if (rc == -EAGAIN)
+			goto getxattr_nocache;
 		if (rc < 0)
 			GOTO(out_xattr, rc);
+
+		/* Add "system.posix_acl_access" to the list */
+		if (lli->lli_posix_acl != NULL && valid & OBD_MD_FLXATTRLS) {
+			if (size == 0) {
+				rc += sizeof(XATTR_NAME_ACL_ACCESS);
+			} else if (size - rc >= sizeof(XATTR_NAME_ACL_ACCESS)) {
+				memcpy(buffer + rc, XATTR_NAME_ACL_ACCESS,
+				       sizeof(XATTR_NAME_ACL_ACCESS));
+				rc += sizeof(XATTR_NAME_ACL_ACCESS);
+			} else {
+				GOTO(out_xattr, rc = -ERANGE);
+			}
+		}
 	} else {
+getxattr_nocache:
 		oc = ll_mdscapa_get(inode);
 		rc = md_getxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
 				valid | (rce ? rce_ops2valid(rce->rce_ops) : 0),
diff --git a/drivers/staging/lustre/lustre/llite/xattr_cache.c b/drivers/staging/lustre/lustre/llite/xattr_cache.c
index fd303cb..f9f2855 100644
--- a/drivers/staging/lustre/lustre/llite/xattr_cache.c
+++ b/drivers/staging/lustre/lustre/llite/xattr_cache.c
@@ -121,13 +121,13 @@ static int ll_xattr_cache_find(struct list_head *cache,
 }
 
 /**
- * This adds or updates an xattr.
+ * This adds an xattr.
  *
  * Add @xattr_name attr with @xattr_val value and @xattr_val_len length,
- * if the attribute already exists, then update its value.
  *
  * \retval 0       success
  * \retval -ENOMEM if no memory could be allocated for the cached attr
+ * \retval -EPROTO if duplicate xattr is being added
  */
 static int ll_xattr_cache_add(struct list_head *cache,
 			      const char *xattr_name,
@@ -139,28 +139,8 @@ static int ll_xattr_cache_add(struct list_head *cache,
 
 
 	if (ll_xattr_cache_find(cache, xattr_name, &xattr) == 0) {
-		/* Found a cached EA, update it */
-
-		if (xattr_val_len != xattr->xe_vallen) {
-			char *val;
-			OBD_ALLOC(val, xattr_val_len);
-			if (val == NULL) {
-				CDEBUG(D_CACHE, "failed to allocate %u bytes "
-						"for xattr %s update\n",
-						xattr_val_len,
-						xattr_name);
-				return -ENOMEM;
-			}
-			OBD_FREE(xattr->xe_value, xattr->xe_vallen);
-			xattr->xe_value = val;
-			xattr->xe_vallen = xattr_val_len;
-		}
-		memcpy(xattr->xe_value, xattr_val, xattr_val_len);
-
-		CDEBUG(D_CACHE, "update: [%s]=%.*s\n", xattr_name,
-			xattr_val_len, xattr_val);
-
-		return 0;
+		CDEBUG(D_CACHE, "duplicate xattr: [%s]\n", xattr_name);
+		return -EPROTO;
 	}
 
 	OBD_SLAB_ALLOC_PTR_GFP(xattr, xattr_kmem, __GFP_IO);
@@ -316,7 +296,7 @@ int ll_xattr_cache_destroy(struct inode *inode)
 }
 
 /**
- * Match or enqueue a PR or PW LDLM lock.
+ * Match or enqueue a PR lock.
  *
  * Find or request an LDLM lock with xattr data.
  * Since LDLM does not provide API for atomic match_or_enqueue,
@@ -346,9 +326,7 @@ static int ll_xattr_find_get_lock(struct inode *inode,
 
 	mutex_lock(&lli->lli_xattrs_enq_lock);
 	/* Try matching first. */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0,
-			       oit->it_op == IT_SETXATTR ? LCK_PW :
-							   (LCK_PR | LCK_PW));
+	mode = ll_take_md_lock(inode, MDS_INODELOCK_XATTR, &lockh, 0, LCK_PR);
 	if (mode != 0) {
 		/* fake oit in mdc_revalidate_lock() manner */
 		oit->d.lustre.it_lock_handle = lockh.cookie;
@@ -364,13 +342,7 @@ static int ll_xattr_find_get_lock(struct inode *inode,
 		return PTR_ERR(op_data);
 	}
 
-	op_data->op_valid = OBD_MD_FLXATTR | OBD_MD_FLXATTRLS |
-			    OBD_MD_FLXATTRLOCKED;
-#ifdef CONFIG_FS_POSIX_ACL
-	/* If working with ACLs, we would like to cache local ACLs */
-	if (sbi->ll_flags & LL_SBI_RMT_CLIENT)
-		op_data->op_valid |= OBD_MD_FLRMTLGETFACL;
-#endif
+	op_data->op_valid = OBD_MD_FLXATTR | OBD_MD_FLXATTRLS;
 
 	rc = md_enqueue(exp, &einfo, oit, op_data, &lockh, NULL, 0, NULL, 0);
 	ll_finish_md_op_data(op_data);
@@ -432,7 +404,11 @@ static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
 	if (oit->d.lustre.it_status < 0) {
 		CDEBUG(D_CACHE, "getxattr intent returned %d for fid "DFID"\n",
 		       oit->d.lustre.it_status, PFID(ll_inode2fid(inode)));
-		GOTO(out_destroy, rc = oit->d.lustre.it_status);
+		rc = oit->d.lustre.it_status;
+		/* xattr data is so large that we don't want to cache it */
+		if (rc == -ERANGE)
+			rc = -EAGAIN;
+		GOTO(out_destroy, rc);
 	}
 
 	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
@@ -470,6 +446,11 @@ static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
 			rc = -EPROTO;
 		} else if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_XATTR_ENOMEM)) {
 			rc = -ENOMEM;
+		} else if (!strcmp(xdata, XATTR_NAME_ACL_ACCESS)) {
+			/* Filter out ACL ACCESS since it's cached separately */
+			CDEBUG(D_CACHE, "not caching %s\n",
+			       XATTR_NAME_ACL_ACCESS);
+			rc = 0;
 		} else {
 			rc = ll_xattr_cache_add(&lli->lli_xattrs, xdata, xval,
 						*xsizes);
@@ -490,9 +471,8 @@ static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
 
 	GOTO(out_maybe_drop, rc);
 out_maybe_drop:
-	/* drop lock on error or getxattr */
-	if (rc != 0 || oit->it_op != IT_SETXATTR)
-		ll_intent_drop_lock(oit);
+
+	ll_intent_drop_lock(oit);
 
 	if (rc != 0)
 		up_write(&lli->lli_xattrs_list_rwsem);
@@ -577,64 +557,3 @@ out:
 	return rc;
 }
 
-
-/**
- * Set/update an xattr value or remove xattr using the write-through cache.
- *
- * Set/update the xattr value (if @valid has OBD_MD_FLXATTR) of @name to @newval
- * or
- * remove the xattr @name (@valid has OBD_MD_FLXATTRRM set) from @inode.
- * @flags is either XATTR_CREATE or XATTR_REPLACE as defined by setxattr(2)
- *
- * \retval 0        no error occured
- * \retval -EPROTO  network protocol error
- * \retval -ENOMEM  not enough memory for the cache
- * \retval -ERANGE  the buffer is not large enough
- * \retval -ENODATA no such attr (in the removal case)
- */
-int ll_xattr_cache_update(struct inode *inode,
-			const char *name,
-			const char *newval,
-			size_t size,
-			__u64 valid,
-			int flags)
-{
-	struct lookup_intent oit = { .it_op = IT_SETXATTR };
-	struct ll_sb_info *sbi = ll_i2sbi(inode);
-	struct ptlrpc_request *req = NULL;
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct obd_capa *oc;
-	int rc;
-
-
-
-	LASSERT(!!(valid & OBD_MD_FLXATTR) ^ !!(valid & OBD_MD_FLXATTRRM));
-
-	rc = ll_xattr_cache_refill(inode, &oit);
-	if (rc)
-		return rc;
-
-	oc = ll_mdscapa_get(inode);
-	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), oc,
-			valid | OBD_MD_FLXATTRLOCKED, name, newval,
-			size, 0, flags, ll_i2suppgid(inode), &req);
-	capa_put(oc);
-
-	if (rc) {
-		ll_intent_drop_lock(&oit);
-		GOTO(out, rc);
-	}
-
-	if (valid & OBD_MD_FLXATTR)
-		rc = ll_xattr_cache_add(&lli->lli_xattrs, name, newval, size);
-	else if (valid & OBD_MD_FLXATTRRM)
-		rc = ll_xattr_cache_del(&lli->lli_xattrs, name);
-
-	ll_intent_drop_lock(&oit);
-	GOTO(out, rc);
-out:
-	up_write(&lli->lli_xattrs_list_rwsem);
-	ptlrpc_req_finished(req);
-
-	return rc;
-}
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index 5069829..fc21777 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -101,7 +101,7 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 		struct lustre_handle *lockh, void *lmm, int lmmsize,
 		struct ptlrpc_request **req, __u64 extra_lock_flags);
 
-int mdc_resource_get_unused(struct obd_export *exp, struct lu_fid *fid,
+int mdc_resource_get_unused(struct obd_export *exp, const struct lu_fid *fid,
 			    struct list_head *cancels, ldlm_mode_t mode,
 			    __u64 bits);
 /* mdc/mdc_request.c */
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index f285cbe..b5a3f26 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -379,13 +379,6 @@ mdc_intent_getxattr_pack(struct obd_export *exp,
 
 	mdc_set_capa_size(req, &RMF_CAPA1, op_data->op_capa1);
 
-	if (it->it_op == IT_SETXATTR)
-		/* If we want to upgrade to LCK_PW, let's cancel LCK_PR
-		 * locks now. This avoids unnecessary ASTs. */
-		count = mdc_resource_get_unused(exp, &op_data->op_fid1,
-						&cancels, LCK_PW,
-						MDS_INODELOCK_XATTR);
-
 	rc = ldlm_prep_enqueue_req(exp, req, &cancels, count);
 	if (rc) {
 		ptlrpc_request_free(req);
@@ -843,7 +836,7 @@ resend:
 			return -EOPNOTSUPP;
 		req = mdc_intent_layout_pack(exp, it, op_data);
 		lvb_type = LVB_T_LAYOUT;
-	} else if (it->it_op & (IT_GETXATTR | IT_SETXATTR)) {
+	} else if (it->it_op & IT_GETXATTR) {
 		req = mdc_intent_getxattr_pack(exp, it, op_data);
 	} else {
 		LBUG();
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index 9f3a345..1aea154 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -66,7 +66,7 @@ static int mdc_reint(struct ptlrpc_request *request,
 /* Find and cancel locally locks matched by inode @bits & @mode in the resource
  * found by @fid. Found locks are added into @cancel list. Returns the amount of
  * locks added to @cancels list. */
-int mdc_resource_get_unused(struct obd_export *exp, struct lu_fid *fid,
+int mdc_resource_get_unused(struct obd_export *exp, const struct lu_fid *fid,
 			    struct list_head *cancels, ldlm_mode_t mode,
 			    __u64 bits)
 {
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 0b43251..c6e4fb2 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -355,10 +355,32 @@ static int mdc_xattr_common(struct obd_export *exp,const struct req_format *fmt,
 				     input_size);
 	}
 
-	rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, opcode);
-	if (rc) {
-		ptlrpc_request_free(req);
-		return rc;
+	/* Flush local XATTR locks to get rid of a possible cancel RPC */
+	if (opcode == MDS_REINT && fid_is_sane(fid) &&
+	    exp->exp_connect_data.ocd_ibits_known & MDS_INODELOCK_XATTR) {
+		LIST_HEAD(cancels);
+		int count;
+
+		/* Without that packing would fail */
+		if (input_size == 0)
+			req_capsule_set_size(&req->rq_pill, &RMF_EADATA,
+					     RCL_CLIENT, 0);
+
+		count = mdc_resource_get_unused(exp, fid,
+						&cancels, LCK_EX,
+						MDS_INODELOCK_XATTR);
+
+		rc = mdc_prep_elc_req(exp, req, MDS_REINT, &cancels, count);
+		if (rc) {
+			ptlrpc_request_free(req);
+			return rc;
+		}
+	} else {
+		rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, opcode);
+		if (rc) {
+			ptlrpc_request_free(req);
+			return rc;
+		}
 	}
 
 	if (opcode == MDS_REINT) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index d7c427f..3aa5539 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -295,7 +295,8 @@ static const struct req_msg_field *mds_reint_setxattr_client[] = {
 	&RMF_REC_REINT,
 	&RMF_CAPA1,
 	&RMF_NAME,
-	&RMF_EADATA
+	&RMF_EADATA,
+	&RMF_DLM_REQ
 };
 
 static const struct req_msg_field *mdt_swap_layouts[] = {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 31/40] staging/lustre/lnet: Add LNet Router Priority parameter
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (29 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 30/40] staging/lustre/xattr: separate ACL and XATTR caches Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 32/40] staging/lustre/api: HSM import uses new released pattern Peng Tao
                   ` (9 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Doug Oucharek, Peng Tao, Andreas Dilger

From: Doug Oucharek <doug.s.oucharek@intel.com>

This change adds a priority parameter to the route module settings.
This paramter can be >= 0. Like hops, the lower the prioirty number
the higher the priority.  So lower numbered priorities will be
selected over higher numbers.

Lustre-change: http://review.whamcloud.com/5663
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2934
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    5 ++-
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    5 ++-
 drivers/staging/lustre/lnet/lnet/config.c          |   39 +++++++++++++++++++-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    6 +++
 drivers/staging/lustre/lnet/lnet/router.c          |   19 ++++++----
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   16 ++++----
 8 files changed, 72 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index 5be3679..efdc816 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -69,6 +69,7 @@ struct libcfs_ioctl_data {
 	char ioc_bulk[0];
 };
 
+#define ioc_priority ioc_u32[0]
 
 struct libcfs_ioctl_hdr {
 	__u32 ioc_len;
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index bf30104..3ac2bb5 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -650,12 +650,13 @@ extern lnet_ni_t *lnet_net2ni(__u32 net);
 
 int lnet_notify(lnet_ni_t *ni, lnet_nid_t peer, int alive, cfs_time_t when);
 void lnet_notify_locked(lnet_peer_t *lp, int notifylnd, int alive, cfs_time_t when);
-int lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway_nid);
+int lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway_nid,
+		   unsigned int priority);
 int lnet_check_routes(void);
 int lnet_del_route(__u32 net, lnet_nid_t gw_nid);
 void lnet_destroy_routes(void);
 int lnet_get_route(int idx, __u32 *net, __u32 *hops,
-		   lnet_nid_t *gateway, __u32 *alive);
+		   lnet_nid_t *gateway, __u32 *alive, __u32 *priority);
 void lnet_proc_init(void);
 void lnet_proc_fini(void);
 int  lnet_rtrpools_alloc(int im_a_router);
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index e579e7e..dd8edcf 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -478,7 +478,6 @@ typedef struct lnet_peer {
 	lnet_rc_data_t		*lp_rcd;	/* router checker state */
 } lnet_peer_t;
 
-
 /* peer hash size */
 #define LNET_PEER_HASH_BITS     9
 #define LNET_PEER_HASH_SIZE     (1 << LNET_PEER_HASH_BITS)
@@ -504,6 +503,7 @@ typedef struct {
 	int			lr_seq;		/* sequence for round-robin */
 	unsigned int		lr_downis;	/* number of down NIs */
 	unsigned int		lr_hops;	/* how far I am */
+	unsigned int            lr_priority;    /* route priority */
 } lnet_route_t;
 
 #define LNET_REMOTE_NETS_HASH_DEFAULT	(1U << 7)
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 160a429..b3aed00 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1436,7 +1436,7 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_ADD_ROUTE:
 		rc = lnet_add_route(data->ioc_net, data->ioc_count,
-				    data->ioc_nid);
+				    data->ioc_nid, data->ioc_priority);
 		return (rc != 0) ? rc : lnet_check_routes();
 
 	case IOC_LIBCFS_DEL_ROUTE:
@@ -1445,7 +1445,8 @@ LNetCtl(unsigned int cmd, void *arg)
 	case IOC_LIBCFS_GET_ROUTE:
 		return lnet_get_route(data->ioc_count,
 				      &data->ioc_net, &data->ioc_count,
-				      &data->ioc_nid, &data->ioc_flags);
+				      &data->ioc_nid, &data->ioc_flags,
+				      &data->ioc_priority);
 	case IOC_LIBCFS_NOTIFY_ROUTER:
 		return lnet_notify(NULL, data->ioc_nid, data->ioc_flags,
 				   cfs_time_current() -
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index de323f7..6a07b0a 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -603,6 +603,37 @@ lnet_parse_hops(char *str, unsigned int *hops)
 		*hops > 0 && *hops < 256);
 }
 
+#define LNET_PRIORITY_SEPARATOR (':')
+
+int
+lnet_parse_priority(char *str, unsigned int *priority, char **token)
+{
+	int   nob;
+	char *sep;
+	int   len;
+
+	sep = strchr(str, LNET_PRIORITY_SEPARATOR);
+	if (sep == NULL) {
+		*priority = 0;
+		return 0;
+	}
+	len = strlen(sep + 1);
+
+	if ((sscanf((sep+1), "%u%n", priority, &nob) < 1) || (len != nob)) {
+		/* Update the caller's token pointer so it treats the found
+		   priority as the token to report in the error message. */
+		*token += sep - str + 1;
+		return -1;
+	}
+
+	CDEBUG(D_NET, "gateway %s, priority %d, nob %d\n", str, *priority, nob);
+
+	/*
+	 * Change priority separator to \0 to be able to parse NID
+	 */
+	*sep = '\0';
+	return 0;
+}
 
 int
 lnet_parse_route(char *str, int *im_a_router)
@@ -624,6 +655,7 @@ lnet_parse_route(char *str, int *im_a_router)
 	int	       myrc = -1;
 	unsigned int      hops;
 	int	       got_hops = 0;
+	unsigned int	  priority = 0;
 
 	INIT_LIST_HEAD(&gateways);
 	INIT_LIST_HEAD(&nets);
@@ -691,6 +723,11 @@ lnet_parse_route(char *str, int *im_a_router)
 				    LNET_NETTYP(net) == LOLND)
 					goto token_error;
 			} else {
+				rc = lnet_parse_priority(ltb->ltb_text,
+							 &priority, &token);
+				if (rc < 0)
+					goto token_error;
+
 				nid = libcfs_str2nid(ltb->ltb_text);
 				if (nid == LNET_NID_ANY ||
 				    LNET_NETTYP(LNET_NIDNET(nid)) == LOLND)
@@ -720,7 +757,7 @@ lnet_parse_route(char *str, int *im_a_router)
 				continue;
 			}
 
-			rc = lnet_add_route(net, hops, nid);
+			rc = lnet_add_route(net, hops, nid, priority);
 			if (rc != 0) {
 				CERROR("Can't create route to %s via %s\n",
 				       libcfs_net2str(net),
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index b6f8ad3..a5f25a2 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1074,6 +1074,12 @@ lnet_compare_routes(lnet_route_t *r1, lnet_route_t *r2)
 	lnet_peer_t *p1 = r1->lr_gateway;
 	lnet_peer_t *p2 = r2->lr_gateway;
 
+	if (r1->lr_priority < r2->lr_priority)
+		return 1;
+
+	if (r1->lr_priority > r2->lr_priority)
+		return -1;
+
 	if (r1->lr_hops < r2->lr_hops)
 		return 1;
 
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index a326ce0..c13bcf1 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -301,7 +301,8 @@ lnet_add_route_to_rnet (lnet_remotenet_t *rnet, lnet_route_t *route)
 }
 
 int
-lnet_add_route (__u32 net, unsigned int hops, lnet_nid_t gateway)
+lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway,
+	       unsigned int priority)
 {
 	struct list_head	  *e;
 	lnet_remotenet_t    *rnet;
@@ -311,8 +312,8 @@ lnet_add_route (__u32 net, unsigned int hops, lnet_nid_t gateway)
 	int		  add_route;
 	int		  rc;
 
-	CDEBUG(D_NET, "Add route: net %s hops %u gw %s\n",
-	       libcfs_net2str(net), hops, libcfs_nid2str(gateway));
+	CDEBUG(D_NET, "Add route: net %s hops %u priority %u gw %s\n",
+	       libcfs_net2str(net), hops, priority, libcfs_nid2str(gateway));
 
 	if (gateway == LNET_NID_ANY ||
 	    LNET_NETTYP(LNET_NIDNET(gateway)) == LOLND ||
@@ -342,6 +343,7 @@ lnet_add_route (__u32 net, unsigned int hops, lnet_nid_t gateway)
 	rnet->lrn_net = net;
 	route->lr_hops = hops;
 	route->lr_net = net;
+	route->lr_priority = priority;
 
 	lnet_net_lock(LNET_LOCK_EX);
 
@@ -552,7 +554,7 @@ lnet_destroy_routes (void)
 
 int
 lnet_get_route(int idx, __u32 *net, __u32 *hops,
-	       lnet_nid_t *gateway, __u32 *alive)
+	       lnet_nid_t *gateway, __u32 *alive, __u32 *priority)
 {
 	struct list_head		*e1;
 	struct list_head		*e2;
@@ -574,10 +576,11 @@ lnet_get_route(int idx, __u32 *net, __u32 *hops,
 						       lr_list);
 
 				if (idx-- == 0) {
-					*net     = rnet->lrn_net;
-					*hops    = route->lr_hops;
-					*gateway = route->lr_gateway->lp_nid;
-					*alive   = route->lr_gateway->lp_alive;
+					*net	  = rnet->lrn_net;
+					*hops	  = route->lr_hops;
+					*priority = route->lr_priority;
+					*gateway  = route->lr_gateway->lp_nid;
+					*alive	  = route->lr_gateway->lp_alive;
 					lnet_net_unlock(cpt);
 					return 0;
 				}
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index 5e47de3..28b16ca 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -174,8 +174,8 @@ int LL_PROC_PROTO(proc_lnet_routes)
 			      the_lnet.ln_routing ? "enabled" : "disabled");
 		LASSERT(tmpstr + tmpsiz - s > 0);
 
-		s += snprintf(s, tmpstr + tmpsiz - s, "%-8s %4s %7s %s\n",
-			      "net", "hops", "state", "router");
+		s += snprintf(s, tmpstr + tmpsiz - s, "%-8s %4s %8s %7s %s\n",
+			      "net", "hops", "priority", "state", "router");
 		LASSERT(tmpstr + tmpsiz - s > 0);
 
 		lnet_net_lock(0);
@@ -229,14 +229,16 @@ int LL_PROC_PROTO(proc_lnet_routes)
 		}
 
 		if (route != NULL) {
-			__u32	net   = rnet->lrn_net;
-			unsigned int hops  = route->lr_hops;
-			lnet_nid_t   nid   = route->lr_gateway->lp_nid;
-			int	  alive = route->lr_gateway->lp_alive;
+			__u32        net	= rnet->lrn_net;
+			unsigned int hops	= route->lr_hops;
+			unsigned int priority	= route->lr_priority;
+			lnet_nid_t   nid	= route->lr_gateway->lp_nid;
+			int          alive	= route->lr_gateway->lp_alive;
 
 			s += snprintf(s, tmpstr + tmpsiz - s,
-				      "%-8s %4u %7s %s\n",
+				      "%-8s %4u %8u %7s %s\n",
 				      libcfs_net2str(net), hops,
+				      priority,
 				      alive ? "up" : "down",
 				      libcfs_nid2str(nid));
 			LASSERT(tmpstr + tmpsiz - s > 0);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 32/40] staging/lustre/api: HSM import uses new released pattern
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (30 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 31/40] staging/lustre/lnet: Add LNet Router Priority parameter Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 33/40] staging/lustre/target: move OUT to the unified target code Peng Tao
                   ` (8 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, JC Lafoucriere, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

Import creates a released file using new RAID pattern flag
Import used a new ioctl() to implement the import in the
client kernel.
Add a -L|--layout option to lfs getstripe and lfs find

Lustre-change: http://review.whamcloud.com/6536
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3363
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
[Fix up kuid_t/guid_t conversion in llite/file.c -- Peng Tao]
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |   17 +-
 .../lustre/lustre/include/lustre/lustreapi.h       |  172 +++++++++++---------
 drivers/staging/lustre/lustre/llite/file.c         |  118 +++++++++++---
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   21 ++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   40 +++++
 6 files changed, 261 insertions(+), 109 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index ec1bcf5..49ab5be 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -244,9 +244,9 @@ struct ost_id {
 #define LL_IOC_LMV_SETSTRIPE		_IOWR('f', 240, struct lmv_user_md)
 #define LL_IOC_LMV_GETSTRIPE		_IOWR('f', 241, struct lmv_user_md)
 #define LL_IOC_REMOVE_ENTRY		_IOWR('f', 242, __u64)
-
 #define LL_IOC_SET_LEASE		_IOWR('f', 243, long)
 #define LL_IOC_GET_LEASE		_IO('f', 244)
+#define LL_IOC_HSM_IMPORT		_IOWR('f', 245, struct hsm_user_import)
 
 #define LL_STATFS_LMV		1
 #define LL_STATFS_LOV		2
@@ -1130,6 +1130,21 @@ static inline int hal_size(struct hsm_action_list *hal)
 	return sz;
 }
 
+/* HSM file import
+ * describe the attributes to be set on imported file
+ */
+struct hsm_user_import {
+	__u64		hui_size;
+	__u64		hui_atime;
+	__u64		hui_mtime;
+	__u32		hui_atime_ns;
+	__u32		hui_mtime_ns;
+	__u32		hui_uid;
+	__u32		hui_gid;
+	__u32		hui_mode;
+	__u32		hui_archive_id;
+};
+
 /* Copytool progress reporting */
 #define HP_FLAG_COMPLETED 0x01
 #define HP_FLAG_RETRY     0x02
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustreapi.h b/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
index 63da665..1748138 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
@@ -90,97 +90,110 @@ extern int llapi_file_get_stripe(const char *path, struct lov_user_md *lum);
 #define HAVE_LLAPI_FILE_LOOKUP
 extern int llapi_file_lookup(int dirfd, const char *name);
 
-#define VERBOSE_COUNT      0x1
-#define VERBOSE_SIZE       0x2
-#define VERBOSE_OFFSET     0x4
-#define VERBOSE_POOL       0x8
-#define VERBOSE_DETAIL     0x10
-#define VERBOSE_OBJID      0x20
-#define VERBOSE_GENERATION 0x40
-#define VERBOSE_MDTINDEX   0x80
-#define VERBOSE_ALL	(VERBOSE_COUNT | VERBOSE_SIZE | VERBOSE_OFFSET | \
-			    VERBOSE_POOL | VERBOSE_OBJID | VERBOSE_GENERATION)
+#define VERBOSE_COUNT		0x1
+#define VERBOSE_SIZE		0x2
+#define VERBOSE_OFFSET		0x4
+#define VERBOSE_POOL		0x8
+#define VERBOSE_DETAIL		0x10
+#define VERBOSE_OBJID		0x20
+#define VERBOSE_GENERATION	0x40
+#define VERBOSE_MDTINDEX	0x80
+#define VERBOSE_LAYOUT		0x100
+#define VERBOSE_ALL		(VERBOSE_COUNT | VERBOSE_SIZE | \
+				 VERBOSE_OFFSET | VERBOSE_POOL | \
+				 VERBOSE_OBJID | VERBOSE_GENERATION |\
+				 VERBOSE_LAYOUT)
 
 struct find_param {
-	unsigned int maxdepth;
-	time_t  atime;
-	time_t  mtime;
-	time_t  ctime;
-	int     asign;  /* cannot be bitfields due to using pointers to */
-	int     csign;  /* access them during argument parsing. */
-	int     msign;
-	int     type;
-	int	     size_sign:2,	/* these need to be signed values */
-			stripesize_sign:2,
-			stripecount_sign:2;
-	unsigned long long size;
-	unsigned long long size_units;
-	uid_t uid;
-	gid_t gid;
-
-	unsigned long   zeroend:1,
-			recursive:1,
-			exclude_pattern:1,
-			exclude_type:1,
-			exclude_obd:1,
-			exclude_mdt:1,
-			exclude_gid:1,
-			exclude_uid:1,
-			check_gid:1,	    /* group ID */
-			check_uid:1,	    /* user ID */
-			check_pool:1,	   /* LOV pool name */
-			check_size:1,	   /* file size */
-			exclude_pool:1,
-			exclude_size:1,
-			exclude_atime:1,
-			exclude_mtime:1,
-			exclude_ctime:1,
-			get_lmv:1,	      /* get MDT list from LMV */
-			raw:1,		  /* do not fill in defaults */
-			check_stripesize:1,     /* LOV stripe size */
-			exclude_stripesize:1,
-			check_stripecount:1,    /* LOV stripe count */
-			exclude_stripecount:1;
-
-	int     verbose;
-	int     quiet;
+	unsigned int		 maxdepth;
+	time_t			 atime;
+	time_t			 mtime;
+	time_t			 ctime;
+	/* cannot be bitfields due to using pointers to */
+	int			 asign;
+	/* access them during argument parsing. */
+	int			 csign;
+	int			 msign;
+	int			 type;
+	/* these need to be signed values */
+	int			 size_sign:2,
+				 stripesize_sign:2,
+				 stripecount_sign:2;
+	unsigned long long	 size;
+	unsigned long long	 size_units;
+	uid_t			 uid;
+	gid_t			 gid;
+
+	unsigned long		 zeroend:1,
+				 recursive:1,
+				 exclude_pattern:1,
+				 exclude_type:1,
+				 exclude_obd:1,
+				 exclude_mdt:1,
+				 exclude_gid:1,
+				 exclude_uid:1,
+				 check_gid:1,		/* group ID */
+				 check_uid:1,		/* user ID */
+				 check_pool:1,		/* LOV pool name */
+				 check_size:1,		/* file size */
+				 exclude_pool:1,
+				 exclude_size:1,
+				 exclude_atime:1,
+				 exclude_mtime:1,
+				 exclude_ctime:1,
+				 get_lmv:1,	/* get MDT list from LMV */
+				 raw:1,		/* do not fill in defaults */
+				 check_stripesize:1,	/* LOV stripe size */
+				 exclude_stripesize:1,
+				 check_stripecount:1,	/* LOV stripe count */
+				 exclude_stripecount:1,
+				 check_layout:1,
+				 exclude_layout:1;
+
+	int			 verbose;
+	int			 quiet;
 
 	/* regular expression */
-	char   *pattern;
+	char			*pattern;
 
-	char   *print_fmt;
+	char			*print_fmt;
 
-	struct  obd_uuid       *obduuid;
-	int		     num_obds;
-	int		     num_alloc_obds;
-	int		     obdindex;
-	int		    *obdindexes;
+	struct  obd_uuid	*obduuid;
+	int			 num_obds;
+	int			 num_alloc_obds;
+	int			 obdindex;
+	int			*obdindexes;
 
-	struct  obd_uuid       *mdtuuid;
-	int		     num_mdts;
-	int		     num_alloc_mdts;
-	int		     mdtindex;
-	int		    *mdtindexes;
-	int		     file_mdtindex;
+	struct  obd_uuid	*mdtuuid;
+	int			 num_mdts;
+	int			 num_alloc_mdts;
+	int			 mdtindex;
+	int			*mdtindexes;
+	int			 file_mdtindex;
 
-	int	lumlen;
-	struct  lov_user_mds_data *lmd;
+	int			 lumlen;
+	struct  lov_user_mds_data	*lmd;
 
-	char poolname[LOV_MAXPOOLNAME + 1];
+	char			poolname[LOV_MAXPOOLNAME + 1];
 
-	int			fp_lmv_count;
+	int			 fp_lmv_count;
 	struct lmv_user_md	*fp_lmv_md;
 
-	unsigned long long stripesize;
-	unsigned long long stripesize_units;
-	unsigned long long stripecount;
+	unsigned long long	 stripesize;
+	unsigned long long	 stripesize_units;
+	unsigned long long	 stripecount;
+	__u32			 layout;
 
 	/* In-process parameters. */
-	unsigned long   got_uuids:1,
-			obds_printed:1,
-			have_fileinfo:1;	/* file attrs and LOV xattr */
-	unsigned int    depth;
-	dev_t	   st_dev;
+	unsigned long		 got_uuids:1,
+				 obds_printed:1,
+				 have_fileinfo:1; /* file attrs and LOV xattr */
+	unsigned int		 depth;
+	dev_t			 st_dev;
+	__u64			 padding1;
+	__u64			 padding2;
+	__u64			 padding3;
+	__u64			 padding4;
 };
 
 extern int llapi_ostlist(char *path, struct find_param *param);
@@ -241,7 +254,10 @@ extern int llapi_fd2fid(const int fd, lustre_fid *fid);
 
 extern int llapi_get_version(char *buffer, int buffer_size, char **version);
 extern int llapi_get_data_version(int fd, __u64 *data_version, __u64 flags);
+extern int llapi_hsm_state_get_fd(int fd, struct hsm_user_state *hus);
 extern int llapi_hsm_state_get(const char *path, struct hsm_user_state *hus);
+extern int llapi_hsm_state_set_fd(int fd, __u64 setmask, __u64 clearmask,
+				  __u32 archive_id);
 extern int llapi_hsm_state_set(const char *path, __u64 setmask, __u64 clearmask,
 			       __u32 archive_id);
 
@@ -294,7 +310,7 @@ extern int llapi_hsm_copy_start(char *mnt, struct hsm_copy *copy,
 extern int llapi_hsm_copy_end(char *mnt, struct hsm_copy *copy,
 			      const struct hsm_progress *hp);
 extern int llapi_hsm_progress(char *mnt, struct hsm_progress *hp);
-extern int llapi_hsm_import(const char *dst, int archive, struct stat *st,
+extern int llapi_hsm_import(const char *dst, int archive, const struct stat *st,
 			    unsigned long long stripe_size, int stripe_offset,
 			    int stripe_count, int stripe_pattern,
 			    char *pool_name, lustre_fid *newfid);
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 94a46b5..02b297b 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2116,6 +2116,86 @@ free:
 	return rc;
 }
 
+static int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
+{
+	struct md_op_data	*op_data;
+	int			 rc;
+
+	/* Non-root users are forbidden to set or clear flags which are
+	 * NOT defined in HSM_USER_MASK. */
+	if (((hss->hss_setmask | hss->hss_clearmask) & ~HSM_USER_MASK) &&
+	    !cfs_capable(CFS_CAP_SYS_ADMIN))
+		return -EPERM;
+
+	op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0,
+				     LUSTRE_OPC_ANY, hss);
+	if (IS_ERR(op_data))
+		return PTR_ERR(op_data);
+
+	rc = obd_iocontrol(LL_IOC_HSM_STATE_SET, ll_i2mdexp(inode),
+			   sizeof(*op_data), op_data, NULL);
+
+	ll_finish_md_op_data(op_data);
+
+	return rc;
+}
+
+static int ll_hsm_import(struct inode *inode, struct file *file,
+			 struct hsm_user_import *hui)
+{
+	struct hsm_state_set	*hss = NULL;
+	struct iattr		*attr = NULL;
+	int			 rc;
+
+
+	if (!S_ISREG(inode->i_mode))
+		return -EINVAL;
+
+	/* set HSM flags */
+	OBD_ALLOC_PTR(hss);
+	if (hss == NULL)
+		GOTO(out, rc = -ENOMEM);
+
+	hss->hss_valid = HSS_SETMASK | HSS_ARCHIVE_ID;
+	hss->hss_archive_id = hui->hui_archive_id;
+	hss->hss_setmask = HS_ARCHIVED | HS_EXISTS | HS_RELEASED;
+	rc = ll_hsm_state_set(inode, hss);
+	if (rc != 0)
+		GOTO(out, rc);
+
+	OBD_ALLOC_PTR(attr);
+	if (attr == NULL)
+		GOTO(out, rc = -ENOMEM);
+
+	attr->ia_mode = hui->hui_mode & (S_IRWXU | S_IRWXG | S_IRWXO);
+	attr->ia_mode |= S_IFREG;
+	attr->ia_uid = make_kuid(&init_user_ns, hui->hui_uid);
+	attr->ia_gid = make_kgid(&init_user_ns, hui->hui_gid);
+	attr->ia_size = hui->hui_size;
+	attr->ia_mtime.tv_sec = hui->hui_mtime;
+	attr->ia_mtime.tv_nsec = hui->hui_mtime_ns;
+	attr->ia_atime.tv_sec = hui->hui_atime;
+	attr->ia_atime.tv_nsec = hui->hui_atime_ns;
+
+	attr->ia_valid = ATTR_SIZE | ATTR_MODE | ATTR_FORCE |
+			 ATTR_UID | ATTR_GID |
+			 ATTR_MTIME | ATTR_MTIME_SET |
+			 ATTR_ATIME | ATTR_ATIME_SET;
+
+	rc = ll_setattr_raw(file->f_dentry, attr, true);
+	if (rc == -ENODATA)
+		rc = 0;
+
+out:
+	if (hss != NULL)
+		OBD_FREE_PTR(hss);
+
+	if (attr != NULL)
+		OBD_FREE_PTR(attr);
+
+	return rc;
+}
+
 long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
 	struct inode		*inode = file->f_dentry->d_inode;
@@ -2277,37 +2357,19 @@ long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		return rc;
 	}
 	case LL_IOC_HSM_STATE_SET: {
-		struct md_op_data	*op_data;
 		struct hsm_state_set	*hss;
 		int			 rc;
 
 		OBD_ALLOC_PTR(hss);
 		if (hss == NULL)
 			return -ENOMEM;
+
 		if (copy_from_user(hss, (char *)arg, sizeof(*hss))) {
 			OBD_FREE_PTR(hss);
 			return -EFAULT;
 		}
 
-		/* Non-root users are forbidden to set or clear flags which are
-		 * NOT defined in HSM_USER_MASK. */
-		if (((hss->hss_setmask | hss->hss_clearmask) & ~HSM_USER_MASK)
-		    && !cfs_capable(CFS_CAP_SYS_ADMIN)) {
-			OBD_FREE_PTR(hss);
-			return -EPERM;
-		}
-
-		op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0,
-					     LUSTRE_OPC_ANY, hss);
-		if (IS_ERR(op_data)) {
-			OBD_FREE_PTR(hss);
-			return PTR_ERR(op_data);
-		}
-
-		rc = obd_iocontrol(cmd, ll_i2mdexp(inode), sizeof(*op_data),
-				   op_data, NULL);
-
-		ll_finish_md_op_data(op_data);
+		rc = ll_hsm_state_set(inode, hss);
 
 		OBD_FREE_PTR(hss);
 		return rc;
@@ -2419,7 +2481,23 @@ long ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 			}
 		}
 		mutex_unlock(&lli->lli_och_mutex);
+		return rc;
+	}
+	case LL_IOC_HSM_IMPORT: {
+		struct hsm_user_import *hui;
+
+		OBD_ALLOC_PTR(hui);
+		if (hui == NULL)
+			return -ENOMEM;
+
+		if (copy_from_user(hui, (void *)arg, sizeof(*hui))) {
+			OBD_FREE_PTR(hui);
+			return -EFAULT;
+		}
+
+		rc = ll_hsm_import(inode, file, hui);
 
+		OBD_FREE_PTR(hui);
 		return rc;
 	}
 	default: {
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 7f0197c..5028c3b 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -839,7 +839,7 @@ void ll_kill_super(struct super_block *sb);
 struct inode *ll_inode_from_resource_lock(struct ldlm_lock *lock);
 struct inode *ll_inode_from_lock(struct ldlm_lock *lock);
 void ll_clear_inode(struct inode *inode);
-int ll_setattr_raw(struct dentry *dentry, struct iattr *attr);
+int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import);
 int ll_setattr(struct dentry *de, struct iattr *attr);
 int ll_statfs(struct dentry *de, struct kstatfs *sfs);
 int ll_statfs_internal(struct super_block *sb, struct obd_statfs *osfs,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 876083d..3b7e022 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1362,8 +1362,10 @@ static int ll_setattr_ost(struct inode *inode, struct iattr *attr)
  * to the OST with the punch RPC, otherwise we do an explicit setattr RPC.
  * I don't believe it is possible to get e.g. ATTR_MTIME_SET and ATTR_SIZE
  * at the same time.
+ *
+ * In case of HSMimport, we only set attr on MDS.
  */
-int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
+int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 {
 	struct inode *inode = dentry->d_inode;
 	struct ll_inode_info *lli = ll_i2info(inode);
@@ -1373,9 +1375,10 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 	int rc = 0, rc1 = 0;
 
 	CDEBUG(D_VFSTRACE, "%s: setattr inode %p/fid:"DFID" from %llu to %llu, "
-		"valid %x\n", ll_get_fsname(inode->i_sb, NULL, 0), inode,
+		"valid %x, hsm_import %d\n",
+		ll_get_fsname(inode->i_sb, NULL, 0), inode,
 		PFID(&lli->lli_fid), i_size_read(inode), attr->ia_size,
-		attr->ia_valid);
+		attr->ia_valid, hsm_import);
 
 	if (attr->ia_valid & ATTR_SIZE) {
 		/* Check new size against VFS/VM file size limit and rlimit */
@@ -1468,20 +1471,20 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr)
 		ccc_inode_lsm_put(inode, lsm);
 	}
 
-	/* clear size attr for released file
+	/* if not in HSM import mode, clear size attr for released file
 	 * we clear the attribute send to MDT in op_data, not the original
 	 * received from caller in attr which is used later to
 	 * decide return code */
-	if (file_is_released && (attr->ia_valid & ATTR_SIZE))
+	if (file_is_released && (attr->ia_valid & ATTR_SIZE) && !hsm_import)
 		op_data->op_attr.ia_valid &= ~ATTR_SIZE;
 
 	rc = ll_md_setattr(dentry, op_data, &mod);
 	if (rc)
 		GOTO(out, rc);
 
-	/* truncate failed, others succeed */
+	/* truncate failed (only when non HSM import), others succeed */
 	if (file_is_released) {
-		if (attr->ia_valid & ATTR_SIZE)
+		if ((attr->ia_valid & ATTR_SIZE) && !hsm_import)
 			GOTO(out, rc = -ENODATA);
 		else
 			GOTO(out, rc = 0);
@@ -1521,7 +1524,7 @@ out:
 	if (!S_ISDIR(inode->i_mode)) {
 		up_write(&lli->lli_trunc_sem);
 		mutex_lock(&inode->i_mutex);
-		if (attr->ia_valid & ATTR_SIZE)
+		if ((attr->ia_valid & ATTR_SIZE) && !hsm_import)
 			inode_dio_wait(inode);
 	}
 
@@ -1556,7 +1559,7 @@ int ll_setattr(struct dentry *de, struct iattr *attr)
 	    !(attr->ia_valid & ATTR_KILL_SGID))
 		attr->ia_valid |= ATTR_KILL_SGID;
 
-	return ll_setattr_raw(de, attr);
+	return ll_setattr_raw(de, attr, false);
 }
 
 int ll_statfs_internal(struct super_block *sb, struct obd_statfs *osfs,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index ca2ef59..6c31947 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -4422,6 +4422,46 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct hsm_user_request *)0)->hur_user_item) == 0, "found %lld\n",
 		 (long long)(int)sizeof(((struct hsm_user_request *)0)->hur_user_item));
 
+	/* Checks for struct hsm_user_import */
+	LASSERTF((int)sizeof(struct hsm_user_import) == 48, "found %lld\n",
+		 (long long)(int)sizeof(struct hsm_user_import));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_size) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_size));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_size) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_size));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_uid) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_uid));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_uid) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_uid));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_gid) == 36, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_gid));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_gid) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_gid));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_mode) == 40, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_mode));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_mode) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_mode));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_atime) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_atime));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_atime) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_atime));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_atime_ns) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_atime_ns));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_atime_ns) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_atime_ns));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_mtime) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_mtime));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_mtime) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_mtime));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_mtime_ns) == 28, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_mtime_ns));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_mtime_ns) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_mtime_ns));
+	LASSERTF((int)offsetof(struct hsm_user_import, hui_archive_id) == 44, "found %lld\n",
+		 (long long)(int)offsetof(struct hsm_user_import, hui_archive_id));
+	LASSERTF((int)sizeof(((struct hsm_user_import *)0)->hui_archive_id) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct hsm_user_import *)0)->hui_archive_id));
+
 	/* Checks for struct update_buf */
 	LASSERTF((int)sizeof(struct update_buf) == 8, "found %lld\n",
 		 (long long)(int)sizeof(struct update_buf));
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 33/40] staging/lustre/target: move OUT to the unified target code
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (31 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 32/40] staging/lustre/api: HSM import uses new released pattern Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 34/40] staging/lustre/seq: unified SEQ handler Peng Tao
                   ` (7 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <mike.pershin@intel.com>

- Move OUT handler to the unified target code, so it can be
  used by both MDS and OST.
- Make OFD able to handle OUT requests.
- Rename MDS_MDS_PORTAL to the OUT_PORTAL and MDS_OUT_... defines
  just OUT_... since they are independent from MDS now.

Lustre-change: http://review.whamcloud.com/6763
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3467
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
[pick client side of change. target is mostly server side code -- PengTao]
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    3 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |    8 ++---
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   32 +-------------------
 .../staging/lustre/lustre/obdecho/echo_client.c    |    2 +-
 4 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index bc4eaf2..5505cc5 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -131,8 +131,7 @@
 //#define PTLBD_BULK_PORTAL	      21
 #define MDS_SETATTR_PORTAL	     22
 #define MDS_READPAGE_PORTAL	    23
-#define MDS_MDS_PORTAL		 24
-
+#define OUT_PORTAL			24
 #define MGC_REPLY_PORTAL	       25
 #define MGS_REQUEST_PORTAL	     26
 #define MGS_REPLY_PORTAL	       27
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 6c5479b..80227e9 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -370,8 +370,8 @@
  * include linkea (4K maxim), together with other updates, we set it to 9K:
  * lustre_msg + ptlrpc_body + UPDATE_BUF_SIZE (8K)
  */
-#define MDS_OUT_MAXREQSIZE	(9 * 1024)
-#define MDS_OUT_MAXREPSIZE	MDS_MAXREPSIZE
+#define OUT_MAXREQSIZE	(9 * 1024)
+#define OUT_MAXREPSIZE	MDS_MAXREPSIZE
 
 /** MDS_BUFSIZE = max_reqsize (w/o LOV EA) + max sptlrpc payload size */
 #define MDS_BUFSIZE		max(MDS_MAXREQSIZE + SPTLRPC_MAX_PAYLOAD, \
@@ -397,11 +397,11 @@
 				    160 * 1024)
 
 /**
- * MDS_OUT_BUFSIZE = max_out_reqsize + max sptlrpc payload (~1K) which is
+ * OUT_BUFSIZE = max_out_reqsize + max sptlrpc payload (~1K) which is
  * about 10K, for the same reason as MDS_REG_BUFSIZE, we also give some
  * extra bytes to each request buffer to improve buffer utilization rate.
   */
-#define MDS_OUT_BUFSIZE		max(MDS_OUT_MAXREQSIZE + SPTLRPC_MAX_PAYLOAD, \
+#define OUT_BUFSIZE		max(OUT_MAXREQSIZE + SPTLRPC_MAX_PAYLOAD, \
 				    24 * 1024)
 
 /** FLD_MAXREQSIZE == lustre_msg + __u32 padding + ptlrpc_body + opc */
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index c6e4fb2..e418b58 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2098,15 +2098,6 @@ int mdc_set_info_async(const struct lu_env *env,
 		sptlrpc_import_flush_my_ctx(imp);
 		return 0;
 	}
-	if (KEY_IS(KEY_MDS_CONN)) {
-		/* mds-mds import */
-		spin_lock(&imp->imp_lock);
-		imp->imp_server_timeout = 1;
-		spin_unlock(&imp->imp_lock);
-		imp->imp_client->cli_request_portal = MDS_MDS_PORTAL;
-		CDEBUG(D_OTHER, "%s: timeout / 2\n", exp->exp_obd->obd_name);
-		return 0;
-	}
 	if (KEY_IS(KEY_CHANGELOG_CLEAR)) {
 		rc = do_set_info_async(imp, MDS_SET_INFO, LUSTRE_MDS_VERSION,
 				       keylen, key, vallen, val, set);
@@ -2616,27 +2607,6 @@ static int mdc_renew_capa(struct obd_export *exp, struct obd_capa *oc,
 	return 0;
 }
 
-static int mdc_connect(const struct lu_env *env,
-		       struct obd_export **exp,
-		       struct obd_device *obd, struct obd_uuid *cluuid,
-		       struct obd_connect_data *data,
-		       void *localdata)
-{
-	struct obd_import *imp = obd->u.cli.cl_import;
-
-	/* mds-mds import features */
-	if (data && (data->ocd_connect_flags & OBD_CONNECT_MDS_MDS)) {
-		spin_lock(&imp->imp_lock);
-		imp->imp_server_timeout = 1;
-		spin_unlock(&imp->imp_lock);
-		imp->imp_client->cli_request_portal = MDS_MDS_PORTAL;
-		CDEBUG(D_OTHER, "%s: Set 'mds' portal and timeout\n",
-		       obd->obd_name);
-	}
-
-	return client_connect_import(env, exp, obd, cluuid, data, NULL);
-}
-
 struct obd_ops mdc_obd_ops = {
 	.o_owner	    = THIS_MODULE,
 	.o_setup	    = mdc_setup,
@@ -2644,7 +2614,7 @@ struct obd_ops mdc_obd_ops = {
 	.o_cleanup	  = mdc_cleanup,
 	.o_add_conn	 = client_import_add_conn,
 	.o_del_conn	 = client_import_del_conn,
-	.o_connect	  = mdc_connect,
+	.o_connect          = client_connect_import,
 	.o_disconnect       = client_disconnect_export,
 	.o_iocontrol	= mdc_iocontrol,
 	.o_set_info_async   = mdc_set_info_async,
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 1fb0ac4..d7a3395 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -2758,7 +2758,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	if (env == NULL)
 		return -ENOMEM;
 
-	rc = lu_env_init(env, LCT_DT_THREAD);
+	rc = lu_env_init(env, LCT_DT_THREAD | LCT_MD_THREAD);
 	if (rc)
 		GOTO(out, rc = -ENOMEM);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 34/40] staging/lustre/seq: unified SEQ handler
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (32 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 33/40] staging/lustre/target: move OUT to the unified target code Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 35/40] staging/lustre/llite: remove ll_d_root_ops Peng Tao
                   ` (6 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Mikhail Pershin, Peng Tao, Andreas Dilger

From: Mikhail Pershin <mike.pershin@intel.com>

move SEQ handler from MDT to the FID module so it can
be used from any target.

Lustre-change: http://review.whamcloud.com/6765
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3467
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lustre_fid.h |    6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_fid.h b/drivers/staging/lustre/lustre/include/lustre_fid.h
index ff11953..84a897e 100644
--- a/drivers/staging/lustre/lustre/include/lustre_fid.h
+++ b/drivers/staging/lustre/lustre/include/lustre_fid.h
@@ -431,12 +431,6 @@ struct lu_server_seq {
 	struct seq_server_site  *lss_site;
 };
 
-struct com_thread_info;
-int seq_query(struct com_thread_info *info);
-
-struct ptlrpc_request;
-int seq_handle(struct ptlrpc_request *req);
-
 /* Server methods */
 
 int seq_server_init(struct lu_server_seq *seq,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 35/40] staging/lustre/llite: remove ll_d_root_ops
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (33 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 34/40] staging/lustre/seq: unified SEQ handler Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 36/40] staging/lustre/llite: pass correct pointer to obd_iocontrol() Peng Tao
                   ` (5 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Lai Siyao, Lai Siyao, Peng Tao, Andreas Dilger

From: Lai Siyao <laisiyao@whamcloud.com>

Mnt root dentry will never be revalidated, but its d_op->d_compare
will be called for its children, to simplify code, we use the same
ll_d_ops as normal dentries.
But its attribute may be invalid before access, this won't cause
any issue because it always exists, and the only operation depends
on its attribute is .permission, which has revalidated it in lustre
code.

So it's okay to remove ll_d_root_ops, and remove unnecessary checks
in lookup/revalidate/statahead.

This patch also includes several small fixes:
* don't d_add() for create only files, because the dentry is fake,
  and will be released right after use.
* alloc ll_dentry_data may fail, add check for this.

Lustre-change: http://review.whamcloud.com/6797
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3486
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/dcache.c       |   41 +++++---------------
 .../staging/lustre/lustre/llite/llite_internal.h   |    5 ++-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   11 +-----
 drivers/staging/lustre/lustre/llite/llite_nfs.c    |    6 +--
 drivers/staging/lustre/lustre/llite/namei.c        |   33 +++++++---------
 drivers/staging/lustre/lustre/llite/statahead.c    |    7 ++--
 6 files changed, 35 insertions(+), 68 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dcache.c b/drivers/staging/lustre/lustre/llite/dcache.c
index cc7befe..c5565be 100644
--- a/drivers/staging/lustre/lustre/llite/dcache.c
+++ b/drivers/staging/lustre/lustre/llite/dcache.c
@@ -37,6 +37,7 @@
 #include <linux/fs.h>
 #include <linux/sched.h>
 #include <linux/quotaops.h>
+#include <linux/kernel.h>
 
 #define DEBUG_SUBSYSTEM S_LLITE
 
@@ -176,7 +177,7 @@ static int ll_ddelete(const struct dentry *de)
 	return 0;
 }
 
-static int ll_set_dd(struct dentry *de)
+int ll_d_init(struct dentry *de)
 {
 	LASSERT(de != NULL);
 
@@ -190,40 +191,22 @@ static int ll_set_dd(struct dentry *de)
 		OBD_ALLOC_PTR(lld);
 		if (likely(lld != NULL)) {
 			spin_lock(&de->d_lock);
-			if (likely(de->d_fsdata == NULL))
+			if (likely(de->d_fsdata == NULL)) {
 				de->d_fsdata = lld;
-			else
+				__d_lustre_invalidate(de);
+			} else {
 				OBD_FREE_PTR(lld);
+			}
 			spin_unlock(&de->d_lock);
 		} else {
 			return -ENOMEM;
 		}
 	}
+	LASSERT(de->d_op == &ll_d_ops);
 
 	return 0;
 }
 
-int ll_dops_init(struct dentry *de, int block, int init_sa)
-{
-	struct ll_dentry_data *lld = ll_d2d(de);
-	int rc = 0;
-
-	if (lld == NULL && block != 0) {
-		rc = ll_set_dd(de);
-		if (rc)
-			return rc;
-
-		lld = ll_d2d(de);
-	}
-
-	if (lld != NULL && init_sa != 0)
-		lld->lld_sa_generation = 0;
-
-	/* kernel >= 2.6.38 d_op is set in d_alloc() */
-	LASSERT(de->d_op == &ll_d_ops);
-	return rc;
-}
-
 void ll_intent_drop_lock(struct lookup_intent *it)
 {
 	if (it->it_op && it->d.lustre.it_lock_mode) {
@@ -358,6 +341,8 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 	CDEBUG(D_VFSTRACE, "VFS Op:name=%s,intent=%s\n", de->d_name.name,
 	       LL_IT2STR(it));
 
+	LASSERT(de != de->d_sb->s_root);
+
 	if (de->d_inode == NULL) {
 		__u64 ibits;
 
@@ -382,14 +367,6 @@ int ll_revalidate_it(struct dentry *de, int lookup_flags,
 	if (d_mountpoint(de))
 		GOTO(out_sa, rc = 1);
 
-	/* need to get attributes in case root got changed from other client */
-	if (de == de->d_sb->s_root) {
-		rc = __ll_inode_revalidate_it(de, it, MDS_INODELOCK_LOOKUP);
-		if (rc == 0)
-			rc = 1;
-		GOTO(out_sa, rc);
-	}
-
 	exp = ll_i2mdexp(de->d_inode);
 
 	OBD_FAIL_TIMEOUT(OBD_FAIL_MDC_REVALIDATE_PAUSE, 5);
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 5028c3b..37e33ab 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -816,7 +816,7 @@ int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 
 /* llite/dcache.c */
 
-int ll_dops_init(struct dentry *de, int block, int init_sa);
+int ll_d_init(struct dentry *de);
 extern struct dentry_operations ll_d_ops;
 void ll_intent_drop_lock(struct lookup_intent *);
 void ll_intent_release(struct lookup_intent *);
@@ -1308,7 +1308,8 @@ ll_statahead_mark(struct inode *dir, struct dentry *dentry)
 	if (lli->lli_opendir_pid != current_pid())
 		return;
 
-	if (sai != NULL && ldd != NULL)
+	LASSERT(ldd != NULL);
+	if (sai != NULL)
 		ldd->lld_sa_generation = sai->sai_generation;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 3b7e022..d15f99c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -154,11 +154,6 @@ void ll_free_sbi(struct super_block *sb)
 	}
 }
 
-static struct dentry_operations ll_d_root_ops = {
-	.d_compare = ll_dcompare,
-	.d_revalidate = ll_revalidate_nd,
-};
-
 static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				    struct vfsmount *mnt)
 {
@@ -577,10 +572,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 		GOTO(out_root, err = -ENOMEM);
 	}
 
-	/* kernel >= 2.6.38 store dentry operations in sb->s_d_op. */
-	d_set_d_op(sb->s_root, &ll_d_root_ops);
-	sb->s_d_op = &ll_d_ops;
-
 	sbi->ll_sdev_orig = sb->s_dev;
 
 	/* We set sb->s_dev equal on all lustre clients in order to support
@@ -1011,6 +1002,8 @@ int ll_fill_super(struct super_block *sb, struct vfsmount *mnt)
 		GOTO(out_free, err);
 
 	sb->s_bdi = &lsi->lsi_bdi;
+	/* kernel >= 2.6.38 store dentry operations in sb->s_d_op. */
+	sb->s_d_op = &ll_d_ops;
 
 	/* Generate a string unique to this super, in case some joker tries
 	   to mount the same fs at two mount points.
diff --git a/drivers/staging/lustre/lustre/llite/llite_nfs.c b/drivers/staging/lustre/lustre/llite/llite_nfs.c
index 1767c74..3580069 100644
--- a/drivers/staging/lustre/lustre/llite/llite_nfs.c
+++ b/drivers/staging/lustre/lustre/llite/llite_nfs.c
@@ -167,10 +167,10 @@ ll_iget_for_nfs(struct super_block *sb, struct lu_fid *fid, struct lu_fid *paren
 	}
 
 	result = d_obtain_alias(inode);
-	if (IS_ERR(result))
+	if (IS_ERR(result)) {
+		iput(inode);
 		return result;
-
-	ll_dops_init(result, 1, 0);
+	}
 
 	return result;
 }
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index dfcfbb8..c760059 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -398,11 +398,16 @@ static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 struct dentry *ll_splice_alias(struct inode *inode, struct dentry *de)
 {
 	struct dentry *new;
+	int rc;
 
 	if (inode) {
 		new = ll_find_alias(inode, de);
 		if (new) {
-			ll_dops_init(new, 1, 1);
+			rc = ll_d_init(new);
+			if (rc < 0) {
+				dput(new);
+				return ERR_PTR(rc);
+			}
 			d_move(new, de);
 			iput(inode);
 			CDEBUG(D_DENTRY,
@@ -411,8 +416,9 @@ struct dentry *ll_splice_alias(struct inode *inode, struct dentry *de)
 			return new;
 		}
 	}
-	ll_dops_init(de, 1, 1);
-	__d_lustre_invalidate(de);
+	rc = ll_d_init(de);
+	if (rc < 0)
+		return ERR_PTR(rc);
 	d_add(de, inode);
 	CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
 	       de, de->d_inode, d_count(de), de->d_flags);
@@ -453,8 +459,11 @@ int ll_lookup_it_finish(struct ptlrpc_request *request,
 	/* Only hash *de if it is unhashed (new dentry).
 	 * Atoimc_open may passin hashed dentries for open.
 	 */
-	if (d_unhashed(*de))
+	if (d_unhashed(*de)) {
 		*de = ll_splice_alias(inode, *de);
+		if (IS_ERR(*de))
+			return PTR_ERR(*de);
+	}
 
 	if (!it_disposition(it, DISP_LOOKUP_NEG)) {
 		/* we have lookup look - unhide dentry */
@@ -503,16 +512,6 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 
 	ll_frob_intent(&it, &lookup_it);
 
-	/* As do_lookup is called before follow_mount, root dentry may be left
-	 * not valid, revalidate it here. */
-	if (parent->i_sb->s_root && (parent->i_sb->s_root->d_inode == parent) &&
-	    (it->it_op & (IT_OPEN | IT_CREAT))) {
-		rc = ll_inode_revalidate_it(parent->i_sb->s_root, it,
-					    MDS_INODELOCK_LOOKUP);
-		if (rc)
-			return ERR_PTR(rc);
-	}
-
 	if (it->it_op == IT_GETATTR) {
 		rc = ll_statahead_enter(parent, &dentry, 0);
 		if (rc == 1) {
@@ -582,12 +581,8 @@ static struct dentry *ll_lookup_nd(struct inode *parent, struct dentry *dentry,
 	       parent->i_generation, parent, flags);
 
 	/* Optimize away (CREATE && !OPEN). Let .create handle the race. */
-	if ((flags & LOOKUP_CREATE ) && !(flags & LOOKUP_OPEN)) {
-		ll_dops_init(dentry, 1, 1);
-		__d_lustre_invalidate(dentry);
-		d_add(dentry, NULL);
+	if ((flags & LOOKUP_CREATE) && !(flags & LOOKUP_OPEN))
 		return NULL;
-	}
 
 	if (flags & (LOOKUP_PARENT|LOOKUP_OPEN|LOOKUP_CREATE))
 		itp = NULL;
diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c
index f6b5f4b9..183b415 100644
--- a/drivers/staging/lustre/lustre/llite/statahead.c
+++ b/drivers/staging/lustre/lustre/llite/statahead.c
@@ -877,9 +877,6 @@ static int do_sa_revalidate(struct inode *dir, struct ll_sa_entry *entry,
 	if (d_mountpoint(dentry))
 		return 1;
 
-	if (unlikely(dentry == dentry->d_sb->s_root))
-		return 1;
-
 	entry->se_inode = igrab(inode);
 	rc = md_revalidate_lock(ll_i2mdexp(dir), &it, ll_inode2fid(inode),NULL);
 	if (rc == 1) {
@@ -1590,6 +1587,10 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 				if ((*dentryp)->d_inode == NULL) {
 					*dentryp = ll_splice_alias(inode,
 								   *dentryp);
+					if (IS_ERR(*dentryp)) {
+						ll_sai_unplug(sai, entry);
+						return PTR_ERR(*dentryp);
+					}
 				} else if ((*dentryp)->d_inode != inode) {
 					/* revalidate, but inode is recreated */
 					CDEBUG(D_READA,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 36/40] staging/lustre/llite: pass correct pointer to obd_iocontrol()
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (34 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 35/40] staging/lustre/llite: remove ll_d_root_ops Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 37/40] staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h Peng Tao
                   ` (4 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, John L. Hammond, Peng Tao, Andreas Dilger

From: "John L. Hammond" <john.hammond@intel.com>

In copy_and_ioctl() use the kernel space copy as the karg to
obd_iocontrol().

Lustre-change: http://review.whamcloud.com/6274
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3283
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/llite/dir.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 22d0acc9..1b217c8 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1048,20 +1048,25 @@ progress:
 }
 
 
-static int copy_and_ioctl(int cmd, struct obd_export *exp, void *data, int len)
+static int copy_and_ioctl(int cmd, struct obd_export *exp,
+			  const void __user *data, size_t size)
 {
-	void *ptr;
+	void *copy;
 	int rc;
 
-	OBD_ALLOC(ptr, len);
-	if (ptr == NULL)
+	OBD_ALLOC(copy, size);
+	if (copy == NULL)
 		return -ENOMEM;
-	if (copy_from_user(ptr, data, len)) {
-		OBD_FREE(ptr, len);
-		return -EFAULT;
+
+	if (copy_from_user(copy, data, size)) {
+		rc = -EFAULT;
+		goto out;
 	}
-	rc = obd_iocontrol(cmd, exp, len, data, NULL);
-	OBD_FREE(ptr, len);
+
+	rc = obd_iocontrol(cmd, exp, size, copy, NULL);
+out:
+	OBD_FREE(copy, size);
+
 	return rc;
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 37/40] staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (35 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 36/40] staging/lustre/llite: pass correct pointer to obd_iocontrol() Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 38/40] staging/lustre/mgs: set_param -P option that sets value permanently Peng Tao
                   ` (3 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Andreas Dilger, John L. Hammond, Peng Tao, Andreas Dilger

From: Andreas Dilger <adilger@whamcloud.com>

Remove the usage of LASSERT() and CLASSERT() from lustre_idl.h, so
that it is usable from userspace programs if needed.  These have
crept in over the years, but are not intended to be there.

The CLASSERT() checks for fid swabbing were largely redundant, and
have been moved to lustre/fid/fid_handler.c.

The fid_is_sane() macro has been modified to remove the check that
f_ver == 0, since this will eventually not be true, and the client
or server should not crash if it sees such a FID during usage.

There are still a few LASSERTs that need to be removed when FID-on-OST
is landed, but I don't want to remove them before that code lands.

There are also uses of CERROR() that could be removed at that time.

Lustre-change: http://review.whamcloud.com/5682
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1606
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   43 ++++----------------
 1 file changed, 8 insertions(+), 35 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 5505cc5..9d3bc29 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -91,8 +91,8 @@
 #ifndef _LUSTRE_IDL_H_
 #define _LUSTRE_IDL_H_
 
-#if !defined(LASSERT) && !defined(LPU64)
-#include <linux/libcfs/libcfs.h> /* for LASSERT, LPUX64, etc */
+#if !defined(LPU64)
+#include <linux/libcfs/libcfs.h> /* for LPUX64, etc */
 #endif
 
 /* Defn's shared with user-space. */
@@ -231,7 +231,6 @@ static inline unsigned fld_range_is_any(const struct lu_seq_range *range)
 static inline void fld_range_set_type(struct lu_seq_range *range,
 				      unsigned flags)
 {
-	LASSERT(!(flags & ~LU_SEQ_RANGE_MASK));
 	range->lsr_flags |= flags;
 }
 
@@ -615,7 +614,6 @@ static inline obd_id fid_idif_id(obd_seq seq, __u32 oid, __u32 ver)
 /* extract ost index from IDIF FID */
 static inline __u32 fid_idif_ost_idx(const struct lu_fid *fid)
 {
-	LASSERT(fid_is_idif(fid));
 	return (fid_seq(fid) >> 16) & 0xffff;
 }
 
@@ -833,11 +831,6 @@ static inline void lu_igif_build(struct lu_fid *fid, __u32 ino, __u32 gen)
  */
 static inline void fid_cpu_to_le(struct lu_fid *dst, const struct lu_fid *src)
 {
-	/* check that all fields are converted */
-	CLASSERT(sizeof(*src) ==
-		 sizeof(fid_seq(src)) +
-		 sizeof(fid_oid(src)) +
-		 sizeof(fid_ver(src)));
 	dst->f_seq = cpu_to_le64(fid_seq(src));
 	dst->f_oid = cpu_to_le32(fid_oid(src));
 	dst->f_ver = cpu_to_le32(fid_ver(src));
@@ -845,11 +838,6 @@ static inline void fid_cpu_to_le(struct lu_fid *dst, const struct lu_fid *src)
 
 static inline void fid_le_to_cpu(struct lu_fid *dst, const struct lu_fid *src)
 {
-	/* check that all fields are converted */
-	CLASSERT(sizeof(*src) ==
-		 sizeof(fid_seq(src)) +
-		 sizeof(fid_oid(src)) +
-		 sizeof(fid_ver(src)));
 	dst->f_seq = le64_to_cpu(fid_seq(src));
 	dst->f_oid = le32_to_cpu(fid_oid(src));
 	dst->f_ver = le32_to_cpu(fid_ver(src));
@@ -857,11 +845,6 @@ static inline void fid_le_to_cpu(struct lu_fid *dst, const struct lu_fid *src)
 
 static inline void fid_cpu_to_be(struct lu_fid *dst, const struct lu_fid *src)
 {
-	/* check that all fields are converted */
-	CLASSERT(sizeof(*src) ==
-		 sizeof(fid_seq(src)) +
-		 sizeof(fid_oid(src)) +
-		 sizeof(fid_ver(src)));
 	dst->f_seq = cpu_to_be64(fid_seq(src));
 	dst->f_oid = cpu_to_be32(fid_oid(src));
 	dst->f_ver = cpu_to_be32(fid_ver(src));
@@ -869,11 +852,6 @@ static inline void fid_cpu_to_be(struct lu_fid *dst, const struct lu_fid *src)
 
 static inline void fid_be_to_cpu(struct lu_fid *dst, const struct lu_fid *src)
 {
-	/* check that all fields are converted */
-	CLASSERT(sizeof(*src) ==
-		 sizeof(fid_seq(src)) +
-		 sizeof(fid_oid(src)) +
-		 sizeof(fid_ver(src)));
 	dst->f_seq = be64_to_cpu(fid_seq(src));
 	dst->f_oid = be32_to_cpu(fid_oid(src));
 	dst->f_ver = be32_to_cpu(fid_ver(src));
@@ -897,11 +875,6 @@ extern void lustre_swab_lu_seq_range(struct lu_seq_range *range);
 
 static inline int lu_fid_eq(const struct lu_fid *f0, const struct lu_fid *f1)
 {
-	/* Check that there is no alignment padding. */
-	CLASSERT(sizeof(*f0) ==
-		 sizeof(f0->f_seq) +
-		 sizeof(f0->f_oid) +
-		 sizeof(f0->f_ver));
 	return memcmp(f0, f1, sizeof(*f0)) == 0;
 }
 
@@ -3315,9 +3288,10 @@ struct obdo {
 #define o_grant_used o_data_version
 
 static inline void lustre_set_wire_obdo(struct obd_connect_data *ocd,
-					struct obdo *wobdo, struct obdo *lobdo)
+					struct obdo *wobdo,
+					const struct obdo *lobdo)
 {
-	memcpy(wobdo, lobdo, sizeof(*lobdo));
+	*wobdo = *lobdo;
 	wobdo->o_flags &= ~OBD_FL_LOCAL_MASK;
 	if (ocd == NULL)
 		return;
@@ -3332,16 +3306,15 @@ static inline void lustre_set_wire_obdo(struct obd_connect_data *ocd,
 }
 
 static inline void lustre_get_wire_obdo(struct obd_connect_data *ocd,
-					struct obdo *lobdo, struct obdo *wobdo)
+					struct obdo *lobdo,
+					const struct obdo *wobdo)
 {
 	obd_flag local_flags = 0;
 
 	if (lobdo->o_valid & OBD_MD_FLFLAGS)
 		 local_flags = lobdo->o_flags & OBD_FL_LOCAL_MASK;
 
-	LASSERT(!(wobdo->o_flags & OBD_FL_LOCAL_MASK));
-
-	memcpy(lobdo, wobdo, sizeof(*lobdo));
+	*lobdo = *wobdo;
 	if (local_flags != 0) {
 		lobdo->o_valid |= OBD_MD_FLFLAGS;
 		lobdo->o_flags &= ~OBD_FL_LOCAL_MASK;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 38/40] staging/lustre/mgs: set_param -P option that sets value permanently
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (36 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 37/40] staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 39/40] staging/lustre/utils: HSM Posix CopyTool Peng Tao
                   ` (2 subsequent siblings)
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Artem Blagodarenko, Peng Tao, Andreas Dilger

From: Artem Blagodarenko <artem_blagodarenko@xyratex.com>

set_param and conf_param have different syntaxes. Also conf_param
has unimplemented paths and no wildcarding support.

This patch adds set_param -P option, that replaces the whole
conf_param "direct" proc access with a simple upcall-type mechanism
from the MGC. Option conf_param is saved now for compatibility.

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3155
Lustre-change: http://review.whamcloud.com/6025
Signed-off-by: Artem Blagodarenko <artem_blagodarenko@xyratex.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lustre/include/lustre_cfg.h |    2 +
 .../staging/lustre/lustre/include/lustre_disk.h    |    2 +
 drivers/staging/lustre/lustre/include/obd_class.h  |    9 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    5 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   89 +++++++++++++++++---
 .../staging/lustre/lustre/obdclass/obd_config.c    |   50 +++++++++--
 6 files changed, 134 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_cfg.h b/drivers/staging/lustre/lustre/include/lustre_cfg.h
index e14a5f6..3680668 100644
--- a/drivers/staging/lustre/lustre/include/lustre_cfg.h
+++ b/drivers/staging/lustre/lustre/include/lustre_cfg.h
@@ -88,6 +88,8 @@ enum lcfg_command_type {
 	LCFG_SET_LDLM_TIMEOUT   = 0x00ce030, /**< set ldlm_timeout */
 	LCFG_PRE_CLEANUP	= 0x00cf031, /**< call type-specific pre
 					      * cleanup cleanup */
+	LCFG_SET_PARAM		= 0x00ce032, /**< use set_param syntax to set
+					      *a proc parameters */
 };
 
 struct lustre_cfg_bufs {
diff --git a/drivers/staging/lustre/lustre/include/lustre_disk.h b/drivers/staging/lustre/lustre/include/lustre_disk.h
index 9228b16..e714fde 100644
--- a/drivers/staging/lustre/lustre/include/lustre_disk.h
+++ b/drivers/staging/lustre/lustre/include/lustre_disk.h
@@ -98,6 +98,8 @@
 #define LDD_F_IR_CAPABLE    0x2000
 /** the MGS refused to register the target. */
 #define LDD_F_ERROR	 0x4000
+/** process at lctl conf_param */
+#define LDD_F_PARAM2		0x8000
 
 /* opc for target register */
 #define LDD_F_OPC_REG   0x10000000
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 983718f..1c2ba19 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -175,9 +175,13 @@ enum {
 	CONFIG_T_CONFIG  = 0,
 	CONFIG_T_SPTLRPC = 1,
 	CONFIG_T_RECOVER = 2,
-	CONFIG_T_MAX     = 3
+	CONFIG_T_PARAMS  = 3,
+	CONFIG_T_MAX     = 4
 };
 
+#define PARAMS_FILENAME	"params"
+#define LCTL_UPCALL	"lctl"
+
 /* list of active configuration logs  */
 struct config_llog_data {
 	struct ldlm_res_id	  cld_resid;
@@ -185,7 +189,8 @@ struct config_llog_data {
 	struct list_head		  cld_list_chain;
 	atomic_t		cld_refcount;
 	struct config_llog_data    *cld_sptlrpc;/* depended sptlrpc log */
-	struct config_llog_data    *cld_recover;    /* imperative recover log */
+	struct config_llog_data	   *cld_params;	/* common parameters log */
+	struct config_llog_data    *cld_recover;/* imperative recover log */
 	struct obd_export	  *cld_mgcexp;
 	struct mutex		    cld_lock;
 	int			 cld_type;
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index d15f99c..93c67e5 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1058,7 +1058,7 @@ out_free:
 
 void ll_put_super(struct super_block *sb)
 {
-	struct config_llog_instance cfg;
+	struct config_llog_instance cfg, params_cfg;
 	struct obd_device *obd;
 	struct lustre_sb_info *lsi = s2lsi(sb);
 	struct ll_sb_info *sbi = ll_s2sbi(sb);
@@ -1072,6 +1072,9 @@ void ll_put_super(struct super_block *sb)
 	cfg.cfg_instance = sb;
 	lustre_end_log(sb, profilenm, &cfg);
 
+	params_cfg.cfg_instance = sb;
+	lustre_end_log(sb, PARAMS_FILENAME, &params_cfg);
+
 	if (sbi->ll_md_exp) {
 		obd = class_exp2obd(sbi->ll_md_exp);
 		if (obd)
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 9325f91..f4ecd29 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -56,7 +56,7 @@ static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
 {
 	__u64 resname = 0;
 
-	if (len > 8) {
+	if (len > sizeof(resname)) {
 		CERROR("name too long: %s\n", name);
 		return -EINVAL;
 	}
@@ -76,6 +76,7 @@ static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
 		resname = 0;
 		break;
 	case CONFIG_T_RECOVER:
+	case CONFIG_T_PARAMS:
 		resname = type;
 		break;
 	default:
@@ -101,10 +102,13 @@ int mgc_logname2resid(char *logname, struct ldlm_res_id *res_id, int type)
 	int len;
 
 	/* logname consists of "fsname-nodetype".
-	 * e.g. "lustre-MDT0001", "SUN-000-client" */
+	 * e.g. "lustre-MDT0001", "SUN-000-client"
+	 * there is an exception: llog "params" */
 	name_end = strrchr(logname, '-');
-	LASSERT(name_end);
-	len = name_end - logname;
+	if (!name_end)
+		len = strlen(logname);
+	else
+		len = name_end - logname;
 	return mgc_name2resid(logname, len, res_id, type);
 }
 
@@ -140,6 +144,8 @@ static void config_log_put(struct config_llog_data *cld)
 			config_log_put(cld->cld_recover);
 		if (cld->cld_sptlrpc)
 			config_log_put(cld->cld_sptlrpc);
+		if (cld->cld_params)
+			config_log_put(cld->cld_params);
 		if (cld_is_sptlrpc(cld))
 			sptlrpc_conf_log_stop(cld->cld_logname);
 
@@ -271,6 +277,19 @@ static struct config_llog_data *config_recover_log_add(struct obd_device *obd,
 	return cld;
 }
 
+static struct config_llog_data *config_params_log_add(struct obd_device *obd,
+	struct config_llog_instance *cfg, struct super_block *sb)
+{
+	struct config_llog_instance	lcfg = *cfg;
+	struct config_llog_data		*cld;
+
+	lcfg.cfg_instance = sb;
+
+	cld = do_config_log_add(obd, PARAMS_FILENAME, CONFIG_T_PARAMS,
+				&lcfg, sb);
+
+	return cld;
+}
 
 /** Add this log to the list of active logs watched by an MGC.
  * Active means we're watching for updates.
@@ -284,8 +303,10 @@ static int config_log_add(struct obd_device *obd, char *logname,
 	struct lustre_sb_info *lsi = s2lsi(sb);
 	struct config_llog_data *cld;
 	struct config_llog_data *sptlrpc_cld;
-	char		     seclogname[32];
-	char		    *ptr;
+	struct config_llog_data *params_cld;
+	char			seclogname[32];
+	char			*ptr;
+	int			rc;
 
 	CDEBUG(D_MGC, "adding config log %s:%p\n", logname, cfg->cfg_instance);
 
@@ -308,32 +329,49 @@ static int config_log_add(struct obd_device *obd, char *logname,
 						CONFIG_T_SPTLRPC, NULL, NULL);
 		if (IS_ERR(sptlrpc_cld)) {
 			CERROR("can't create sptlrpc log: %s\n", seclogname);
-			return PTR_ERR(sptlrpc_cld);
+			GOTO(out_err, rc = PTR_ERR(sptlrpc_cld));
 		}
 	}
+	params_cld = config_params_log_add(obd, cfg, sb);
+	if (IS_ERR(params_cld)) {
+		rc = PTR_ERR(params_cld);
+		CERROR("%s: can't create params log: rc = %d\n",
+		       obd->obd_name, rc);
+		GOTO(out_err1, rc);
+	}
 
 	cld = do_config_log_add(obd, logname, CONFIG_T_CONFIG, cfg, sb);
 	if (IS_ERR(cld)) {
 		CERROR("can't create log: %s\n", logname);
-		config_log_put(sptlrpc_cld);
-		return PTR_ERR(cld);
+		GOTO(out_err2, rc = PTR_ERR(cld));
 	}
 
 	cld->cld_sptlrpc = sptlrpc_cld;
+	cld->cld_params = params_cld;
 
 	LASSERT(lsi->lsi_lmd);
 	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)) {
 		struct config_llog_data *recover_cld;
 		*strrchr(seclogname, '-') = 0;
 		recover_cld = config_recover_log_add(obd, seclogname, cfg, sb);
-		if (IS_ERR(recover_cld)) {
-			config_log_put(cld);
-			return PTR_ERR(recover_cld);
-		}
+		if (IS_ERR(recover_cld))
+			GOTO(out_err3, rc = PTR_ERR(recover_cld));
 		cld->cld_recover = recover_cld;
 	}
 
 	return 0;
+
+out_err3:
+	config_log_put(cld);
+
+out_err2:
+	config_log_put(params_cld);
+
+out_err1:
+	config_log_put(sptlrpc_cld);
+
+out_err:
+	return rc;
 }
 
 DEFINE_MUTEX(llog_process_lock);
@@ -344,6 +382,7 @@ static int config_log_end(char *logname, struct config_llog_instance *cfg)
 {
 	struct config_llog_data *cld;
 	struct config_llog_data *cld_sptlrpc = NULL;
+	struct config_llog_data *cld_params = NULL;
 	struct config_llog_data *cld_recover = NULL;
 	int rc = 0;
 
@@ -382,11 +421,20 @@ static int config_log_end(char *logname, struct config_llog_instance *cfg)
 	spin_lock(&config_list_lock);
 	cld_sptlrpc = cld->cld_sptlrpc;
 	cld->cld_sptlrpc = NULL;
+	cld_params = cld->cld_params;
+	cld->cld_params = NULL;
 	spin_unlock(&config_list_lock);
 
 	if (cld_sptlrpc)
 		config_log_put(cld_sptlrpc);
 
+	if (cld_params) {
+		mutex_lock(&cld_params->cld_lock);
+		cld_params->cld_stopping = 1;
+		mutex_unlock(&cld_params->cld_lock);
+		config_log_put(cld_params);
+	}
+
 	/* drop the ref from the find */
 	config_log_put(cld);
 	/* drop the start ref */
@@ -1657,7 +1705,7 @@ static int mgc_process_cfg_log(struct obd_device *mgc,
 				LCONSOLE_ERROR_MSG(0x13a, "Failed to get MGS "
 						   "log %s and no local copy."
 						   "\n", cld->cld_logname);
-				GOTO(out_pop, rc = -ENOTCONN);
+				GOTO(out_pop, rc = -ENOENT);
 			}
 			CDEBUG(D_MGC, "Failed to get MGS log %s, using local "
 			       "copy for now, will try to update later.\n",
@@ -1856,6 +1904,19 @@ static int mgc_process_config(struct obd_device *obd, obd_count len, void *buf)
 			if (rc)
 				CERROR("Cannot process recover llog %d\n", rc);
 		}
+
+		if (rc == 0 && cld->cld_params != NULL) {
+			rc = mgc_process_log(obd, cld->cld_params);
+			if (rc == -ENOENT) {
+				CDEBUG(D_MGC, "There is no params"
+					      "config file yet\n");
+				rc = 0;
+			}
+			/* params log is optional */
+			if (rc)
+				CERROR("%s: can't process params llog: rc = %d\n",
+				       obd->obd_name, rc);
+		}
 		config_log_put(cld);
 
 		break;
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c b/drivers/staging/lustre/lustre/obdclass/obd_config.c
index 362ae54..abb519a 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
@@ -1027,6 +1027,45 @@ struct lustre_cfg *lustre_cfg_rename(struct lustre_cfg *cfg,
 }
 EXPORT_SYMBOL(lustre_cfg_rename);
 
+static int process_param2_config(struct lustre_cfg *lcfg)
+{
+	char *param = lustre_cfg_string(lcfg, 1);
+	char *upcall = lustre_cfg_string(lcfg, 2);
+	char *argv[] = {
+		[0] = "/usr/sbin/lctl",
+		[1] = "set_param",
+		[2] = param,
+		[3] = NULL
+	};
+	struct timeval	start;
+	struct timeval	end;
+	int		rc;
+
+
+	/* Add upcall processing here. Now only lctl is supported */
+	if (strcmp(upcall, LCTL_UPCALL) != 0) {
+		CERROR("Unsupported upcall %s\n", upcall);
+		return -EINVAL;
+	}
+
+	do_gettimeofday(&start);
+	rc = USERMODEHELPER(argv[0], argv, NULL);
+	do_gettimeofday(&end);
+
+	if (rc < 0) {
+		CERROR("lctl: error invoking upcall %s %s %s: rc = %d; "
+		       "time %ldus\n", argv[0], argv[1], argv[2], rc,
+		       cfs_timeval_sub(&end, &start, NULL));
+	} else {
+		CDEBUG(D_HA, "lctl: invoked upcall %s %s %s, time %ldus\n",
+		       argv[0], argv[1], argv[2],
+		       cfs_timeval_sub(&end, &start, NULL));
+		       rc = 0;
+	}
+
+	return rc;
+}
+
 void lustre_register_quota_process_config(int (*qpc)(struct lustre_cfg *lcfg))
 {
 	quota_process_config = qpc;
@@ -1142,11 +1181,14 @@ int class_process_config(struct lustre_cfg *lcfg)
 			err = (*quota_process_config)(lcfg);
 			GOTO(out, err);
 		}
-		/* Fall through */
+
 		break;
 	}
+	case LCFG_SET_PARAM: {
+		err = process_param2_config(lcfg);
+		GOTO(out, 0);
+	}
 	}
-
 	/* Commands that require a device */
 	obd = class_name2obd(lustre_cfg_string(lcfg, 0));
 	if (obd == NULL) {
@@ -1183,24 +1225,20 @@ int class_process_config(struct lustre_cfg *lcfg)
 	case LCFG_POOL_NEW: {
 		err = obd_pool_new(obd, lustre_cfg_string(lcfg, 2));
 		GOTO(out, err = 0);
-		break;
 	}
 	case LCFG_POOL_ADD: {
 		err = obd_pool_add(obd, lustre_cfg_string(lcfg, 2),
 				   lustre_cfg_string(lcfg, 3));
 		GOTO(out, err = 0);
-		break;
 	}
 	case LCFG_POOL_REM: {
 		err = obd_pool_rem(obd, lustre_cfg_string(lcfg, 2),
 				   lustre_cfg_string(lcfg, 3));
 		GOTO(out, err = 0);
-		break;
 	}
 	case LCFG_POOL_DEL: {
 		err = obd_pool_del(obd, lustre_cfg_string(lcfg, 2));
 		GOTO(out, err = 0);
-		break;
 	}
 	default: {
 		err = obd_process_config(obd, sizeof(*lcfg), lcfg);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 39/40] staging/lustre/utils: HSM Posix CopyTool
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (37 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 38/40] staging/lustre/mgs: set_param -P option that sets value permanently Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-14 16:13 ` [PATCH 40/40] staging/lustre/ptlrpc: flock deadlock detection does not work Peng Tao
  2013-11-15  3:59 ` [PATCH 00/40] staging/lustre: patch bomb 1 Greg Kroah-Hartman
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, JC Lafoucriere, Henri Doreau, Peng Tao, Andreas Dilger

From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>

POSIX HSM CopyTool utils named lhsmtool_posix.
This user space command is the 'glue" between Lustre-HSM
and a POSIX filesytem used as a backend.
The main functionalities implemented are:
daemon mode:
- archive: read in lustre write with POSIX backend
- restore: read in POSIX backend write to Lustre
- remove: remove an entry from backend
cmd line mode:
- import: create in lustre a released file from a backend file
- rebind: change the FID associated to a file in the backend
- maxseq: get the larger sequence of FID found in the backend

The 2 last options are used for disaster recovery mode

This tools is also used for all the non regression tests made in
sanity-hsm.sh

Lustre-change: http://review.whamcloud.com/4737
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2062
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |    9 ++---
 .../lustre/lustre/include/lustre/lustreapi.h       |   36 +++++++++++++-------
 2 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 49ab5be..c63a1ae 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -311,6 +311,9 @@ struct ost_id {
 #define LOV_ALL_STRIPES       0xffff /* only valid for directories */
 #define LOV_V1_INSANE_STRIPE_COUNT 65532 /* maximum stripe count bz13933 */
 
+#define XATTR_LUSTRE_PREFIX	"lustre."
+#define XATTR_LUSTRE_LOV	XATTR_LUSTRE_PREFIX"lov"
+
 #define lov_user_ost_data lov_user_ost_data_v1
 struct lov_user_ost_data_v1 {     /* per-stripe data structure */
 	struct ost_id l_ost_oi;	  /* OST object ID */
@@ -1158,12 +1161,6 @@ struct hsm_progress {
 	__u32			padding;
 };
 
-/**
- * Use by copytool during any hsm request they handled.
- * This structure is initialized by llapi_hsm_copy_start()
- * which is an helper over the ioctl() interface
- * Store Lustre, internal use only, data.
- */
 struct hsm_copy {
 	__u64			hc_data_version;
 	__u16			hc_flags;
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustreapi.h b/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
index 1748138..0bb3c9f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustreapi.h
@@ -219,8 +219,8 @@ extern int llapi_lmv_get_uuids(int fd, struct obd_uuid *uuidp, int *mdt_count);
 extern int llapi_is_lustre_mnttype(const char *type);
 extern int llapi_search_ost(char *fsname, char *poolname, char *ostname);
 extern int llapi_get_obd_count(char *mnt, int *count, int is_mdt);
-extern int parse_size(char *optarg, unsigned long long *size,
-		      unsigned long long *size_units, int bytes_spec);
+extern int llapi_parse_size(const char *optarg, unsigned long long *size,
+			    unsigned long long *size_units, int bytes_spec);
 extern int llapi_search_mounts(const char *pathname, int index,
 			       char *mntdir, char *fsname);
 extern int llapi_search_fsname(const char *pathname, char *fsname);
@@ -298,18 +298,27 @@ extern int llapi_changelog_clear(const char *mdtname, const char *idstr,
  * priv is private state, managed internally by these functions
  */
 struct hsm_copytool_private;
-extern int llapi_hsm_copytool_start(struct hsm_copytool_private **priv,
-				    char *fsname, int flags,
-				    int archive_count, int *archives);
-extern int llapi_hsm_copytool_fini(struct hsm_copytool_private **priv);
+struct hsm_copyaction_private;
+
+extern int llapi_hsm_copytool_register(struct hsm_copytool_private **priv,
+				       const char *mnt, int flags,
+				       int archive_count, int *archives);
+extern int llapi_hsm_copytool_unregister(struct hsm_copytool_private **priv);
 extern int llapi_hsm_copytool_recv(struct hsm_copytool_private *priv,
 				   struct hsm_action_list **hal, int *msgsize);
-extern int llapi_hsm_copytool_free(struct hsm_action_list **hal);
-extern int llapi_hsm_copy_start(char *mnt, struct hsm_copy *copy,
-				const struct hsm_action_item *hai);
-extern int llapi_hsm_copy_end(char *mnt, struct hsm_copy *copy,
-			      const struct hsm_progress *hp);
-extern int llapi_hsm_progress(char *mnt, struct hsm_progress *hp);
+extern void llapi_hsm_action_list_free(struct hsm_action_list **hal);
+extern int llapi_hsm_action_begin(struct hsm_copyaction_private **hcp,
+				  const struct hsm_copytool_private *ct_priv,
+				  const struct hsm_action_item *hai,
+				  bool is_error);
+extern int llapi_hsm_action_end(struct hsm_copyaction_private **hcp,
+				const struct hsm_extent *he, int flags,
+				int errval);
+extern int llapi_hsm_action_progress(struct hsm_copyaction_private *hcp,
+				     const struct hsm_extent *he, int hp_flags);
+extern int llapi_hsm_action_get_dfid(const struct hsm_copyaction_private *hcp,
+				     lustre_fid *fid);
+extern int llapi_hsm_action_get_fd(const struct hsm_copyaction_private *hcp);
 extern int llapi_hsm_import(const char *dst, int archive, const struct stat *st,
 			    unsigned long long stripe_size, int stripe_offset,
 			    int stripe_count, int stripe_pattern,
@@ -318,7 +327,8 @@ extern int llapi_hsm_import(const char *dst, int archive, const struct stat *st,
 /* HSM user interface */
 extern struct hsm_user_request *llapi_hsm_user_request_alloc(int itemcount,
 							     int data_len);
-extern int llapi_hsm_request(char *mnt, struct hsm_user_request *request);
+extern int llapi_hsm_request(const char *path,
+			     const struct hsm_user_request *request);
 extern int llapi_hsm_current_action(const char *path,
 				    struct hsm_current_action *hca);
 /** @} llapi */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 40/40] staging/lustre/ptlrpc: flock deadlock detection does not work
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (38 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 39/40] staging/lustre/utils: HSM Posix CopyTool Peng Tao
@ 2013-11-14 16:13 ` Peng Tao
  2013-11-15  3:59 ` [PATCH 00/40] staging/lustre: patch bomb 1 Greg Kroah-Hartman
  40 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Andriy Skulysh, Peng Tao, Andreas Dilger

From: Andriy Skulysh <Andriy_Skulysh@xyratex.com>

Flock deadlocks are checked on the first attempt to grant
the flock only. If we try again to grant it after its
blocking lock is cancelled, we don't check for deadlocks
which also may exist.

Perform deadlock detection during reprocess

Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1715
Lustre-change: http://review.whamcloud.com/3553
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    5 ++-
 .../lustre/lustre/include/lustre_dlm_flags.h       |   14 +++++--
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   41 ++++++++++++++++++--
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |    1 +
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |    2 +
 6 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 9d3bc29..7bdb24f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1292,6 +1292,8 @@ extern void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 #define OBD_CONNECT_LIGHTWEIGHT 0x1000000000000ULL/* lightweight connection */
 #define OBD_CONNECT_SHORTIO     0x2000000000000ULL/* short io */
 #define OBD_CONNECT_PINGLESS	0x4000000000000ULL/* pings not required */
+#define OBD_CONNECT_FLOCK_DEAD	0x8000000000000ULL/* improved flock deadlock detection */
+
 /* XXX README XXX:
  * Please DO NOT add flag values here before first ensuring that this same
  * flag value is not in use on some other branch.  Please clear any such
@@ -1329,7 +1331,8 @@ extern void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 				OBD_CONNECT_EINPROGRESS | \
 				OBD_CONNECT_LIGHTWEIGHT | OBD_CONNECT_UMASK | \
 				OBD_CONNECT_LVB_TYPE | OBD_CONNECT_LAYOUTLOCK |\
-				OBD_CONNECT_PINGLESS | OBD_CONNECT_MAX_EASIZE)
+				OBD_CONNECT_PINGLESS | OBD_CONNECT_MAX_EASIZE |\
+				OBD_CONNECT_FLOCK_DEAD)
 #define OST_CONNECT_SUPPORTED  (OBD_CONNECT_SRVLOCK | OBD_CONNECT_GRANT | \
 				OBD_CONNECT_REQPORTAL | OBD_CONNECT_VERSION | \
 				OBD_CONNECT_TRUNCLOCK | OBD_CONNECT_INDEX | \
diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h b/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
index a632217..855e18f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm_flags.h
@@ -35,10 +35,10 @@
 #ifndef LDLM_ALL_FLAGS_MASK
 
 /** l_flags bits marked as "all_flags" bits */
-#define LDLM_FL_ALL_FLAGS_MASK          0x00FFFFFFC08F132FULL
+#define LDLM_FL_ALL_FLAGS_MASK          0x00FFFFFFC08F932FULL
 
 /** l_flags bits marked as "ast" bits */
-#define LDLM_FL_AST_MASK                0x0000000080000000ULL
+#define LDLM_FL_AST_MASK                0x0000000080008000ULL
 
 /** l_flags bits marked as "blocked" bits */
 #define LDLM_FL_BLOCKED_MASK            0x000000000000000EULL
@@ -56,7 +56,7 @@
 #define LDLM_FL_LOCAL_ONLY_MASK         0x00FFFFFF00000000ULL
 
 /** l_flags bits marked as "on_wire" bits */
-#define LDLM_FL_ON_WIRE_MASK            0x00000000C08F132FULL
+#define LDLM_FL_ON_WIRE_MASK            0x00000000C08F932FULL
 
 /** extent, mode, or resource changed */
 #define LDLM_FL_LOCK_CHANGED            0x0000000000000001ULL // bit   0
@@ -114,6 +114,12 @@
 #define ldlm_set_has_intent(_l)         LDLM_SET_FLAG((  _l), 1ULL << 12)
 #define ldlm_clear_has_intent(_l)       LDLM_CLEAR_FLAG((_l), 1ULL << 12)
 
+/** flock deadlock detected */
+#define LDLM_FL_FLOCK_DEADLOCK          0x0000000000008000ULL // bit  15
+#define ldlm_is_flock_deadlock(_l)      LDLM_TEST_FLAG(( _l), 1ULL << 15)
+#define ldlm_set_flock_deadlock(_l)     LDLM_SET_FLAG((  _l), 1ULL << 15)
+#define ldlm_clear_flock_deadlock(_l)   LDLM_CLEAR_FLAG((_l), 1ULL << 15)
+
 /** discard (no writeback) on cancel */
 #define LDLM_FL_DISCARD_DATA            0x0000000000010000ULL // bit  16
 #define ldlm_is_discard_data(_l)        LDLM_TEST_FLAG(( _l), 1ULL << 16)
@@ -390,6 +396,7 @@ static int hf_lustre_ldlm_fl_ast_sent            = -1;
 static int hf_lustre_ldlm_fl_replay              = -1;
 static int hf_lustre_ldlm_fl_intent_only         = -1;
 static int hf_lustre_ldlm_fl_has_intent          = -1;
+static int hf_lustre_ldlm_fl_flock_deadlock      = -1;
 static int hf_lustre_ldlm_fl_discard_data        = -1;
 static int hf_lustre_ldlm_fl_no_timeout          = -1;
 static int hf_lustre_ldlm_fl_block_nowait        = -1;
@@ -431,6 +438,7 @@ const value_string lustre_ldlm_flags_vals[] = {
   {LDLM_FL_REPLAY,              "LDLM_FL_REPLAY"},
   {LDLM_FL_INTENT_ONLY,         "LDLM_FL_INTENT_ONLY"},
   {LDLM_FL_HAS_INTENT,          "LDLM_FL_HAS_INTENT"},
+  {LDLM_FL_FLOCK_DEADLOCK,      "LDLM_FL_FLOCK_DEADLOCK"},
   {LDLM_FL_DISCARD_DATA,        "LDLM_FL_DISCARD_DATA"},
   {LDLM_FL_NO_TIMEOUT,          "LDLM_FL_NO_TIMEOUT"},
   {LDLM_FL_BLOCK_NOWAIT,        "LDLM_FL_BLOCK_NOWAIT"},
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 182773b..6324eff 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -205,6 +205,26 @@ ldlm_flock_deadlock(struct ldlm_lock *req, struct ldlm_lock *bl_lock)
 	return 0;
 }
 
+static void ldlm_flock_cancel_on_deadlock(struct ldlm_lock *lock,
+					  struct list_head *work_list)
+{
+	CDEBUG(D_INFO, "reprocess deadlock req=%p\n", lock);
+
+	if ((exp_connect_flags(lock->l_export) &
+				OBD_CONNECT_FLOCK_DEAD) == 0) {
+		CERROR("deadlock found, but client doesn't "
+				"support flock canceliation\n");
+	} else {
+		LASSERT(lock->l_completion_ast);
+		LASSERT((lock->l_flags & LDLM_FL_AST_SENT) == 0);
+		lock->l_flags |= LDLM_FL_AST_SENT | LDLM_FL_CANCEL_ON_BLOCK |
+			LDLM_FL_FLOCK_DEADLOCK;
+		ldlm_flock_blocking_unlink(lock);
+		ldlm_resource_unlink_lock(lock);
+		ldlm_add_ast_work_item(lock, NULL, work_list);
+	}
+}
+
 /**
  * Process a granting attempt for flock lock.
  * Must be called under ns lock held.
@@ -272,6 +292,7 @@ reprocess:
 			}
 		}
 	} else {
+		int reprocess_failed = 0;
 		lockmode_verify(mode);
 
 		/* This loop determines if there are existing locks
@@ -293,8 +314,15 @@ reprocess:
 			if (!ldlm_flocks_overlap(lock, req))
 				continue;
 
-			if (!first_enq)
-				return LDLM_ITER_CONTINUE;
+			if (!first_enq) {
+				reprocess_failed = 1;
+				if (ldlm_flock_deadlock(req, lock)) {
+					ldlm_flock_cancel_on_deadlock(req,
+							work_list);
+					return LDLM_ITER_CONTINUE;
+				}
+				continue;
+			}
 
 			if (*flags & LDLM_FL_BLOCK_NOWAIT) {
 				ldlm_flock_destroy(req, mode, *flags);
@@ -330,6 +358,8 @@ reprocess:
 			*flags |= LDLM_FL_BLOCK_GRANTED;
 			return LDLM_ITER_STOP;
 		}
+		if (reprocess_failed)
+			return LDLM_ITER_CONTINUE;
 	}
 
 	if (*flags & LDLM_FL_TEST_LOCK) {
@@ -646,7 +676,10 @@ granted:
 	/* ldlm_lock_enqueue() has already placed lock on the granted list. */
 	list_del_init(&lock->l_res_link);
 
-	if (flags & LDLM_FL_TEST_LOCK) {
+	if (lock->l_flags & LDLM_FL_FLOCK_DEADLOCK) {
+		LDLM_DEBUG(lock, "client-side enqueue deadlock received");
+		rc = -EDEADLK;
+	} else if (flags & LDLM_FL_TEST_LOCK) {
 		/* fcntl(F_GETLK) request */
 		/* The old mode was saved in getlk->fl_type so that if the mode
 		 * in the lock changes we can decref the appropriate refcount.*/
@@ -675,7 +708,7 @@ granted:
 		ldlm_process_flock_lock(lock, &noreproc, 1, &err, NULL);
 	}
 	unlock_res_and_lock(lock);
-	return 0;
+	return rc;
 }
 EXPORT_SYMBOL(ldlm_flock_completion_ast);
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 93c67e5..18e3db2 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -205,7 +205,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_EINPROGRESS |
 				  OBD_CONNECT_JOBSTATS | OBD_CONNECT_LVB_TYPE |
 				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS |
-				  OBD_CONNECT_MAX_EASIZE;
+				  OBD_CONNECT_MAX_EASIZE |
+				  OBD_CONNECT_FLOCK_DEAD;
 
 	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
 		data->ocd_connect_flags |= OBD_CONNECT_SOM;
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 9008861..3329055 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -531,6 +531,7 @@ static const char *obd_connect_names[] = {
 	"lightweight_conn",
 	"short_io",
 	"pingless",
+	"flock_deadlock",
 	"unknown",
 	NULL
 };
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 6c31947..f7af706 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -1153,6 +1153,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT_SHORTIO);
 	LASSERTF(OBD_CONNECT_PINGLESS == 0x4000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_PINGLESS);
+	LASSERTF(OBD_CONNECT_FLOCK_DEAD == 0x8000000000000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT_FLOCK_DEAD);
 	LASSERTF(OBD_CKSUM_CRC32 == 0x00000001UL, "found 0x%.8xUL\n",
 		(unsigned)OBD_CKSUM_CRC32);
 	LASSERTF(OBD_CKSUM_ADLER == 0x00000002UL, "found 0x%.8xUL\n",
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-14 16:13 ` [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
@ 2013-11-14 16:22   ` Cheng Shao
  2013-11-14 16:28     ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Cheng Shao @ 2013-11-14 16:22 UTC (permalink / raw)
  To: Peng Tao
  Cc: Greg Kroah-Hartman, linux-kernel, Patrick Farrell, Peng Tao,
	Andreas Dilger

Hi Peng,

This patch was eventually reverted as it was causing interoperability issues with 2.1 Lustre server. We are working on the new patch, so please don't include this one.

Thanks,
- Cheng

> On Nov 14, 2013, at 8:13 AM, Peng Tao <bergwolf@gmail.com> wrote:
> 
> From: Patrick Farrell <paf@cray.com>
> 
> This happend with SLES11SP2 Lustre client, which in turn acts as an
> NFS server, exporting a subtree of an Lustre fs through NFS.
> 
> We detected that whenever we are writing to a new file using, fx,
> 'echo blah > newfile', it will return ENOENT error. We found
> out that this was caused by the anonymous dentry. In SLESS11SP2,
> anonymous dentries are assigned '/' as the name, instead of an
> empty string. When MDT handles the intent_open call, it will look
> up the obj by the name if it is not an empty string, and thus
> couldn't find it.
> 
> As MDS_OPEN_BY_FID is always set on this request, we never need
> to send the name in this request.  The fid is already available
> and should be used in case the file has been renamed.
> 
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
> Lustre-change: http://review.whamcloud.com/6920
> Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
> Signed-off-by: Patrick Farrell <paf@cray.com>
> Reviewed-by: Bob Glossman <bob.glossman@intel.com>
> Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
> Reviewed-by: Lai Siyao <lai.siyao@intel.com>
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: Peng Tao <bergwolf@gmail.com>
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> ---
> drivers/staging/lustre/lustre/llite/file.c |    5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
> index f97c600..efc614b 100644
> --- a/drivers/staging/lustre/lustre/llite/file.c
> +++ b/drivers/staging/lustre/lustre/llite/file.c
> @@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
> {
>    struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
>    struct dentry *parent = file->f_dentry->d_parent;
> -    const char *name = file->f_dentry->d_name.name;
> -    const int len = file->f_dentry->d_name.len;
>    struct md_op_data *op_data;
>    struct ptlrpc_request *req;
>    __u32 opc = LUSTRE_OPC_ANY;
> @@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
>    }
> 
>    op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
> -                      file->f_dentry->d_inode, name, len,
> +                      file->f_dentry->d_inode, NULL, 0,
>                      O_RDWR, opc, NULL);
> +
>    if (IS_ERR(op_data))
>        return PTR_ERR(op_data);
> 
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-14 16:22   ` Cheng Shao
@ 2013-11-14 16:28     ` Peng Tao
  2013-11-15  4:01       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-14 16:28 UTC (permalink / raw)
  To: Cheng Shao
  Cc: Greg Kroah-Hartman, linux-kernel, Patrick Farrell, Andreas Dilger

On Fri, Nov 15, 2013 at 12:22 AM, Cheng Shao <cheng_shao@xyratex.com> wrote:
> Hi Peng,
>
> This patch was eventually reverted as it was causing interoperability issues with 2.1 Lustre server. We are working on the new patch, so please don't include this one.
>
Thanks for the notification. I didn't see the reverting patch in next
200 patches in Lustre tree so it must have happened recently.

Greg, please drop this one. I verified that the rest of the patchset
can be applied cleanly.

Thanks,
Tao

> Thanks,
> - Cheng
>
>> On Nov 14, 2013, at 8:13 AM, Peng Tao <bergwolf@gmail.com> wrote:
>>
>> From: Patrick Farrell <paf@cray.com>
>>
>> This happend with SLES11SP2 Lustre client, which in turn acts as an
>> NFS server, exporting a subtree of an Lustre fs through NFS.
>>
>> We detected that whenever we are writing to a new file using, fx,
>> 'echo blah > newfile', it will return ENOENT error. We found
>> out that this was caused by the anonymous dentry. In SLESS11SP2,
>> anonymous dentries are assigned '/' as the name, instead of an
>> empty string. When MDT handles the intent_open call, it will look
>> up the obj by the name if it is not an empty string, and thus
>> couldn't find it.
>>
>> As MDS_OPEN_BY_FID is always set on this request, we never need
>> to send the name in this request.  The fid is already available
>> and should be used in case the file has been renamed.
>>
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3544
>> Lustre-change: http://review.whamcloud.com/6920
>> Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
>> Signed-off-by: Patrick Farrell <paf@cray.com>
>> Reviewed-by: Bob Glossman <bob.glossman@intel.com>
>> Reviewed-by: Alexey Shvetsov <alexxy@gentoo.org>
>> Reviewed-by: Lai Siyao <lai.siyao@intel.com>
>> Reviewed-by: James Simmons <uja.ornl@gmail.com>
>> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
>> Signed-off-by: Peng Tao <bergwolf@gmail.com>
>> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
>> ---
>> drivers/staging/lustre/lustre/llite/file.c |    5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
>> index f97c600..efc614b 100644
>> --- a/drivers/staging/lustre/lustre/llite/file.c
>> +++ b/drivers/staging/lustre/lustre/llite/file.c
>> @@ -368,8 +368,6 @@ static int ll_intent_file_open(struct file *file, void *lmm,
>> {
>>    struct ll_sb_info *sbi = ll_i2sbi(file->f_dentry->d_inode);
>>    struct dentry *parent = file->f_dentry->d_parent;
>> -    const char *name = file->f_dentry->d_name.name;
>> -    const int len = file->f_dentry->d_name.len;
>>    struct md_op_data *op_data;
>>    struct ptlrpc_request *req;
>>    __u32 opc = LUSTRE_OPC_ANY;
>> @@ -394,8 +392,9 @@ static int ll_intent_file_open(struct file *file, void *lmm,
>>    }
>>
>>    op_data  = ll_prep_md_op_data(NULL, parent->d_inode,
>> -                      file->f_dentry->d_inode, name, len,
>> +                      file->f_dentry->d_inode, NULL, 0,
>>                      O_RDWR, opc, NULL);
>> +
>>    if (IS_ERR(op_data))
>>        return PTR_ERR(op_data);
>>
>> --
>> 1.7.9.5
>>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 00/40] staging/lustre: patch bomb 1
  2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
                   ` (39 preceding siblings ...)
  2013-11-14 16:13 ` [PATCH 40/40] staging/lustre/ptlrpc: flock deadlock detection does not work Peng Tao
@ 2013-11-15  3:59 ` Greg Kroah-Hartman
  2013-11-15  9:51   ` Peng Tao
  40 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15  3:59 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, Andreas Dilger

On Fri, Nov 15, 2013 at 12:13:02AM +0800, Peng Tao wrote:
> [sadly git-send-email died sending the patchset... so resending]
> 
> Hi Greg,
> 
> Here are 40 patches. Three cleanup patches and 37 patches ported
> from Lustre master to commit
> LU-2940 llite: Fix for oops in vvp_pgcache_show()
> 
> There are two more patchsets that I will send out following up.
> 
> Thanks,
> Tao
> 
> Cc: Andreas Dilger <andreas.dilger@intel.com>
> 
> Amir Shehata (2):
>   staging/lustre/lnet: Fix assert on empty group in selftest module
>   staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer
> 
> Andreas Dilger (2):
>   staging/lustre/ldlm: fix resource/fid check, use DLDLMRES
>   staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h
> 
> Andrew Perepechko (1):
>   staging/lustre/llite: extended attribute cache
> 
> Andriy Skulysh (2):
>   staging/lustre/ptlrpc: Fix race during exp_flock_hash creation
>   staging/lustre/ptlrpc: flock deadlock detection does not work
> 
> Artem Blagodarenko (1):
>   staging/lustre/mgs: set_param -P option that sets value permanently
> 
> Dmitry Eremin (2):
>   staging/lustre/build: clean up unused variables and dead code
>   staging/lustre/build: fix compilation issue with is_compat_task
> 
> Doug Oucharek (1):
>   staging/lustre/lnet: Add LNet Router Priority parameter
> 
> Fan Yong (2):
>   staging/lustre/scrub: OI scrub on OST
>   staging/lustre/scrub: control OI scrub on OST from user space
> 
> JC Lafoucriere (5):
>   staging/lustre/llite: Access to released file trigs a restore
>   staging/lustre/mdt: HSM coordinator client interface
>   staging/lustre/mdt: HSM coordinator agent interface
>   staging/lustre/api: HSM import uses new released pattern
>   staging/lustre/utils: HSM Posix CopyTool
> 
> James Simmons (2):
>   staging/lustre/autoconf: remove vectored fops tests
>   staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test
> 
> Jinshan Xiong (2):
>   staging/lustre/hsm: Implementation of exclusive open
>   staging/lustre/hsm: Add hsm_release feature.
> 
> John L. Hammond (7):
>   staging/lustre: validate open handle cookies
>   staging/lustre/llite: use correct FID in ll_och_fill()
>   staging/lustre/lov: convert magic to host-endian in lov_dump_lmm()
>   staging/lustre/mdc: prevent fall through in mdc_iocontrol()
>   staging/lustre/lu: shrink lu_object by 8 bytes on x86_64
>   staging/lustre/llite: don't check for O_CREAT in it_create_mode
>   staging/lustre/llite: pass correct pointer to obd_iocontrol()
> 
> Lai Siyao (1):
>   staging/lustre/llite: remove ll_d_root_ops
> 
> Li Xi (1):
>   staging/lustre/llog: fix return value of llog_alloc_handle
> 
> Mikhail Pershin (4):
>   staging/lustre/server: use unified request handler for MGS
>   staging/lustre/llog: MGC to use OSD API for backup logs
>   staging/lustre/target: move OUT to the unified target code
>   staging/lustre/seq: unified SEQ handler
> 
> Nathaniel Clark (1):
>   staging/lustre/xattr: separate ACL and XATTR caches
> 
> Patrick Farrell (1):
>   staging/lustre/nfs: writing to new files will return ENOENT
> 
> Peng Tao (3):
>   staging/lustre/llite: restore ll_fiemap
>   staging/lustre: remove lu_target.h
>   staging/lustre: remove llog_server.c
> 
>  .../staging/lustre/include/linux/libcfs/curproc.h  |    1 -
>  .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
>  .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
>  .../staging/lustre/include/linux/lnet/lib-lnet.h   |    5 +-
>  .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +-
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |    5 +-
>  drivers/staging/lustre/lnet/lnet/config.c          |   39 +-
>  drivers/staging/lustre/lnet/lnet/lib-move.c        |    6 +
>  drivers/staging/lustre/lnet/lnet/router.c          |   19 +-
>  drivers/staging/lustre/lnet/lnet/router_proc.c     |   16 +-
>  drivers/staging/lustre/lnet/selftest/conctl.c      |   56 +-
>  drivers/staging/lustre/lnet/selftest/conrpc.c      |    2 +-
>  drivers/staging/lustre/lnet/selftest/console.c     |  105 ++--
>  drivers/staging/lustre/lnet/selftest/console.h     |    6 +-
>  drivers/staging/lustre/lnet/selftest/rpc.c         |    2 -
>  drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
>  drivers/staging/lustre/lnet/selftest/timer.c       |    6 +-
>  drivers/staging/lustre/lustre/include/cl_object.h  |    6 +-
>  drivers/staging/lustre/lustre/include/dt_object.h  |    2 +-
>  .../lustre/lustre/include/linux/lustre_compat25.h  |    4 +-
>  .../lustre/lustre/include/linux/lustre_intent.h    |    2 +-
>  .../lustre/lustre/include/linux/lustre_lite.h      |    1 +
>  drivers/staging/lustre/lustre/include/lu_object.h  |   19 -
>  drivers/staging/lustre/lustre/include/lu_target.h  |   91 ---
>  .../lustre/lustre/include/lustre/lustre_idl.h      |  111 ++--
>  .../lustre/lustre/include/lustre/lustre_user.h     |   63 +-
>  .../lustre/lustre/include/lustre/lustreapi.h       |  208 ++++---
>  drivers/staging/lustre/lustre/include/lustre_cfg.h |    2 +
>  .../staging/lustre/lustre/include/lustre_disk.h    |    2 +
>  .../lustre/lustre/include/lustre_dlm_flags.h       |   24 +-
>  drivers/staging/lustre/lustre/include/lustre_fid.h |    6 -
>  drivers/staging/lustre/lustre/include/lustre_ha.h  |    3 -
>  .../staging/lustre/lustre/include/lustre_handles.h |    5 +-
>  drivers/staging/lustre/lustre/include/lustre_lib.h |   11 +-
>  drivers/staging/lustre/lustre/include/lustre_log.h |   13 +-
>  drivers/staging/lustre/lustre/include/lustre_net.h |   19 +-
>  .../lustre/lustre/include/lustre_req_layout.h      |    7 +
>  drivers/staging/lustre/lustre/include/md_object.h  |    4 +-
>  drivers/staging/lustre/lustre/include/obd.h        |   15 +-
>  drivers/staging/lustre/lustre/include/obd_class.h  |    9 +-
>  .../staging/lustre/lustre/include/obd_support.h    |   10 +
>  drivers/staging/lustre/lustre/lclient/lcommon_cl.c |    6 +
>  .../staging/lustre/lustre/lclient/lcommon_misc.c   |    4 +-
>  drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   69 ++-
>  drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    7 +-
>  drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   47 +-
>  drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |   39 +-
>  drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |   19 +-
>  .../lustre/lustre/libcfs/linux/linux-curproc.c     |   13 -
>  drivers/staging/lustre/lustre/libcfs/workitem.c    |   55 +-
>  drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
>  drivers/staging/lustre/lustre/llite/dcache.c       |   77 +--
>  drivers/staging/lustre/lustre/llite/dir.c          |   47 +-
>  drivers/staging/lustre/lustre/llite/file.c         |  622 ++++++++++++++++++--
>  .../staging/lustre/lustre/llite/llite_internal.h   |   59 +-
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |   91 ++-
>  drivers/staging/lustre/lustre/llite/llite_nfs.c    |    6 +-
>  drivers/staging/lustre/lustre/llite/lproc_llite.c  |   36 ++
>  drivers/staging/lustre/lustre/llite/namei.c        |   53 +-
>  drivers/staging/lustre/lustre/llite/statahead.c    |    7 +-
>  drivers/staging/lustre/lustre/llite/super25.c      |    4 +
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |   54 +-
>  drivers/staging/lustre/lustre/llite/vvp_object.c   |    2 +-
>  drivers/staging/lustre/lustre/llite/xattr.c        |  105 ++--
>  drivers/staging/lustre/lustre/llite/xattr_cache.c  |  559 ++++++++++++++++++
>  .../staging/lustre/lustre/lov/lov_cl_internal.h    |   16 +
>  drivers/staging/lustre/lustre/lov/lov_internal.h   |    4 +-
>  drivers/staging/lustre/lustre/lov/lov_io.c         |   15 +-
>  drivers/staging/lustre/lustre/lov/lov_object.c     |   35 +-
>  drivers/staging/lustre/lustre/lov/lov_pack.c       |   20 +-
>  drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
>  drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   31 +-
>  drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  104 +++-
>  drivers/staging/lustre/lustre/mdc/mdc_reint.c      |    2 +-
>  drivers/staging/lustre/lustre/mdc/mdc_request.c    |  116 ++--
>  drivers/staging/lustre/lustre/mgc/libmgc.c         |    3 -
>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  492 ++++++++++------
>  drivers/staging/lustre/lustre/obdclass/genops.c    |    2 +-
>  drivers/staging/lustre/lustre/obdclass/llog.c      |  214 ++++---
>  .../staging/lustre/lustre/obdclass/local_storage.c |    9 +-
>  .../staging/lustre/lustre/obdclass/local_storage.h |    3 +
>  .../lustre/lustre/obdclass/lprocfs_status.c        |    5 +-
>  drivers/staging/lustre/lustre/obdclass/lu_object.c |   26 +-
>  .../lustre/lustre/obdclass/lustre_handles.c        |    4 +-
>  drivers/staging/lustre/lustre/obdclass/md_attrs.c  |    4 +-
>  .../staging/lustre/lustre/obdclass/obd_config.c    |   50 +-
>  drivers/staging/lustre/lustre/obdclass/obd_mount.c |    3 +
>  .../staging/lustre/lustre/obdecho/echo_client.c    |    2 +-
>  drivers/staging/lustre/lustre/ptlrpc/Makefile      |    2 +-
>  drivers/staging/lustre/lustre/ptlrpc/client.c      |   10 +-
>  drivers/staging/lustre/lustre/ptlrpc/import.c      |    2 -
>  drivers/staging/lustre/lustre/ptlrpc/layout.c      |   63 +-
>  drivers/staging/lustre/lustre/ptlrpc/llog_server.c |  450 --------------
>  .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c    |    4 +-
>  drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   16 +-
>  .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   59 +-
>  drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   65 --
>  drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |   44 +-
>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   60 +-
>  99 files changed, 3043 insertions(+), 1793 deletions(-)
>  delete mode 100644 drivers/staging/lustre/lustre/include/lu_target.h
>  create mode 100644 drivers/staging/lustre/lustre/llite/xattr_cache.c
>  delete mode 100644 drivers/staging/lustre/lustre/ptlrpc/llog_server.c

Some of the stuff here looks a lot like "new features / functionality"
to me, insead of just cleanups and fixes.

Ideally, you all would only work on cleanups and getting this code out
of staging, and not add new things while it's sitting here.  But I know
that's going to be hard as you all are still working with an external
tree.

That external tree is going to be tough to keep up over time, I
_strongly_ suggest you drop it as soon as possible and only use the
upstream tree.  Otherwise this whole thing really isn't going to work
out at all.

I'll go dig through these and see what they look like, thanks...

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT
  2013-11-14 16:28     ` Peng Tao
@ 2013-11-15  4:01       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15  4:01 UTC (permalink / raw)
  To: Peng Tao; +Cc: Cheng Shao, linux-kernel, Patrick Farrell, Andreas Dilger

On Fri, Nov 15, 2013 at 12:28:20AM +0800, Peng Tao wrote:
> On Fri, Nov 15, 2013 at 12:22 AM, Cheng Shao <cheng_shao@xyratex.com> wrote:
> > Hi Peng,
> >
> > This patch was eventually reverted as it was causing interoperability issues with 2.1 Lustre server. We are working on the new patch, so please don't include this one.
> >
> Thanks for the notification. I didn't see the reverting patch in next
> 200 patches in Lustre tree so it must have happened recently.
> 
> Greg, please drop this one. I verified that the rest of the patchset
> can be applied cleanly.

Now dropped, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-14 16:13 ` [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore Peng Tao
@ 2013-11-15  4:09   ` Greg Kroah-Hartman
  2013-11-15  9:55     ` Peng Tao
  2013-11-16 10:36     ` Dilger, Andreas
  0 siblings, 2 replies; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15  4:09 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, JC Lafoucriere, Andreas Dilger

On Fri, Nov 15, 2013 at 12:13:06AM +0800, Peng Tao wrote:
> From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
> 
> When a client accesses data in a released file,
> or truncate it, client must trig a restore request.
> During this restore, the client must not glimpse and
> must use size from MDT. To bring the "restore is running"
> information on the client we add a new t_state bit field
> to mdt_info which will be used to carry transient file state.
> To memorise this information in the inode we add a new flag
> LLIF_FILE_RESTORING.

This patch also does other things not mentioned here (coding style
cleanups), which isn't allowed in a single patch (only do one thing per
patch, and never not document what you are doing...)

It also adds checkpatch warnings, which I will not accept in patches at
all here.  People are spending a lot of time cleaning up the coding
style issues, please NEVER add new ones, that just causes more work to
be needed to be done, and for people to have to go back and reclean
files they have already cleaned up.

So, sorry, I have to stop here at this series.  I've applied the first 3
to the opw-next branch of staging.git so they can live somewhere until
3.13-rc1 is out.

I know you spent a lot of time making these 120 patches to send me, but
that too is crazy.  You shouldn't wait that long to get feedback and
send patches to me at all.  Please send them in smaller series, with
less time between patch submissions.

So, care to just send me 10 patches or so now, which I can review and
accept if good, and we can sync up and continue from there?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-14 16:13 ` [PATCH 05/40] staging/lustre: validate open handle cookies Peng Tao
@ 2013-11-15  4:13   ` Greg Kroah-Hartman
  2013-11-15 10:22     ` Peng Tao
  2013-11-16 11:20     ` Dilger, Andreas
  0 siblings, 2 replies; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15  4:13 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, John L. Hammond, Andreas Dilger

On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
> From: "John L. Hammond" <john.hammond@intel.com>
> 
> Add a const void *h_owner member to struct portals_handle. Add a const
> void *owner parameter to class_handle2object() which must be matched
> by the h_owner member of the handle in addition to the cookie.

Ick ick ick.

NEVER use a void pointer if you can help it, and for a "handle", never.
This isn't other operating systems, sorry.  We know what types our
pointers to structures are, use them, so that the compiler can catch our
problems, and don't try to cheat by using void *.

> Adjust
> the callers of class_handle2object() accordingly, using NULL as the
> argument to the owner parameter, except in the case of
> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
> which we use as the owner when searching for a MFD. When allocating a
> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
> store it in h_owner. This allows the MDT to validate that the client
> has not sent the wrong open handle cookie, or sent the right cookie to
> the wrong MDT.

This changelog entry doesn't even match up with the code below.  ALl
callers of class_handle2object are passing NULL here, which makes this
patch pretty pointless, right?

And that's a _very_ generic global symbol name, please don't do that, it
needs to be "lustre_*" at the front to even expect it to be acceptable.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open
  2013-11-14 16:13 ` [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open Peng Tao
@ 2013-11-15  4:17   ` Greg Kroah-Hartman
  2013-11-15 10:26     ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15  4:17 UTC (permalink / raw)
  To: Peng Tao; +Cc: linux-kernel, Jinshan Xiong, John L. Hammond, Andreas Dilger

On Fri, Nov 15, 2013 at 12:13:09AM +0800, Peng Tao wrote:
> From: Jinshan Xiong <jinshan.xiong@intel.com>
> 
> In this patch, a framework of lease is implemented. However,
> only exclusive lease is supported right now.

Which is a problem, why?

Why do you want to add this new functionality?  What is it good for?
Who will use it?

Oh, and the patch doesn't apply to my tree, AND it contains a bunch of
foolish coding style errors, so even if I did understand what this was
for, and why we wanted it, I couldn't.

Please be more careful, I'm really stopping here now on reviewing these
lustre patches.  Please clean them up, resync, and try again, I've now
purged them all from my mboxes, sorry.

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 00/40] staging/lustre: patch bomb 1
  2013-11-15  3:59 ` [PATCH 00/40] staging/lustre: patch bomb 1 Greg Kroah-Hartman
@ 2013-11-15  9:51   ` Peng Tao
  2013-11-15 20:54     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-15  9:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Linux Kernel Mailing List, Andreas Dilger

On Fri, Nov 15, 2013 at 11:59 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> Some of the stuff here looks a lot like "new features / functionality"
> to me, insead of just cleanups and fixes.
>
> Ideally, you all would only work on cleanups and getting this code out
> of staging, and not add new things while it's sitting here.  But I know
> that's going to be hard as you all are still working with an external
> tree.
>
> That external tree is going to be tough to keep up over time, I
> _strongly_ suggest you drop it as soon as possible and only use the
> upstream tree.  Otherwise this whole thing really isn't going to work
> out at all.
>
There are many users of the external tree so we cannot just abandon
it, especially that the upstream client is not shipped in any
distribution yet. Thanks for your understanding.


> I'll go dig through these and see what they look like, thanks...
>
Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-15  4:09   ` Greg Kroah-Hartman
@ 2013-11-15  9:55     ` Peng Tao
  2013-11-16 10:36     ` Dilger, Andreas
  1 sibling, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-15  9:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, JC Lafoucriere, Andreas Dilger

On Fri, Nov 15, 2013 at 12:09 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Fri, Nov 15, 2013 at 12:13:06AM +0800, Peng Tao wrote:
>> From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
>>
>> When a client accesses data in a released file,
>> or truncate it, client must trig a restore request.
>> During this restore, the client must not glimpse and
>> must use size from MDT. To bring the "restore is running"
>> information on the client we add a new t_state bit field
>> to mdt_info which will be used to carry transient file state.
>> To memorise this information in the inode we add a new flag
>> LLIF_FILE_RESTORING.
>
> This patch also does other things not mentioned here (coding style
> cleanups), which isn't allowed in a single patch (only do one thing per
> patch, and never not document what you are doing...)
>
I ported the original patch with as least modification as possible.
Guess I need to split this one (and the alike) so that it follows
upstream patch rules.

> It also adds checkpatch warnings, which I will not accept in patches at
> all here.  People are spending a lot of time cleaning up the coding
> style issues, please NEVER add new ones, that just causes more work to
> be needed to be done, and for people to have to go back and reclean
> files they have already cleaned up.
>
sorry... I will be more careful about this.

> So, sorry, I have to stop here at this series.  I've applied the first 3
> to the opw-next branch of staging.git so they can live somewhere until
> 3.13-rc1 is out.
>
Thanks!

> I know you spent a lot of time making these 120 patches to send me, but
> that too is crazy.  You shouldn't wait that long to get feedback and
> send patches to me at all.  Please send them in smaller series, with
> less time between patch submissions.
>
> So, care to just send me 10 patches or so now, which I can review and
> accept if good, and we can sync up and continue from there?
>
OK. I will split the patchsets into smaller ones. Thanks for reviewing!

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-15  4:13   ` Greg Kroah-Hartman
@ 2013-11-15 10:22     ` Peng Tao
  2013-11-15 20:57       ` Greg Kroah-Hartman
  2013-11-16 11:20     ` Dilger, Andreas
  1 sibling, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-15 10:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, John L. Hammond, Andreas Dilger

On Fri, Nov 15, 2013 at 12:13 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
>> From: "John L. Hammond" <john.hammond@intel.com>
>>
>> Add a const void *h_owner member to struct portals_handle. Add a const
>> void *owner parameter to class_handle2object() which must be matched
>> by the h_owner member of the handle in addition to the cookie.
>
> Ick ick ick.
>
> NEVER use a void pointer if you can help it, and for a "handle", never.
> This isn't other operating systems, sorry.  We know what types our
> pointers to structures are, use them, so that the compiler can catch our
> problems, and don't try to cheat by using void *.
>
Here the void * is not used to pass any structures. It just intends to
save any pointer that is viewed as owner of the handle. Does it make
sense?

Maybe we can just make it unsigned long and type cast the owner
pointer to unsigned long as well. But I guess that is even more
hacky...

>> Adjust
>> the callers of class_handle2object() accordingly, using NULL as the
>> argument to the owner parameter, except in the case of
>> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
>> which we use as the owner when searching for a MFD. When allocating a
>> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
>> store it in h_owner. This allows the MDT to validate that the client
>> has not sent the wrong open handle cookie, or sent the right cookie to
>> the wrong MDT.
>
> This changelog entry doesn't even match up with the code below.  ALl
> callers of class_handle2object are passing NULL here, which makes this
> patch pretty pointless, right?
>
The extra pointer argument is in fact used by server code. The
function is shared by client and server and we want to keep it sync in
the two places.

> And that's a _very_ generic global symbol name, please don't do that, it
> needs to be "lustre_*" at the front to even expect it to be acceptable.
>
OK. I'll rename it.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open
  2013-11-15  4:17   ` Greg Kroah-Hartman
@ 2013-11-15 10:26     ` Peng Tao
  0 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-15 10:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, Jinshan Xiong, John L. Hammond,
	Andreas Dilger

On Fri, Nov 15, 2013 at 12:17 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Fri, Nov 15, 2013 at 12:13:09AM +0800, Peng Tao wrote:
>> From: Jinshan Xiong <jinshan.xiong@intel.com>
>>
>> In this patch, a framework of lease is implemented. However,
>> only exclusive lease is supported right now.
>
> Which is a problem, why?
>
> Why do you want to add this new functionality?  What is it good for?
> Who will use it?
>
Only brief info for the patch in included in the commit message. The
long story is saved in the external link
https://jira.hpdd.intel.com/browse/LU-2919. I guess I need to pull
more bits to the commit message to make it more understandable.

> Oh, and the patch doesn't apply to my tree, AND it contains a bunch of
> foolish coding style errors, so even if I did understand what this was
> for, and why we wanted it, I couldn't.
>
I'll fix up. sorry.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 00/40] staging/lustre: patch bomb 1
  2013-11-15  9:51   ` Peng Tao
@ 2013-11-15 20:54     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15 20:54 UTC (permalink / raw)
  To: Peng Tao; +Cc: Linux Kernel Mailing List, Andreas Dilger

On Fri, Nov 15, 2013 at 05:51:14PM +0800, Peng Tao wrote:
> On Fri, Nov 15, 2013 at 11:59 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > Some of the stuff here looks a lot like "new features / functionality"
> > to me, insead of just cleanups and fixes.
> >
> > Ideally, you all would only work on cleanups and getting this code out
> > of staging, and not add new things while it's sitting here.  But I know
> > that's going to be hard as you all are still working with an external
> > tree.
> >
> > That external tree is going to be tough to keep up over time, I
> > _strongly_ suggest you drop it as soon as possible and only use the
> > upstream tree.  Otherwise this whole thing really isn't going to work
> > out at all.
> >
> There are many users of the external tree so we cannot just abandon
> it, especially that the upstream client is not shipped in any
> distribution yet. Thanks for your understanding.

What is keeping people using that tree?  Support for older/distro
kernels?

Is it the fact that the server code isn't in the kernel?  Should that be
added now too so that we can get a proper view of what can and can not
be changed?  Some of your patches are showing that things are shared by
the two chunks of code, so does that mean if I delete things in the
client code that don't look to be used by anything, you will have
problems because the server now breaks?

I think it's time to just merge the server and deal with the whole thing
all at once, otherwise this dependancy on your external tree, and
external code to the kernel, is going to doom the ability to ever get
this code cleaned up and merged properly.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-15 10:22     ` Peng Tao
@ 2013-11-15 20:57       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-15 20:57 UTC (permalink / raw)
  To: Peng Tao; +Cc: Linux Kernel Mailing List, John L. Hammond, Andreas Dilger

On Fri, Nov 15, 2013 at 06:22:57PM +0800, Peng Tao wrote:
> On Fri, Nov 15, 2013 at 12:13 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
> >> From: "John L. Hammond" <john.hammond@intel.com>
> >>
> >> Add a const void *h_owner member to struct portals_handle. Add a const
> >> void *owner parameter to class_handle2object() which must be matched
> >> by the h_owner member of the handle in addition to the cookie.
> >
> > Ick ick ick.
> >
> > NEVER use a void pointer if you can help it, and for a "handle", never.
> > This isn't other operating systems, sorry.  We know what types our
> > pointers to structures are, use them, so that the compiler can catch our
> > problems, and don't try to cheat by using void *.
> >
> Here the void * is not used to pass any structures. It just intends to
> save any pointer that is viewed as owner of the handle. Does it make
> sense?

Yes, but you shouldn't use a void * for it.  Use the _real_ data
structure it is pointing to.

And "handles" suck, please don't use them, it's not a Linux model at
all, for good reasons (i.e. we have the ability to know what is being
passed around.)

> Maybe we can just make it unsigned long and type cast the owner
> pointer to unsigned long as well. But I guess that is even more
> hacky...

Yes, that's not the correct solution either :)

> >> Adjust
> >> the callers of class_handle2object() accordingly, using NULL as the
> >> argument to the owner parameter, except in the case of
> >> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
> >> which we use as the owner when searching for a MFD. When allocating a
> >> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
> >> store it in h_owner. This allows the MDT to validate that the client
> >> has not sent the wrong open handle cookie, or sent the right cookie to
> >> the wrong MDT.
> >
> > This changelog entry doesn't even match up with the code below.  ALl
> > callers of class_handle2object are passing NULL here, which makes this
> > patch pretty pointless, right?
> >
> The extra pointer argument is in fact used by server code. The
> function is shared by client and server and we want to keep it sync in
> the two places.

Ok, this is proof we need the server code in the kernel, because as it
is, I can't take this patch because it doesn't do anything at all, and
the changelog lies.

Dealing with an external dependancy on kernel code does not work at all,
sorry.  Let's just merge both into staging and deal with the cleanup
properly.  Otherwise this is going to be impossible to do, and we should
just delete the client code as I sure can't work on it as is now.

sorry,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-15  4:09   ` Greg Kroah-Hartman
  2013-11-15  9:55     ` Peng Tao
@ 2013-11-16 10:36     ` Dilger, Andreas
  2013-11-16 19:59       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 67+ messages in thread
From: Dilger, Andreas @ 2013-11-16 10:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Peng Tao; +Cc: linux-kernel

On 2013/11/14 9:09 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
wrote:
>On Fri, Nov 15, 2013 at 12:13:06AM +0800, Peng Tao wrote:
>> From: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
>> 
>> When a client accesses data in a released file,
>> or truncate it, client must trig a restore request.
>> During this restore, the client must not glimpse and
>> must use size from MDT. To bring the "restore is running"
>> information on the client we add a new t_state bit field
>> to mdt_info which will be used to carry transient file state.
>> To memorise this information in the inode we add a new flag
>> LLIF_FILE_RESTORING.
>
>This patch also does other things not mentioned here (coding style
>cleanups), which isn't allowed in a single patch (only do one thing per
>patch, and never not document what you are doing...)
>
>It also adds checkpatch warnings, which I will not accept in patches at
>all here.  People are spending a lot of time cleaning up the coding
>style issues, please NEVER add new ones, that just causes more work to
>be needed to be done, and for people to have to go back and reclean
>files they have already cleaned up.

I'm fine to clean up the coding style issues in these patches.

>So, sorry, I have to stop here at this series.  I've applied the first 3
>to the opw-next branch of staging.git so they can live somewhere until
>3.13-rc1 is out.
>
>I know you spent a lot of time making these 120 patches to send me, but
>that too is crazy.  You shouldn't wait that long to get feedback and
>send patches to me at all.  Please send them in smaller series, with
>less time between patch submissions.

The reason that there are so many patches in a burst is that we are also
developing new features and fixing bugs in parallel with the kernel, but
they also need to be tested a lot before they are released.  It's not any
different from kernel patches testing on -next or -mm for a few months
before they are pushed to the mainline kernel in a batch.

The out-of-tree development can't really stop for a year while the kernel
client code is cleaned up.  If the feature patches don't land into the
kernel client for a year (or however long it takes to do all the cleanup),
then it will become outdated and the whole reason for adding the client
into the kernel is lost.

>> There are many users of the external tree so we cannot just abandon
>> it, especially that the upstream client is not shipped in any
>> distribution yet. Thanks for your understanding.
>
>What is keeping people using that tree?  Support for older/distro
>kernels?
>

Support for distro kernels is a big part of it.  Most HPC sites use vendor
kernels of various vintages, and we need to keep the code working for those
sites.  We also need to continue developing features needed by ever-larger
clusters, fixing bugs, etc.  Eventually, when Lustre is in the kernel
proper,
it will also be available in future distro kernels and we can eventually
stop supporting the out-of-tree code.  I expect that to be 3+ years away.

>Is it the fact that the server code isn't in the kernel?

Not really.  Lustre servers are on separate servers with vendor kernels
(ancient by your standards), and there isn't really any demand for
using newer kernels on those nodes.  Most importantly, the out-of-tree
branch is where all of the new feature development is being done.  That
also means if feature patches don't land into the kernel it just makes
the kernel version less attractive for users.

>  Should that be added now too so that we can get a proper view of what
> can and can not be changed?  Some of your patches are showing that things
> are shared by the two chunks of code, so does that mean if I delete
>things in the client code that don't look to be used by anything, you
> will have problems because the server now breaks?

Adding the server code to staging would mean another 150kLOC in staging.
We also haven't cleaned that code up even nearly as much as the client
code, so it isn't really even ready to go into staging either.  I don't
think you or anyone else would be happy to see that code yet, so I don't
think there is a real advantage to do so.

Deleting unused code in the kernel client isn't fatal, since we'll
still have the out-of-tree version, but we're trying to keep the code in
sync as much as possible so that it is easier to port patches in both
directions.  If code is deleted it means more that needs to be added
back later when the server eventually does get merged, and more effort
to resolve patch conflicts.

>I think it's time to just merge the server and deal with the whole thing
>all at once, otherwise this dependancy on your external tree, and
>external code to the kernel, is going to doom the ability to ever get
>this code cleaned up and merged properly.

Regardless of whether the Lustre server is added to the kernel or not,
we need to maintain support for the older kernels, so there will need
to be an external tree for years to come until Lustre is available in
vendor kernels.  I'm sure we can't merge in the kernel-version-portable
code into the kernel, so there isn't really any choice but to try and
keep both trees in sync.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-15  4:13   ` Greg Kroah-Hartman
  2013-11-15 10:22     ` Peng Tao
@ 2013-11-16 11:20     ` Dilger, Andreas
  2013-11-16 19:50       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 67+ messages in thread
From: Dilger, Andreas @ 2013-11-16 11:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Peng Tao; +Cc: linux-kernel, Hammond, John

On 2013/11/14 9:13 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
wrote:

>On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
>> From: "John L. Hammond" <john.hammond@intel.com>
>> 
>> Add a const void *h_owner member to struct portals_handle. Add a const
>> void *owner parameter to class_handle2object() which must be matched
>> by the h_owner member of the handle in addition to the cookie.
>
>Ick ick ick.
>
>NEVER use a void pointer if you can help it, and for a "handle", never.
>This isn't other operating systems, sorry.  We know what types our
>pointers to structures are, use them, so that the compiler can catch our
>problems, and don't try to cheat by using void *.

The portals_handle is used as a generic type for objects referenced over
the network, like a file handle.  The "owner" parameter is just used as
an extra check that the cookie passed from the client is actually a
valid value for the code in which it is being used (e.g. metadata or
data object).  It isn't actually dereferenced by anything, it is just
like a magic value.

>> Adjust
>> the callers of class_handle2object() accordingly, using NULL as the
>> argument to the owner parameter, except in the case of
>> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
>> which we use as the owner when searching for a MFD. When allocating a
>> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
>> store it in h_owner. This allows the MDT to validate that the client
>> has not sent the wrong open handle cookie, or sent the right cookie to
>> the wrong MDT.
>
>This changelog entry doesn't even match up with the code below.  ALl
>callers of class_handle2object are passing NULL here, which makes this
>patch pretty pointless, right?

As Tao wrote, this is the patch summary that matches what was committed
in our own tree and in this case mostly describes the changes made on the
server.  Keeping the same commits and comments in both trees makes it
easier to keep the code in sync.

>And that's a _very_ generic global symbol name, please don't do that, it
>needs to be "lustre_*" at the front to even expect it to be acceptable.

There's already something else called a "lustre_handle", so what about
obdclass_handle2object(), since this is in the obdclass module?

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-16 11:20     ` Dilger, Andreas
@ 2013-11-16 19:50       ` Greg Kroah-Hartman
  2013-11-18  2:36         ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-16 19:50 UTC (permalink / raw)
  To: Dilger, Andreas; +Cc: Peng Tao, linux-kernel, Hammond, John

On Sat, Nov 16, 2013 at 11:20:37AM +0000, Dilger, Andreas wrote:
> On 2013/11/14 9:13 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
> wrote:
> 
> >On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
> >> From: "John L. Hammond" <john.hammond@intel.com>
> >> 
> >> Add a const void *h_owner member to struct portals_handle. Add a const
> >> void *owner parameter to class_handle2object() which must be matched
> >> by the h_owner member of the handle in addition to the cookie.
> >
> >Ick ick ick.
> >
> >NEVER use a void pointer if you can help it, and for a "handle", never.
> >This isn't other operating systems, sorry.  We know what types our
> >pointers to structures are, use them, so that the compiler can catch our
> >problems, and don't try to cheat by using void *.
> 
> The portals_handle is used as a generic type for objects referenced over
> the network, like a file handle.  The "owner" parameter is just used as
> an extra check that the cookie passed from the client is actually a
> valid value for the code in which it is being used (e.g. metadata or
> data object).  It isn't actually dereferenced by anything, it is just
> like a magic value.

Then make it an explicit type, not a void *.

> >> Adjust
> >> the callers of class_handle2object() accordingly, using NULL as the
> >> argument to the owner parameter, except in the case of
> >> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
> >> which we use as the owner when searching for a MFD. When allocating a
> >> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
> >> store it in h_owner. This allows the MDT to validate that the client
> >> has not sent the wrong open handle cookie, or sent the right cookie to
> >> the wrong MDT.
> >
> >This changelog entry doesn't even match up with the code below.  ALl
> >callers of class_handle2object are passing NULL here, which makes this
> >patch pretty pointless, right?
> 
> As Tao wrote, this is the patch summary that matches what was committed
> in our own tree and in this case mostly describes the changes made on the
> server.  Keeping the same commits and comments in both trees makes it
> easier to keep the code in sync.

Ok, but as it is, this patch does nothing to the client code, so how can
I accept it?  A function that is only ever called with NULL as an option
is ripe for cleanup in my eyes.

> >And that's a _very_ generic global symbol name, please don't do that, it
> >needs to be "lustre_*" at the front to even expect it to be acceptable.
> 
> There's already something else called a "lustre_handle", so what about
> obdclass_handle2object(), since this is in the obdclass module?

I don't care, but make it unique to the lustre subsystem somehow.
Haveing it start with "class_" looks like it belongs to the driver core
which isn't ok.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-16 10:36     ` Dilger, Andreas
@ 2013-11-16 19:59       ` Greg Kroah-Hartman
  2013-11-18  3:07         ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-16 19:59 UTC (permalink / raw)
  To: Dilger, Andreas; +Cc: Peng Tao, linux-kernel

On Sat, Nov 16, 2013 at 10:36:18AM +0000, Dilger, Andreas wrote:
> >So, sorry, I have to stop here at this series.  I've applied the first 3
> >to the opw-next branch of staging.git so they can live somewhere until
> >3.13-rc1 is out.
> >
> >I know you spent a lot of time making these 120 patches to send me, but
> >that too is crazy.  You shouldn't wait that long to get feedback and
> >send patches to me at all.  Please send them in smaller series, with
> >less time between patch submissions.
> 
> The reason that there are so many patches in a burst is that we are also
> developing new features and fixing bugs in parallel with the kernel, but
> they also need to be tested a lot before they are released.  It's not any
> different from kernel patches testing on -next or -mm for a few months
> before they are pushed to the mainline kernel in a batch.

It's very different in that you are expecting me to suddenly accept this
huge burst of patches all at once, and I can't review them properly.  As
this patch series shows, there was a problem in the 4rd patch in a 100+
series, which means that I couldn't take them all.  So you wasted a
bunch of work preparing those other 96 patches.

Send them to me in short chunks, that way you don't waste time, and you
don't take so long between patch acceptance.

> The out-of-tree development can't really stop for a year while the kernel
> client code is cleaned up.  If the feature patches don't land into the
> kernel client for a year (or however long it takes to do all the cleanup),
> then it will become outdated and the whole reason for adding the client
> into the kernel is lost.

But you understand my hesitation at taking any new features when there
really has not been any attempt by your team to do much cleanup and
fixes of the code at all, right?  It feels like I have done more cleanup
personally than anyone on your teams, is this not true?

So, why would I believe that you all are really going to start doing the
major cleanup work on this code that is so obviously needed?  Why would
I take new features that you are spending your time on instead?

> >> There are many users of the external tree so we cannot just abandon
> >> it, especially that the upstream client is not shipped in any
> >> distribution yet. Thanks for your understanding.
> >
> >What is keeping people using that tree?  Support for older/distro
> >kernels?
> >
> 
> Support for distro kernels is a big part of it.  Most HPC sites use vendor
> kernels of various vintages, and we need to keep the code working for those
> sites.  We also need to continue developing features needed by ever-larger
> clusters, fixing bugs, etc.  Eventually, when Lustre is in the kernel
> proper,
> it will also be available in future distro kernels and we can eventually
> stop supporting the out-of-tree code.  I expect that to be 3+ years away.

So, originally you said it would take about a year to get this out of
staging.  It's been 6 months already with no real progress made with the
exception of having interns do cleanup and me doing some basic wrapper
function removals.  What's the plan for the next 6 months?  Based on
these 100+ patches, I don't see any major changes that are happening to
get this cleaned up properly (or did I miss those patches in this huge
patchbomb?)

> >Is it the fact that the server code isn't in the kernel?
> 
> Not really.  Lustre servers are on separate servers with vendor kernels
> (ancient by your standards), and there isn't really any demand for
> using newer kernels on those nodes.  Most importantly, the out-of-tree
> branch is where all of the new feature development is being done.  That
> also means if feature patches don't land into the kernel it just makes
> the kernel version less attractive for users.
> 
> >  Should that be added now too so that we can get a proper view of what
> > can and can not be changed?  Some of your patches are showing that things
> > are shared by the two chunks of code, so does that mean if I delete
> >things in the client code that don't look to be used by anything, you
> > will have problems because the server now breaks?
> 
> Adding the server code to staging would mean another 150kLOC in staging.
> We also haven't cleaned that code up even nearly as much as the client
> code, so it isn't really even ready to go into staging either.  I don't
> think you or anyone else would be happy to see that code yet, so I don't
> think there is a real advantage to do so.
> 
> Deleting unused code in the kernel client isn't fatal, since we'll
> still have the out-of-tree version, but we're trying to keep the code in
> sync as much as possible so that it is easier to port patches in both
> directions.  If code is deleted it means more that needs to be added
> back later when the server eventually does get merged, and more effort
> to resolve patch conflicts.

But look at the function that caused this original question to be asked.
You want an api change that I would never accept as it doesn't do
anything to the client code at all.  How do I "know" this is acceptable
and needed if I can't see the server side?  How would I "know" not to
take a cleanup patch that reverted this change if I can't see the server
side at the same time?

150k of code is not a big deal to me, I can easily take it.  If it means
that we have a real chance at getting this all fixed up properly, I say
to do it, otherwise I really don't think this project will ever succeed,
sorry.

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-16 19:50       ` Greg Kroah-Hartman
@ 2013-11-18  2:36         ` Peng Tao
  2013-11-18  4:35           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-18  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dilger, Andreas, linux-kernel, Hammond, John

On Sun, Nov 17, 2013 at 3:50 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Sat, Nov 16, 2013 at 11:20:37AM +0000, Dilger, Andreas wrote:
>> On 2013/11/14 9:13 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
>> wrote:
>>
>> >On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
>> >> From: "John L. Hammond" <john.hammond@intel.com>
>> >>
>> >> Add a const void *h_owner member to struct portals_handle. Add a const
>> >> void *owner parameter to class_handle2object() which must be matched
>> >> by the h_owner member of the handle in addition to the cookie.
>> >
>> >Ick ick ick.
>> >
>> >NEVER use a void pointer if you can help it, and for a "handle", never.
>> >This isn't other operating systems, sorry.  We know what types our
>> >pointers to structures are, use them, so that the compiler can catch our
>> >problems, and don't try to cheat by using void *.
>>
>> The portals_handle is used as a generic type for objects referenced over
>> the network, like a file handle.  The "owner" parameter is just used as
>> an extra check that the cookie passed from the client is actually a
>> valid value for the code in which it is being used (e.g. metadata or
>> data object).  It isn't actually dereferenced by anything, it is just
>> like a magic value.
>
> Then make it an explicit type, not a void *.
>
>> >> Adjust
>> >> the callers of class_handle2object() accordingly, using NULL as the
>> >> argument to the owner parameter, except in the case of
>> >> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
>> >> which we use as the owner when searching for a MFD. When allocating a
>> >> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
>> >> store it in h_owner. This allows the MDT to validate that the client
>> >> has not sent the wrong open handle cookie, or sent the right cookie to
>> >> the wrong MDT.
>> >
>> >This changelog entry doesn't even match up with the code below.  ALl
>> >callers of class_handle2object are passing NULL here, which makes this
>> >patch pretty pointless, right?
>>
>> As Tao wrote, this is the patch summary that matches what was committed
>> in our own tree and in this case mostly describes the changes made on the
>> server.  Keeping the same commits and comments in both trees makes it
>> easier to keep the code in sync.
>
> Ok, but as it is, this patch does nothing to the client code, so how can
> I accept it?  A function that is only ever called with NULL as an option
> is ripe for cleanup in my eyes.
>
How about adding a comment above the function to note that this extra
argument is used by server code and please don't remove it?

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-16 19:59       ` Greg Kroah-Hartman
@ 2013-11-18  3:07         ` Peng Tao
  2013-11-18  4:39           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-18  3:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dilger, Andreas, linux-kernel

On Sun, Nov 17, 2013 at 3:59 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Sat, Nov 16, 2013 at 10:36:18AM +0000, Dilger, Andreas wrote:
>> >So, sorry, I have to stop here at this series.  I've applied the first 3
>> >to the opw-next branch of staging.git so they can live somewhere until
>> >3.13-rc1 is out.
>> >
>> >I know you spent a lot of time making these 120 patches to send me, but
>> >that too is crazy.  You shouldn't wait that long to get feedback and
>> >send patches to me at all.  Please send them in smaller series, with
>> >less time between patch submissions.
>>
>> The reason that there are so many patches in a burst is that we are also
>> developing new features and fixing bugs in parallel with the kernel, but
>> they also need to be tested a lot before they are released.  It's not any
>> different from kernel patches testing on -next or -mm for a few months
>> before they are pushed to the mainline kernel in a batch.
>
> It's very different in that you are expecting me to suddenly accept this
> huge burst of patches all at once, and I can't review them properly.  As
> this patch series shows, there was a problem in the 4rd patch in a 100+
> series, which means that I couldn't take them all.  So you wasted a
> bunch of work preparing those other 96 patches.
>
> Send them to me in short chunks, that way you don't waste time, and you
> don't take so long between patch acceptance.
>
working on it. sorry for the noise.

>> The out-of-tree development can't really stop for a year while the kernel
>> client code is cleaned up.  If the feature patches don't land into the
>> kernel client for a year (or however long it takes to do all the cleanup),
>> then it will become outdated and the whole reason for adding the client
>> into the kernel is lost.
>
> But you understand my hesitation at taking any new features when there
> really has not been any attempt by your team to do much cleanup and
> fixes of the code at all, right?  It feels like I have done more cleanup
> personally than anyone on your teams, is this not true?
>
> So, why would I believe that you all are really going to start doing the
> major cleanup work on this code that is so obviously needed?  Why would
> I take new features that you are spending your time on instead?
>
My apologize. I was distracted by other projects for some time and I
am trying to make it up. I thought you agreed that I can first close
the patch gap between in-kernel client and external tree, and then add
cleanup patches, no?

>> >> There are many users of the external tree so we cannot just abandon
>> >> it, especially that the upstream client is not shipped in any
>> >> distribution yet. Thanks for your understanding.
>> >
>> >What is keeping people using that tree?  Support for older/distro
>> >kernels?
>> >
>>
>> Support for distro kernels is a big part of it.  Most HPC sites use vendor
>> kernels of various vintages, and we need to keep the code working for those
>> sites.  We also need to continue developing features needed by ever-larger
>> clusters, fixing bugs, etc.  Eventually, when Lustre is in the kernel
>> proper,
>> it will also be available in future distro kernels and we can eventually
>> stop supporting the out-of-tree code.  I expect that to be 3+ years away.
>
> So, originally you said it would take about a year to get this out of
> staging.  It's been 6 months already with no real progress made with the
> exception of having interns do cleanup and me doing some basic wrapper
> function removals.  What's the plan for the next 6 months?  Based on
> these 100+ patches, I don't see any major changes that are happening to
> get this cleaned up properly (or did I miss those patches in this huge
> patchbomb?)
>
sorry for sending you these patch bombs. I wanted to first sync with
external tree because otherwise the gap will be increasingly large and
eventually we end up with two different code base.

>> >Is it the fact that the server code isn't in the kernel?
>>
>> Not really.  Lustre servers are on separate servers with vendor kernels
>> (ancient by your standards), and there isn't really any demand for
>> using newer kernels on those nodes.  Most importantly, the out-of-tree
>> branch is where all of the new feature development is being done.  That
>> also means if feature patches don't land into the kernel it just makes
>> the kernel version less attractive for users.
>>
>> >  Should that be added now too so that we can get a proper view of what
>> > can and can not be changed?  Some of your patches are showing that things
>> > are shared by the two chunks of code, so does that mean if I delete
>> >things in the client code that don't look to be used by anything, you
>> > will have problems because the server now breaks?
>>
>> Adding the server code to staging would mean another 150kLOC in staging.
>> We also haven't cleaned that code up even nearly as much as the client
>> code, so it isn't really even ready to go into staging either.  I don't
>> think you or anyone else would be happy to see that code yet, so I don't
>> think there is a real advantage to do so.
>>
>> Deleting unused code in the kernel client isn't fatal, since we'll
>> still have the out-of-tree version, but we're trying to keep the code in
>> sync as much as possible so that it is easier to port patches in both
>> directions.  If code is deleted it means more that needs to be added
>> back later when the server eventually does get merged, and more effort
>> to resolve patch conflicts.
>
> But look at the function that caused this original question to be asked.
> You want an api change that I would never accept as it doesn't do
> anything to the client code at all.  How do I "know" this is acceptable
> and needed if I can't see the server side?  How would I "know" not to
> take a cleanup patch that reverted this change if I can't see the server
> side at the same time?
>
How about CCing Andreas and me for patches changing
drivers/staging/lustre? If such reverting patch appears, we can send
NACKs, right? Andreas reviews most of patches landed in both in kernel
client and in the external tree. He for sure knows if a cleanup patch
drops something that server needs. Also I'll try to add comments on
things that are used only by server but client also needs. Does it
sound working?

> 150k of code is not a big deal to me, I can easily take it.  If it means
> that we have a real chance at getting this all fixed up properly, I say
> to do it, otherwise I really don't think this project will ever succeed,
> sorry.
>
The issue with server code is that it is not even updated to support
latest kernel. Also work is still going on to get rid of extra patches
to generic kernel code. And there is the issue of ldiskfs that is
mostly the same code as ext4 but with modifications. I am not even
sure if Ted is willing to take these extra lustre patches. I mean, if
they can be merged, they should have been sent upstream and merged
long ago. Is it correct, Andreas?

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-18  2:36         ` Peng Tao
@ 2013-11-18  4:35           ` Greg Kroah-Hartman
  2013-11-18  6:18             ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-18  4:35 UTC (permalink / raw)
  To: Peng Tao; +Cc: Dilger, Andreas, linux-kernel, Hammond, John

On Mon, Nov 18, 2013 at 10:36:26AM +0800, Peng Tao wrote:
> On Sun, Nov 17, 2013 at 3:50 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Sat, Nov 16, 2013 at 11:20:37AM +0000, Dilger, Andreas wrote:
> >> On 2013/11/14 9:13 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
> >> wrote:
> >>
> >> >On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
> >> >> From: "John L. Hammond" <john.hammond@intel.com>
> >> >>
> >> >> Add a const void *h_owner member to struct portals_handle. Add a const
> >> >> void *owner parameter to class_handle2object() which must be matched
> >> >> by the h_owner member of the handle in addition to the cookie.
> >> >
> >> >Ick ick ick.
> >> >
> >> >NEVER use a void pointer if you can help it, and for a "handle", never.
> >> >This isn't other operating systems, sorry.  We know what types our
> >> >pointers to structures are, use them, so that the compiler can catch our
> >> >problems, and don't try to cheat by using void *.
> >>
> >> The portals_handle is used as a generic type for objects referenced over
> >> the network, like a file handle.  The "owner" parameter is just used as
> >> an extra check that the cookie passed from the client is actually a
> >> valid value for the code in which it is being used (e.g. metadata or
> >> data object).  It isn't actually dereferenced by anything, it is just
> >> like a magic value.
> >
> > Then make it an explicit type, not a void *.
> >
> >> >> Adjust
> >> >> the callers of class_handle2object() accordingly, using NULL as the
> >> >> argument to the owner parameter, except in the case of
> >> >> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
> >> >> which we use as the owner when searching for a MFD. When allocating a
> >> >> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
> >> >> store it in h_owner. This allows the MDT to validate that the client
> >> >> has not sent the wrong open handle cookie, or sent the right cookie to
> >> >> the wrong MDT.
> >> >
> >> >This changelog entry doesn't even match up with the code below.  ALl
> >> >callers of class_handle2object are passing NULL here, which makes this
> >> >patch pretty pointless, right?
> >>
> >> As Tao wrote, this is the patch summary that matches what was committed
> >> in our own tree and in this case mostly describes the changes made on the
> >> server.  Keeping the same commits and comments in both trees makes it
> >> easier to keep the code in sync.
> >
> > Ok, but as it is, this patch does nothing to the client code, so how can
> > I accept it?  A function that is only ever called with NULL as an option
> > is ripe for cleanup in my eyes.
> >
> How about adding a comment above the function to note that this extra
> argument is used by server code and please don't remove it?

How about adding the server code to the kernel to keep problems like
this (which will continue to crop up, it's not just this one function,
right?) from happening in the future?

In-kernel code does not depend on out-of-kernel code, it's that simple,
and has been a rule for kernel code for forever.  Either deal with the
fact that you will have to keep the apis and functions working for your
out-of-tree code, or put all the code into the kernel.  Don't force
in-kernel code to deal with out-of-tree code as there is NO way that
anyone other than the very few of you, can deal with that at all.

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-18  3:07         ` Peng Tao
@ 2013-11-18  4:39           ` Greg Kroah-Hartman
  2013-11-18  6:07             ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-18  4:39 UTC (permalink / raw)
  To: Peng Tao; +Cc: Dilger, Andreas, linux-kernel

On Mon, Nov 18, 2013 at 11:07:12AM +0800, Peng Tao wrote:
> On Sun, Nov 17, 2013 at 3:59 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Sat, Nov 16, 2013 at 10:36:18AM +0000, Dilger, Andreas wrote:
> >> >So, sorry, I have to stop here at this series.  I've applied the first 3
> >> >to the opw-next branch of staging.git so they can live somewhere until
> >> >3.13-rc1 is out.
> >> >
> >> >I know you spent a lot of time making these 120 patches to send me, but
> >> >that too is crazy.  You shouldn't wait that long to get feedback and
> >> >send patches to me at all.  Please send them in smaller series, with
> >> >less time between patch submissions.
> >>
> >> The reason that there are so many patches in a burst is that we are also
> >> developing new features and fixing bugs in parallel with the kernel, but
> >> they also need to be tested a lot before they are released.  It's not any
> >> different from kernel patches testing on -next or -mm for a few months
> >> before they are pushed to the mainline kernel in a batch.
> >
> > It's very different in that you are expecting me to suddenly accept this
> > huge burst of patches all at once, and I can't review them properly.  As
> > this patch series shows, there was a problem in the 4rd patch in a 100+
> > series, which means that I couldn't take them all.  So you wasted a
> > bunch of work preparing those other 96 patches.
> >
> > Send them to me in short chunks, that way you don't waste time, and you
> > don't take so long between patch acceptance.
> >
> working on it. sorry for the noise.
> 
> >> The out-of-tree development can't really stop for a year while the kernel
> >> client code is cleaned up.  If the feature patches don't land into the
> >> kernel client for a year (or however long it takes to do all the cleanup),
> >> then it will become outdated and the whole reason for adding the client
> >> into the kernel is lost.
> >
> > But you understand my hesitation at taking any new features when there
> > really has not been any attempt by your team to do much cleanup and
> > fixes of the code at all, right?  It feels like I have done more cleanup
> > personally than anyone on your teams, is this not true?
> >
> > So, why would I believe that you all are really going to start doing the
> > major cleanup work on this code that is so obviously needed?  Why would
> > I take new features that you are spending your time on instead?
> >
> My apologize. I was distracted by other projects for some time and I
> am trying to make it up. I thought you agreed that I can first close
> the patch gap between in-kernel client and external tree, and then add
> cleanup patches, no?

What exactly is "the gap"?  Are we talking 200 patches?  10?  100?  I
have no idea what you are referring to, but realize I can't review 100
patches at a single time easily.

And what happens when I accep these "gap" patches?  Will development in
this external tree suddenly stop?  It sure doesn't seem like this is
happening as I thought it would have already since the code has been in
the kernel for over 6 months now.

> >> >> There are many users of the external tree so we cannot just abandon
> >> >> it, especially that the upstream client is not shipped in any
> >> >> distribution yet. Thanks for your understanding.
> >> >
> >> >What is keeping people using that tree?  Support for older/distro
> >> >kernels?
> >> >
> >>
> >> Support for distro kernels is a big part of it.  Most HPC sites use vendor
> >> kernels of various vintages, and we need to keep the code working for those
> >> sites.  We also need to continue developing features needed by ever-larger
> >> clusters, fixing bugs, etc.  Eventually, when Lustre is in the kernel
> >> proper,
> >> it will also be available in future distro kernels and we can eventually
> >> stop supporting the out-of-tree code.  I expect that to be 3+ years away.
> >
> > So, originally you said it would take about a year to get this out of
> > staging.  It's been 6 months already with no real progress made with the
> > exception of having interns do cleanup and me doing some basic wrapper
> > function removals.  What's the plan for the next 6 months?  Based on
> > these 100+ patches, I don't see any major changes that are happening to
> > get this cleaned up properly (or did I miss those patches in this huge
> > patchbomb?)
> >
> sorry for sending you these patch bombs. I wanted to first sync with
> external tree because otherwise the gap will be increasingly large and
> eventually we end up with two different code base.

You already have 2 different code bases :(

> >> >Is it the fact that the server code isn't in the kernel?
> >>
> >> Not really.  Lustre servers are on separate servers with vendor kernels
> >> (ancient by your standards), and there isn't really any demand for
> >> using newer kernels on those nodes.  Most importantly, the out-of-tree
> >> branch is where all of the new feature development is being done.  That
> >> also means if feature patches don't land into the kernel it just makes
> >> the kernel version less attractive for users.
> >>
> >> >  Should that be added now too so that we can get a proper view of what
> >> > can and can not be changed?  Some of your patches are showing that things
> >> > are shared by the two chunks of code, so does that mean if I delete
> >> >things in the client code that don't look to be used by anything, you
> >> > will have problems because the server now breaks?
> >>
> >> Adding the server code to staging would mean another 150kLOC in staging.
> >> We also haven't cleaned that code up even nearly as much as the client
> >> code, so it isn't really even ready to go into staging either.  I don't
> >> think you or anyone else would be happy to see that code yet, so I don't
> >> think there is a real advantage to do so.
> >>
> >> Deleting unused code in the kernel client isn't fatal, since we'll
> >> still have the out-of-tree version, but we're trying to keep the code in
> >> sync as much as possible so that it is easier to port patches in both
> >> directions.  If code is deleted it means more that needs to be added
> >> back later when the server eventually does get merged, and more effort
> >> to resolve patch conflicts.
> >
> > But look at the function that caused this original question to be asked.
> > You want an api change that I would never accept as it doesn't do
> > anything to the client code at all.  How do I "know" this is acceptable
> > and needed if I can't see the server side?  How would I "know" not to
> > take a cleanup patch that reverted this change if I can't see the server
> > side at the same time?
> >
> How about CCing Andreas and me for patches changing
> drivers/staging/lustre?

I have been trying to do that, but not all developers follow that.
Also, not all developers in the overall kernel do that either, for good
reasons (i.e. api changes.)

> If such reverting patch appears, we can send NACKs, right?

Not always, no.  Being a maintainer doesn't mean that you have to have
final approval on every change to your codebase, this is a long-time
thing you all should know quite well.

> Andreas reviews most of patches landed in both in kernel client and in
> the external tree. He for sure knows if a cleanup patch drops
> something that server needs. Also I'll try to add comments on things
> that are used only by server but client also needs. Does it sound
> working?

Does the current way of working seem to work for you?

> > 150k of code is not a big deal to me, I can easily take it.  If it means
> > that we have a real chance at getting this all fixed up properly, I say
> > to do it, otherwise I really don't think this project will ever succeed,
> > sorry.
> >
> The issue with server code is that it is not even updated to support
> latest kernel.

Then how are you even testing this stuff?

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-18  4:39           ` Greg Kroah-Hartman
@ 2013-11-18  6:07             ` Peng Tao
  2013-11-18 13:52               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-18  6:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dilger, Andreas, linux-kernel, faibish, sorin

[CC Sorin since he might be interested in such discussion.]

On Mon, Nov 18, 2013 at 12:39 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Mon, Nov 18, 2013 at 11:07:12AM +0800, Peng Tao wrote:
>> On Sun, Nov 17, 2013 at 3:59 AM, Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>> > On Sat, Nov 16, 2013 at 10:36:18AM +0000, Dilger, Andreas wrote:
>> >> >So, sorry, I have to stop here at this series.  I've applied the first 3
>> >> >to the opw-next branch of staging.git so they can live somewhere until
>> >> >3.13-rc1 is out.
>> >> >
>> >> >I know you spent a lot of time making these 120 patches to send me, but
>> >> >that too is crazy.  You shouldn't wait that long to get feedback and
>> >> >send patches to me at all.  Please send them in smaller series, with
>> >> >less time between patch submissions.
>> >>
>> >> The reason that there are so many patches in a burst is that we are also
>> >> developing new features and fixing bugs in parallel with the kernel, but
>> >> they also need to be tested a lot before they are released.  It's not any
>> >> different from kernel patches testing on -next or -mm for a few months
>> >> before they are pushed to the mainline kernel in a batch.
>> >
>> > It's very different in that you are expecting me to suddenly accept this
>> > huge burst of patches all at once, and I can't review them properly.  As
>> > this patch series shows, there was a problem in the 4rd patch in a 100+
>> > series, which means that I couldn't take them all.  So you wasted a
>> > bunch of work preparing those other 96 patches.
>> >
>> > Send them to me in short chunks, that way you don't waste time, and you
>> > don't take so long between patch acceptance.
>> >
>> working on it. sorry for the noise.
>>
>> >> The out-of-tree development can't really stop for a year while the kernel
>> >> client code is cleaned up.  If the feature patches don't land into the
>> >> kernel client for a year (or however long it takes to do all the cleanup),
>> >> then it will become outdated and the whole reason for adding the client
>> >> into the kernel is lost.
>> >
>> > But you understand my hesitation at taking any new features when there
>> > really has not been any attempt by your team to do much cleanup and
>> > fixes of the code at all, right?  It feels like I have done more cleanup
>> > personally than anyone on your teams, is this not true?
>> >
>> > So, why would I believe that you all are really going to start doing the
>> > major cleanup work on this code that is so obviously needed?  Why would
>> > I take new features that you are spending your time on instead?
>> >
>> My apologize. I was distracted by other projects for some time and I
>> am trying to make it up. I thought you agreed that I can first close
>> the patch gap between in-kernel client and external tree, and then add
>> cleanup patches, no?
>
> What exactly is "the gap"?  Are we talking 200 patches?  10?  100?  I
> have no idea what you are referring to, but realize I can't review 100
> patches at a single time easily.
>
Right now, I think it is about 150 patches. I will split them into
smaller chunks (10 patches or in each patchset) and send them
incrementally.

> And what happens when I accep these "gap" patches?  Will development in
> this external tree suddenly stop?  It sure doesn't seem like this is
> happening as I thought it would have already since the code has been in
> the kernel for over 6 months now.
>
No, it won't stop. The biggest feature coming in this time is HSM. In
future, I think there will be less features and more bug fixes.
Porting patches from external tree isn't just to make the two tree in
sync. External tree contains many bug fix patches that in kernel
client wants as well. If we stop porting, as time goes by, the in tree
client will be less featured, and more buggy than external tree. When
we get it out of staging like that, it will be very hard to persuade
people to use it.

>> >> >> There are many users of the external tree so we cannot just abandon
>> >> >> it, especially that the upstream client is not shipped in any
>> >> >> distribution yet. Thanks for your understanding.
>> >> >
>> >> >What is keeping people using that tree?  Support for older/distro
>> >> >kernels?
>> >> >
>> >>
>> >> Support for distro kernels is a big part of it.  Most HPC sites use vendor
>> >> kernels of various vintages, and we need to keep the code working for those
>> >> sites.  We also need to continue developing features needed by ever-larger
>> >> clusters, fixing bugs, etc.  Eventually, when Lustre is in the kernel
>> >> proper,
>> >> it will also be available in future distro kernels and we can eventually
>> >> stop supporting the out-of-tree code.  I expect that to be 3+ years away.
>> >
>> > So, originally you said it would take about a year to get this out of
>> > staging.  It's been 6 months already with no real progress made with the
>> > exception of having interns do cleanup and me doing some basic wrapper
>> > function removals.  What's the plan for the next 6 months?  Based on
>> > these 100+ patches, I don't see any major changes that are happening to
>> > get this cleaned up properly (or did I miss those patches in this huge
>> > patchbomb?)
>> >
>> sorry for sending you these patch bombs. I wanted to first sync with
>> external tree because otherwise the gap will be increasingly large and
>> eventually we end up with two different code base.
>
> You already have 2 different code bases :(
we still want to make the difference as small as possible. So it won't
cost too much effort to port patches between them.

>
>> >> >Is it the fact that the server code isn't in the kernel?
>> >>
>> >> Not really.  Lustre servers are on separate servers with vendor kernels
>> >> (ancient by your standards), and there isn't really any demand for
>> >> using newer kernels on those nodes.  Most importantly, the out-of-tree
>> >> branch is where all of the new feature development is being done.  That
>> >> also means if feature patches don't land into the kernel it just makes
>> >> the kernel version less attractive for users.
>> >>
>> >> >  Should that be added now too so that we can get a proper view of what
>> >> > can and can not be changed?  Some of your patches are showing that things
>> >> > are shared by the two chunks of code, so does that mean if I delete
>> >> >things in the client code that don't look to be used by anything, you
>> >> > will have problems because the server now breaks?
>> >>
>> >> Adding the server code to staging would mean another 150kLOC in staging.
>> >> We also haven't cleaned that code up even nearly as much as the client
>> >> code, so it isn't really even ready to go into staging either.  I don't
>> >> think you or anyone else would be happy to see that code yet, so I don't
>> >> think there is a real advantage to do so.
>> >>
>> >> Deleting unused code in the kernel client isn't fatal, since we'll
>> >> still have the out-of-tree version, but we're trying to keep the code in
>> >> sync as much as possible so that it is easier to port patches in both
>> >> directions.  If code is deleted it means more that needs to be added
>> >> back later when the server eventually does get merged, and more effort
>> >> to resolve patch conflicts.
>> >
>> > But look at the function that caused this original question to be asked.
>> > You want an api change that I would never accept as it doesn't do
>> > anything to the client code at all.  How do I "know" this is acceptable
>> > and needed if I can't see the server side?  How would I "know" not to
>> > take a cleanup patch that reverted this change if I can't see the server
>> > side at the same time?
>> >
>> How about CCing Andreas and me for patches changing
>> drivers/staging/lustre?
>
> I have been trying to do that, but not all developers follow that.
> Also, not all developers in the overall kernel do that either, for good
> reasons (i.e. api changes.)
>
>> If such reverting patch appears, we can send NACKs, right?
>
> Not always, no.  Being a maintainer doesn't mean that you have to have
> final approval on every change to your codebase, this is a long-time
> thing you all should know quite well.
I see that we don't have final approval. But at least we can send
review comments and raise objections, right? That's my understanding
of how kernel development on public mailing list works, every one gets
the chance to voice his/her opinions. :)

>
>> Andreas reviews most of patches landed in both in kernel client and in
>> the external tree. He for sure knows if a cleanup patch drops
>> something that server needs. Also I'll try to add comments on things
>> that are used only by server but client also needs. Does it sound
>> working?
>
> Does the current way of working seem to work for you?
>
I think it is working but needs improvement. All my patches were sent
to Andreas for inspection before sending out public. We might need to
speed it up. Andreas, what do you think of that I just send patches
public and you do the inspection on mailing list? That way Greg and
others can review and comment at the same time.

After closing the current patch gap, I can work on cleaning up and
most cleanups are quite obvious.

>> > 150k of code is not a big deal to me, I can easily take it.  If it means
>> > that we have a real chance at getting this all fixed up properly, I say
>> > to do it, otherwise I really don't think this project will ever succeed,
>> > sorry.
>> >
>> The issue with server code is that it is not even updated to support
>> latest kernel.
>
> Then how are you even testing this stuff?
We can set up Lustre server with supported vendor kernels. For details
if you are interested, please see
https://wiki.hpdd.intel.com/display/PUB/Putting+together+a+Lustre+filesystem

It is quite common that Lustre client and server use different kernel
versions. Since we only have client code in kernel, we can always test
latest kernel client against a running Lustre server regardless of the
server kernel version.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-18  4:35           ` Greg Kroah-Hartman
@ 2013-11-18  6:18             ` Peng Tao
  2013-11-18  8:57               ` Dilger, Andreas
  0 siblings, 1 reply; 67+ messages in thread
From: Peng Tao @ 2013-11-18  6:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dilger, Andreas, linux-kernel, Hammond, John

On Mon, Nov 18, 2013 at 12:35 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Mon, Nov 18, 2013 at 10:36:26AM +0800, Peng Tao wrote:
>> On Sun, Nov 17, 2013 at 3:50 AM, Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>> > On Sat, Nov 16, 2013 at 11:20:37AM +0000, Dilger, Andreas wrote:
>> >> On 2013/11/14 9:13 PM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
>> >> wrote:
>> >>
>> >> >On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
>> >> >> From: "John L. Hammond" <john.hammond@intel.com>
>> >> >>
>> >> >> Add a const void *h_owner member to struct portals_handle. Add a const
>> >> >> void *owner parameter to class_handle2object() which must be matched
>> >> >> by the h_owner member of the handle in addition to the cookie.
>> >> >
>> >> >Ick ick ick.
>> >> >
>> >> >NEVER use a void pointer if you can help it, and for a "handle", never.
>> >> >This isn't other operating systems, sorry.  We know what types our
>> >> >pointers to structures are, use them, so that the compiler can catch our
>> >> >problems, and don't try to cheat by using void *.
>> >>
>> >> The portals_handle is used as a generic type for objects referenced over
>> >> the network, like a file handle.  The "owner" parameter is just used as
>> >> an extra check that the cookie passed from the client is actually a
>> >> valid value for the code in which it is being used (e.g. metadata or
>> >> data object).  It isn't actually dereferenced by anything, it is just
>> >> like a magic value.
>> >
>> > Then make it an explicit type, not a void *.
>> >
>> >> >> Adjust
>> >> >> the callers of class_handle2object() accordingly, using NULL as the
>> >> >> argument to the owner parameter, except in the case of
>> >> >> mdt_handle2mfd() where we add an explicit mdt_export_data parameter
>> >> >> which we use as the owner when searching for a MFD. When allocating a
>> >> >> new MFD, pass a pointer to the mdt_export_data into mdt_mfd_new() and
>> >> >> store it in h_owner. This allows the MDT to validate that the client
>> >> >> has not sent the wrong open handle cookie, or sent the right cookie to
>> >> >> the wrong MDT.
>> >> >
>> >> >This changelog entry doesn't even match up with the code below.  ALl
>> >> >callers of class_handle2object are passing NULL here, which makes this
>> >> >patch pretty pointless, right?
>> >>
>> >> As Tao wrote, this is the patch summary that matches what was committed
>> >> in our own tree and in this case mostly describes the changes made on the
>> >> server.  Keeping the same commits and comments in both trees makes it
>> >> easier to keep the code in sync.
>> >
>> > Ok, but as it is, this patch does nothing to the client code, so how can
>> > I accept it?  A function that is only ever called with NULL as an option
>> > is ripe for cleanup in my eyes.
>> >
>> How about adding a comment above the function to note that this extra
>> argument is used by server code and please don't remove it?
>
> How about adding the server code to the kernel to keep problems like
> this (which will continue to crop up, it's not just this one function,
> right?) from happening in the future?
>
As explained in the other thread, the server code is not even ready
for landing in upstream kernel. And it won't be for quite some time.

> In-kernel code does not depend on out-of-kernel code, it's that simple,
> and has been a rule for kernel code for forever.  Either deal with the
> fact that you will have to keep the apis and functions working for your
> out-of-tree code, or put all the code into the kernel.  Don't force
> in-kernel code to deal with out-of-tree code as there is NO way that
> anyone other than the very few of you, can deal with that at all.
>
Fair enough. Andreas, how about we handling this kind of difference in
external tree and letting in-tree client be clean of it? We already
have HAVE_SERVER_SUPPORT macro in external tree. It is just a matter
of adding more references.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 05/40] staging/lustre: validate open handle cookies
  2013-11-18  6:18             ` Peng Tao
@ 2013-11-18  8:57               ` Dilger, Andreas
  0 siblings, 0 replies; 67+ messages in thread
From: Dilger, Andreas @ 2013-11-18  8:57 UTC (permalink / raw)
  To: Peng Tao, Greg Kroah-Hartman; +Cc: linux-kernel, Hammond, John

On 2013/11/17 11:18 PM, "Peng Tao" <bergwolf@gmail.com> wrote:

>On Mon, Nov 18, 2013 at 12:35 PM, Greg Kroah-Hartman
><gregkh@linuxfoundation.org> wrote:
>> On Mon, Nov 18, 2013 at 10:36:26AM +0800, Peng Tao wrote:
>>> On Sun, Nov 17, 2013 at 3:50 AM, Greg Kroah-Hartman
>>> <gregkh@linuxfoundation.org> wrote:
>>> > On Sat, Nov 16, 2013 at 11:20:37AM +0000, Dilger, Andreas wrote:
>>> >> On 2013/11/14 9:13 PM, "Greg Kroah-Hartman"
>>><gregkh@linuxfoundation.org>
>>> >> wrote:
>>> >>
>>> >> >On Fri, Nov 15, 2013 at 12:13:07AM +0800, Peng Tao wrote:
>>> >> >> From: "John L. Hammond" <john.hammond@intel.com>
>>> >> >>
>>> >> >> Add a const void *h_owner member to struct portals_handle. Add a
>>>const
>>> >> >> void *owner parameter to class_handle2object() which must be
>>>matched
>>> >> >> by the h_owner member of the handle in addition to the cookie.
>>> >> >
>>> >> >Ick ick ick.
>>> >> >
>>> >> >NEVER use a void pointer if you can help it, and for a "handle",
>>>never.
>>> >> >This isn't other operating systems, sorry.  We know what types our
>>> >> >pointers to structures are, use them, so that the compiler can
>>>catch our
>>> >> >problems, and don't try to cheat by using void *.
>>> >>
>>> >> The portals_handle is used as a generic type for objects referenced
>>>over
>>> >> the network, like a file handle.  The "owner" parameter is just
>>>used as
>>> >> an extra check that the cookie passed from the client is actually a
>>> >> valid value for the code in which it is being used (e.g. metadata or
>>> >> data object).  It isn't actually dereferenced by anything, it is
>>>just
>>> >> like a magic value.
>>> >
>>> > Then make it an explicit type, not a void *.
>>> >
>>> >> >> Adjust
>>> >> >> the callers of class_handle2object() accordingly, using NULL as
>>>the
>>> >> >> argument to the owner parameter, except in the case of
>>> >> >> mdt_handle2mfd() where we add an explicit mdt_export_data
>>>parameter
>>> >> >> which we use as the owner when searching for a MFD. When
>>>allocating a
>>> >> >> new MFD, pass a pointer to the mdt_export_data into
>>>mdt_mfd_new() and
>>> >> >> store it in h_owner. This allows the MDT to validate that the
>>>client
>>> >> >> has not sent the wrong open handle cookie, or sent the right
>>>cookie to
>>> >> >> the wrong MDT.
>>> >> >
>>> >> >This changelog entry doesn't even match up with the code below.
>>>ALl
>>> >> >callers of class_handle2object are passing NULL here, which makes
>>>this
>>> >> >patch pretty pointless, right?
>>> >>
>>> >> As Tao wrote, this is the patch summary that matches what was
>>>committed
>>> >> in our own tree and in this case mostly describes the changes made
>>>on the
>>> >> server.  Keeping the same commits and comments in both trees makes
>>>it
>>> >> easier to keep the code in sync.
>>> >
>>> > Ok, but as it is, this patch does nothing to the client code, so how
>>>can
>>> > I accept it?  A function that is only ever called with NULL as an
>>>option
>>> > is ripe for cleanup in my eyes.
>>> >
>>> How about adding a comment above the function to note that this extra
>>> argument is used by server code and please don't remove it?
>>
>> How about adding the server code to the kernel to keep problems like
>> this (which will continue to crop up, it's not just this one function,
>> right?) from happening in the future?
>>
>As explained in the other thread, the server code is not even ready
>for landing in upstream kernel. And it won't be for quite some time.
>
>> In-kernel code does not depend on out-of-kernel code, it's that simple,
>> and has been a rule for kernel code for forever.  Either deal with the
>> fact that you will have to keep the apis and functions working for your
>> out-of-tree code, or put all the code into the kernel.  Don't force
>> in-kernel code to deal with out-of-tree code as there is NO way that
>> anyone other than the very few of you, can deal with that at all.
>>
>Fair enough. Andreas, how about we handling this kind of difference in
>external tree and letting in-tree client be clean of it? We already
>have HAVE_SERVER_SUPPORT macro in external tree. It is just a matter
>of adding more references.

Fine.  This patch is not critical, and the API isn't used very widely in
the code, so it shouldn't cause too much grief.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-18  6:07             ` Peng Tao
@ 2013-11-18 13:52               ` Greg Kroah-Hartman
  2013-11-19 13:29                 ` Peng Tao
  0 siblings, 1 reply; 67+ messages in thread
From: Greg Kroah-Hartman @ 2013-11-18 13:52 UTC (permalink / raw)
  To: Peng Tao; +Cc: Dilger, Andreas, linux-kernel, faibish, sorin

On Mon, Nov 18, 2013 at 02:07:02PM +0800, Peng Tao wrote:
> >> > So, why would I believe that you all are really going to start doing the
> >> > major cleanup work on this code that is so obviously needed?  Why would
> >> > I take new features that you are spending your time on instead?
> >> >
> >> My apologize. I was distracted by other projects for some time and I
> >> am trying to make it up. I thought you agreed that I can first close
> >> the patch gap between in-kernel client and external tree, and then add
> >> cleanup patches, no?
> >
> > What exactly is "the gap"?  Are we talking 200 patches?  10?  100?  I
> > have no idea what you are referring to, but realize I can't review 100
> > patches at a single time easily.
> >
> Right now, I think it is about 150 patches. I will split them into
> smaller chunks (10 patches or in each patchset) and send them
> incrementally.

Good.

> > And what happens when I accep these "gap" patches?  Will development in
> > this external tree suddenly stop?  It sure doesn't seem like this is
> > happening as I thought it would have already since the code has been in
> > the kernel for over 6 months now.
> >
> No, it won't stop. The biggest feature coming in this time is HSM. In
> future, I think there will be less features and more bug fixes.
> Porting patches from external tree isn't just to make the two tree in
> sync. External tree contains many bug fix patches that in kernel
> client wants as well. If we stop porting, as time goes by, the in tree
> client will be less featured, and more buggy than external tree. When
> we get it out of staging like that, it will be very hard to persuade
> people to use it.

So, who is going to take the time to do the cleanup and work involved in
getting this into mergable state, if developers are off adding new
features to your out-of-tree kernel code?

I _really_ hate taking new features for staging drivers, as that shows
that the developers are using staging as just a dumping ground to get
stuff merged, without taking the time to get it cleaned up properly.

If this continues, and no major work is done to clean things up within
the next 6 months (i.e. the same thing as the past 6 months.) this code
will be deleted from the tree.  You have been warned.

> > You already have 2 different code bases :(
> we still want to make the difference as small as possible. So it won't
> cost too much effort to port patches between them.

That is your problem, not ours, sorry.

Best of luck,

greg k-h

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore
  2013-11-18 13:52               ` Greg Kroah-Hartman
@ 2013-11-19 13:29                 ` Peng Tao
  0 siblings, 0 replies; 67+ messages in thread
From: Peng Tao @ 2013-11-19 13:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dilger, Andreas, linux-kernel, faibish, sorin

On Mon, Nov 18, 2013 at 9:52 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Mon, Nov 18, 2013 at 02:07:02PM +0800, Peng Tao wrote:
>> >> > So, why would I believe that you all are really going to start doing the
>> >> > major cleanup work on this code that is so obviously needed?  Why would
>> >> > I take new features that you are spending your time on instead?
>> >> >
>> >> My apologize. I was distracted by other projects for some time and I
>> >> am trying to make it up. I thought you agreed that I can first close
>> >> the patch gap between in-kernel client and external tree, and then add
>> >> cleanup patches, no?
>> >
>> > What exactly is "the gap"?  Are we talking 200 patches?  10?  100?  I
>> > have no idea what you are referring to, but realize I can't review 100
>> > patches at a single time easily.
>> >
>> Right now, I think it is about 150 patches. I will split them into
>> smaller chunks (10 patches or in each patchset) and send them
>> incrementally.
>
> Good.
>
>> > And what happens when I accep these "gap" patches?  Will development in
>> > this external tree suddenly stop?  It sure doesn't seem like this is
>> > happening as I thought it would have already since the code has been in
>> > the kernel for over 6 months now.
>> >
>> No, it won't stop. The biggest feature coming in this time is HSM. In
>> future, I think there will be less features and more bug fixes.
>> Porting patches from external tree isn't just to make the two tree in
>> sync. External tree contains many bug fix patches that in kernel
>> client wants as well. If we stop porting, as time goes by, the in tree
>> client will be less featured, and more buggy than external tree. When
>> we get it out of staging like that, it will be very hard to persuade
>> people to use it.
>
> So, who is going to take the time to do the cleanup and work involved in
> getting this into mergable state, if developers are off adding new
> features to your out-of-tree kernel code?
>
I am to work on it.

> I _really_ hate taking new features for staging drivers, as that shows
> that the developers are using staging as just a dumping ground to get
> stuff merged, without taking the time to get it cleaned up properly.
>
> If this continues, and no major work is done to clean things up within
> the next 6 months (i.e. the same thing as the past 6 months.) this code
> will be deleted from the tree.  You have been warned.
>
Got it.

Thanks,
Tao

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2013-11-19 13:30 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-14 16:13 [PATCH 00/40] staging/lustre: patch bomb 1 Peng Tao
2013-11-14 16:13 ` [PATCH 01/40] staging/lustre/llite: restore ll_fiemap Peng Tao
2013-11-14 16:13 ` [PATCH 02/40] staging/lustre: remove lu_target.h Peng Tao
2013-11-14 16:13 ` [PATCH 03/40] staging/lustre: remove llog_server.c Peng Tao
2013-11-14 16:13 ` [PATCH 04/40] staging/lustre/llite: Access to released file trigs a restore Peng Tao
2013-11-15  4:09   ` Greg Kroah-Hartman
2013-11-15  9:55     ` Peng Tao
2013-11-16 10:36     ` Dilger, Andreas
2013-11-16 19:59       ` Greg Kroah-Hartman
2013-11-18  3:07         ` Peng Tao
2013-11-18  4:39           ` Greg Kroah-Hartman
2013-11-18  6:07             ` Peng Tao
2013-11-18 13:52               ` Greg Kroah-Hartman
2013-11-19 13:29                 ` Peng Tao
2013-11-14 16:13 ` [PATCH 05/40] staging/lustre: validate open handle cookies Peng Tao
2013-11-15  4:13   ` Greg Kroah-Hartman
2013-11-15 10:22     ` Peng Tao
2013-11-15 20:57       ` Greg Kroah-Hartman
2013-11-16 11:20     ` Dilger, Andreas
2013-11-16 19:50       ` Greg Kroah-Hartman
2013-11-18  2:36         ` Peng Tao
2013-11-18  4:35           ` Greg Kroah-Hartman
2013-11-18  6:18             ` Peng Tao
2013-11-18  8:57               ` Dilger, Andreas
2013-11-14 16:13 ` [PATCH 06/40] staging/lustre/llite: use correct FID in ll_och_fill() Peng Tao
2013-11-14 16:13 ` [PATCH 07/40] staging/lustre/hsm: Implementation of exclusive open Peng Tao
2013-11-15  4:17   ` Greg Kroah-Hartman
2013-11-15 10:26     ` Peng Tao
2013-11-14 16:13 ` [PATCH 08/40] staging/lustre/lnet: Fix assert on empty group in selftest module Peng Tao
2013-11-14 16:13 ` [PATCH 09/40] staging/lustre/ldlm: fix resource/fid check, use DLDLMRES Peng Tao
2013-11-14 16:13 ` [PATCH 10/40] staging/lustre/server: use unified request handler for MGS Peng Tao
2013-11-14 16:13 ` [PATCH 11/40] staging/lustre/llog: MGC to use OSD API for backup logs Peng Tao
2013-11-14 16:13 ` [PATCH 12/40] staging/lustre/nfs: writing to new files will return ENOENT Peng Tao
2013-11-14 16:22   ` Cheng Shao
2013-11-14 16:28     ` Peng Tao
2013-11-15  4:01       ` Greg Kroah-Hartman
2013-11-14 16:13 ` [PATCH 13/40] staging/lustre/autoconf: remove vectored fops tests Peng Tao
2013-11-14 16:13 ` [PATCH 14/40] staging/lustre/autoconf: remove LIBCFS_HAVE_IS_COMPAT_TASK test Peng Tao
2013-11-14 16:13 ` [PATCH 15/40] staging/lustre/llog: fix return value of llog_alloc_handle Peng Tao
2013-11-14 16:13 ` [PATCH 16/40] staging/lustre/lov: convert magic to host-endian in lov_dump_lmm() Peng Tao
2013-11-14 16:13 ` [PATCH 17/40] staging/lustre/ptlrpc: Fix race during exp_flock_hash creation Peng Tao
2013-11-14 16:13 ` [PATCH 18/40] staging/lustre/mdc: prevent fall through in mdc_iocontrol() Peng Tao
2013-11-14 16:13 ` [PATCH 19/40] staging/lustre/lu: shrink lu_object by 8 bytes on x86_64 Peng Tao
2013-11-14 16:13 ` [PATCH 20/40] staging/lustre/mdt: HSM coordinator client interface Peng Tao
2013-11-14 16:13 ` [PATCH 21/40] staging/lustre/mdt: HSM coordinator agent interface Peng Tao
2013-11-14 16:13 ` [PATCH 22/40] staging/lustre/scrub: OI scrub on OST Peng Tao
2013-11-14 16:13 ` [PATCH 23/40] staging/lustre/scrub: control OI scrub on OST from user space Peng Tao
2013-11-14 16:13 ` [PATCH 24/40] staging/lustre/llite: don't check for O_CREAT in it_create_mode Peng Tao
2013-11-14 16:13 ` [PATCH 25/40] staging/lustre/build: clean up unused variables and dead code Peng Tao
2013-11-14 16:13 ` [PATCH 26/40] staging/lustre/build: fix compilation issue with is_compat_task Peng Tao
2013-11-14 16:13 ` [PATCH 27/40] staging/lustre/ptlrpc: Fix a crash when dereferencing NULL pointer Peng Tao
2013-11-14 16:13 ` [PATCH 28/40] staging/lustre/hsm: Add hsm_release feature Peng Tao
2013-11-14 16:13 ` [PATCH 29/40] staging/lustre/llite: extended attribute cache Peng Tao
2013-11-14 16:13 ` [PATCH 30/40] staging/lustre/xattr: separate ACL and XATTR caches Peng Tao
2013-11-14 16:13 ` [PATCH 31/40] staging/lustre/lnet: Add LNet Router Priority parameter Peng Tao
2013-11-14 16:13 ` [PATCH 32/40] staging/lustre/api: HSM import uses new released pattern Peng Tao
2013-11-14 16:13 ` [PATCH 33/40] staging/lustre/target: move OUT to the unified target code Peng Tao
2013-11-14 16:13 ` [PATCH 34/40] staging/lustre/seq: unified SEQ handler Peng Tao
2013-11-14 16:13 ` [PATCH 35/40] staging/lustre/llite: remove ll_d_root_ops Peng Tao
2013-11-14 16:13 ` [PATCH 36/40] staging/lustre/llite: pass correct pointer to obd_iocontrol() Peng Tao
2013-11-14 16:13 ` [PATCH 37/40] staging/lustre/idl: remove LASSERT/CLASSERT from lustre_idl.h Peng Tao
2013-11-14 16:13 ` [PATCH 38/40] staging/lustre/mgs: set_param -P option that sets value permanently Peng Tao
2013-11-14 16:13 ` [PATCH 39/40] staging/lustre/utils: HSM Posix CopyTool Peng Tao
2013-11-14 16:13 ` [PATCH 40/40] staging/lustre/ptlrpc: flock deadlock detection does not work Peng Tao
2013-11-15  3:59 ` [PATCH 00/40] staging/lustre: patch bomb 1 Greg Kroah-Hartman
2013-11-15  9:51   ` Peng Tao
2013-11-15 20:54     ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).