linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/60] staging: lustre: batches of fixes for lustre client
@ 2017-01-29  0:04 James Simmons
  2017-01-29  0:04 ` [PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw James Simmons
                   ` (60 more replies)
  0 siblings, 61 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Batch of missing fixes for lustre for the upstream client.

Alex Zhuravlev (1):
  staging: lustre: obdclass: do not call lu_site_purge() for single object exceed

Alexander Boyko (1):
  staging: lustre: ptlrpc: skip lock if export failed

Andreas Dilger (3):
  staging: lustre: mdc: quiet console message for known -EINTR
  staging: lustre: obdclass: add more info to sysfs version string
  staging: lustre: llite: handle inactive OSTs better in statfs

Andriy Skulysh (1):
  staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed

Ann Koehler (1):
  staging: lustre: obd: RCU stalls in lu_cache_shrink_count()

Ben Evans (1):
  staging: lustre: lustre: Remove old commented out code

Bobi Jam (3):
  staging: lustre: clio: add cl_page LRU shrinker
  staging: lustre: lov: ld_target could be NULL
  staging: lustre: llite: specify READA debug mask for ras_update

Bruno Faccini (1):
  staging: lustre: obdclass: health_check to report unhealthy upon LBUG

Dmitry Eremin (6):
  staging: lustre: llite: Setting xattr are properly checked with and without ACLs
  staging: lustre: libcfs: avoid stomping on module param cpu_pattern
  staging: lustre: libcfs: default CPT matches NUMA topology
  staging: lustre: libcfs: fix error messages
  staging: lustre: ptlrpc: remove unused pc->pc_env
  staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c

Fan Yong (4):
  staging: lustre: fid: fix race in fid allocation
  staging: lustre: mgc: handle config_llog_data::cld_refcount properly
  staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab
  staging: lustre: linkea: linkEA size limitation

Giuseppe Di Natale (1):
  staging: lustre: lmv: Correctly generate target_obd

James Simmons (7):
  staging: lustre: header: remove assert from interval_set()
  staging: libcfs: remove integer types abstraction from libcfs
  staging: lustre: socklnd: remove socklnd_init_msg
  staging: lustre: obd: move s3 in lmd_parse to inner loop
  staging: lustre: osc: avoid 64 divide in osc_cache_too_much
  staging: lustre: ptlrpc : remove userland usage from ptlrpc
  staging: lustre: libcfs: fix minimum size check for libcfs ioctl

Jeremy Filizetti (1):
  staging: lustre: ldlm: Restore connect flags on failure

Jinshan Xiong (4):
  staging: lustre: llite: Remove access of stripe in ll_setattr_raw
  staging: lustre: clio: revise readahead to support 16MB IO
  staging: lustre: llite: don't ignore layout for group lock request
  staging: lustre: osc: limits the number of chunks in write RPC

John L. Hammond (5):
  staging: lustre: llite: remove obsolete comment for ll_unlink()
  staging: lustre: ptlrpc: correct use of list_add_tail()
  staging: lustre: lmv: remove unused placement parameter
  staging: lustre: obd: remove OBD_NOTIFY_CREATE
  staging: lustre: mdc: avoid returning freed request

Lai Siyao (2):
  staging: lustre: statahead: drop support for remote entry
  staging: lustre: llite: normal user can't set FS default stripe

Liang Zhen (1):
  staging: lustre: ksocklnd: ignore timedout TX on closing connection

Nathaniel Clark (1):
  staging: lustre: lov: Ensure correct operation for large object sizes

Niu Yawei (4):
  staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend
  staging: lustre: clio: sync write should update mtime
  staging: ptlrpc: leaked rs on difficult reply
  staging: lustre: ptlrpc: update replay cursor when close during replay

Oleg Drokin (1):
  staging: lustre: llite: Trust creates in revalidate too.

Patrick Farrell (1):
  staging: lustre: mdc: Make IT_OPEN take lookup bits lock

Rahul Deshmukh (1):
  staging: lustre: llite: Adding timed wait in ll_umount_begin

Steve Guminski (3):
  staging: lustre: osc: osc_match_base prototype differs from declaration
  staging: lustre: libcfs: Change positional struct initializers to C99
  staging: lustre: fid: Change positional struct initializers to C99

Ulka Vaze (1):
  staging: lustre: lmv: Error not handled for lmv_find_target

Vladimir Saveliev (1):
  staging: lustre: ptlrpc: allow blocking asts to be delayed

Yang Sheng (1):
  staging: lustre: llite: don't invoke direct_IO for the EOF case

frank zago (1):
  staging: lustre: hsm: stack overrun in hai_dump_data_field

wang di (2):
  staging: lustre: llite: check request != NULL in ll_migrate
  staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves

 .../lustre/include/linux/libcfs/libcfs_crypto.h    |  60 +++++--
 .../lustre/include/linux/libcfs/linux/libcfs.h     |   4 -
 .../staging/lustre/include/linux/lnet/socklnd.h    |   9 -
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c |   2 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |   2 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |  29 +--
 drivers/staging/lustre/lnet/libcfs/debug.c         |   2 +-
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |  17 +-
 .../lustre/lnet/libcfs/linux/linux-module.c        |   2 +-
 drivers/staging/lustre/lnet/libcfs/workitem.c      |   2 +-
 drivers/staging/lustre/lnet/lnet/acceptor.c        |   4 +-
 drivers/staging/lustre/lnet/selftest/module.c      |   3 +-
 drivers/staging/lustre/lustre/fid/fid_lib.c        |   7 +-
 drivers/staging/lustre/lustre/fid/fid_request.c    |  55 +++---
 drivers/staging/lustre/lustre/fid/lproc_fid.c      |  12 +-
 drivers/staging/lustre/lustre/include/cl_object.h  |  10 +-
 .../staging/lustre/lustre/include/interval_tree.h  |  12 +-
 drivers/staging/lustre/lustre/include/lu_object.h  |  14 +-
 .../lustre/lustre/include/lustre/lustre_idl.h      |   5 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |  18 +-
 .../staging/lustre/lustre/include/lustre_linkea.h  |  15 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |   4 -
 .../lustre/lustre/include/lustre_req_layout.h      |  10 +-
 drivers/staging/lustre/lustre/include/obd.h        |  18 +-
 drivers/staging/lustre/lustre/include/obd_class.h  |   5 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c   |   6 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   1 -
 .../staging/lustre/lustre/ldlm/ldlm_inodebits.c    |   1 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |  13 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |   7 +
 drivers/staging/lustre/lustre/llite/dcache.c       |  13 +-
 drivers/staging/lustre/lustre/llite/dir.c          |  14 +-
 drivers/staging/lustre/lustre/llite/file.c         |  83 +++++----
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   9 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   2 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  16 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    | 124 ++++++-------
 drivers/staging/lustre/lustre/llite/namei.c        |   5 -
 drivers/staging/lustre/lustre/llite/range_lock.c   |  10 +-
 drivers/staging/lustre/lustre/llite/range_lock.h   |   2 +-
 drivers/staging/lustre/lustre/llite/rw.c           | 199 ++++++++++-----------
 drivers/staging/lustre/lustre/llite/rw26.c         |   4 +
 drivers/staging/lustre/lustre/llite/statahead.c    |  94 ++++------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  17 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |   9 +
 drivers/staging/lustre/lustre/lmv/lmv_intent.c     |  16 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  65 ++++---
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c      |  85 ++-------
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  22 +--
 drivers/staging/lustre/lustre/lov/lov_io.c         |   7 +-
 drivers/staging/lustre/lustre/lov/lov_lock.c       |   5 +
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   2 -
 drivers/staging/lustre/lustre/lov/lov_object.c     |  33 +++-
 drivers/staging/lustre/lustre/lov/lov_request.c    |   6 +-
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |   3 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  18 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   9 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    | 183 ++++++++++---------
 drivers/staging/lustre/lustre/obdclass/linkea.c    |  70 ++++++--
 .../lustre/lustre/obdclass/linux/linux-module.c    |   8 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c | 106 +++++------
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |   3 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c      | 125 +++++++++----
 drivers/staging/lustre/lustre/osc/osc_internal.h   |  15 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |  15 +-
 drivers/staging/lustre/lustre/osc/osc_page.c       |  98 +++++++++-
 drivers/staging/lustre/lustre/osc/osc_request.c    |  21 +++
 drivers/staging/lustre/lustre/ptlrpc/client.c      |  24 ++-
 drivers/staging/lustre/lustre/ptlrpc/events.c      |   3 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |  26 ++-
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   5 +-
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     |  18 +-
 drivers/staging/lustre/lustre/ptlrpc/recover.c     |  24 +--
 drivers/staging/lustre/lustre/ptlrpc/sec_gc.c      |   2 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c     |  21 +--
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |  16 +-
 76 files changed, 1136 insertions(+), 868 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 02/60] staging: lustre: statahead: drop support for remote entry James Simmons
                   ` (59 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

In ll_setattr_raw(), it needs to know if a file is released
when the file is being truncated. It used to get this information
by accessing lov_stripe_md. This turns out not necessary. This
patch removes the access of lov_stripe_md and solves the problem
in lov_io_init_released().

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/13514
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   6 --
 drivers/staging/lustre/lustre/llite/file.c         |   2 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   9 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c    | 109 ++++++++++-----------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  10 +-
 drivers/staging/lustre/lustre/lov/lov_io.c         |   7 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     |   3 -
 8 files changed, 68 insertions(+), 79 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index dc68561..a1b8301 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -284,12 +284,6 @@ struct cl_layout {
 	size_t		cl_size;
 	/** Layout generation. */
 	u32		cl_layout_gen;
-	/**
-	 * True if this is a released file.
-	 * Temporarily added for released file truncate in ll_setattr_raw().
-	 * It will be removed later. -Jinshan
-	 */
-	bool		cl_is_released;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index a171188..0ee02f1 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1821,7 +1821,7 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 	return rc;
 }
 
-static int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
 {
 	struct md_op_data	*op_data;
 	int			 rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index dd1cfd8..f1036f4 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -94,6 +94,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
 
 	io = vvp_env_thread_io(env);
 	io->ci_obj = obj;
+	io->ci_verify_layout = 1;
 
 	io->u.ci_setattr.sa_attr.lvb_atime = LTIME_S(attr->ia_atime);
 	io->u.ci_setattr.sa_attr.lvb_mtime = LTIME_S(attr->ia_mtime);
@@ -120,13 +121,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
 	cl_io_fini(env, io);
 	if (unlikely(io->ci_need_restart))
 		goto again;
-	/* HSM import case: file is released, cannot be restored
-	 * no need to fail except if restore registration failed
-	 * with -ENODATA
-	 */
-	if (result == -ENODATA && io->ci_restore_needed &&
-	    io->ci_result != -ENODATA)
-		result = 0;
+
 	cl_env_put(env, &refcheck);
 	return result;
 }
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 065a9a7..2c72177 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -766,6 +766,7 @@ int ll_dir_getstripe(struct inode *inode, void **lmmp, int *lmm_size,
 int ll_fid2path(struct inode *inode, void __user *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int flags);
 int ll_hsm_release(struct inode *inode);
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss);
 
 /* llite/dcache.c */
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 9cb4909..769b307 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1402,7 +1402,11 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 	 * cache is not cleared yet.
 	 */
 	op_data->op_attr.ia_valid &= ~(TIMES_SET_FLAGS | ATTR_SIZE);
+	if (S_ISREG(inode->i_mode))
+		inode_lock(inode);
 	rc = simple_setattr(dentry, &op_data->op_attr);
+	if (S_ISREG(inode->i_mode))
+		inode_unlock(inode);
 	op_data->op_attr.ia_valid = ia_valid;
 
 	rc = ll_update_inode(inode, &md);
@@ -1431,7 +1435,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	struct inode *inode = d_inode(dentry);
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct md_op_data *op_data = NULL;
-	bool file_is_released = false;
 	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "%s: setattr inode "DFID"(%p) from %llu to %llu, valid %x, hsm_import %d\n",
@@ -1486,76 +1489,35 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		       LTIME_S(attr->ia_mtime), LTIME_S(attr->ia_ctime),
 		       (s64)ktime_get_real_seconds());
 
-	/* We always do an MDS RPC, even if we're only changing the size;
-	 * only the MDS knows whether truncate() should fail with -ETXTBUSY
-	 */
-
-	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
-	if (!op_data)
-		return -ENOMEM;
-
-	if (!S_ISDIR(inode->i_mode))
+	if (S_ISREG(inode->i_mode))
 		inode_unlock(inode);
 
-	/* truncate on a released file must failed with -ENODATA,
-	 * so size must not be set on MDS for released file
-	 * but other attributes must be set
+	/*
+	 * We always do an MDS RPC, even if we're only changing the size;
+	 * only the MDS knows whether truncate() should fail with -ETXTBUSY
 	 */
-	if (S_ISREG(inode->i_mode)) {
-		struct cl_layout cl = {
-			.cl_is_released = false,
-		};
-		struct lu_env *env;
-		int refcheck;
-		__u32 gen;
+	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+	if (!op_data) {
+		rc = -ENOMEM;
+		goto out;
+	}
 
-		rc = ll_layout_refresh(inode, &gen);
-		if (rc < 0)
-			goto out;
+	op_data->op_attr = *attr;
 
+	if (!hsm_import && attr->ia_valid & ATTR_SIZE) {
 		/*
-		 * XXX: the only place we need to know the layout type,
-		 * this will be removed by a later patch. -Jinshan
+		 * If we are changing file size, file content is
+		 * modified, flag it.
 		 */
-		env = cl_env_get(&refcheck);
-		if (IS_ERR(env)) {
-			rc = PTR_ERR(env);
-			goto out;
-		}
-
-		rc = cl_object_layout_get(env, lli->lli_clob, &cl);
-		cl_env_put(env, &refcheck);
-		if (rc < 0)
-			goto out;
-
-		file_is_released = cl.cl_is_released;
-
-		if (!hsm_import && attr->ia_valid & ATTR_SIZE) {
-			if (file_is_released) {
-				rc = ll_layout_restore(inode, 0, attr->ia_size);
-				if (rc < 0)
-					goto out;
-
-				file_is_released = false;
-				ll_layout_refresh(inode, &gen);
-			}
-
-			/*
-			 * If we are changing file size, file content is
-			 * modified, flag it.
-			 */
-			attr->ia_valid |= MDS_OPEN_OWNEROVERRIDE;
-			op_data->op_bias |= MDS_DATA_MODIFIED;
-		}
+		attr->ia_valid |= MDS_OPEN_OWNEROVERRIDE;
+		op_data->op_bias |= MDS_DATA_MODIFIED;
 	}
 
-	memcpy(&op_data->op_attr, attr, sizeof(*attr));
-
 	rc = ll_md_setattr(dentry, op_data);
 	if (rc)
 		goto out;
 
-	if (!S_ISREG(inode->i_mode) || file_is_released) {
+	if (!S_ISREG(inode->i_mode) || hsm_import) {
 		rc = 0;
 		goto out;
 	}
@@ -1572,11 +1534,40 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		 */
 		rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, attr, 0);
 	}
+
+	/*
+	 * If the file was restored, it needs to set dirty flag.
+	 *
+	 * We've already sent MDS_DATA_MODIFIED flag in
+	 * ll_md_setattr() for truncate. However, the MDT refuses to
+	 * set the HS_DIRTY flag on released files, so we have to set
+	 * it again if the file has been restored. Please check how
+	 * LLIF_DATA_MODIFIED is set in vvp_io_setattr_fini().
+	 *
+	 * Please notice that if the file is not released, the previous
+	 * MDS_DATA_MODIFIED has taken effect and usually
+	 * LLIF_DATA_MODIFIED is not set(see vvp_io_setattr_fini()).
+	 * This way we can save an RPC for common open + trunc
+	 * operation.
+	 */
+	if (test_and_clear_bit(LLIF_DATA_MODIFIED, &lli->lli_flags)) {
+		struct hsm_state_set hss = {
+			.hss_valid = HSS_SETMASK,
+			.hss_setmask = HS_DIRTY,
+		};
+		int rc2;
+
+		rc2 = ll_hsm_state_set(inode, &hss);
+		if (rc2 < 0)
+			CERROR(DFID "HSM set dirty failed: rc2 = %d\n",
+			       PFID(ll_inode2fid(inode)), rc2);
+	}
+
 out:
 	if (op_data)
 		ll_finish_md_op_data(op_data);
 
-	if (!S_ISDIR(inode->i_mode)) {
+	if (S_ISREG(inode->i_mode)) {
 		inode_lock(inode);
 		if ((attr->ia_valid & ATTR_SIZE) && !hsm_import)
 			inode_dio_wait(inode);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 697cbfb..19f85fc 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -288,7 +288,7 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 	       io->ci_ignore_layout, io->ci_verify_layout,
 	       vio->vui_layout_gen, io->ci_restore_needed);
 
-	if (io->ci_restore_needed == 1) {
+	if (io->ci_restore_needed) {
 		int	rc;
 
 		/* file was detected release, we need to restore it
@@ -657,7 +657,15 @@ static void vvp_io_setattr_end(const struct lu_env *env,
 static void vvp_io_setattr_fini(const struct lu_env *env,
 				const struct cl_io_slice *ios)
 {
+	bool restore_needed = ios->cis_io->ci_restore_needed;
+	struct inode *inode = vvp_object_inode(ios->cis_obj);
+
 	vvp_io_fini(env, ios);
+
+	if (restore_needed && !ios->cis_io->ci_restore_needed) {
+		/* restore finished, set data modified flag for HSM */
+		set_bit(LLIF_DATA_MODIFIED, &(ll_i2info(inode))->lli_flags);
+	}
 }
 
 static int vvp_io_read_start(const struct lu_env *env,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 002326c..e0f0756 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -1056,9 +1056,12 @@ int lov_io_init_released(const struct lu_env *env, struct cl_object *obj,
 		 * - in setattr, for truncate
 		 */
 		/* the truncate is for size > 0 so triggers a restore */
-		if (cl_io_is_trunc(io))
+		if (cl_io_is_trunc(io)) {
 			io->ci_restore_needed = 1;
-		result = -ENODATA;
+			result = -ENODATA;
+		} else {
+			result = 1;
+		}
 		break;
 	case CIT_READ:
 	case CIT_WRITE:
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 76d4256..46ec46e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1453,14 +1453,11 @@ static int lov_object_layout_get(const struct lu_env *env,
 	if (!lsm) {
 		cl->cl_size = 0;
 		cl->cl_layout_gen = CL_LAYOUT_GEN_EMPTY;
-		cl->cl_is_released = false;
-
 		return 0;
 	}
 
 	cl->cl_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
 	cl->cl_layout_gen = lsm->lsm_layout_gen;
-	cl->cl_is_released = lsm_is_released(lsm);
 
 	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
 	lov_lsm_put(lsm);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 02/60] staging: lustre: statahead: drop support for remote entry
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
  2017-01-29  0:04 ` [PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 03/60] staging: lustre: clio: add cl_page LRU shrinker James Simmons
                   ` (58 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Lai Siyao,
	James Simmons

From: Lai Siyao <lai.siyao@intel.com>

This patch dropped support for remote entry statahead, because it
needs 2 async RPCs to fetch both LOOKUP lock from parent MDT and
UPDATE lock from client MDT, which is complicated. Plus not
supporting remote entry statahead won't cause any issue.

* pack child fid in statahead request.
* lmv_intent_getattr_async() will compare parent and child MDT,
  if child is remote, return -ENOTSUPP.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6578
Reviewed-on: http://review.whamcloud.com/15767
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h       |  4 +-
 drivers/staging/lustre/lustre/include/obd_class.h |  5 +-
 drivers/staging/lustre/lustre/llite/statahead.c   | 94 +++++++++--------------
 drivers/staging/lustre/lustre/lmv/lmv_obd.c       | 30 ++++++--
 drivers/staging/lustre/lustre/mdc/mdc_internal.h  |  3 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c     | 16 +---
 6 files changed, 68 insertions(+), 84 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 6f0f5dd..7f0fc44 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -761,6 +761,7 @@ struct md_enqueue_info {
 	struct lookup_intent    mi_it;
 	struct lustre_handle    mi_lockh;
 	struct inode	   *mi_dir;
+	struct ldlm_enqueue_info	mi_einfo;
 	int (*mi_cb)(struct ptlrpc_request *req,
 		     struct md_enqueue_info *minfo, int rc);
 	void			*mi_cbdata;
@@ -978,8 +979,7 @@ struct md_ops {
 				struct lu_fid *fid);
 
 	int (*intent_getattr_async)(struct obd_export *,
-				    struct md_enqueue_info *,
-				    struct ldlm_enqueue_info *);
+				    struct md_enqueue_info *);
 
 	int (*revalidate_lock)(struct obd_export *, struct lookup_intent *,
 			       struct lu_fid *, __u64 *bits);
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 7ec2520..083a6ff 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1444,14 +1444,13 @@ static inline int md_init_ea_size(struct obd_export *exp, u32 easize,
 }
 
 static inline int md_intent_getattr_async(struct obd_export *exp,
-					  struct md_enqueue_info *minfo,
-					  struct ldlm_enqueue_info *einfo)
+					  struct md_enqueue_info *minfo)
 {
 	int rc;
 
 	EXP_CHECK_MD_OP(exp, intent_getattr_async);
 	EXP_MD_COUNTER_INCREMENT(exp, intent_getattr_async);
-	rc = MDP(exp->exp_obd, intent_getattr_async)(exp, minfo, einfo);
+	rc = MDP(exp->exp_obd, intent_getattr_async)(exp, minfo);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c
index f1ee17f..fb7c315 100644
--- a/drivers/staging/lustre/lustre/llite/statahead.c
+++ b/drivers/staging/lustre/lustre/llite/statahead.c
@@ -79,6 +79,8 @@ struct sa_entry {
 	struct inode	   *se_inode;
 	/* entry name */
 	struct qstr	     se_qstr;
+	/* entry fid */
+	struct lu_fid		se_fid;
 };
 
 static unsigned int sai_generation;
@@ -169,7 +171,7 @@ static inline int is_omitted_entry(struct ll_statahead_info *sai, __u64 index)
 /* allocate sa_entry and hash it to allow scanner process to find it */
 static struct sa_entry *
 sa_alloc(struct dentry *parent, struct ll_statahead_info *sai, __u64 index,
-	 const char *name, int len)
+	 const char *name, int len, const struct lu_fid *fid)
 {
 	struct ll_inode_info *lli;
 	struct sa_entry   *entry;
@@ -194,6 +196,7 @@ static inline int is_omitted_entry(struct ll_statahead_info *sai, __u64 index)
 	entry->se_qstr.hash = full_name_hash(parent, name, len);
 	entry->se_qstr.len = len;
 	entry->se_qstr.name = dname;
+	entry->se_fid = *fid;
 
 	lli = ll_i2info(sai->sai_dentry->d_inode);
 	spin_lock(&lli->lli_sa_lock);
@@ -566,24 +569,8 @@ static void sa_instantiate(struct ll_statahead_info *sai,
 	}
 
 	child = entry->se_inode;
-	if (!child) {
-		/*
-		 * lookup.
-		 */
-		LASSERT(fid_is_zero(&minfo->mi_data.op_fid2));
-
-		/* XXX: No fid in reply, this is probably cross-ref case.
-		 * SA can't handle it yet.
-		 */
-		if (body->mbo_valid & OBD_MD_MDS) {
-			rc = -EAGAIN;
-			goto out;
-		}
-	} else {
-		/*
-		 * revalidate.
-		 */
-		/* unlinked and re-created with the same name */
+	if (child) {
+		/* revalidate; unlinked and re-created with the same name */
 		if (unlikely(!lu_fid_eq(&minfo->mi_data.op_fid2, &body->mbo_fid1))) {
 			entry->se_inode = NULL;
 			iput(child);
@@ -720,50 +707,42 @@ static int ll_statahead_interpret(struct ptlrpc_request *req,
 }
 
 /* finish async stat RPC arguments */
-static void sa_fini_data(struct md_enqueue_info *minfo,
-			 struct ldlm_enqueue_info *einfo)
+static void sa_fini_data(struct md_enqueue_info *minfo)
 {
-	LASSERT(minfo && einfo);
 	iput(minfo->mi_dir);
 	kfree(minfo);
-	kfree(einfo);
 }
 
 /**
  * prepare arguments for async stat RPC.
  */
-static int sa_prep_data(struct inode *dir, struct inode *child,
-			struct sa_entry *entry, struct md_enqueue_info **pmi,
-			struct ldlm_enqueue_info **pei)
+static struct md_enqueue_info *
+sa_prep_data(struct inode *dir, struct inode *child, struct sa_entry *entry)
 {
-	const struct qstr      *qstr = &entry->se_qstr;
 	struct md_enqueue_info   *minfo;
 	struct ldlm_enqueue_info *einfo;
 	struct md_op_data	*op_data;
 
-	einfo = kzalloc(sizeof(*einfo), GFP_NOFS);
-	if (!einfo)
-		return -ENOMEM;
-
 	minfo = kzalloc(sizeof(*minfo), GFP_NOFS);
-	if (!minfo) {
-		kfree(einfo);
-		return -ENOMEM;
-	}
+	if (!minfo)
+		return ERR_PTR(-ENOMEM);
 
-	op_data = ll_prep_md_op_data(&minfo->mi_data, dir, child, qstr->name,
-				     qstr->len, 0, LUSTRE_OPC_ANY, NULL);
+	op_data = ll_prep_md_op_data(&minfo->mi_data, dir, child, NULL, 0, 0,
+				     LUSTRE_OPC_ANY, NULL);
 	if (IS_ERR(op_data)) {
-		kfree(einfo);
 		kfree(minfo);
-		return PTR_ERR(op_data);
+		return (struct md_enqueue_info *)op_data;
 	}
 
+	if (!child)
+		op_data->op_fid2 = entry->se_fid;
+
 	minfo->mi_it.it_op = IT_GETATTR;
 	minfo->mi_dir = igrab(dir);
 	minfo->mi_cb = ll_statahead_interpret;
 	minfo->mi_cbdata = entry;
 
+	einfo = &minfo->mi_einfo;
 	einfo->ei_type   = LDLM_IBITS;
 	einfo->ei_mode   = it_to_lock_mode(&minfo->mi_it);
 	einfo->ei_cb_bl  = ll_md_blocking_ast;
@@ -771,26 +750,22 @@ static int sa_prep_data(struct inode *dir, struct inode *child,
 	einfo->ei_cb_gl  = NULL;
 	einfo->ei_cbdata = NULL;
 
-	*pmi = minfo;
-	*pei = einfo;
-
-	return 0;
+	return minfo;
 }
 
 /* async stat for file not found in dcache */
 static int sa_lookup(struct inode *dir, struct sa_entry *entry)
 {
 	struct md_enqueue_info   *minfo;
-	struct ldlm_enqueue_info *einfo;
 	int		       rc;
 
-	rc = sa_prep_data(dir, NULL, entry, &minfo, &einfo);
-	if (rc)
-		return rc;
+	minfo = sa_prep_data(dir, NULL, entry);
+	if (IS_ERR(minfo))
+		return PTR_ERR(minfo);
 
-	rc = md_intent_getattr_async(ll_i2mdexp(dir), minfo, einfo);
+	rc = md_intent_getattr_async(ll_i2mdexp(dir), minfo);
 	if (rc)
-		sa_fini_data(minfo, einfo);
+		sa_fini_data(minfo);
 
 	return rc;
 }
@@ -809,7 +784,6 @@ static int sa_revalidate(struct inode *dir, struct sa_entry *entry,
 	struct lookup_intent      it = { .it_op = IT_GETATTR,
 					 .it_lock_handle = 0 };
 	struct md_enqueue_info   *minfo;
-	struct ldlm_enqueue_info *einfo;
 	int rc;
 
 	if (unlikely(!inode))
@@ -827,25 +801,26 @@ static int sa_revalidate(struct inode *dir, struct sa_entry *entry,
 		return 1;
 	}
 
-	rc = sa_prep_data(dir, inode, entry, &minfo, &einfo);
-	if (rc) {
+	minfo = sa_prep_data(dir, inode, entry);
+	if (IS_ERR(minfo)) {
 		entry->se_inode = NULL;
 		iput(inode);
-		return rc;
+		return PTR_ERR(minfo);
 	}
 
-	rc = md_intent_getattr_async(ll_i2mdexp(dir), minfo, einfo);
+	rc = md_intent_getattr_async(ll_i2mdexp(dir), minfo);
 	if (rc) {
 		entry->se_inode = NULL;
 		iput(inode);
-		sa_fini_data(minfo, einfo);
+		sa_fini_data(minfo);
 	}
 
 	return rc;
 }
 
 /* async stat for file with @name */
-static void sa_statahead(struct dentry *parent, const char *name, int len)
+static void sa_statahead(struct dentry *parent, const char *name, int len,
+			 const struct lu_fid *fid)
 {
 	struct inode	     *dir    = d_inode(parent);
 	struct ll_inode_info     *lli    = ll_i2info(dir);
@@ -854,7 +829,7 @@ static void sa_statahead(struct dentry *parent, const char *name, int len)
 	struct sa_entry *entry;
 	int		       rc;
 
-	entry = sa_alloc(parent, sai, sai->sai_index, name, len);
+	entry = sa_alloc(parent, sai, sai->sai_index, name, len, fid);
 	if (IS_ERR(entry))
 		return;
 
@@ -1043,6 +1018,7 @@ static int ll_statahead_thread(void *arg)
 		for (ent = lu_dirent_start(dp);
 		     ent && thread_is_running(sa_thread) && !sa_low_hit(sai);
 		     ent = lu_dirent_next(ent)) {
+			struct lu_fid fid;
 			__u64 hash;
 			int namelen;
 			char *name;
@@ -1088,6 +1064,8 @@ static int ll_statahead_thread(void *arg)
 			if (unlikely(++first == 1))
 				continue;
 
+			fid_le_to_cpu(&fid, &ent->lde_fid);
+
 			/* wait for spare statahead window */
 			do {
 				l_wait_event(sa_thread->t_ctl_waitq,
@@ -1117,7 +1095,7 @@ static int ll_statahead_thread(void *arg)
 			} while (sa_sent_full(sai) &&
 				 thread_is_running(sa_thread));
 
-			sa_statahead(parent, name, namelen);
+			sa_statahead(parent, name, namelen, &fid);
 		}
 
 		pos = le64_to_cpu(dp->ldp_hash_end);
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 76a0306..6a3b83f 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -3012,24 +3012,40 @@ static int lmv_clear_open_replay_data(struct obd_export *exp,
 }
 
 static int lmv_intent_getattr_async(struct obd_export *exp,
-				    struct md_enqueue_info *minfo,
-				    struct ldlm_enqueue_info *einfo)
+				    struct md_enqueue_info *minfo)
 {
 	struct md_op_data       *op_data = &minfo->mi_data;
 	struct obd_device       *obd = exp->exp_obd;
 	struct lmv_obd	  *lmv = &obd->u.lmv;
-	struct lmv_tgt_desc     *tgt = NULL;
+	struct lmv_tgt_desc *ptgt = NULL;
+	struct lmv_tgt_desc *ctgt = NULL;
 	int		      rc;
 
+	if (!fid_is_sane(&op_data->op_fid2))
+		return -EINVAL;
+
 	rc = lmv_check_connect(obd);
 	if (rc)
 		return rc;
 
-	tgt = lmv_locate_mds(lmv, op_data, &op_data->op_fid1);
-	if (IS_ERR(tgt))
-		return PTR_ERR(tgt);
+	ptgt = lmv_locate_mds(lmv, op_data, &op_data->op_fid1);
+	if (IS_ERR(ptgt))
+		return PTR_ERR(ptgt);
+
+	ctgt = lmv_locate_mds(lmv, op_data, &op_data->op_fid2);
+	if (IS_ERR(ctgt))
+		return PTR_ERR(ctgt);
+
+	/*
+	 * if child is on remote MDT, we need 2 async RPCs to fetch both LOOKUP
+	 * lock on parent, and UPDATE lock on child MDT, which makes all
+	 * complicated. Considering remote dir is rare case, and not supporting
+	 * it in statahead won't cause any issue, drop its support for now.
+	 */
+	if (ptgt != ctgt)
+		return -ENOTSUPP;
 
-	return md_intent_getattr_async(tgt->ltd_exp, minfo, einfo);
+	return md_intent_getattr_async(ptgt->ltd_exp, minfo);
 }
 
 static int lmv_revalidate_lock(struct obd_export *exp, struct lookup_intent *it,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index 881c6a0..fecedc88 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -116,8 +116,7 @@ int mdc_revalidate_lock(struct obd_export *exp, struct lookup_intent *it,
 			struct lu_fid *fid, __u64 *bits);
 
 int mdc_intent_getattr_async(struct obd_export *exp,
-			     struct md_enqueue_info *minfo,
-			     struct ldlm_enqueue_info *einfo);
+			     struct md_enqueue_info *minfo);
 
 enum ldlm_mode mdc_lock_match(struct obd_export *exp, __u64 flags,
 			      const struct lu_fid *fid, enum ldlm_type type,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 54ebb99..156add7 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -49,7 +49,6 @@
 struct mdc_getattr_args {
 	struct obd_export	   *ga_exp;
 	struct md_enqueue_info      *ga_minfo;
-	struct ldlm_enqueue_info    *ga_einfo;
 };
 
 int it_open_error(int phase, struct lookup_intent *it)
@@ -1111,7 +1110,7 @@ static int mdc_intent_getattr_async_interpret(const struct lu_env *env,
 	struct mdc_getattr_args  *ga = args;
 	struct obd_export	*exp = ga->ga_exp;
 	struct md_enqueue_info   *minfo = ga->ga_minfo;
-	struct ldlm_enqueue_info *einfo = ga->ga_einfo;
+	struct ldlm_enqueue_info *einfo = &minfo->mi_einfo;
 	struct lookup_intent     *it;
 	struct lustre_handle     *lockh;
 	struct obd_device	*obddev;
@@ -1147,14 +1146,12 @@ static int mdc_intent_getattr_async_interpret(const struct lu_env *env,
 	rc = mdc_finish_intent_lock(exp, req, &minfo->mi_data, it, lockh);
 
 out:
-	kfree(einfo);
 	minfo->mi_cb(req, minfo, rc);
 	return 0;
 }
 
 int mdc_intent_getattr_async(struct obd_export *exp,
-			     struct md_enqueue_info *minfo,
-			     struct ldlm_enqueue_info *einfo)
+			     struct md_enqueue_info *minfo)
 {
 	struct md_op_data       *op_data = &minfo->mi_data;
 	struct lookup_intent    *it = &minfo->mi_it;
@@ -1162,10 +1159,6 @@ int mdc_intent_getattr_async(struct obd_export *exp,
 	struct mdc_getattr_args *ga;
 	struct obd_device       *obddev = class_exp2obd(exp);
 	struct ldlm_res_id       res_id;
-	/*XXX: Both MDS_INODELOCK_LOOKUP and MDS_INODELOCK_UPDATE are needed
-	 *     for statahead currently. Consider CMD in future, such two bits
-	 *     maybe managed by different MDS, should be adjusted then.
-	 */
 	union ldlm_policy_data policy = {
 		.l_inodebits = { MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE }
 	};
@@ -1188,8 +1181,8 @@ int mdc_intent_getattr_async(struct obd_export *exp,
 		return rc;
 	}
 
-	rc = ldlm_cli_enqueue(exp, &req, einfo, &res_id, &policy, &flags, NULL,
-			      0, LVB_T_NONE, &minfo->mi_lockh, 1);
+	rc = ldlm_cli_enqueue(exp, &req, &minfo->mi_einfo, &res_id, &policy,
+			      &flags, NULL, 0, LVB_T_NONE, &minfo->mi_lockh, 1);
 	if (rc < 0) {
 		obd_put_request_slot(&obddev->u.cli);
 		ptlrpc_req_finished(req);
@@ -1200,7 +1193,6 @@ int mdc_intent_getattr_async(struct obd_export *exp,
 	ga = ptlrpc_req_async_args(req);
 	ga->ga_exp = exp;
 	ga->ga_minfo = minfo;
-	ga->ga_einfo = einfo;
 
 	req->rq_interpret_reply = mdc_intent_getattr_async_interpret;
 	ptlrpcd_add_req(req);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 03/60] staging: lustre: clio: add cl_page LRU shrinker
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
  2017-01-29  0:04 ` [PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw James Simmons
  2017-01-29  0:04 ` [PATCH 02/60] staging: lustre: statahead: drop support for remote entry James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR James Simmons
                   ` (57 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

Register cache shrinker to reclaim memory from cl_page LRU list.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6842
Reviewed-on: http://review.whamcloud.com/15630
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h      |  2 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c    |  1 +
 drivers/staging/lustre/lustre/osc/osc_internal.h |  9 +++
 drivers/staging/lustre/lustre/osc/osc_page.c     | 87 ++++++++++++++++++++++++
 drivers/staging/lustre/lustre/osc/osc_request.c  | 21 ++++++
 5 files changed, 120 insertions(+)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 7f0fc44..6d3bd05 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -287,6 +287,8 @@ struct client_obd {
 	 * the transaction has NOT yet committed.
 	 */
 	atomic_long_t		 cl_unstable_count;
+	/** Link to osc_shrinker_list */
+	struct list_head	 cl_shrink_list;
 
 	/* number of in flight destroy rpcs is limited to max_rpcs_in_flight */
 	atomic_t	     cl_destroy_in_flight;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 9be0142..675e25b 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -336,6 +336,7 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	INIT_LIST_HEAD(&cli->cl_lru_list);
 	spin_lock_init(&cli->cl_lru_list_lock);
 	atomic_long_set(&cli->cl_unstable_count, 0);
+	INIT_LIST_HEAD(&cli->cl_shrink_list);
 
 	init_waitqueue_head(&cli->cl_destroy_waitq);
 	atomic_set(&cli->cl_destroy_in_flight, 0);
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index ff7c9ec..43a43e4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -222,4 +222,13 @@ struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 
 int osc_object_invalidate(const struct lu_env *env, struct osc_object *osc);
 
+/** osc shrink list to link all osc client obd */
+extern struct list_head osc_shrink_list;
+/** spin lock to protect osc_shrink_list */
+extern spinlock_t osc_shrink_lock;
+unsigned long osc_cache_shrink_count(struct shrinker *sk,
+				     struct shrink_control *sc);
+unsigned long osc_cache_shrink_scan(struct shrinker *sk,
+				    struct shrink_control *sc);
+
 #endif /* OSC_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c b/drivers/staging/lustre/lustre/osc/osc_page.c
index e356e4a..0461408 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -943,4 +943,91 @@ bool osc_over_unstable_soft_limit(struct client_obd *cli)
 				    cli->cl_max_rpcs_in_flight;
 }
 
+/**
+ * Return how many LRU pages in the cache of all OSC devices
+ *
+ * Return:	return # of cached LRU pages times reclaimation tendency
+ *		SHRINK_STOP if it cannot do any scanning in this time
+ */
+unsigned long osc_cache_shrink_count(struct shrinker *sk,
+				     struct shrink_control *sc)
+{
+	struct client_obd *cli;
+	unsigned long cached = 0;
+
+	spin_lock(&osc_shrink_lock);
+	list_for_each_entry(cli, &osc_shrink_list, cl_shrink_list)
+		cached += atomic_long_read(&cli->cl_lru_in_list);
+	spin_unlock(&osc_shrink_lock);
+
+	return (cached  * sysctl_vfs_cache_pressure) / 100;
+}
+
+/**
+ * Scan and try to reclaim sc->nr_to_scan cached LRU pages
+ *
+ * Return:	number of cached LRU pages reclaimed
+ *		SHRINK_STOP if it cannot do any scanning in this time
+ *
+ * Linux kernel will loop calling this shrinker scan routine with
+ * sc->nr_to_scan = SHRINK_BATCH(128 for now) until kernel got enough memory.
+ *
+ * If sc->nr_to_scan is 0, the VM is querying the cache size, we don't need
+ * to scan and try to reclaim LRU pages, just return 0 and
+ * osc_cache_shrink_count() will report the LRU page number.
+ */
+unsigned long osc_cache_shrink_scan(struct shrinker *sk,
+				    struct shrink_control *sc)
+{
+	struct client_obd *stop_anchor = NULL;
+	struct client_obd *cli;
+	struct lu_env *env;
+	long shrank = 0;
+	int refcheck;
+	int rc;
+
+	if (!sc->nr_to_scan)
+		return 0;
+
+	if (!(sc->gfp_mask & __GFP_FS))
+		return SHRINK_STOP;
+
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return SHRINK_STOP;
+
+	spin_lock(&osc_shrink_lock);
+	while (!list_empty(&osc_shrink_list)) {
+		cli = list_entry(osc_shrink_list.next, struct client_obd,
+				 cl_shrink_list);
+
+		if (!stop_anchor)
+			stop_anchor = cli;
+		else if (cli == stop_anchor)
+			break;
+
+		list_move_tail(&cli->cl_shrink_list, &osc_shrink_list);
+		spin_unlock(&osc_shrink_lock);
+
+		/* shrink no more than max_pages_per_rpc for an OSC */
+		rc = osc_lru_shrink(env, cli, (sc->nr_to_scan - shrank) >
+				    cli->cl_max_pages_per_rpc ?
+				    cli->cl_max_pages_per_rpc :
+				    sc->nr_to_scan - shrank, true);
+		if (rc > 0)
+			shrank += rc;
+
+		if (shrank >= sc->nr_to_scan)
+			goto out;
+
+		spin_lock(&osc_shrink_lock);
+	}
+	spin_unlock(&osc_shrink_lock);
+
+out:
+	cl_env_put(env, &refcheck);
+
+	return shrank;
+}
+
 /** @} osc */
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 3efae75..c2c0385 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2675,6 +2675,11 @@ int osc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 
 	INIT_LIST_HEAD(&cli->cl_grant_shrink_list);
 	ns_register_cancel(obd->obd_namespace, osc_cancel_weight);
+
+	spin_lock(&osc_shrink_lock);
+	list_add_tail(&cli->cl_shrink_list, &osc_shrink_list);
+	spin_unlock(&osc_shrink_lock);
+
 	return rc;
 
 out_ptlrpcd_work:
@@ -2728,6 +2733,10 @@ static int osc_cleanup(struct obd_device *obd)
 	struct client_obd *cli = &obd->u.cli;
 	int rc;
 
+	spin_lock(&osc_shrink_lock);
+	list_del(&cli->cl_shrink_list);
+	spin_unlock(&osc_shrink_lock);
+
 	/* lru cleanup */
 	if (cli->cl_cache) {
 		LASSERT(atomic_read(&cli->cl_cache->ccc_users) > 0);
@@ -2795,6 +2804,15 @@ static int osc_process_config(struct obd_device *obd, u32 len, void *buf)
 	.quotactl       = osc_quotactl,
 };
 
+struct list_head osc_shrink_list = LIST_HEAD_INIT(osc_shrink_list);
+DEFINE_SPINLOCK(osc_shrink_lock);
+
+static struct shrinker osc_cache_shrinker = {
+	.count_objects	= osc_cache_shrink_count,
+	.scan_objects	= osc_cache_shrink_scan,
+	.seeks		= DEFAULT_SEEKS,
+};
+
 static int __init osc_init(void)
 {
 	struct lprocfs_static_vars lvars = { NULL };
@@ -2819,6 +2837,8 @@ static int __init osc_init(void)
 	if (rc)
 		goto out_kmem;
 
+	register_shrinker(&osc_cache_shrinker);
+
 	/* This is obviously too much memory, only prevent overflow here */
 	if (osc_reqpool_mem_max >= 1 << 12 || osc_reqpool_mem_max == 0) {
 		rc = -EINVAL;
@@ -2857,6 +2877,7 @@ static int __init osc_init(void)
 
 static void /*__exit*/ osc_exit(void)
 {
+	unregister_shrinker(&osc_cache_shrinker);
 	class_unregister_type(LUSTRE_OSC_NAME);
 	lu_kmem_fini(osc_caches);
 	ptlrpc_free_rq_pool(osc_rq_pool);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (2 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 03/60] staging: lustre: clio: add cl_page LRU shrinker James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate James Simmons
                   ` (56 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

If a user process is waiting for MDS recovery during close, but the
process is interrupted, the file is still closed but it prints a
message on the console. Quiet the console message for -EINTR, since
this is expected behaviour.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6627
Reviewed-on: http://review.whamcloud.com/14911
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 0ee02f1..a1e51a5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -122,26 +122,25 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 				     enum mds_op_bias bias,
 				     void *data)
 {
-	struct obd_export *exp = ll_i2mdexp(inode);
+	const struct ll_inode_info *lli = ll_i2info(inode);
 	struct md_op_data *op_data;
 	struct ptlrpc_request *req = NULL;
-	struct obd_device *obd = class_exp2obd(exp);
 	int rc;
 
-	if (!obd) {
-		/*
-		 * XXX: in case of LMV, is this correct to access
-		 * ->exp_handle?
-		 */
-		CERROR("Invalid MDC connection handle %#llx\n",
-		       ll_i2mdexp(inode)->exp_handle.h_cookie);
+	if (!class_exp2obd(md_exp)) {
+		CERROR("%s: invalid MDC connection handle closing " DFID "\n",
+		       ll_get_fsname(inode->i_sb, NULL, 0),
+		       PFID(&lli->lli_fid));
 		rc = 0;
 		goto out;
 	}
 
 	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+	/*
+	 * We leak openhandle and request here on error, but not much to be
+	 * done in OOM case since app won't retry close on error either.
+	 */
 	if (!op_data) {
-		/* XXX We leak openhandle and request here. */
 		rc = -ENOMEM;
 		goto out;
 	}
@@ -170,10 +169,9 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	}
 
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
-	if (rc) {
-		CERROR("%s: inode "DFID" mdc close failed: rc = %d\n",
-		       ll_i2mdexp(inode)->exp_obd->obd_name,
-		       PFID(ll_inode2fid(inode)), rc);
+	if (rc && rc != -EINTR) {
+		CERROR("%s: inode " DFID " mdc close failed: rc = %d\n",
+		       md_exp->exp_obd->obd_name, PFID(&lli->lli_fid), rc);
 	}
 
 	if (op_data->op_bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP) &&
@@ -192,8 +190,7 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	och->och_fh.cookie = DEAD_HANDLE_MAGIC;
 	kfree(och);
 
-	if (req) /* This is close request */
-		ptlrpc_req_finished(req);
+	ptlrpc_req_finished(req);
 	return rc;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (3 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-30 11:34   ` Dan Carpenter
  2017-01-29  0:04 ` [PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO James Simmons
                   ` (55 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Check if the request is NULL, before retrieve reply body
from the request.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7396
Reviewed-on: http://review.whamcloud.com/17079
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c | 41 +++++++++++++++++-------------
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index a1e51a5..b681e15 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2656,28 +2656,33 @@ int ll_migrate(struct inode *parent, struct file *file, int mdtidx,
 	if (!rc)
 		ll_update_times(request, parent);
 
-	body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
-	if (!body) {
-		rc = -EPROTO;
-		goto out_free;
-	}
+	if (request) {
+		body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
+		if (!body) {
+			rc = -EPROTO;
+			goto out_free;
+		}
 
-	/*
-	 * If the server does release layout lock, then we cleanup
-	 * the client och here, otherwise release it in out_free:
-	 */
-	if (och && body->mbo_valid & OBD_MD_CLOSE_INTENT_EXECED) {
-		obd_mod_put(och->och_mod);
-		md_clear_open_replay_data(ll_i2sbi(parent)->ll_md_exp, och);
-		och->och_fh.cookie = DEAD_HANDLE_MAGIC;
-		kfree(och);
-		och = NULL;
-	}
+		/*
+		 * If the server does release layout lock, then we cleanup
+		 * the client och here, otherwise release it in out_free:
+		 */
+		if (och && body->mbo_valid & OBD_MD_CLOSE_INTENT_EXECED) {
+			obd_mod_put(och->och_mod);
+			md_clear_open_replay_data(ll_i2sbi(parent)->ll_md_exp,
+						  och);
+			och->och_fh.cookie = DEAD_HANDLE_MAGIC;
+			kfree(och);
+			och = NULL;
+		}
 
-	ptlrpc_req_finished(request);
+		ptlrpc_req_finished(request);
+	}
 	/* Try again if the file layout has changed. */
-	if (rc == -EAGAIN && S_ISREG(child_inode->i_mode))
+	if (rc == -EAGAIN && S_ISREG(child_inode->i_mode)) {
+		request = NULL;
 		goto again;
+	}
 out_free:
 	if (child_inode) {
 		if (och) /* close the file */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (4 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 07/60] staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend James Simmons
                   ` (54 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, Gu Zheng, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

Read ahead currently doesn't handle 16MB RPC packets correctly
by assuming the packets are a default size instead of querying
the size. This work adjust the read ahead policy to issue
read ahead RPC by the underlying RPC size.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7990
Reviewed-on: http://review.whamcloud.com/19368
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |  10 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  14 +-
 drivers/staging/lustre/lustre/llite/rw.c           | 195 ++++++++++-----------
 drivers/staging/lustre/lustre/osc/osc_io.c         |   3 +-
 5 files changed, 114 insertions(+), 112 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index a1b8301..813e71d 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1452,8 +1452,10 @@ struct cl_read_ahead {
 	 * cra_end is included.
 	 */
 	pgoff_t cra_end;
+	/* optimal RPC size for this read, by pages */
+	unsigned long cra_rpc_size;
 	/*
-	 * Release routine. If readahead holds resources underneath, this
+	 * Release callback. If readahead holds resources underneath, this
 	 * function should be called to release it.
 	 */
 	void (*cra_release)(const struct lu_env *env, void *cbdata);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 675e25b..95b8c76 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -351,13 +351,11 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	cli->cl_supp_cksum_types = OBD_CKSUM_CRC32;
 	atomic_set(&cli->cl_resends, OSC_DEFAULT_RESENDS);
 
-	/* This value may be reduced at connect time in
-	 * ptlrpc_connect_interpret() . We initialize it to only
-	 * 1MB until we know what the performance looks like.
-	 * In the future this should likely be increased. LU-1431
+	/*
+	 * Set it to possible maximum size. It may be reduced by ocd_brw_size
+	 * from OFD after connecting.
 	 */
-	cli->cl_max_pages_per_rpc = min_t(int, PTLRPC_MAX_BRW_PAGES,
-					  LNET_MTU >> PAGE_SHIFT);
+	cli->cl_max_pages_per_rpc = PTLRPC_MAX_BRW_PAGES;
 
 	/*
 	 * set cl_chunkbits default value to PAGE_CACHE_SHIFT,
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 2c72177..501957c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -281,10 +281,8 @@ static inline struct ll_inode_info *ll_i2info(struct inode *inode)
 	return container_of(inode, struct ll_inode_info, lli_vfs_inode);
 }
 
-/* default to about 40meg of readahead on a given system.  That much tied
- * up in 512k readahead requests serviced at 40ms each is about 1GB/s.
- */
-#define SBI_DEFAULT_READAHEAD_MAX (40UL << (20 - PAGE_SHIFT))
+/* default to about 64M of readahead on a given system. */
+#define SBI_DEFAULT_READAHEAD_MAX	(64UL << (20 - PAGE_SHIFT))
 
 /* default to read-ahead full files smaller than 2MB on the second read */
 #define SBI_DEFAULT_READAHEAD_WHOLE_MAX (2UL << (20 - PAGE_SHIFT))
@@ -321,6 +319,9 @@ struct ll_ra_info {
 struct ra_io_arg {
 	unsigned long ria_start;  /* start offset of read-ahead*/
 	unsigned long ria_end;    /* end offset of read-ahead*/
+	unsigned long ria_reserved; /* reserved pages for read-ahead */
+	unsigned long ria_end_min;  /* minimum end to cover current read */
+	bool ria_eof;		    /* reach end of file */
 	/* If stride read pattern is detected, ria_stoff means where
 	 * stride read is started. Note: for normal read-ahead, the
 	 * value here is meaningless, and also it will not be accessed
@@ -551,6 +552,11 @@ struct ll_readahead_state {
 	 */
 	unsigned long   ras_window_start, ras_window_len;
 	/*
+	 * Optimal RPC size. It decides how many pages will be sent
+	 * for each read-ahead.
+	 */
+	unsigned long	ras_rpc_size;
+	/*
 	 * Where next read-ahead should start at. This lies within read-ahead
 	 * window. Read-ahead window is read in pieces rather than at once
 	 * because: 1. lustre limits total number of pages under read-ahead by
diff --git a/drivers/staging/lustre/lustre/llite/rw.c b/drivers/staging/lustre/lustre/llite/rw.c
index f10e092..18d3ccb 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -92,25 +92,6 @@ static unsigned long ll_ra_count_get(struct ll_sb_info *sbi,
 		goto out;
 	}
 
-	/* If the non-strided (ria_pages == 0) readahead window
-	 * (ria_start + ret) has grown across an RPC boundary, then trim
-	 * readahead size by the amount beyond the RPC so it ends on an
-	 * RPC boundary. If the readahead window is already ending on
-	 * an RPC boundary (beyond_rpc == 0), or smaller than a full
-	 * RPC (beyond_rpc < ret) the readahead size is unchanged.
-	 * The (beyond_rpc != 0) check is skipped since the conditional
-	 * branch is more expensive than subtracting zero from the result.
-	 *
-	 * Strided read is left unaligned to avoid small fragments beyond
-	 * the RPC boundary from needing an extra read RPC.
-	 */
-	if (ria->ria_pages == 0) {
-		long beyond_rpc = (ria->ria_start + ret) % PTLRPC_MAX_BRW_PAGES;
-
-		if (/* beyond_rpc != 0 && */ beyond_rpc < ret)
-			ret -= beyond_rpc;
-	}
-
 	if (atomic_add_return(ret, &ra->ra_cur_pages) > ra->ra_max_pages) {
 		atomic_sub(ret, &ra->ra_cur_pages);
 		ret = 0;
@@ -147,11 +128,12 @@ void ll_ra_stats_inc(struct inode *inode, enum ra_stat which)
 
 #define RAS_CDEBUG(ras) \
 	CDEBUG(D_READA,						      \
-	       "lrp %lu cr %lu cp %lu ws %lu wl %lu nra %lu r %lu ri %lu"    \
-	       "csr %lu sf %lu sp %lu sl %lu\n",			    \
+	       "lrp %lu cr %lu cp %lu ws %lu wl %lu nra %lu rpc %lu "	     \
+	       "r %lu ri %lu csr %lu sf %lu sp %lu sl %lu\n",		     \
 	       ras->ras_last_readpage, ras->ras_consecutive_requests,	\
 	       ras->ras_consecutive_pages, ras->ras_window_start,	    \
 	       ras->ras_window_len, ras->ras_next_readahead,		 \
+	       ras->ras_rpc_size,					     \
 	       ras->ras_requests, ras->ras_request_index,		    \
 	       ras->ras_consecutive_stride_requests, ras->ras_stride_offset, \
 	       ras->ras_stride_pages, ras->ras_stride_length)
@@ -261,20 +243,6 @@ static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io,
 	ria->ria_start, ria->ria_end, ria->ria_stoff, ria->ria_length,\
 	ria->ria_pages)
 
-/* Limit this to the blocksize instead of PTLRPC_BRW_MAX_SIZE, since we don't
- * know what the actual RPC size is.  If this needs to change, it makes more
- * sense to tune the i_blkbits value for the file based on the OSTs it is
- * striped over, rather than having a constant value for all files here.
- */
-
-/* RAS_INCREASE_STEP should be (1UL << (inode->i_blkbits - PAGE_SHIFT)).
- * Temporarily set RAS_INCREASE_STEP to 1MB. After 4MB RPC is enabled
- * by default, this should be adjusted corresponding with max_read_ahead_mb
- * and max_read_ahead_per_file_mb otherwise the readahead budget can be used
- * up quickly which will affect read performance significantly. See LU-2816
- */
-#define RAS_INCREASE_STEP(inode) (ONE_MB_BRW_SIZE >> PAGE_SHIFT)
-
 static inline int stride_io_mode(struct ll_readahead_state *ras)
 {
 	return ras->ras_consecutive_stride_requests > 1;
@@ -345,6 +313,17 @@ static int ria_page_count(struct ra_io_arg *ria)
 			       length);
 }
 
+static unsigned long ras_align(struct ll_readahead_state *ras,
+			       unsigned long index,
+			       unsigned long *remainder)
+{
+	unsigned long rem = index % ras->ras_rpc_size;
+
+	if (remainder)
+		*remainder = rem;
+	return index - rem;
+}
+
 /*Check whether the index is in the defined ra-window */
 static int ras_inside_ra_window(unsigned long idx, struct ra_io_arg *ria)
 {
@@ -358,42 +337,63 @@ static int ras_inside_ra_window(unsigned long idx, struct ra_io_arg *ria)
 		ria->ria_length < ria->ria_pages);
 }
 
-static int ll_read_ahead_pages(const struct lu_env *env,
-			       struct cl_io *io, struct cl_page_list *queue,
-			       struct ra_io_arg *ria,
-			       unsigned long *reserved_pages,
-			       pgoff_t *ra_end)
+static unsigned long
+ll_read_ahead_pages(const struct lu_env *env, struct cl_io *io,
+		    struct cl_page_list *queue, struct ll_readahead_state *ras,
+		    struct ra_io_arg *ria)
 {
 	struct cl_read_ahead ra = { 0 };
-	int rc, count = 0;
+	unsigned long ra_end = 0;
 	bool stride_ria;
 	pgoff_t page_idx;
+	int rc;
 
 	LASSERT(ria);
 	RIA_DEBUG(ria);
 
 	stride_ria = ria->ria_length > ria->ria_pages && ria->ria_pages > 0;
 	for (page_idx = ria->ria_start;
-	     page_idx <= ria->ria_end && *reserved_pages > 0; page_idx++) {
+	     page_idx <= ria->ria_end && ria->ria_reserved > 0; page_idx++) {
 		if (ras_inside_ra_window(page_idx, ria)) {
 			if (!ra.cra_end || ra.cra_end < page_idx) {
+				unsigned long end;
+
 				cl_read_ahead_release(env, &ra);
 
 				rc = cl_io_read_ahead(env, io, page_idx, &ra);
 				if (rc < 0)
 					break;
 
+				CDEBUG(D_READA, "idx: %lu, ra: %lu, rpc: %lu\n",
+				       page_idx, ra.cra_end, ra.cra_rpc_size);
 				LASSERTF(ra.cra_end >= page_idx,
 					 "object: %p, indcies %lu / %lu\n",
 					 io->ci_obj, ra.cra_end, page_idx);
+				/*
+				 * update read ahead RPC size.
+				 * NB: it's racy but doesn't matter
+				 */
+				if (ras->ras_rpc_size > ra.cra_rpc_size &&
+				    ra.cra_rpc_size > 0)
+					ras->ras_rpc_size = ra.cra_rpc_size;
+				/* trim it to align with optimal RPC size */
+				end = ras_align(ras, ria->ria_end + 1, NULL);
+				if (end > 0 && !ria->ria_eof)
+					ria->ria_end = end - 1;
+				if (ria->ria_end < ria->ria_end_min)
+					ria->ria_end = ria->ria_end_min;
+				if (ria->ria_end > ra.cra_end)
+					ria->ria_end = ra.cra_end;
 			}
 
-			/* If the page is inside the read-ahead window*/
+			/* If the page is inside the read-ahead window */
 			rc = ll_read_ahead_page(env, io, queue, page_idx);
-			if (!rc) {
-				(*reserved_pages)--;
-				count++;
-			}
+			if (rc < 0)
+				break;
+
+			ra_end = page_idx;
+			if (!rc)
+				ria->ria_reserved--;
 		} else if (stride_ria) {
 			/* If it is not in the read-ahead window, and it is
 			 * read-ahead mode, then check whether it should skip
@@ -420,8 +420,7 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 	}
 	cl_read_ahead_release(env, &ra);
 
-	*ra_end = page_idx;
-	return count;
+	return ra_end;
 }
 
 static int ll_readahead(const struct lu_env *env, struct cl_io *io,
@@ -431,7 +430,7 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	struct vvp_io *vio = vvp_env_io(env);
 	struct ll_thread_info *lti = ll_env_info(env);
 	struct cl_attr *attr = vvp_env_thread_attr(env);
-	unsigned long len, mlen = 0, reserved;
+	unsigned long len, mlen = 0;
 	pgoff_t ra_end, start = 0, end = 0;
 	struct inode *inode;
 	struct ra_io_arg *ria = &lti->lti_ria;
@@ -478,29 +477,15 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	    end < vio->vui_ra_start + vio->vui_ra_count - 1)
 		end = vio->vui_ra_start + vio->vui_ra_count - 1;
 
-	if (end != 0) {
-		unsigned long rpc_boundary;
-		/*
-		 * Align RA window to an optimal boundary.
-		 *
-		 * XXX This would be better to align to cl_max_pages_per_rpc
-		 * instead of PTLRPC_MAX_BRW_PAGES, because the RPC size may
-		 * be aligned to the RAID stripe size in the future and that
-		 * is more important than the RPC size.
-		 */
-		/* Note: we only trim the RPC, instead of extending the RPC
-		 * to the boundary, so to avoid reading too much pages during
-		 * random reading.
-		 */
-		rpc_boundary = (end + 1) & (~(PTLRPC_MAX_BRW_PAGES - 1));
-		if (rpc_boundary > 0)
-			rpc_boundary--;
-
-		if (rpc_boundary  > start)
-			end = rpc_boundary;
+	if (end) {
+		unsigned long end_index;
 
 		/* Truncate RA window to end of file */
-		end = min(end, (unsigned long)((kms - 1) >> PAGE_SHIFT));
+		end_index = (unsigned long)((kms - 1) >> PAGE_SHIFT);
+		if (end_index <= end) {
+			end = end_index;
+			ria->ria_eof = true;
+		}
 
 		ras->ras_next_readahead = max(end, end + 1);
 		RAS_CDEBUG(ras);
@@ -535,28 +520,31 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	/* at least to extend the readahead window to cover current read */
 	if (!hit && vio->vui_ra_valid &&
 	    vio->vui_ra_start + vio->vui_ra_count > ria->ria_start) {
+		unsigned long remainder;
+
 		/* to the end of current read window. */
 		mlen = vio->vui_ra_start + vio->vui_ra_count - ria->ria_start;
 		/* trim to RPC boundary */
-		start = ria->ria_start & (PTLRPC_MAX_BRW_PAGES - 1);
-		mlen = min(mlen, PTLRPC_MAX_BRW_PAGES - start);
+		ras_align(ras, ria->ria_start, &remainder);
+		mlen = min(mlen, ras->ras_rpc_size - remainder);
+		ria->ria_end_min = ria->ria_start + mlen;
 	}
 
-	reserved = ll_ra_count_get(ll_i2sbi(inode), ria, len, mlen);
-	if (reserved < len)
+	ria->ria_reserved = ll_ra_count_get(ll_i2sbi(inode), ria, len, mlen);
+	if (ria->ria_reserved < len)
 		ll_ra_stats_inc(inode, RA_STAT_MAX_IN_FLIGHT);
 
 	CDEBUG(D_READA, "reserved pages %lu/%lu/%lu, ra_cur %d, ra_max %lu\n",
-	       reserved, len, mlen,
+	       ria->ria_reserved, len, mlen,
 	       atomic_read(&ll_i2sbi(inode)->ll_ra_info.ra_cur_pages),
 	       ll_i2sbi(inode)->ll_ra_info.ra_max_pages);
 
-	ret = ll_read_ahead_pages(env, io, queue, ria, &reserved, &ra_end);
+	ra_end = ll_read_ahead_pages(env, io, queue, ras, ria);
 
-	if (reserved != 0)
-		ll_ra_count_put(ll_i2sbi(inode), reserved);
+	if (ria->ria_reserved)
+		ll_ra_count_put(ll_i2sbi(inode), ria->ria_reserved);
 
-	if (ra_end == end + 1 && ra_end == (kms >> PAGE_SHIFT))
+	if (ra_end == end && ra_end == (kms >> PAGE_SHIFT))
 		ll_ra_stats_inc(inode, RA_STAT_EOF);
 
 	/* if we didn't get to the end of the region we reserved from
@@ -568,13 +556,13 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	CDEBUG(D_READA, "ra_end = %lu end = %lu stride end = %lu pages = %d\n",
 	       ra_end, end, ria->ria_end, ret);
 
-	if (ra_end != end + 1) {
+	if (ra_end > 0 && ra_end != end) {
 		ll_ra_stats_inc(inode, RA_STAT_FAILED_REACH_END);
 		spin_lock(&ras->ras_lock);
-		if (ra_end < ras->ras_next_readahead &&
+		if (ra_end <= ras->ras_next_readahead &&
 		    index_in_window(ra_end, ras->ras_window_start, 0,
 				    ras->ras_window_len)) {
-			ras->ras_next_readahead = ra_end;
+			ras->ras_next_readahead = ra_end + 1;
 			RAS_CDEBUG(ras);
 		}
 		spin_unlock(&ras->ras_lock);
@@ -586,7 +574,7 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io,
 static void ras_set_start(struct inode *inode, struct ll_readahead_state *ras,
 			  unsigned long index)
 {
-	ras->ras_window_start = index & (~(RAS_INCREASE_STEP(inode) - 1));
+	ras->ras_window_start = ras_align(ras, index, NULL);
 }
 
 /* called with the ras_lock held or from places where it doesn't matter */
@@ -615,6 +603,7 @@ static void ras_stride_reset(struct ll_readahead_state *ras)
 void ll_readahead_init(struct inode *inode, struct ll_readahead_state *ras)
 {
 	spin_lock_init(&ras->ras_lock);
+	ras->ras_rpc_size = PTLRPC_MAX_BRW_PAGES;
 	ras_reset(inode, ras, 0);
 	ras->ras_requests = 0;
 }
@@ -719,12 +708,15 @@ static void ras_increase_window(struct inode *inode,
 	 * but current clio architecture does not support retrieve such
 	 * information from lower layer. FIXME later
 	 */
-	if (stride_io_mode(ras))
-		ras_stride_increase_window(ras, ra, RAS_INCREASE_STEP(inode));
-	else
-		ras->ras_window_len = min(ras->ras_window_len +
-					  RAS_INCREASE_STEP(inode),
-					  ra->ra_max_pages_per_file);
+	if (stride_io_mode(ras)) {
+		ras_stride_increase_window(ras, ra, ras->ras_rpc_size);
+	} else {
+		unsigned long wlen;
+
+		wlen = min(ras->ras_window_len + ras->ras_rpc_size,
+			   ra->ra_max_pages_per_file);
+		ras->ras_window_len = ras_align(ras, wlen, NULL);
+	}
 }
 
 static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
@@ -852,6 +844,8 @@ static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
 		 * instead of ras_window_start, which is RPC aligned
 		 */
 		ras->ras_next_readahead = max(index, ras->ras_next_readahead);
+		ras->ras_window_start = max(ras->ras_stride_offset,
+					    ras->ras_window_start);
 	} else {
 		if (ras->ras_next_readahead < ras->ras_window_start)
 			ras->ras_next_readahead = ras->ras_window_start;
@@ -881,7 +875,7 @@ static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
 		 */
 		ras->ras_next_readahead = max(index, ras->ras_next_readahead);
 		ras->ras_stride_offset = index;
-		ras->ras_window_len = RAS_INCREASE_STEP(inode);
+		ras->ras_window_start = max(index, ras->ras_window_start);
 	}
 
 	/* The initial ras_window_len is set to the request size.  To avoid
@@ -1098,38 +1092,39 @@ static int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
 	struct cl_2queue *queue  = &io->ci_queue;
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 	struct vvp_page *vpg;
+	bool uptodate;
 	int rc = 0;
 
 	vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page));
+	uptodate = vpg->vpg_defer_uptodate;
+
 	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
 	    sbi->ll_ra_info.ra_max_pages > 0) {
 		struct vvp_io *vio = vvp_env_io(env);
 		enum ras_update_flags flags = 0;
 
-		if (vpg->vpg_defer_uptodate)
+		if (uptodate)
 			flags |= LL_RAS_HIT;
 		if (!vio->vui_ra_valid)
 			flags |= LL_RAS_MMAP;
 		ras_update(sbi, inode, ras, vvp_index(vpg), flags);
 	}
 
-	if (vpg->vpg_defer_uptodate) {
+	cl_2queue_init(queue);
+	if (uptodate) {
 		vpg->vpg_ra_used = 1;
 		cl_page_export(env, page, 1);
+		cl_page_disown(env, io, page);
+	} else {
+		cl_page_list_add(&queue->c2_qin, page);
 	}
 
-	cl_2queue_init(queue);
-	/*
-	 * Add page into the queue even when it is marked uptodate above.
-	 * this will unlock it automatically as part of cl_page_list_disown().
-	 */
-	cl_page_list_add(&queue->c2_qin, page);
 	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
 	    sbi->ll_ra_info.ra_max_pages > 0) {
 		int rc2;
 
 		rc2 = ll_readahead(env, io, &queue->c2_qin, ras,
-				   vpg->vpg_defer_uptodate);
+				   uptodate);
 		CDEBUG(D_READA, DFID "%d pages read ahead at %lu\n",
 		       PFID(ll_inode2fid(inode)), rc2, vvp_index(vpg));
 	}
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 9402dfc..7e5cd3a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -99,6 +99,7 @@ static int osc_io_read_ahead(const struct lu_env *env,
 			ldlm_lock_decref(&lockh, dlmlock->l_req_mode);
 		}
 
+		ra->cra_rpc_size = osc_cli(osc)->cl_max_pages_per_rpc;
 		ra->cra_end = cl_index(osc2cl(osc),
 				       dlmlock->l_policy_data.l_extent.end);
 		ra->cra_release = osc_read_ahead_release;
@@ -138,7 +139,7 @@ static int osc_io_submit(const struct lu_env *env,
 
 	LASSERT(qin->pl_nr > 0);
 
-	CDEBUG(D_CACHE, "%d %d\n", qin->pl_nr, crt);
+	CDEBUG(D_CACHE | D_READA, "%d %d\n", qin->pl_nr, crt);
 
 	osc = cl2osc(ios->cis_obj);
 	cli = osc_cli(osc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 07/60] staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (5 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 08/60] staging: lustre: ldlm: Restore connect flags on failure James Simmons
                   ` (53 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

Set mbits for EINPROGRESS resend in ptl_send_rpc().

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8193
Reviewed-on: http://review.whamcloud.com/20377
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c | 7 +++++--
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c | 5 +++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 8047413..3c18ab6 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -3123,8 +3123,11 @@ void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req)
 			req->rq_mbits = ptlrpc_next_xid();
 		} else {
 			/* old version transfers rq_xid to peer as matchbits */
-			req->rq_mbits = ptlrpc_next_xid();
-			req->rq_xid = req->rq_mbits;
+			spin_lock(&req->rq_import->imp_lock);
+			list_del_init(&req->rq_unreplied_list);
+			ptlrpc_assign_next_xid_nolock(req);
+			req->rq_mbits = req->rq_xid;
+			spin_unlock(&req->rq_import->imp_lock);
 		}
 
 		CDEBUG(D_HA, "resend bulk old x%llu new x%llu\n",
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index da1209e..b870184 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -522,13 +522,14 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
 		 */
 		spin_lock(&imp->imp_lock);
 		ptlrpc_assign_next_xid_nolock(request);
-		request->rq_mbits = request->rq_xid;
 		min_xid = ptlrpc_known_replied_xid(imp);
 		spin_unlock(&imp->imp_lock);
 
 		lustre_msg_set_last_xid(request->rq_reqmsg, min_xid);
 		DEBUG_REQ(D_RPCTRACE, request, "Allocating new xid for resend on EINPROGRESS");
-	} else if (request->rq_bulk) {
+	}
+
+	if (request->rq_bulk) {
 		ptlrpc_set_bulk_mbits(request);
 		lustre_msg_set_mbits(request->rq_reqmsg, request->rq_mbits);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 08/60] staging: lustre: ldlm: Restore connect flags on failure
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (6 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 07/60] staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 09/60] staging: lustre: lmv: Correctly generate target_obd James Simmons
                   ` (52 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jeremy Filizetti, James Simmons

From: Jeremy Filizetti <jeremy.filizetti@gmail.com>

Restore connect flags on failure of ptlrpc_connect_import()
to prevent an LBUG due to flags mismatch.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7185
Reviewed-on: http://review.whamcloud.com/16950
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 95b8c76..3663c5c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -523,6 +523,8 @@ int client_connect_import(const struct lu_env *env,
 
 	rc = ptlrpc_connect_import(imp);
 	if (rc != 0) {
+		if (data && is_mdc)
+			data->ocd_connect_flags &= ~OBD_CONNECT_MULTIMODRPCS;
 		LASSERT(imp->imp_state == LUSTRE_IMP_DISCON);
 		goto out_ldlm;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 09/60] staging: lustre: lmv: Correctly generate target_obd
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (7 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 08/60] staging: lustre: ldlm: Restore connect flags on failure James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string James Simmons
                   ` (51 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Giuseppe Di Natale, James Simmons

From: Giuseppe Di Natale <dinatale2@llnl.gov>

The target_obd debugfs file was not being generated correctly
in cases where nonconsecutive MDT indices were used when
generating a filesystem.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8100
Reviewed-on: http://review.whamcloud.com/20336
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
index 20bbdfc..14fbc9c 100644
--- a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
+++ b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
@@ -147,7 +147,13 @@ static void *lmv_tgt_seq_start(struct seq_file *p, loff_t *pos)
 	struct obd_device       *dev = p->private;
 	struct lmv_obd	  *lmv = &dev->u.lmv;
 
-	return (*pos >= lmv->desc.ld_tgt_count) ? NULL : lmv->tgts[*pos];
+	while (*pos < lmv->tgts_size) {
+		if (lmv->tgts[*pos])
+			return lmv->tgts[*pos];
+		++*pos;
+	}
+
+	return  NULL;
 }
 
 static void lmv_tgt_seq_stop(struct seq_file *p, void *v)
@@ -159,8 +165,15 @@ static void *lmv_tgt_seq_next(struct seq_file *p, void *v, loff_t *pos)
 {
 	struct obd_device       *dev = p->private;
 	struct lmv_obd	  *lmv = &dev->u.lmv;
+
 	++*pos;
-	return (*pos >= lmv->desc.ld_tgt_count) ? NULL : lmv->tgts[*pos];
+	while (*pos < lmv->tgts_size) {
+		if (lmv->tgts[*pos])
+			return lmv->tgts[*pos];
+		++*pos;
+	}
+
+	return  NULL;
 }
 
 static int lmv_tgt_seq_show(struct seq_file *p, void *v)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (8 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 09/60] staging: lustre: lmv: Correctly generate target_obd James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-02-03 10:33   ` Greg Kroah-Hartman
  2017-01-29  0:04 ` [PATCH 11/60] staging: lustre: obd: RCU stalls in lu_cache_shrink_count() James Simmons
                   ` (50 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

Update the sysfs "version" file to print "lustre: " with
the version number.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5969
Reviewed-on: http://review.whamcloud.com/16721
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
index 9f5e829..22e6d1f 100644
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
@@ -208,7 +208,7 @@ struct miscdevice obd_psdev = {
 static ssize_t version_show(struct kobject *kobj, struct attribute *attr,
 			    char *buf)
 {
-	return sprintf(buf, "%s\n", LUSTRE_VERSION_STRING);
+	return sprintf(buf, "lustre: %s\n", LUSTRE_VERSION_STRING);
 }
 
 static ssize_t pinger_show(struct kobject *kobj, struct attribute *attr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 11/60] staging: lustre: obd: RCU stalls in lu_cache_shrink_count()
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (9 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 12/60] staging: lustre: lmv: Error not handled for lmv_find_target James Simmons
                   ` (49 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Ann Koehler,
	James Simmons

From: Ann Koehler <amk@cray.com>

The algorithm for counting freeable objects in the lu_cache shrinker
does not scale with the number of cpus. The LU_SS_LRU_LEN counter
for each cpu is read and summed at shrink time while holding the
lu_sites_guard mutex. With a large number of cpus and low memory
conditions, processes bottleneck on the mutex.

This mod reduces the time spent counting by using the kernel's percpu
counter functions to maintain the length of a site's lru. The summing
occurs when a percpu value is incremented or decremented and a
threshold is exceeded. lu_cache_shrink_count() simply returns the
last such computed sum.

This mod also replaces the lu_sites_guard mutex with a rw semaphore.
The lock protects the lu_site list, which is modified when a file
system is mounted/umounted or when the lu_site is purged.
lu_cache_shrink_count simply reads data so it does not need to wait
for other readers. lu_cache_shrink_scan, which actually frees the
unused objects, is still serialized.

Signed-off-by: Ann Koehler <amk@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7997
Reviewed-on: http://review.whamcloud.com/19390
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lu_object.h  |  6 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c | 80 ++++++++++------------
 2 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index 69b2812..f442a96 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -34,6 +34,7 @@
 #define __LUSTRE_LU_OBJECT_H
 
 #include <stdarg.h>
+#include <linux/percpu_counter.h>
 #include "../../include/linux/libcfs/libcfs.h"
 #include "lustre/lustre_idl.h"
 #include "lu_ref.h"
@@ -580,7 +581,6 @@ enum {
 	LU_SS_CACHE_RACE,
 	LU_SS_CACHE_DEATH_RACE,
 	LU_SS_LRU_PURGED,
-	LU_SS_LRU_LEN,	/* # of objects in lsb_lru lists */
 	LU_SS_LAST_STAT
 };
 
@@ -635,6 +635,10 @@ struct lu_site {
 	 * XXX: a hack! fld has to find md_site via site, remove when possible
 	 */
 	struct seq_server_site	*ld_seq_site;
+	/**
+	 * Number of objects in lsb_lru_lists - used for shrinking
+	 */
+	struct percpu_counter	 ls_lru_len_counter;
 };
 
 static inline struct lu_site_bkt_data *
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 7971562..1805861 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -151,7 +151,7 @@ void lu_object_put(const struct lu_env *env, struct lu_object *o)
 		LASSERT(list_empty(&top->loh_lru));
 		list_add_tail(&top->loh_lru, &bkt->lsb_lru);
 		bkt->lsb_lru_len++;
-		lprocfs_counter_incr(site->ls_stats, LU_SS_LRU_LEN);
+		percpu_counter_inc(&site->ls_lru_len_counter);
 		CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p, lru_len: %ld\n",
 		       o, site->ls_obj_hash, bkt, bkt->lsb_lru_len);
 		cfs_hash_bd_unlock(site->ls_obj_hash, &bd, 1);
@@ -202,7 +202,7 @@ void lu_object_unhash(const struct lu_env *env, struct lu_object *o)
 			list_del_init(&top->loh_lru);
 			bkt = cfs_hash_bd_extra_get(obj_hash, &bd);
 			bkt->lsb_lru_len--;
-			lprocfs_counter_decr(site->ls_stats, LU_SS_LRU_LEN);
+			percpu_counter_dec(&site->ls_lru_len_counter);
 		}
 		cfs_hash_bd_del_locked(obj_hash, &bd, &top->loh_hash);
 		cfs_hash_bd_unlock(obj_hash, &bd, 1);
@@ -379,7 +379,7 @@ int lu_site_purge(const struct lu_env *env, struct lu_site *s, int nr)
 					       &bd2, &h->loh_hash);
 			list_move(&h->loh_lru, &dispose);
 			bkt->lsb_lru_len--;
-			lprocfs_counter_decr(s->ls_stats, LU_SS_LRU_LEN);
+			percpu_counter_dec(&s->ls_lru_len_counter);
 			if (did_sth == 0)
 				did_sth = 1;
 
@@ -578,7 +578,7 @@ static struct lu_object *htable_lookup(struct lu_site *s,
 		if (!list_empty(&h->loh_lru)) {
 			list_del_init(&h->loh_lru);
 			bkt->lsb_lru_len--;
-			lprocfs_counter_decr(s->ls_stats, LU_SS_LRU_LEN);
+			percpu_counter_dec(&s->ls_lru_len_counter);
 		}
 		return lu_object_top(h);
 	}
@@ -820,7 +820,7 @@ void lu_device_type_fini(struct lu_device_type *ldt)
  * Global list of all sites on this node
  */
 static LIST_HEAD(lu_sites);
-static DEFINE_MUTEX(lu_sites_guard);
+static DECLARE_RWSEM(lu_sites_guard);
 
 /**
  * Global environment used by site shrinker.
@@ -994,9 +994,15 @@ int lu_site_init(struct lu_site *s, struct lu_device *top)
 	unsigned long bits;
 	unsigned long i;
 	char name[16];
+	int rc;
 
 	memset(s, 0, sizeof(*s));
 	mutex_init(&s->ls_purge_mutex);
+
+	rc = percpu_counter_init(&s->ls_lru_len_counter, 0, GFP_NOFS);
+	if (rc)
+		return -ENOMEM;
+
 	snprintf(name, sizeof(name), "lu_site_%s", top->ld_type->ldt_name);
 	for (bits = lu_htable_order(top); bits >= LU_SITE_BITS_MIN; bits--) {
 		s->ls_obj_hash = cfs_hash_create(name, bits, bits,
@@ -1042,12 +1048,6 @@ int lu_site_init(struct lu_site *s, struct lu_device *top)
 			     0, "cache_death_race", "cache_death_race");
 	lprocfs_counter_init(s->ls_stats, LU_SS_LRU_PURGED,
 			     0, "lru_purged", "lru_purged");
-	/*
-	 * Unlike other counters, lru_len can be decremented so
-	 * need lc_sum instead of just lc_count
-	 */
-	lprocfs_counter_init(s->ls_stats, LU_SS_LRU_LEN,
-			     LPROCFS_CNTR_AVGMINMAX, "lru_len", "lru_len");
 
 	INIT_LIST_HEAD(&s->ls_linkage);
 	s->ls_top_dev = top;
@@ -1069,9 +1069,11 @@ int lu_site_init(struct lu_site *s, struct lu_device *top)
  */
 void lu_site_fini(struct lu_site *s)
 {
-	mutex_lock(&lu_sites_guard);
+	down_write(&lu_sites_guard);
 	list_del_init(&s->ls_linkage);
-	mutex_unlock(&lu_sites_guard);
+	up_write(&lu_sites_guard);
+
+	percpu_counter_destroy(&s->ls_lru_len_counter);
 
 	if (s->ls_obj_hash) {
 		cfs_hash_putref(s->ls_obj_hash);
@@ -1097,11 +1099,11 @@ int lu_site_init_finish(struct lu_site *s)
 {
 	int result;
 
-	mutex_lock(&lu_sites_guard);
+	down_write(&lu_sites_guard);
 	result = lu_context_refill(&lu_shrink_env.le_ctx);
 	if (result == 0)
 		list_add(&s->ls_linkage, &lu_sites);
-	mutex_unlock(&lu_sites_guard);
+	up_write(&lu_sites_guard);
 	return result;
 }
 EXPORT_SYMBOL(lu_site_init_finish);
@@ -1820,12 +1822,15 @@ static void lu_site_stats_get(struct cfs_hash *hs,
 }
 
 /*
- * lu_cache_shrink_count returns the number of cached objects that are
- * candidates to be freed by shrink_slab(). A counter, which tracks
- * the number of items in the site's lru, is maintained in the per cpu
- * stats of each site. The counter is incremented when an object is added
- * to a site's lru and decremented when one is removed. The number of
- * free-able objects is the sum of all per cpu counters for all sites.
+ * lu_cache_shrink_count() returns an approximate number of cached objects
+ * that can be freed by shrink_slab(). A counter, which tracks the
+ * number of items in the site's lru, is maintained in a percpu_counter
+ * for each site. The percpu values are incremented and decremented as
+ * objects are added or removed from the lru. The percpu values are summed
+ * and saved whenever a percpu value exceeds a threshold. Thus the saved,
+ * summed value at any given time may not accurately reflect the current
+ * lru length. But this value is sufficiently accurate for the needs of
+ * a shrinker.
  *
  * Using a per cpu counter is a compromise solution to concurrent access:
  * lu_object_put() can update the counter without locking the site and
@@ -1842,11 +1847,10 @@ static unsigned long lu_cache_shrink_count(struct shrinker *sk,
 	if (!(sc->gfp_mask & __GFP_FS))
 		return 0;
 
-	mutex_lock(&lu_sites_guard);
-	list_for_each_entry_safe(s, tmp, &lu_sites, ls_linkage) {
-		cached += ls_stats_read(s->ls_stats, LU_SS_LRU_LEN);
-	}
-	mutex_unlock(&lu_sites_guard);
+	down_read(&lu_sites_guard);
+	list_for_each_entry_safe(s, tmp, &lu_sites, ls_linkage)
+		cached += percpu_counter_read_positive(&s->ls_lru_len_counter);
+	up_read(&lu_sites_guard);
 
 	cached = (cached / 100) * sysctl_vfs_cache_pressure;
 	CDEBUG(D_INODE, "%ld objects cached, cache pressure %d\n",
@@ -1877,7 +1881,7 @@ static unsigned long lu_cache_shrink_scan(struct shrinker *sk,
 		 */
 		return SHRINK_STOP;
 
-	mutex_lock(&lu_sites_guard);
+	down_write(&lu_sites_guard);
 	list_for_each_entry_safe(s, tmp, &lu_sites, ls_linkage) {
 		freed = lu_site_purge(&lu_shrink_env, s, remain);
 		remain -= freed;
@@ -1888,7 +1892,7 @@ static unsigned long lu_cache_shrink_scan(struct shrinker *sk,
 		list_move_tail(&s->ls_linkage, &splice);
 	}
 	list_splice(&splice, lu_sites.prev);
-	mutex_unlock(&lu_sites_guard);
+	up_write(&lu_sites_guard);
 
 	return sc->nr_to_scan - remain;
 }
@@ -1925,9 +1929,9 @@ int lu_global_init(void)
 	 * conservatively. This should not be too bad, because this
 	 * environment is global.
 	 */
-	mutex_lock(&lu_sites_guard);
+	down_write(&lu_sites_guard);
 	result = lu_env_init(&lu_shrink_env, LCT_SHRINKER);
-	mutex_unlock(&lu_sites_guard);
+	up_write(&lu_sites_guard);
 	if (result != 0)
 		return result;
 
@@ -1953,9 +1957,9 @@ void lu_global_fini(void)
 	 * Tear shrinker environment down _after_ de-registering
 	 * lu_global_key, because the latter has a value in the former.
 	 */
-	mutex_lock(&lu_sites_guard);
+	down_write(&lu_sites_guard);
 	lu_env_fini(&lu_shrink_env);
-	mutex_unlock(&lu_sites_guard);
+	up_write(&lu_sites_guard);
 
 	lu_ref_global_fini();
 }
@@ -1965,13 +1969,6 @@ static __u32 ls_stats_read(struct lprocfs_stats *stats, int idx)
 	struct lprocfs_counter ret;
 
 	lprocfs_stats_collect(stats, idx, &ret);
-	if (idx == LU_SS_LRU_LEN)
-		/*
-		 * protect against counter on cpu A being decremented
-		 * before counter is incremented on cpu B; unlikely
-		 */
-		return (__u32)((ret.lc_sum > 0) ? ret.lc_sum : 0);
-
 	return (__u32)ret.lc_count;
 }
 
@@ -1986,7 +1983,7 @@ int lu_site_stats_print(const struct lu_site *s, struct seq_file *m)
 	memset(&stats, 0, sizeof(stats));
 	lu_site_stats_get(s->ls_obj_hash, &stats, 1);
 
-	seq_printf(m, "%d/%d %d/%ld %d %d %d %d %d %d %d %d\n",
+	seq_printf(m, "%d/%d %d/%ld %d %d %d %d %d %d %d\n",
 		   stats.lss_busy,
 		   stats.lss_total,
 		   stats.lss_populated,
@@ -1997,8 +1994,7 @@ int lu_site_stats_print(const struct lu_site *s, struct seq_file *m)
 		   ls_stats_read(s->ls_stats, LU_SS_CACHE_MISS),
 		   ls_stats_read(s->ls_stats, LU_SS_CACHE_RACE),
 		   ls_stats_read(s->ls_stats, LU_SS_CACHE_DEATH_RACE),
-		   ls_stats_read(s->ls_stats, LU_SS_LRU_PURGED),
-		   ls_stats_read(s->ls_stats, LU_SS_LRU_LEN));
+		   ls_stats_read(s->ls_stats, LU_SS_LRU_PURGED));
 	return 0;
 }
 EXPORT_SYMBOL(lu_site_stats_print);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 12/60] staging: lustre: lmv: Error not handled for lmv_find_target
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (10 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 11/60] staging: lustre: obd: RCU stalls in lu_cache_shrink_count() James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG James Simmons
                   ` (48 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Ulka Vaze,
	Aditya Pandit, James Simmons

From: Ulka Vaze <ulka.vaze@yahoo.in>

This issue is found by smatch; has been reported as-
Unchecked usage of potential ERR_PTR result in lmv_hsm_req_count
and lmv_hsm_req_build. Added ERR_PTR in both functions and also
return value check added.

Signed-off-by: Ulka Vaze <ulka.vaze@yahoo.in>
Signed-off-by: Aditya Pandit <panditadityashreesh@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6523
Reviewed-on: http://review.whamcloud.com/14918
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 6a3b83f..915415c 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -736,16 +736,18 @@ static int lmv_hsm_req_count(struct lmv_obd *lmv,
 	/* count how many requests must be sent to the given target */
 	for (i = 0; i < hur->hur_request.hr_itemcount; i++) {
 		curr_tgt = lmv_find_target(lmv, &hur->hur_user_item[i].hui_fid);
+		if (IS_ERR(curr_tgt))
+			return PTR_ERR(curr_tgt);
 		if (obd_uuid_equals(&curr_tgt->ltd_uuid, &tgt_mds->ltd_uuid))
 			nr++;
 	}
 	return nr;
 }
 
-static void lmv_hsm_req_build(struct lmv_obd *lmv,
-			      struct hsm_user_request *hur_in,
-			      const struct lmv_tgt_desc *tgt_mds,
-			      struct hsm_user_request *hur_out)
+static int lmv_hsm_req_build(struct lmv_obd *lmv,
+			     struct hsm_user_request *hur_in,
+			     const struct lmv_tgt_desc *tgt_mds,
+			     struct hsm_user_request *hur_out)
 {
 	int			i, nr_out;
 	struct lmv_tgt_desc    *curr_tgt;
@@ -756,6 +758,8 @@ static void lmv_hsm_req_build(struct lmv_obd *lmv,
 	for (i = 0; i < hur_in->hur_request.hr_itemcount; i++) {
 		curr_tgt = lmv_find_target(lmv,
 					   &hur_in->hur_user_item[i].hui_fid);
+		if (IS_ERR(curr_tgt))
+			return PTR_ERR(curr_tgt);
 		if (obd_uuid_equals(&curr_tgt->ltd_uuid, &tgt_mds->ltd_uuid)) {
 			hur_out->hur_user_item[nr_out] =
 				hur_in->hur_user_item[i];
@@ -765,6 +769,8 @@ static void lmv_hsm_req_build(struct lmv_obd *lmv,
 	hur_out->hur_request.hr_itemcount = nr_out;
 	memcpy(hur_data(hur_out), hur_data(hur_in),
 	       hur_in->hur_request.hr_data_len);
+
+	return 0;
 }
 
 static int lmv_hsm_ct_unregister(struct lmv_obd *lmv, unsigned int cmd, int len,
@@ -1041,15 +1047,17 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp,
 		} else {
 			/* split fid list to their respective MDS */
 			for (i = 0; i < count; i++) {
-				unsigned int		nr, reqlen;
-				int			rc1;
 				struct hsm_user_request *req;
+				size_t reqlen;
+				int nr, rc1;
 
 				tgt = lmv->tgts[i];
 				if (!tgt || !tgt->ltd_exp)
 					continue;
 
 				nr = lmv_hsm_req_count(lmv, hur, tgt);
+				if (nr < 0)
+					return nr;
 				if (nr == 0) /* nothing for this MDS */
 					continue;
 
@@ -1061,10 +1069,13 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp,
 				if (!req)
 					return -ENOMEM;
 
-				lmv_hsm_req_build(lmv, hur, tgt, req);
+				rc1 = lmv_hsm_req_build(lmv, hur, tgt, req);
+				if (rc1 < 0)
+					goto hsm_req_err;
 
 				rc1 = obd_iocontrol(cmd, tgt->ltd_exp, reqlen,
 						    req, uarg);
+hsm_req_err:
 				if (rc1 != 0 && rc == 0)
 					rc = rc1;
 				kvfree(req);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (11 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 12/60] staging: lustre: lmv: Error not handled for lmv_find_target James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-30 12:03   ` Dan Carpenter
  2017-01-29  0:04 ` [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes James Simmons
                   ` (47 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Bruno Faccini, James Simmons

From: Bruno Faccini <bruno.faccini@intel.com>

When a LBUG has occurred, without panic_on_lbug being set,
health_check sysfs file must return an unhealthy state.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7486
Reviewed-on: http://review.whamcloud.com/17981
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
index 22e6d1f..ef25db6 100644
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
@@ -224,8 +224,10 @@ static ssize_t pinger_show(struct kobject *kobj, struct attribute *attr,
 	int i;
 	size_t len = 0;
 
-	if (libcfs_catastrophe)
-		return sprintf(buf, "LBUG\n");
+	if (libcfs_catastrophe) {
+		len = sprintf(buf, "LBUG\n");
+		healthy = false;
+	}
 
 	read_lock(&obd_dev_lock);
 	for (i = 0; i < class_devno_max(); i++) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (12 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-31  8:53   ` Dan Carpenter
  2017-01-29  0:04 ` [PATCH 15/60] staging: lustre: hsm: stack overrun in hai_dump_data_field James Simmons
                   ` (46 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Nathaniel Clark, James Simmons

From: Nathaniel Clark <nathaniel.l.clark@intel.com>

If a backing filesystem (ZFS) returns that it supports very large
(LLONG_MAX) object sizes, that should be correctly supported.  This
fixes the check for unitialized stripe_maxbytes in
lsm_unpackmd_common(), so that ZFS can return LLONG_MAX and it will be
okay. This issue is excersized by writing to or past the 2TB boundary
of a singly stripped file.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7890
Reviewed-on: http://review.whamcloud.com/19066
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index ac0bf64..07dee87 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -150,7 +150,7 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 			       struct lov_mds_md *lmm,
 			       struct lov_ost_data_v1 *objects)
 {
-	loff_t stripe_maxbytes = LLONG_MAX;
+	loff_t min_stripe_maxbytes = 0, lov_bytes;
 	unsigned int stripe_count;
 	struct lov_oinfo *loi;
 	unsigned int i;
@@ -168,8 +168,6 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
 
 	for (i = 0; i < stripe_count; i++) {
-		loff_t tgt_bytes;
-
 		loi = lsm->lsm_oinfo[i];
 		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
 		loi->loi_ost_idx = le32_to_cpu(objects[i].l_ost_idx);
@@ -194,17 +192,21 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 			continue;
 		}
 
-		tgt_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
-		stripe_maxbytes = min_t(loff_t, stripe_maxbytes, tgt_bytes);
+		lov_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
+		if (min_stripe_maxbytes == 0 || lov_bytes < min_stripe_maxbytes)
+			min_stripe_maxbytes = lov_bytes;
 	}
 
-	if (stripe_maxbytes == LLONG_MAX)
-		stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+	if (min_stripe_maxbytes == 0)
+		min_stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+
+	stripe_count = lsm->lsm_stripe_count ?: lov->desc.ld_tgt_count;
+	lov_bytes = min_stripe_maxbytes * stripe_count;
 
-	if (!lsm->lsm_stripe_count)
-		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
+	if (lov_bytes < min_stripe_maxbytes) /* handle overflow */
+		lsm->lsm_maxbytes = MAX_LFS_FILESIZE;
 	else
-		lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
+		lsm->lsm_maxbytes = lov_bytes;
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 15/60] staging: lustre: hsm: stack overrun in hai_dump_data_field
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (13 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 16/60] staging: lustre: llite: don't ignore layout for group lock request James Simmons
                   ` (45 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, frank zago,
	James Simmons

From: frank zago <fzago@cray.com>

The function hai_dump_data_field will do a stack buffer
overrun when cat'ing /sys/fs/lustre/.../hsm/actions if an action has
some data in it.

hai_dump_data_field uses snprintf. But there is no check for
truncation, and the value returned by snprintf is used as-is.  The
coordinator code calls hai_dump_data_field with 12 bytes in the
buffer. The 6th byte of data is printed incompletely to make room for
the terminating NUL. However snprintf still returns 2, so when
hai_dump_data_field writes the final NUL, it does it outside the
reserved buffer, in the 13th byte of the buffer. This stack buffer
overrun hangs my VM.

Fix by checking that there is enough room for the next 2 characters
plus the NUL terminator. Don't print half bytes. Change the format to
02X instead of .2X, which makes more sense.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8171
Reviewed-on: http://review.whamcloud.com/20338
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jean-Baptiste Riaux <riaux.jb@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/include/lustre/lustre_user.h | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 3301ad6..21aec0c 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -1209,23 +1209,21 @@ struct hsm_action_item {
  * \retval buffer
  */
 static inline char *hai_dump_data_field(struct hsm_action_item *hai,
-					char *buffer, int len)
+					char *buffer, size_t len)
 {
-	int i, sz, data_len;
+	int i, data_len;
 	char *ptr;
 
 	ptr = buffer;
-	sz = len;
 	data_len = hai->hai_len - sizeof(*hai);
-	for (i = 0 ; (i < data_len) && (sz > 0) ; i++) {
-		int cnt;
-
-		cnt = snprintf(ptr, sz, "%.2X",
-			       (unsigned char)hai->hai_data[i]);
-		ptr += cnt;
-		sz -= cnt;
+	for (i = 0; (i < data_len) && (len > 2); i++) {
+		snprintf(ptr, 3, "%02X", (unsigned char)hai->hai_data[i]);
+		ptr += 2;
+		len -= 2;
 	}
+
 	*ptr = '\0';
+
 	return buffer;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 16/60] staging: lustre: llite: don't ignore layout for group lock request
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (14 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 15/60] staging: lustre: hsm: stack overrun in hai_dump_data_field James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 17/60] staging: lustre: obdclass: do not call lu_site_purge() for single object exceed James Simmons
                   ` (44 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, Henri Doreau, Bobi Jam, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

ignore_layout can be set for operations that layout won't be changed,
typically page operations. Ignoring layout change in group lock
request will confuse layout change code at LOV layer and hit
assertion.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2766
Reviewed-on: http://review.whamcloud.com/6828
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |  2 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  7 -------
 drivers/staging/lustre/lustre/lov/lov_lock.c       |  5 +++++
 drivers/staging/lustre/lustre/lov/lov_object.c     | 23 +++++++++++++++++++++-
 drivers/staging/lustre/lustre/osc/osc_cache.c      |  1 +
 5 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index f48660e..f0c132e 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -33,6 +33,7 @@
  * future).
  *
  */
+#define DEBUG_SUBSYSTEM S_LLITE
 #include "../include/obd_class.h"
 #include "../include/obd_support.h"
 #include "../include/obd.h"
@@ -132,7 +133,6 @@ int cl_get_grouplock(struct cl_object *obj, unsigned long gid, int nonblock,
 
 	io = vvp_env_thread_io(env);
 	io->ci_obj = obj;
-	io->ci_ignore_layout = 1;
 
 	rc = cl_io_init(env, io, CIT_MISC, io->ci_obj);
 	if (rc != 0) {
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 19f85fc..3e9cf71 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -1348,13 +1348,6 @@ int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
 			io->ci_lockreq = CILR_MANDATORY;
 	}
 
-	/* ignore layout change for generic CIT_MISC but not for glimpse.
-	 * io context for glimpse must set ci_verify_layout to true,
-	 * see cl_glimpse_size0() for details.
-	 */
-	if (io->ci_type == CIT_MISC && !io->ci_verify_layout)
-		io->ci_ignore_layout = 1;
-
 	/* Enqueue layout lock and get layout version. We need to do this
 	 * even for operations requiring to open file, such as read and write,
 	 * because it might not grant layout lock in IT_OPEN.
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index f3a0583..8502128 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -134,6 +134,11 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 	struct lov_layout_raid0 *r0     = lov_r0(loo);
 	struct lov_lock		*lovlck;
 
+	CDEBUG(D_INODE, "%p: lock/io FID " DFID "/" DFID ", lock/io clobj %p/%p\n",
+	       loo, PFID(lu_object_fid(lov2lu(loo))),
+	       PFID(lu_object_fid(&obj->co_lu)),
+	       lov2cl(loo), obj);
+
 	file_start = cl_offset(lov2cl(loo), lock->cll_descr.cld_start);
 	file_end   = cl_offset(lov2cl(loo), lock->cll_descr.cld_end + 1) - 1;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 46ec46e..9c4b5ab 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -650,12 +650,16 @@ static enum lov_layout_type lov_type(struct lov_stripe_md *lsm)
 
 static inline void lov_conf_freeze(struct lov_object *lov)
 {
+	CDEBUG(D_INODE, "To take share lov(%p) owner %p/%p\n",
+	       lov, lov->lo_owner, current);
 	if (lov->lo_owner != current)
 		down_read(&lov->lo_type_guard);
 }
 
 static inline void lov_conf_thaw(struct lov_object *lov)
 {
+	CDEBUG(D_INODE, "To release share lov(%p) owner %p/%p\n",
+	       lov, lov->lo_owner, current);
 	if (lov->lo_owner != current)
 		up_read(&lov->lo_type_guard);
 }
@@ -698,10 +702,14 @@ static void lov_conf_lock(struct lov_object *lov)
 	down_write(&lov->lo_type_guard);
 	LASSERT(!lov->lo_owner);
 	lov->lo_owner = current;
+	CDEBUG(D_INODE, "Took exclusive lov(%p) owner %p\n",
+	       lov, lov->lo_owner);
 }
 
 static void lov_conf_unlock(struct lov_object *lov)
 {
+	CDEBUG(D_INODE, "To release exclusive lov(%p) owner %p\n",
+	       lov, lov->lo_owner);
 	lov->lo_owner = NULL;
 	up_write(&lov->lo_type_guard);
 }
@@ -725,6 +733,7 @@ static int lov_layout_change(const struct lu_env *unused,
 			     struct lov_object *lov, struct lov_stripe_md *lsm,
 			     const struct cl_object_conf *conf)
 {
+	struct lov_device *lov_dev = lov_object_dev(lov);
 	enum lov_layout_type llt = lov_type(lsm);
 	union lov_layout_state *state = &lov->u;
 	const struct lov_layout_operations *old_ops;
@@ -760,14 +769,21 @@ static int lov_layout_change(const struct lu_env *unused,
 
 	LASSERT(!atomic_read(&lov->lo_active_ios));
 
+	CDEBUG(D_INODE, DFID "Apply new layout lov %p, type %d\n",
+	       PFID(lu_object_fid(lov2lu(lov))), lov, llt);
+
 	lov->lo_type = LLT_EMPTY;
 
 	/* page bufsize fixup */
 	cl_object_header(&lov->lo_cl)->coh_page_bufsize -=
 			lov_page_slice_fixup(lov, NULL);
 
-	rc = new_ops->llo_init(env, lov_object_dev(lov), lov, lsm, conf, state);
+	rc = new_ops->llo_init(env, lov_dev, lov, lsm, conf, state);
 	if (rc) {
+		struct obd_device *obd = lov2obd(lov_dev->ld_lov);
+
+		CERROR("%s: cannot apply new layout on " DFID " : rc = %d\n",
+		       obd->obd_name, PFID(lu_object_fid(lov2lu(lov))), rc);
 		new_ops->llo_delete(env, lov, state);
 		new_ops->llo_fini(env, lov, state);
 		/* this file becomes an EMPTY file. */
@@ -923,6 +939,11 @@ int lov_io_init(const struct lu_env *env, struct cl_object *obj,
 		struct cl_io *io)
 {
 	CL_IO_SLICE_CLEAN(lov_env_io(env), lis_cl);
+
+	CDEBUG(D_INODE, DFID "io %p type %d ignore/verify layout %d/%d\n",
+	       PFID(lu_object_fid(&obj->co_lu)), io, io->ci_type,
+	       io->ci_ignore_layout, io->ci_verify_layout);
+
 	return LOV_2DISPATCH_MAYLOCK(cl2lov(obj), llo_io_init,
 				     !io->ci_ignore_layout, env, obj, io);
 }
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 5ac0e14..72dd554 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -1001,6 +1001,7 @@ static int osc_extent_truncate(struct osc_extent *ext, pgoff_t trunc_index,
 	env = cl_env_get(&refcheck);
 	io  = &osc_env_info(env)->oti_io;
 	io->ci_obj = cl_object_top(osc2cl(obj));
+	io->ci_ignore_layout = 1;
 	rc = cl_io_init(env, io, CIT_MISC, io->ci_obj);
 	if (rc < 0)
 		goto out;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 17/60] staging: lustre: obdclass: do not call lu_site_purge() for single object exceed
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (15 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 16/60] staging: lustre: llite: don't ignore layout for group lock request James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 18/60] staging: lustre: ptlrpc: skip lock if export failed James Simmons
                   ` (43 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alex Zhuravlev, James Simmons

From: Alex Zhuravlev <alexey.zhuravlev@intel.com>

First of all, this is expensive procedure including a global
mutex and per-bucket spinlocks. also, all the threads observed
exceed will be calling lu_site_purge() and essentially serialized
on that. instead we can let other threads to skip the whole
procedure.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7896
Reviewed-on: http://review.whamcloud.com/19082
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lu_object.h  |  8 ++++++-
 drivers/staging/lustre/lustre/obdclass/lu_object.c | 26 +++++++++++++++-------
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index f442a96..c7dee1d 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -712,8 +712,14 @@ static inline int lu_object_is_dying(const struct lu_object_header *h)
 
 void lu_object_put(const struct lu_env *env, struct lu_object *o);
 void lu_object_unhash(const struct lu_env *env, struct lu_object *o);
+int lu_site_purge_objects(const struct lu_env *env, struct lu_site *s, int nr,
+			  bool canblock);
 
-int lu_site_purge(const struct lu_env *env, struct lu_site *s, int nr);
+static inline int lu_site_purge(const struct lu_env *env, struct lu_site *s,
+				int nr)
+{
+	return lu_site_purge_objects(env, s, nr, true);
+}
 
 void lu_site_print(const struct lu_env *env, struct lu_site *s, void *cookie,
 		   lu_printer_t printer);
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 1805861..abcf951 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -60,7 +60,7 @@ enum {
 	LU_CACHE_PERCENT_DEFAULT = 20
 };
 
-#define LU_CACHE_NR_MAX_ADJUST		128
+#define LU_CACHE_NR_MAX_ADJUST		512
 #define LU_CACHE_NR_UNLIMITED		-1
 #define LU_CACHE_NR_DEFAULT		LU_CACHE_NR_UNLIMITED
 #define LU_CACHE_NR_LDISKFS_LIMIT	LU_CACHE_NR_UNLIMITED
@@ -329,8 +329,11 @@ static void lu_object_free(const struct lu_env *env, struct lu_object *o)
 
 /**
  * Free \a nr objects from the cold end of the site LRU list.
+ * if canblock is false, then don't block awaiting for another
+ * instance of lu_site_purge() to complete
  */
-int lu_site_purge(const struct lu_env *env, struct lu_site *s, int nr)
+int lu_site_purge_objects(const struct lu_env *env, struct lu_site *s,
+			  int nr, bool canblock)
 {
 	struct lu_object_header *h;
 	struct lu_object_header *temp;
@@ -360,7 +363,11 @@ int lu_site_purge(const struct lu_env *env, struct lu_site *s, int nr)
 	 * It doesn't make any sense to make purge threads parallel, that can
 	 * only bring troubles to us. See LU-5331.
 	 */
-	mutex_lock(&s->ls_purge_mutex);
+	if (canblock)
+		mutex_lock(&s->ls_purge_mutex);
+	else if (!mutex_trylock(&s->ls_purge_mutex))
+		goto out;
+
 	did_sth = 0;
 	cfs_hash_for_each_bucket(s->ls_obj_hash, &bd, i) {
 		if (i < start)
@@ -414,10 +421,10 @@ int lu_site_purge(const struct lu_env *env, struct lu_site *s, int nr)
 	}
 	/* race on s->ls_purge_start, but nobody cares */
 	s->ls_purge_start = i % CFS_HASH_NBKT(s->ls_obj_hash);
-
+out:
 	return nr;
 }
-EXPORT_SYMBOL(lu_site_purge);
+EXPORT_SYMBOL(lu_site_purge_objects);
 
 /*
  * Object printing.
@@ -625,9 +632,12 @@ static void lu_object_limit(const struct lu_env *env, struct lu_device *dev)
 
 	size = cfs_hash_size_get(dev->ld_site->ls_obj_hash);
 	nr = (__u64)lu_cache_nr;
-	if (size > nr)
-		lu_site_purge(env, dev->ld_site,
-			      min_t(__u64, size - nr, LU_CACHE_NR_MAX_ADJUST));
+	if (size <= nr)
+		return;
+
+	lu_site_purge_objects(env, dev->ld_site,
+			      min_t(__u64, size - nr, LU_CACHE_NR_MAX_ADJUST),
+			      false);
 }
 
 static struct lu_object *lu_object_new(const struct lu_env *env,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 18/60] staging: lustre: ptlrpc: skip lock if export failed
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (16 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 17/60] staging: lustre: obdclass: do not call lu_site_purge() for single object exceed James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 19/60] staging: lustre: llite: handle inactive OSTs better in statfs James Simmons
                   ` (42 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alexander Boyko, James Simmons

From: Alexander Boyko <alexander.boyko@seagate.com>

This patch resolves IO vs eviction race.
After eviction failed export stayed at stale list,
a client had IO processing and reconnected during it.
A client sent brw rpc with last lock cookie and new connection.
The lock with failed export was found and assert was happened.
 (ost_handler.c:1812:ost_prolong_lock_one())
  ASSERTION( lock->l_export == opd->opd_exp ) failed:

 1. Skip the lock at ldlm_handle2lock if lock export failed.
 2. Validation of lock for IO was added at hpreq_check(). The lock
    searching is based on granted interval tree. If server doesn`t
    have a valid lock, it reply to client with ESTALE.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7702
Seagate-bug-id: MRP-2787
Reviewed-on: http://review.whamcloud.com/18120
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  7 +++++++
 drivers/staging/lustre/lustre/ptlrpc/service.c | 21 ++++++++-------------
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index afef5a2..5a94265 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -533,6 +533,13 @@ struct ldlm_lock *__ldlm_handle2lock(const struct lustre_handle *handle,
 	if (!lock)
 		return NULL;
 
+	if (lock->l_export && lock->l_export->exp_failed) {
+		CDEBUG(D_INFO, "lock export failed: lock %p, exp %p\n",
+		       lock, lock->l_export);
+		LDLM_LOCK_PUT(lock);
+		return NULL;
+	}
+
 	/* It's unlikely but possible that someone marked the lock as
 	 * destroyed after we did handle2object on it
 	 */
diff --git a/drivers/staging/lustre/lustre/ptlrpc/service.c b/drivers/staging/lustre/lustre/ptlrpc/service.c
index 70c7055..b8091c1 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/service.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/service.c
@@ -1264,20 +1264,15 @@ static int ptlrpc_server_hpreq_init(struct ptlrpc_service_part *svcpt,
 		 */
 		if (req->rq_ops->hpreq_check) {
 			rc = req->rq_ops->hpreq_check(req);
-			/**
-			 * XXX: Out of all current
-			 * ptlrpc_hpreq_ops::hpreq_check(), only
-			 * ldlm_cancel_hpreq_check() can return an error code;
-			 * other functions assert in similar places, which seems
-			 * odd. What also does not seem right is that handlers
-			 * for those RPCs do not assert on the same checks, but
-			 * rather handle the error cases. e.g. see
-			 * ost_rw_hpreq_check(), and ost_brw_read(),
-			 * ost_brw_write().
+			if (rc == -ESTALE) {
+				req->rq_status = rc;
+				ptlrpc_error(req);
+			}
+			/** can only return error,
+			 * 0 for normal request,
+			 *  or 1 for high priority request
 			 */
-			if (rc < 0)
-				return rc;
-			LASSERT(rc == 0 || rc == 1);
+			LASSERT(rc <= 1);
 		}
 
 		spin_lock_bh(&req->rq_export->exp_rpc_lock);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 19/60] staging: lustre: llite: handle inactive OSTs better in statfs
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (17 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 18/60] staging: lustre: ptlrpc: skip lock if export failed James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 20/60] staging: lustre: llite: remove obsolete comment for ll_unlink() James Simmons
                   ` (41 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

Change the order of checks for inactive OSCs in lov_prep_statfs_set()
so that administratively disabled OSTs do not generate any output in
"lfs df" at all, to avoid needlessly cluttering the output.

Enable the lazystatfs mount option by default, so that "df" does not
hang when an OST is temporarily offline.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7759
Reviewed-on: http://review.whamcloud.com/19195
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_lib.c | 1 +
 drivers/staging/lustre/lustre/lov/lov_request.c | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 769b307..0a87058 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -103,6 +103,7 @@ static struct ll_sb_info *ll_init_sbi(struct super_block *sb)
 	sbi->ll_flags |= LL_SBI_CHECKSUM;
 
 	sbi->ll_flags |= LL_SBI_LRU_RESIZE;
+	sbi->ll_flags |= LL_SBI_LAZYSTATFS;
 
 	for (i = 0; i <= LL_PROCESS_HIST_MAX; i++) {
 		spin_lock_init(&sbi->ll_rw_extents_info.pp_extents[i].
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index d43cc88..3a74791 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -344,9 +344,6 @@ int lov_prep_statfs_set(struct obd_device *obd, struct obd_info *oinfo,
 			continue;
 		}
 
-		if (!lov->lov_tgts[i]->ltd_active)
-			lov_check_and_wait_active(lov, i);
-
 		/* skip targets that have been explicitly disabled by the
 		 * administrator
 		 */
@@ -355,6 +352,9 @@ int lov_prep_statfs_set(struct obd_device *obd, struct obd_info *oinfo,
 			continue;
 		}
 
+		if (!lov->lov_tgts[i]->ltd_active)
+			lov_check_and_wait_active(lov, i);
+
 		req = kzalloc(sizeof(*req), GFP_NOFS);
 		if (!req) {
 			rc = -ENOMEM;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 20/60] staging: lustre: llite: remove obsolete comment for ll_unlink()
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (18 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 19/60] staging: lustre: llite: handle inactive OSTs better in statfs James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail() James Simmons
                   ` (40 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: "John L. Hammond" <john.hammond@intel.com>

Remove obsolete comments about the behavior of ll_unlink()

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8003
Reviewed-on: http://review.whamcloud.com/19881
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/namei.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index f925656..fc17654 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -994,11 +994,6 @@ static int ll_create_nd(struct inode *dir, struct dentry *dentry,
 	return rc;
 }
 
-/* ll_unlink() doesn't update the inode with the new link count.
- * Instead, ll_ddelete() and ll_d_iput() will update it based upon if there
- * is any lock existing. They will recycle dentries and inodes based upon locks
- * too. b=20433
- */
 static int ll_unlink(struct inode *dir, struct dentry *dchild)
 {
 	struct ptlrpc_request *request = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail()
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (19 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 20/60] staging: lustre: llite: remove obsolete comment for ll_unlink() James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-31  8:54   ` Dan Carpenter
  2017-01-29  0:04 ` [PATCH 22/60] staging: lustre: fid: fix race in fid allocation James Simmons
                   ` (39 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: "John L. Hammond" <john.hammond@intel.com>

In sptlrpc_gc_add_sec() swap the arguments to list_add_tail() so that
it does what we meant it to do.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8270
Reviewed-on: http://review.whamcloud.com/20784
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/sec_gc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec_gc.c b/drivers/staging/lustre/lustre/ptlrpc/sec_gc.c
index 8ffd000..026bec7 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec_gc.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec_gc.c
@@ -66,7 +66,7 @@ void sptlrpc_gc_add_sec(struct ptlrpc_sec *sec)
 	sec->ps_gc_next = ktime_get_real_seconds() + sec->ps_gc_interval;
 
 	spin_lock(&sec_gc_list_lock);
-	list_add_tail(&sec_gc_list, &sec->ps_gc_list);
+	list_add_tail(&sec->ps_gc_list, &sec_gc_list);
 	spin_unlock(&sec_gc_list_lock);
 
 	CDEBUG(D_SEC, "added sec %p(%s)\n", sec, sec->ps_policy->sp_name);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 22/60] staging: lustre: fid: fix race in fid allocation
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (20 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail() James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-31  8:55   ` Dan Carpenter
  2017-01-29  0:04 ` [PATCH 23/60] staging: lustre: lmv: remove unused placement parameter James Simmons
                   ` (38 subsequent siblings)
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Fan Yong,
	James Simmons

From: Fan Yong <fan.yong@intel.com>

There is race condition when allocating fid/seq in parallel
as following:

The thread1 will release the lcs_mutex via seq_fid_alloc_prep(),
then another fid allocation thread2 can obtain the lcs_mutex and
allocate FID in the new sequence that has just been allocated by
the thread1 via seq_client_alloc_seq(); and then after thread2
released the lcs_mutex, the thread1 will re-allocate the current
FID in the new sequence without checking whether some others have
already taken such FID in the new sequence during it re-obtaining
the lcs_mutex.

Such race will cause two objects to use the same FID, and trigger
OI conflict and LMA verification failures.

This patch makes the fid allocation and lu_client_seq modification
to be protected by the lcs_mutex.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8319
Reviewed-on: http://review.whamcloud.com/20939
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/fid/fid_request.c | 55 ++++++++++++++++---------
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
index 999f250..62a9f7e 100644
--- a/drivers/staging/lustre/lustre/fid/fid_request.c
+++ b/drivers/staging/lustre/lustre/fid/fid_request.c
@@ -211,12 +211,35 @@ static int seq_fid_alloc_prep(struct lu_client_seq *seq,
 	return 0;
 }
 
-static void seq_fid_alloc_fini(struct lu_client_seq *seq)
+static void seq_fid_alloc_fini(struct lu_client_seq *seq, u64 seqnr,
+			       bool whole)
 {
 	LASSERT(seq->lcs_update == 1);
+
 	mutex_lock(&seq->lcs_mutex);
+	if (seqnr != 0) {
+		CDEBUG(D_INFO, "%s: New sequence [0x%16.16llx]\n",
+		       seq->lcs_name, seqnr);
+
+		seq->lcs_fid.f_seq = seqnr;
+		if (whole) {
+			/*
+			 * Since the caller require the whole seq,
+			 * so marked this seq to be used
+			 */
+			if (seq->lcs_type == LUSTRE_SEQ_METADATA)
+				seq->lcs_fid.f_oid =
+					LUSTRE_METADATA_SEQ_MAX_WIDTH;
+			else
+				seq->lcs_fid.f_oid = LUSTRE_DATA_SEQ_MAX_WIDTH;
+		} else {
+			seq->lcs_fid.f_oid = LUSTRE_FID_INIT_OID;
+		}
+		seq->lcs_fid.f_ver = 0;
+	}
+
 	--seq->lcs_update;
-	wake_up(&seq->lcs_waitq);
+	wake_up_all(&seq->lcs_waitq);
 }
 
 /* Allocate new fid on passed client @seq and save it to @fid. */
@@ -238,41 +261,33 @@ int seq_client_alloc_fid(const struct lu_env *env,
 	while (1) {
 		u64 seqnr;
 
-		if (!fid_is_zero(&seq->lcs_fid) &&
-		    fid_oid(&seq->lcs_fid) < seq->lcs_width) {
+		if (unlikely(!fid_is_zero(&seq->lcs_fid) &&
+			     fid_oid(&seq->lcs_fid) < seq->lcs_width)) {
 			/* Just bump last allocated fid and return to caller. */
-			seq->lcs_fid.f_oid += 1;
+			seq->lcs_fid.f_oid++;
 			rc = 0;
 			break;
 		}
 
+		/*
+		 * Release seq::lcs_mutex via seq_fid_alloc_prep() to avoid
+		 * deadlock during seq_client_alloc_seq().
+		 */
 		rc = seq_fid_alloc_prep(seq, &link);
 		if (rc)
 			continue;
 
 		rc = seq_client_alloc_seq(env, seq, &seqnr);
+		/* Re-take seq::lcs_mutex via seq_fid_alloc_fini(). */
+		seq_fid_alloc_fini(seq, rc ? 0 : seqnr, false);
 		if (rc) {
-			CERROR("%s: Can't allocate new sequence, rc %d\n",
+			CERROR("%s: Can't allocate new sequence, rc = %d\n",
 			       seq->lcs_name, rc);
-			seq_fid_alloc_fini(seq);
 			mutex_unlock(&seq->lcs_mutex);
 			return rc;
 		}
 
-		CDEBUG(D_INFO, "%s: Switch to sequence [0x%16.16Lx]\n",
-		       seq->lcs_name, seqnr);
-
-		seq->lcs_fid.f_oid = LUSTRE_FID_INIT_OID;
-		seq->lcs_fid.f_seq = seqnr;
-		seq->lcs_fid.f_ver = 0;
-
-		/*
-		 * Inform caller that sequence switch is performed to allow it
-		 * to setup FLD for it.
-		 */
 		rc = 1;
-
-		seq_fid_alloc_fini(seq);
 		break;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 23/60] staging: lustre: lmv: remove unused placement parameter
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (21 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 22/60] staging: lustre: fid: fix race in fid allocation James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 24/60] staging: lustre: lustre: Remove old commented out code James Simmons
                   ` (37 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: "John L. Hammond" <john.hammond@intel.com>

Remove the unused lmv.*.placement parameter along with supporting
functions and struct members.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7674
Reviewed-on: http://review.whamcloud.com/18019
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h   |  8 ----
 drivers/staging/lustre/lustre/lmv/lmv_obd.c   |  1 -
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c | 68 ---------------------------
 3 files changed, 77 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 6d3bd05..5c217c0 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -403,18 +403,10 @@ struct lmv_tgt_desc {
 	unsigned long		ltd_active:1; /* target up for requests */
 };
 
-enum placement_policy {
-	PLACEMENT_CHAR_POLICY   = 0,
-	PLACEMENT_NID_POLICY    = 1,
-	PLACEMENT_INVAL_POLICY  = 2,
-	PLACEMENT_MAX_POLICY
-};
-
 struct lmv_obd {
 	int			refcount;
 	struct lu_client_fld	lmv_fld;
 	spinlock_t		lmv_lock;
-	enum placement_policy	lmv_placement;
 	struct lmv_desc		desc;
 	struct obd_uuid		cluuid;
 	struct obd_export	*exp;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 915415c..5926461 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1284,7 +1284,6 @@ static int lmv_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	lmv->desc.ld_active_tgt_count = 0;
 	lmv->max_def_easize = 0;
 	lmv->max_easize = 0;
-	lmv->lmv_placement = PLACEMENT_CHAR_POLICY;
 
 	spin_lock_init(&lmv->lmv_lock);
 	mutex_init(&lmv->lmv_init_mutex);
diff --git a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
index 14fbc9c..ff45802 100644
--- a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
+++ b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
@@ -50,73 +50,6 @@ static ssize_t numobd_show(struct kobject *kobj, struct attribute *attr,
 }
 LUSTRE_RO_ATTR(numobd);
 
-static const char *placement_name[] = {
-	[PLACEMENT_CHAR_POLICY] = "CHAR",
-	[PLACEMENT_NID_POLICY]  = "NID",
-	[PLACEMENT_INVAL_POLICY]  = "INVAL"
-};
-
-static enum placement_policy placement_name2policy(char *name, int len)
-{
-	int		     i;
-
-	for (i = 0; i < PLACEMENT_MAX_POLICY; i++) {
-		if (!strncmp(placement_name[i], name, len))
-			return i;
-	}
-	return PLACEMENT_INVAL_POLICY;
-}
-
-static const char *placement_policy2name(enum placement_policy placement)
-{
-	LASSERT(placement < PLACEMENT_MAX_POLICY);
-	return placement_name[placement];
-}
-
-static ssize_t placement_show(struct kobject *kobj, struct attribute *attr,
-			      char *buf)
-{
-	struct obd_device *dev = container_of(kobj, struct obd_device,
-					      obd_kobj);
-	struct lmv_obd *lmv;
-
-	lmv = &dev->u.lmv;
-	return sprintf(buf, "%s\n", placement_policy2name(lmv->lmv_placement));
-}
-
-#define MAX_POLICY_STRING_SIZE 64
-
-static ssize_t placement_store(struct kobject *kobj, struct attribute *attr,
-			       const char *buffer,
-			       size_t count)
-{
-	struct obd_device *dev = container_of(kobj, struct obd_device,
-					      obd_kobj);
-	char dummy[MAX_POLICY_STRING_SIZE + 1];
-	enum placement_policy policy;
-	struct lmv_obd *lmv = &dev->u.lmv;
-
-	memcpy(dummy, buffer, MAX_POLICY_STRING_SIZE);
-
-	if (count > MAX_POLICY_STRING_SIZE)
-		count = MAX_POLICY_STRING_SIZE;
-
-	if (dummy[count - 1] == '\n')
-		count--;
-	dummy[count] = '\0';
-
-	policy = placement_name2policy(dummy, count);
-	if (policy != PLACEMENT_INVAL_POLICY) {
-		spin_lock(&lmv->lmv_lock);
-		lmv->lmv_placement = policy;
-		spin_unlock(&lmv->lmv_lock);
-	} else {
-		return -EINVAL;
-	}
-	return count;
-}
-LUSTRE_RW_ATTR(placement);
-
 static ssize_t activeobd_show(struct kobject *kobj, struct attribute *attr,
 			      char *buf)
 {
@@ -226,7 +159,6 @@ static int lmv_target_seq_open(struct inode *inode, struct file *file)
 static struct attribute *lmv_attrs[] = {
 	&lustre_attr_activeobd.attr,
 	&lustre_attr_numobd.attr,
-	&lustre_attr_placement.attr,
 	NULL,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 24/60] staging: lustre: lustre: Remove old commented out code
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (22 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 23/60] staging: lustre: lmv: remove unused placement parameter James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 25/60] staging: lustre: llite: normal user can't set FS default stripe James Simmons
                   ` (36 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Ben Evans,
	James Simmons

From: Ben Evans <bevans@cray.com>

These #if 0 blocks have been in place for years. Assume
they are not used and remove them

Signed-off-by: Ben Evans <bevans@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8058
Reviewed-on: http://review.whamcloud.com/20416
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h     | 2 --
 drivers/staging/lustre/lustre/lmv/lmv_obd.c     | 9 +--------
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 8 +-------
 3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5c217c0..ab47078 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -483,8 +483,6 @@ enum obd_notify_event {
 	OBD_NOTIFY_ACTIVE,
 	/* Device deactivated */
 	OBD_NOTIFY_INACTIVE,
-	/* Device disconnected */
-	OBD_NOTIFY_DISCON,
 	/* Connect data for import were changed */
 	OBD_NOTIFY_OCD,
 	/* Sync request */
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 5926461..271e189 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -173,14 +173,7 @@ static int lmv_notify(struct obd_device *obd, struct obd_device *watched,
 		 */
 		obd->obd_self_export->exp_connect_data = *conn_data;
 	}
-#if 0
-	else if (ev == OBD_NOTIFY_DISCON) {
-		/*
-		 * For disconnect event, flush fld cache for failout MDS case.
-		 */
-		fld_client_flush(&lmv->lmv_fld);
-	}
-#endif
+
 	/*
 	 * Pass the notification up the chain.
 	 */
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 2cfd913..02f57d8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2465,13 +2465,6 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 	LASSERT(imp->imp_obd == obd);
 
 	switch (event) {
-	case IMP_EVENT_DISCON: {
-#if 0
-		/* XXX Pass event up to OBDs stack. used only for FLD now */
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_DISCON, NULL);
-#endif
-		break;
-	}
 	case IMP_EVENT_INACTIVE: {
 		struct client_obd *cli = &obd->u.cli;
 		/*
@@ -2503,6 +2496,7 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 	case IMP_EVENT_OCD:
 		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD, NULL);
 		break;
+	case IMP_EVENT_DISCON:
 	case IMP_EVENT_DEACTIVATE:
 	case IMP_EVENT_ACTIVATE:
 		break;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 25/60] staging: lustre: llite: normal user can't set FS default stripe
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (23 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 24/60] staging: lustre: lustre: Remove old commented out code James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 26/60] staging: lustre: llite: Trust creates in revalidate too James Simmons
                   ` (35 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Lai Siyao,
	James Simmons

From: Lai Siyao <lai.siyao@intel.com>

Current client doesn't check permission before updating filesystem
default stripe on MGS, which isn't secure and obvious.

Since we setattr on MDS first, and then set default stripe on MGS,
we can just return error upon setattr failure.

Now filesystem default stripe is stored in ROOT in MDT, so saving
it in system config is for compatibility with old servers, this
will be removed in the future.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8454
Reviewed-on: http://review.whamcloud.com/21612
Reviewed-on: http://review.whamcloud.com/22580
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 526fea2..13b3592 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -521,12 +521,15 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 	rc = md_setattr(sbi->ll_md_exp, op_data, lump, lum_size, &req);
 	ll_finish_md_op_data(op_data);
 	ptlrpc_req_finished(req);
-	if (rc) {
-		if (rc != -EPERM && rc != -EACCES)
-			CERROR("mdc_setattr fails: rc = %d\n", rc);
-	}
+	if (rc)
+		return rc;
 
-	/* In the following we use the fact that LOV_USER_MAGIC_V1 and
+#if OBD_OCD_VERSION(2, 13, 53, 0) > LUSTRE_VERSION_CODE
+	/*
+	 * 2.9 server has stored filesystem default stripe in ROOT xattr,
+	 * and it's stored into system config for backward compatibility.
+	 *
+	 * In the following we use the fact that LOV_USER_MAGIC_V1 and
 	 * LOV_USER_MAGIC_V3 have the same initial fields so we do not
 	 * need to make the distinction between the 2 versions
 	 */
@@ -567,6 +570,7 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 end:
 		kfree(param);
 	}
+#endif
 	return rc;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 26/60] staging: lustre: llite: Trust creates in revalidate too.
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (24 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 25/60] staging: lustre: llite: normal user can't set FS default stripe James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 27/60] staging: lustre: mgc: handle config_llog_data::cld_refcount properly James Simmons
                   ` (34 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

From: Oleg Drokin <oleg.drokin@intel.com>

By forcing creates to always go via lookup we lose some
important caching benefits too.
Instead let's trust creates with positive cached entries.

Then we have 3 possible outcomes:
1. Negative dentry - we go via atomic_open and do the create
   by name there.
2. Positive dentry, no contention - we just go straight to
   ll_intent_file_open and open by fid.
3. positive dentry, contention - by the time we reach the server,
   the inode is gone. We get ENOENT which is unacceptable to return
   from create. But since we know it's a create, we substitute it
   with ESTALE and VFS retries again with LOOKUP_REVAL set, we catch
   that in revalidate and force a lookup (same path as before this
   patch).

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8371
Reviewed-on: http://review.whamcloud.com/21168
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dcache.c | 13 +++++--------
 drivers/staging/lustre/lustre/llite/file.c   | 11 +++++++++++
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dcache.c b/drivers/staging/lustre/lustre/llite/dcache.c
index 65bf0c4..966f580 100644
--- a/drivers/staging/lustre/lustre/llite/dcache.c
+++ b/drivers/staging/lustre/lustre/llite/dcache.c
@@ -247,17 +247,14 @@ static int ll_revalidate_dentry(struct dentry *dentry,
 		return 1;
 
 	/*
-	 * if open&create is set, talk to MDS to make sure file is created if
-	 * necessary, because we can't do this in ->open() later since that's
-	 * called on an inode. return 0 here to let lookup to handle this.
+	 * VFS warns us that this is the second go around and previous
+	 * operation failed (most likely open|creat), so this time
+	 * we better talk to the server via the lookup path by name,
+	 * not by fid.
 	 */
-	if ((lookup_flags & (LOOKUP_OPEN | LOOKUP_CREATE)) ==
-	    (LOOKUP_OPEN | LOOKUP_CREATE))
+	if (lookup_flags & LOOKUP_REVAL)
 		return 0;
 
-	if (lookup_flags & (LOOKUP_PARENT | LOOKUP_OPEN | LOOKUP_CREATE))
-		return 1;
-
 	if (!dentry_may_statahead(dir, dentry))
 		return 1;
 
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index b681e15..0c83bd7 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -417,6 +417,17 @@ static int ll_intent_file_open(struct dentry *de, void *lmm, int lmmsize,
 	ptlrpc_req_finished(req);
 	ll_intent_drop_lock(itp);
 
+	/*
+	 * We did open by fid, but by the time we got to the server,
+	 * the object disappeared. If this is a create, we cannot really
+	 * tell the userspace that the file it was trying to create
+	 * does not exist. Instead let's return -ESTALE, and the VFS will
+	 * retry the create with LOOKUP_REVAL that we are going to catch
+	 * in ll_revalidate_dentry() and use lookup then.
+	 */
+	if (rc == -ENOENT && itp->it_op & IT_CREAT)
+		rc = -ESTALE;
+
 	return rc;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 27/60] staging: lustre: mgc: handle config_llog_data::cld_refcount properly
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (25 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 26/60] staging: lustre: llite: Trust creates in revalidate too James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 28/60] staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed James Simmons
                   ` (33 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Fan Yong,
	James Simmons

From: Fan Yong <fan.yong@intel.com>

Originally, the logic of handling config_llog_data::cld_refcount
is some confusing, it may cause the cld_refcount to be leaked or
trigger "LASSERT(atomic_read(&cld->cld_refcount) > 0);" when put
the reference. This patch clean related logic as following:

1) When the 'cld' is created, its reference is set as 1.

2) No need additional reference when add the 'cld' into the list
   'config_llog_list'.

3) Inrease 'cld_refcount' when set lock data after mgc_enqueue()
   done successfully by mgc_process_log().

4) When mgc_requeue_thread() traversals the 'config_llog_list',
   it needs to take additional reference on each 'cld' to avoid
   being freed during subsequent processing. The reference also
   prevents the 'cld' to be dropped from the 'config_llog_list',
   then the mgc_requeue_thread() can safely locate next 'cld',
   and then decrease the 'cld_refcount' for previous one.

5) mgc_blocking_ast() will drop the reference of 'cld_refcount'
   that is taken in mgc_process_log().

6) The others need to call config_log_find() to find the 'cld'
   if want to access related config log data. That will increase
   the 'cld_refcount' to avoid being freed during accessing. The
   sponsor needs to call config_log_put() after using the 'cld'.

7) Other confused or redundant logic are dropped.

   On the other hand, the patch also enhances the protection for
   'config_llog_data' flags, such as 'cld_stopping'/'cld_lostlock'
   as following.

a) Use 'config_list_lock' (spinlock) to handle the possible
   parallel accessing of these flags among mgc_requeue_thread()
   and others config llog data visitors, such as mount/umount,
   blocking_ast, and so on.

b) Use 'config_llog_data::cld_lock' (mutex) to pretect other
   parallel accessing of these flags among kinds of blockable
   operations, such as mount, umount, and blocking ast.

The 'config_llog_data::cld_lock' is also used for protecting
the sub-cld members, such as 'cld_sptlrpc'/'cld_params', and
so on.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8408
Reviewed-on: http://review.whamcloud.com/21616
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mgc/mgc_request.c | 183 ++++++++++++------------
 1 file changed, 94 insertions(+), 89 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index b9c522a..6a76605 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -142,10 +142,10 @@ static void config_log_put(struct config_llog_data *cld)
 
 		if (cld->cld_recover)
 			config_log_put(cld->cld_recover);
-		if (cld->cld_sptlrpc)
-			config_log_put(cld->cld_sptlrpc);
 		if (cld->cld_params)
 			config_log_put(cld->cld_params);
+		if (cld->cld_sptlrpc)
+			config_log_put(cld->cld_sptlrpc);
 		if (cld_is_sptlrpc(cld))
 			sptlrpc_conf_log_stop(cld->cld_logname);
 
@@ -175,13 +175,10 @@ struct config_llog_data *config_log_find(char *logname,
 		/* instance may be NULL, should check name */
 		if (strcmp(logname, cld->cld_logname) == 0) {
 			found = cld;
+			config_log_get(found);
 			break;
 		}
 	}
-	if (found) {
-		atomic_inc(&found->cld_refcount);
-		LASSERT(found->cld_stopping == 0 || cld_is_sptlrpc(found) == 0);
-	}
 	spin_unlock(&config_list_lock);
 	return found;
 }
@@ -203,6 +200,12 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	if (!cld)
 		return ERR_PTR(-ENOMEM);
 
+	rc = mgc_logname2resid(logname, &cld->cld_resid, type);
+	if (rc) {
+		kfree(cld);
+		return ERR_PTR(rc);
+	}
+
 	strcpy(cld->cld_logname, logname);
 	if (cfg)
 		cld->cld_cfg = *cfg;
@@ -223,17 +226,10 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 		cld->cld_cfg.cfg_obdname = obd->obd_name;
 	}
 
-	rc = mgc_logname2resid(logname, &cld->cld_resid, type);
-
 	spin_lock(&config_list_lock);
 	list_add(&cld->cld_list_chain, &config_llog_list);
 	spin_unlock(&config_list_lock);
 
-	if (rc) {
-		config_log_put(cld);
-		return ERR_PTR(rc);
-	}
-
 	if (cld_is_sptlrpc(cld)) {
 		rc = mgc_process_log(obd, cld);
 		if (rc && rc != -ENOENT)
@@ -284,14 +280,15 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
  * We have one active log per "mount" - client instance or servername.
  * Each instance may be at a different point in the log.
  */
-static int config_log_add(struct obd_device *obd, char *logname,
-			  struct config_llog_instance *cfg,
-			  struct super_block *sb)
+static struct config_llog_data *
+config_log_add(struct obd_device *obd, char *logname,
+	       struct config_llog_instance *cfg, struct super_block *sb)
 {
 	struct lustre_sb_info *lsi = s2lsi(sb);
 	struct config_llog_data *cld;
 	struct config_llog_data *sptlrpc_cld;
 	struct config_llog_data *params_cld;
+	bool locked = false;
 	char			seclogname[32];
 	char			*ptr;
 	int			rc;
@@ -305,7 +302,7 @@ static int config_log_add(struct obd_device *obd, char *logname,
 	ptr = strrchr(logname, '-');
 	if (!ptr || ptr - logname > 8) {
 		CERROR("logname %s is too long\n", logname);
-		return -EINVAL;
+		return ERR_PTR(-EINVAL);
 	}
 
 	memcpy(seclogname, logname, ptr - logname);
@@ -326,14 +323,14 @@ static int config_log_add(struct obd_device *obd, char *logname,
 		rc = PTR_ERR(params_cld);
 		CERROR("%s: can't create params log: rc = %d\n",
 		       obd->obd_name, rc);
-		goto out_err1;
+		goto out_sptlrpc;
 	}
 
 	cld = do_config_log_add(obd, logname, CONFIG_T_CONFIG, cfg, sb);
 	if (IS_ERR(cld)) {
 		CERROR("can't create log: %s\n", logname);
 		rc = PTR_ERR(cld);
-		goto out_err2;
+		goto out_params;
 	}
 
 	cld->cld_sptlrpc = sptlrpc_cld;
@@ -350,33 +347,52 @@ static int config_log_add(struct obd_device *obd, char *logname,
 			CERROR("%s: sptlrpc log name not correct, %s: rc = %d\n",
 			       obd->obd_name, seclogname, -EINVAL);
 			config_log_put(cld);
-			return -EINVAL;
+			rc = -EINVAL;
+			goto out_cld;
 		}
 		recover_cld = config_recover_log_add(obd, seclogname, cfg, sb);
 		if (IS_ERR(recover_cld)) {
 			rc = PTR_ERR(recover_cld);
-			goto out_err3;
+			goto out_cld;
 		}
+
+		mutex_lock(&cld->cld_lock);
+		locked = true;
 		cld->cld_recover = recover_cld;
 	}
 
-	return 0;
+	if (!locked)
+		mutex_lock(&cld->cld_lock);
+	cld->cld_params = params_cld;
+	cld->cld_sptlrpc = sptlrpc_cld;
+	mutex_unlock(&cld->cld_lock);
+
+	return cld;
 
-out_err3:
+out_cld:
 	config_log_put(cld);
 
-out_err2:
+out_params:
 	config_log_put(params_cld);
 
-out_err1:
+out_sptlrpc:
 	config_log_put(sptlrpc_cld);
 
 out_err:
-	return rc;
+	return ERR_PTR(rc);
 }
 
 static DEFINE_MUTEX(llog_process_lock);
 
+static inline void config_mark_cld_stop(struct config_llog_data *cld)
+{
+	mutex_lock(&cld->cld_lock);
+	spin_lock(&config_list_lock);
+	cld->cld_stopping = 1;
+	spin_unlock(&config_list_lock);
+	mutex_unlock(&cld->cld_lock);
+}
+
 /** Stop watching for updates on this log.
  */
 static int config_log_end(char *logname, struct config_llog_instance *cfg)
@@ -406,36 +422,32 @@ static int config_log_end(char *logname, struct config_llog_instance *cfg)
 		return rc;
 	}
 
+	spin_lock(&config_list_lock);
 	cld->cld_stopping = 1;
+	spin_unlock(&config_list_lock);
 
 	cld_recover = cld->cld_recover;
 	cld->cld_recover = NULL;
+
+	cld_params = cld->cld_params;
+	cld->cld_params = NULL;
+	cld_sptlrpc = cld->cld_sptlrpc;
+	cld->cld_sptlrpc = NULL;
 	mutex_unlock(&cld->cld_lock);
 
 	if (cld_recover) {
-		mutex_lock(&cld_recover->cld_lock);
-		cld_recover->cld_stopping = 1;
-		mutex_unlock(&cld_recover->cld_lock);
+		config_mark_cld_stop(cld_recover);
 		config_log_put(cld_recover);
 	}
 
-	spin_lock(&config_list_lock);
-	cld_sptlrpc = cld->cld_sptlrpc;
-	cld->cld_sptlrpc = NULL;
-	cld_params = cld->cld_params;
-	cld->cld_params = NULL;
-	spin_unlock(&config_list_lock);
-
-	if (cld_sptlrpc)
-		config_log_put(cld_sptlrpc);
-
 	if (cld_params) {
-		mutex_lock(&cld_params->cld_lock);
-		cld_params->cld_stopping = 1;
-		mutex_unlock(&cld_params->cld_lock);
+		config_mark_cld_stop(cld_params);
 		config_log_put(cld_params);
 	}
 
+	if (cld_sptlrpc)
+		config_log_put(cld_sptlrpc);
+
 	/* drop the ref from the find */
 	config_log_put(cld);
 	/* drop the start ref */
@@ -531,11 +543,10 @@ static int mgc_requeue_thread(void *data)
 	/* Keep trying failed locks periodically */
 	spin_lock(&config_list_lock);
 	rq_state |= RQ_RUNNING;
-	while (1) {
+	while (!(rq_state & RQ_STOP)) {
 		struct l_wait_info lwi;
 		struct config_llog_data *cld, *cld_prev;
 		int rand = cfs_rand() & MGC_TIMEOUT_RAND_CENTISEC;
-		int stopped = !!(rq_state & RQ_STOP);
 		int to;
 
 		/* Any new or requeued lostlocks will change the state */
@@ -571,44 +582,40 @@ static int mgc_requeue_thread(void *data)
 		spin_lock(&config_list_lock);
 		rq_state &= ~RQ_PRECLEANUP;
 		list_for_each_entry(cld, &config_llog_list, cld_list_chain) {
-			if (!cld->cld_lostlock)
+			if (!cld->cld_lostlock || cld->cld_stopping)
 				continue;
 
+			/*
+			 * hold reference to avoid being freed during
+			 * subsequent processing.
+			 */
+			config_log_get(cld);
+			cld->cld_lostlock = 0;
 			spin_unlock(&config_list_lock);
 
-			LASSERT(atomic_read(&cld->cld_refcount) > 0);
-
-			/* Whether we enqueued again or not in mgc_process_log,
-			 * we're done with the ref from the old enqueue
-			 */
 			if (cld_prev)
 				config_log_put(cld_prev);
 			cld_prev = cld;
 
-			cld->cld_lostlock = 0;
-			if (likely(!stopped))
+			if (likely(!(rq_state & RQ_STOP))) {
 				do_requeue(cld);
-
-			spin_lock(&config_list_lock);
+				spin_lock(&config_list_lock);
+			} else {
+				spin_lock(&config_list_lock);
+				break;
+			}
 		}
 		spin_unlock(&config_list_lock);
 		if (cld_prev)
 			config_log_put(cld_prev);
 
-		/* break after scanning the list so that we can drop
-		 * refcount to losing lock clds
-		 */
-		if (unlikely(stopped)) {
-			spin_lock(&config_list_lock);
-			break;
-		}
-
 		/* Wait a bit to see if anyone else needs a requeue */
 		lwi = (struct l_wait_info) { 0 };
 		l_wait_event(rq_waitq, rq_state & (RQ_NOW | RQ_STOP),
 			     &lwi);
 		spin_lock(&config_list_lock);
 	}
+
 	/* spinlock and while guarantee RQ_NOW and RQ_LATER are not set */
 	rq_state &= ~RQ_RUNNING;
 	spin_unlock(&config_list_lock);
@@ -624,32 +631,24 @@ static int mgc_requeue_thread(void *data)
  */
 static void mgc_requeue_add(struct config_llog_data *cld)
 {
+	bool wakeup = false;
+
 	CDEBUG(D_INFO, "log %s: requeue (r=%d sp=%d st=%x)\n",
 	       cld->cld_logname, atomic_read(&cld->cld_refcount),
 	       cld->cld_stopping, rq_state);
 	LASSERT(atomic_read(&cld->cld_refcount) > 0);
 
 	mutex_lock(&cld->cld_lock);
-	if (cld->cld_stopping || cld->cld_lostlock) {
-		mutex_unlock(&cld->cld_lock);
-		return;
-	}
-	/* this refcount will be released in mgc_requeue_thread. */
-	config_log_get(cld);
-	cld->cld_lostlock = 1;
-	mutex_unlock(&cld->cld_lock);
-
-	/* Hold lock for rq_state */
 	spin_lock(&config_list_lock);
-	if (rq_state & RQ_STOP) {
-		spin_unlock(&config_list_lock);
-		cld->cld_lostlock = 0;
-		config_log_put(cld);
-	} else {
+	if (!(rq_state & RQ_STOP) && !cld->cld_stopping && !cld->cld_lostlock) {
+		cld->cld_lostlock = 1;
 		rq_state |= RQ_NOW;
-		spin_unlock(&config_list_lock);
-		wake_up(&rq_waitq);
+		wakeup = true;
 	}
+	spin_unlock(&config_list_lock);
+	mutex_unlock(&cld->cld_lock);
+	if (wakeup)
+		wake_up(&rq_waitq);
 }
 
 static int mgc_llog_init(const struct lu_env *env, struct obd_device *obd)
@@ -812,6 +811,8 @@ static int mgc_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 
 		/* held at mgc_process_log(). */
 		LASSERT(atomic_read(&cld->cld_refcount) > 0);
+
+		lock->l_ast_data = NULL;
 		/* Are we done with this log? */
 		if (cld->cld_stopping) {
 			CDEBUG(D_MGC, "log %s: stopping, won't requeue\n",
@@ -1661,16 +1662,18 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 				goto restart;
 			} else {
 				mutex_lock(&cld->cld_lock);
+				spin_lock(&config_list_lock);
 				cld->cld_lostlock = 1;
+				spin_unlock(&config_list_lock);
 			}
 		} else {
 			/* mark cld_lostlock so that it will requeue
 			 * after MGC becomes available.
 			 */
+			spin_lock(&config_list_lock);
 			cld->cld_lostlock = 1;
+			spin_unlock(&config_list_lock);
 		}
-		/* Get extra reference, it will be put in requeue thread */
-		config_log_get(cld);
 	}
 
 	if (cld_is_recover(cld)) {
@@ -1681,7 +1684,9 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 				CERROR("%s: recover log %s failed: rc = %d not fatal.\n",
 				       mgc->obd_name, cld->cld_logname, rc);
 				rc = 0;
+				spin_lock(&config_list_lock);
 				cld->cld_lostlock = 1;
+				spin_unlock(&config_list_lock);
 			}
 		}
 	} else {
@@ -1749,12 +1754,9 @@ static int mgc_process_config(struct obd_device *obd, u32 len, void *buf)
 		       cfg->cfg_last_idx);
 
 		/* We're only called through here on the initial mount */
-		rc = config_log_add(obd, logname, cfg, sb);
-		if (rc)
-			break;
-		cld = config_log_find(logname, cfg);
-		if (!cld) {
-			rc = -ENOENT;
+		cld = config_log_add(obd, logname, cfg, sb);
+		if (IS_ERR(cld)) {
+			rc = PTR_ERR(cld);
 			break;
 		}
 
@@ -1770,11 +1772,15 @@ static int mgc_process_config(struct obd_device *obd, u32 len, void *buf)
 					 imp_connect_data, IMP_RECOV)) {
 				rc = mgc_process_log(obd, cld->cld_recover);
 			} else {
-				struct config_llog_data *cir = cld->cld_recover;
+				struct config_llog_data *cir;
 
+				mutex_lock(&cld->cld_lock);
+				cir = cld->cld_recover;
 				cld->cld_recover = NULL;
+				mutex_unlock(&cld->cld_lock);
 				config_log_put(cir);
 			}
+
 			if (rc)
 				CERROR("Cannot process recover llog %d\n", rc);
 		}
@@ -1792,7 +1798,6 @@ static int mgc_process_config(struct obd_device *obd, u32 len, void *buf)
 				       "%s: can't process params llog: rc = %d\n",
 				       obd->obd_name, rc);
 		}
-		config_log_put(cld);
 
 		break;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 28/60] staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (26 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 27/60] staging: lustre: mgc: handle config_llog_data::cld_refcount properly James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 29/60] staging: lustre: llite: Setting xattr are properly checked with and without ACLs James Simmons
                   ` (32 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Andriy Skulysh, Ben Evans, James Simmons

From: Andriy Skulysh <andriy.skulysh@seagate.com>

Whole policy structure was zeroed twice. Once during enqueue
and second time during resend or replay. Policy structure
should be initialized with default values only in ldlm_lock_new().

Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8349
Seagate-bug-id: MRP-2536, MRP-2909
Reviewed-on: http://review.whamcloud.com/21061
Reviewed-by: Alexander Boyko <alexander.boyko@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c    | 1 -
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c     | 1 -
 drivers/staging/lustre/lustre/ldlm/ldlm_inodebits.c | 1 -
 3 files changed, 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
index 32b73ee..5616ea4 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
@@ -243,7 +243,6 @@ void ldlm_extent_unlink_lock(struct ldlm_lock *lock)
 void ldlm_extent_policy_wire_to_local(const union ldlm_wire_policy_data *wpolicy,
 				      union ldlm_policy_data *lpolicy)
 {
-	memset(lpolicy, 0, sizeof(*lpolicy));
 	lpolicy->l_extent.start = wpolicy->l_extent.start;
 	lpolicy->l_extent.end = wpolicy->l_extent.end;
 	lpolicy->l_extent.gid = wpolicy->l_extent.gid;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index f815827..b7f28b3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -615,7 +615,6 @@ struct ldlm_flock_wait_data {
 void ldlm_flock_policy_wire_to_local(const union ldlm_wire_policy_data *wpolicy,
 				     union ldlm_policy_data *lpolicy)
 {
-	memset(lpolicy, 0, sizeof(*lpolicy));
 	lpolicy->l_flock.start = wpolicy->l_flock.lfw_start;
 	lpolicy->l_flock.end = wpolicy->l_flock.lfw_end;
 	lpolicy->l_flock.pid = wpolicy->l_flock.lfw_pid;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_inodebits.c b/drivers/staging/lustre/lustre/ldlm/ldlm_inodebits.c
index 8e1709d..ae37c36 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_inodebits.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_inodebits.c
@@ -57,7 +57,6 @@
 void ldlm_ibits_policy_wire_to_local(const union ldlm_wire_policy_data *wpolicy,
 				     union ldlm_policy_data *lpolicy)
 {
-	memset(lpolicy, 0, sizeof(*lpolicy));
 	lpolicy->l_inodebits.bits = wpolicy->l_inodebits.bits;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 29/60] staging: lustre: llite: Setting xattr are properly checked with and without ACLs
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (27 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 28/60] staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 30/60] staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab James Simmons
                   ` (31 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

Setting extended attributes permissions are properly checked with and
without ACLs. In user.* namespace, only regular files and directories
can have extended attributes. For sticky directories, only the owner
and privileged user can write attributes.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1482
Reviewed-on: http://review.whamcloud.com/21496
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/xattr.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index 7a848eb..421cc04 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -132,6 +132,15 @@ static int xattr_type_filter(struct ll_sb_info *sbi,
 	    (!strcmp(name, "ima") || !strcmp(name, "evm")))
 		return -EOPNOTSUPP;
 
+	/*
+	 * In user.* namespace, only regular files and directories can have
+	 * extended attributes.
+	 */
+	if (handler->flags == XATTR_USER_T) {
+		if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
+			return -EPERM;
+	}
+
 	sprintf(fullname, "%s%s\n", handler->prefix, name);
 	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode),
 			 valid, fullname, pv, size, 0, flags,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 30/60] staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (28 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 29/60] staging: lustre: llite: Setting xattr are properly checked with and without ACLs James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:04 ` [PATCH 31/60] staging: lustre: clio: sync write should update mtime James Simmons
                   ` (30 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Fan Yong,
	James Simmons

From: Fan Yong <fan.yong@intel.com>

The 'fld_read_server' uses 'RMF_GENERIC_DATA' to hold the 'FLD_QUERY'
RPC reply that is composed of 'struct lu_seq_range_array'. But there
is not registered swabber function for 'RMF_GENERIC_DATA'. So the RPC
peers need to handle the RPC reply with fixed little-endian format.

In theory, we can define new structure with some swabber registered
to handle the 'FLD_QUERY' RPC reply result automatically. But from
the implementation view, it is not easy to be done within current
'struct req_msg_field' framework. Because the sequence range array
in the RPC reply is not fixed length, instead, its length depends
on 'lu_seq_range' count, that is unknown when prepare the RPC buffer.
Generally, for such flexible length RPC usage, there will be a field
in the RPC layout to indicate the data length. But for the 'FLD_READ'
RPC, we have no way to do that unless we add new length filed that
will broken the on-wire RPC protocol and cause interoperability
trouble with old peer.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6284
Reviewed-on: http://review.whamcloud.com/22309
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/layout.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 99d7c66..2052848 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1181,6 +1181,23 @@ struct req_format RQF_FLD_QUERY =
 	DEFINE_REQ_FMT0("FLD_QUERY", fld_query_client, fld_query_server);
 EXPORT_SYMBOL(RQF_FLD_QUERY);
 
+/*
+ * The 'fld_read_server' uses 'RMF_GENERIC_DATA' to hold the 'FLD_QUERY'
+ * RPC reply that is composed of 'struct lu_seq_range_array'. But there
+ * is not registered swabber function for 'RMF_GENERIC_DATA'. So the RPC
+ * peers need to handle the RPC reply with fixed little-endian format.
+ *
+ * In theory, we can define new structure with some swabber registered to
+ * handle the 'FLD_QUERY' RPC reply result automatically. But from the
+ * implementation view, it is not easy to be done within current "struct
+ * req_msg_field" framework. Because the sequence range array in the RPC
+ * reply is not fixed length, instead, its length depends on 'lu_seq_range'
+ * count, that is unknown when prepare the RPC buffer. Generally, for such
+ * flexible length RPC usage, there will be a field in the RPC layout to
+ * indicate the data length. But for the 'FLD_READ' RPC, we have no way to
+ * do that unless we add new length filed that will broken the on-wire RPC
+ * protocol and cause interoperability trouble with old peer.
+ */
 struct req_format RQF_FLD_READ =
 	DEFINE_REQ_FMT0("FLD_READ", fld_read_client, fld_read_server);
 EXPORT_SYMBOL(RQF_FLD_READ);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 31/60] staging: lustre: clio: sync write should update mtime
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (29 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 30/60] staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab James Simmons
@ 2017-01-29  0:04 ` James Simmons
  2017-01-29  0:05 ` [PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC James Simmons
                   ` (29 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

Sync write should update m/ctime promptly, otherwise, stale m/ctime
could be updated on the OST object by the sync write RPC.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7310
Reviewed-on: http://review.whamcloud.com/21063
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_io.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 7e5cd3a..3e61f5e 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -210,6 +210,18 @@ static int osc_io_submit(const struct lu_env *env,
 	if (queued > 0)
 		result = osc_queue_sync_pages(env, osc, &list, cmd, brw_flags);
 
+	/* Update c/mtime for sync write. LU-7310 */
+	if (qout->pl_nr > 0 && !result) {
+		struct cl_attr *attr = &osc_env_info(env)->oti_attr;
+		struct cl_object *obj = ios->cis_obj;
+
+		cl_object_attr_lock(obj);
+		attr->cat_mtime = LTIME_S(CURRENT_TIME);
+		attr->cat_ctime = attr->cat_mtime;
+		cl_object_attr_update(env, obj, attr, CAT_MTIME | CAT_CTIME);
+		cl_object_attr_unlock(obj);
+	}
+
 	CDEBUG(D_INFO, "%d/%d %d\n", qin->pl_nr, qout->pl_nr, result);
 	return qout->pl_nr > 0 ? 0 : result;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (30 preceding siblings ...)
  2017-01-29  0:04 ` [PATCH 31/60] staging: lustre: clio: sync write should update mtime James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern James Simmons
                   ` (28 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, Dmitry Eremin, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

OSC has to make sure that it won't issue write RPCs with too many
chunks otherwise it will casue ZFS to create transactions much
bigger than DMU_MAX_ACCESS in size, which will end up with write
failure.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8135
Reviewed-on: http://review.whamcloud.com/22369
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8632
Reviewed-on: http://review.whamcloud.com/22654
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_cache.c | 124 ++++++++++++++++++--------
 1 file changed, 87 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 72dd554..0490478 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -1882,16 +1882,32 @@ static void osc_ap_completion(const struct lu_env *env, struct client_obd *cli,
 		       oap, osc, rc);
 }
 
+struct extent_rpc_data {
+	struct list_head       *erd_rpc_list;
+	unsigned int		erd_page_count;
+	unsigned int		erd_max_pages;
+	unsigned int		erd_max_chunks;
+};
+
+static inline unsigned osc_extent_chunks(const struct osc_extent *ext)
+{
+	struct client_obd *cli = osc_cli(ext->oe_obj);
+	unsigned ppc_bits = cli->cl_chunkbits - PAGE_SHIFT;
+
+	return (ext->oe_end >> ppc_bits) - (ext->oe_start >> ppc_bits) + 1;
+}
+
 /**
  * Try to add extent to one RPC. We need to think about the following things:
  * - # of pages must not be over max_pages_per_rpc
  * - extent must be compatible with previous ones
  */
 static int try_to_add_extent_for_io(struct client_obd *cli,
-				    struct osc_extent *ext, struct list_head *rpclist,
-				    unsigned int *pc, unsigned int *max_pages)
+				    struct osc_extent *ext,
+				    struct extent_rpc_data *data)
 {
 	struct osc_extent *tmp;
+	unsigned int chunk_count;
 	struct osc_async_page *oap = list_first_entry(&ext->oe_pages,
 						      struct osc_async_page,
 						      oap_pending_item);
@@ -1899,19 +1915,22 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
 	EASSERT((ext->oe_state == OES_CACHE || ext->oe_state == OES_LOCK_DONE),
 		ext);
 
-	*max_pages = max(ext->oe_mppr, *max_pages);
-	if (*pc + ext->oe_nr_pages > *max_pages)
+	chunk_count = osc_extent_chunks(ext);
+	if (chunk_count > data->erd_max_chunks)
+		return 0;
+
+	data->erd_max_pages = max(ext->oe_mppr, data->erd_max_pages);
+	if (data->erd_page_count + ext->oe_nr_pages > data->erd_max_pages)
 		return 0;
 
-	list_for_each_entry(tmp, rpclist, oe_link) {
+	list_for_each_entry(tmp, data->erd_rpc_list, oe_link) {
 		struct osc_async_page *oap2;
 
 		oap2 = list_first_entry(&tmp->oe_pages, struct osc_async_page,
 					oap_pending_item);
 		EASSERT(tmp->oe_owner == current, tmp);
 		if (oap2cl_page(oap)->cp_type != oap2cl_page(oap2)->cp_type) {
-			CDEBUG(D_CACHE, "Do not permit different type of IO"
-					" for a same RPC\n");
+			CDEBUG(D_CACHE, "Do not permit different type of IO in one RPC\n");
 			return 0;
 		}
 
@@ -1924,12 +1943,41 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
 		break;
 	}
 
-	*pc += ext->oe_nr_pages;
-	list_move_tail(&ext->oe_link, rpclist);
+	data->erd_max_chunks -= chunk_count;
+	data->erd_page_count += ext->oe_nr_pages;
+	list_move_tail(&ext->oe_link, data->erd_rpc_list);
 	ext->oe_owner = current;
 	return 1;
 }
 
+static inline unsigned osc_max_write_chunks(const struct client_obd *cli)
+{
+	/*
+	 * LU-8135:
+	 *
+	 * The maximum size of a single transaction is about 64MB in ZFS.
+	 * #define DMU_MAX_ACCESS (64 * 1024 * 1024)
+	 *
+	 * Since ZFS is a copy-on-write file system, a single dirty page in
+	 * a chunk will result in the rewrite of the whole chunk, therefore
+	 * an RPC shouldn't be allowed to contain too many chunks otherwise
+	 * it will make transaction size much bigger than 64MB, especially
+	 * with big block size for ZFS.
+	 *
+	 * This piece of code is to make sure that OSC won't send write RPCs
+	 * with too many chunks. The maximum chunk size that an RPC can cover
+	 * is set to PTLRPC_MAX_BRW_SIZE, which is defined to 16MB. Ideally
+	 * OST should tell the client what the biggest transaction size is,
+	 * but it's good enough for now.
+	 *
+	 * This limitation doesn't apply to ldiskfs, which allows as many
+	 * chunks in one RPC as we want. However, it won't have any benefits
+	 * to have too many discontiguous pages in one RPC. Therefore, it
+	 * can only have 256 chunks at most in one RPC.
+	 */
+	return min(PTLRPC_MAX_BRW_SIZE >> cli->cl_chunkbits, 256);
+}
+
 /**
  * In order to prevent multiple ptlrpcd from breaking contiguous extents,
  * get_write_extent() takes all appropriate extents in atomic.
@@ -1949,26 +1997,28 @@ static unsigned int get_write_extents(struct osc_object *obj,
 	struct client_obd *cli = osc_cli(obj);
 	struct osc_extent *ext;
 	struct osc_extent *temp;
-	unsigned int page_count = 0;
-	unsigned int max_pages = cli->cl_max_pages_per_rpc;
+	struct extent_rpc_data data = {
+		.erd_rpc_list = rpclist,
+		.erd_page_count = 0,
+		.erd_max_pages = cli->cl_max_pages_per_rpc,
+		.erd_max_chunks = osc_max_write_chunks(cli),
+	};
 
 	LASSERT(osc_object_is_locked(obj));
 	list_for_each_entry_safe(ext, temp, &obj->oo_hp_exts, oe_link) {
 		LASSERT(ext->oe_state == OES_CACHE);
-		if (!try_to_add_extent_for_io(cli, ext, rpclist, &page_count,
-					      &max_pages))
-			return page_count;
-		EASSERT(ext->oe_nr_pages <= max_pages, ext);
+		if (!try_to_add_extent_for_io(cli, ext, &data))
+			return data.erd_page_count;
+		EASSERT(ext->oe_nr_pages <= data.erd_max_pages, ext);
 	}
-	if (page_count == max_pages)
-		return page_count;
+	if (data.erd_page_count == data.erd_max_pages)
+		return data.erd_page_count;
 
 	while (!list_empty(&obj->oo_urgent_exts)) {
 		ext = list_entry(obj->oo_urgent_exts.next,
 				 struct osc_extent, oe_link);
-		if (!try_to_add_extent_for_io(cli, ext, rpclist, &page_count,
-					      &max_pages))
-			return page_count;
+		if (!try_to_add_extent_for_io(cli, ext, &data))
+			return data.erd_page_count;
 
 		if (!ext->oe_intree)
 			continue;
@@ -1979,13 +2029,12 @@ static unsigned int get_write_extents(struct osc_object *obj,
 			     ext->oe_owner))
 				continue;
 
-			if (!try_to_add_extent_for_io(cli, ext, rpclist,
-						      &page_count, &max_pages))
-				return page_count;
+			if (!try_to_add_extent_for_io(cli, ext, &data))
+				return data.erd_page_count;
 		}
 	}
-	if (page_count == max_pages)
-		return page_count;
+	if (data.erd_page_count == data.erd_max_pages)
+		return data.erd_page_count;
 
 	ext = first_extent(obj);
 	while (ext) {
@@ -1996,13 +2045,12 @@ static unsigned int get_write_extents(struct osc_object *obj,
 			continue;
 		}
 
-		if (!try_to_add_extent_for_io(cli, ext, rpclist, &page_count,
-					      &max_pages))
-			return page_count;
+		if (!try_to_add_extent_for_io(cli, ext, &data))
+			return data.erd_page_count;
 
 		ext = next_extent(ext);
 	}
-	return page_count;
+	return data.erd_page_count;
 }
 
 static int
@@ -2087,27 +2135,29 @@ static unsigned int get_write_extents(struct osc_object *obj,
 	struct osc_extent *ext;
 	struct osc_extent *next;
 	LIST_HEAD(rpclist);
-	unsigned int page_count = 0;
-	unsigned int max_pages = cli->cl_max_pages_per_rpc;
+	struct extent_rpc_data data = {
+		.erd_rpc_list = &rpclist,
+		.erd_page_count = 0,
+		.erd_max_pages = cli->cl_max_pages_per_rpc,
+		.erd_max_chunks = UINT_MAX,
+	};
 	int rc = 0;
 
 	LASSERT(osc_object_is_locked(osc));
 	list_for_each_entry_safe(ext, next, &osc->oo_reading_exts, oe_link) {
 		EASSERT(ext->oe_state == OES_LOCK_DONE, ext);
-		if (!try_to_add_extent_for_io(cli, ext, &rpclist, &page_count,
-					      &max_pages))
+		if (!try_to_add_extent_for_io(cli, ext, &data))
 			break;
 		osc_extent_state_set(ext, OES_RPC);
-		EASSERT(ext->oe_nr_pages <= max_pages, ext);
+		EASSERT(ext->oe_nr_pages <= data.erd_max_pages, ext);
 	}
-	LASSERT(page_count <= max_pages);
+	LASSERT(data.erd_page_count <= data.erd_max_pages);
 
-	osc_update_pending(osc, OBD_BRW_READ, -page_count);
+	osc_update_pending(osc, OBD_BRW_READ, -data.erd_page_count);
 
 	if (!list_empty(&rpclist)) {
 		osc_object_unlock(osc);
 
-		LASSERT(page_count > 0);
 		rc = osc_build_rpc(env, cli, &rpclist, OBD_BRW_READ);
 		LASSERT(list_empty(&rpclist));
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (31 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology James Simmons
                   ` (27 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, Liang Zhen, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

The function cfs_cpt_table_create_pattern() alters the string
passed to it. Currently we are passing in the module parameter
string cpu_pattern which is incorrect. Instead lets duplicate
the module parameter string and pass that to the function
cfs_cpt_table_create_pattern().

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 427e219..71a5b19 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1050,7 +1050,15 @@ static int cfs_cpu_dead(unsigned int cpu)
 	ret = -EINVAL;
 
 	if (*cpu_pattern) {
-		cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern);
+		char *cpu_pattern_dup = kstrdup(cpu_pattern, GFP_KERNEL);
+
+		if (!cpu_pattern_dup) {
+			CERROR("Failed to duplicate cpu_pattern\n");
+			goto failed;
+		}
+
+		cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern_dup);
+		kfree(cpu_pattern_dup);
 		if (!cfs_cpt_table) {
 			CERROR("Failed to create cptab from pattern %s\n",
 			       cpu_pattern);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (32 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 35/60] staging: lustre: lov: ld_target could be NULL James Simmons
                   ` (26 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, Liang Zhen, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

Change default value of CPT pattern and make it match NUMA topology

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 71a5b19..62ab76e 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -59,7 +59,7 @@
  *
  * NB: If user specified cpu_pattern, cpu_npartitions will be ignored
  */
-static char	*cpu_pattern = "";
+static char	*cpu_pattern = "N";
 module_param(cpu_pattern, charp, 0444);
 MODULE_PARM_DESC(cpu_pattern, "CPU partitions pattern");
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 35/60] staging: lustre: lov: ld_target could be NULL
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (33 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 36/60] staging: lustre: header: remove assert from interval_set() James Simmons
                   ` (25 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

lov_device::ld_target[ost_idx] could be NULL if the OST target is
not filled in lov_device::ld_lov::lov_tgt_desc[ost_idx] yet.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8018
Reviewed-on: http://review.whamcloud.com/21411
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_object.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 9c4b5ab..977579c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -266,6 +266,13 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			if (result != 0)
 				goto out;
 
+			if (!dev->ld_target[ost_idx]) {
+				CERROR("%s: OST %04x is not initialized\n",
+				lov2obd(dev->ld_lov)->obd_name, ost_idx);
+				result = -EIO;
+				goto out;
+			}
+
 			subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
 			subconf->u.coc_oinfo = oinfo;
 			LASSERTF(subdev, "not init ost %d\n", ost_idx);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 36/60] staging: lustre: header: remove assert from interval_set()
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (34 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 35/60] staging: lustre: lov: ld_target could be NULL James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update James Simmons
                   ` (24 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

In the case of interval_tree.h only interval_set()
uses LASSERT which is removed in this patch and
interval_set() instead reports a real error. The
header libcfs.h for interval_tree.h is not needed
anymore so we can just use the standard linux
kernel headers instead.h

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6401
Reviewed-on: https://review.whamcloud.com/22522
Reviewed-on: https://review.whamcloud.com/24323
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/interval_tree.h | 12 ++++++++----
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c      |  5 +++--
 drivers/staging/lustre/lustre/llite/range_lock.c      | 10 ++++++++--
 drivers/staging/lustre/lustre/llite/range_lock.h      |  2 +-
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/interval_tree.h b/drivers/staging/lustre/lustre/include/interval_tree.h
index 5d387d3..0d4f92e 100644
--- a/drivers/staging/lustre/lustre/include/interval_tree.h
+++ b/drivers/staging/lustre/lustre/include/interval_tree.h
@@ -36,7 +36,9 @@
 #ifndef _INTERVAL_H__
 #define _INTERVAL_H__
 
-#include "../../include/linux/libcfs/libcfs.h"	/* LASSERT. */
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/types.h>
 
 struct interval_node {
 	struct interval_node   *in_left;
@@ -73,13 +75,15 @@ static inline __u64 interval_high(struct interval_node *node)
 	return node->in_extent.end;
 }
 
-static inline void interval_set(struct interval_node *node,
-				__u64 start, __u64 end)
+static inline int interval_set(struct interval_node *node,
+			       __u64 start, __u64 end)
 {
-	LASSERT(start <= end);
+	if (start > end)
+		return -ERANGE;
 	node->in_extent.start = start;
 	node->in_extent.end = end;
 	node->in_max_high = end;
+	return 0;
 }
 
 /*
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
index 5616ea4..08f97e2 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
@@ -162,7 +162,7 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
 	struct interval_node *found, **root;
 	struct ldlm_interval *node;
 	struct ldlm_extent *extent;
-	int idx;
+	int idx, rc;
 
 	LASSERT(lock->l_granted_mode == lock->l_req_mode);
 
@@ -176,7 +176,8 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
 
 	/* node extent initialize */
 	extent = &lock->l_policy_data.l_extent;
-	interval_set(&node->li_node, extent->start, extent->end);
+	rc = interval_set(&node->li_node, extent->start, extent->end);
+	LASSERT(!rc);
 
 	root = &res->lr_itree[idx].lit_root;
 	found = interval_insert(&node->li_node, root);
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.c b/drivers/staging/lustre/lustre/llite/range_lock.c
index 94c818f..14148a0 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.c
+++ b/drivers/staging/lustre/lustre/llite/range_lock.c
@@ -61,17 +61,23 @@ void range_lock_tree_init(struct range_lock_tree *tree)
  * Pre:  Caller should have allocated the range lock node.
  * Post: The range lock node is meant to cover [start, end] region
  */
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
 {
+	int rc;
+
 	memset(&lock->rl_node, 0, sizeof(lock->rl_node));
 	if (end != LUSTRE_EOF)
 		end >>= PAGE_SHIFT;
-	interval_set(&lock->rl_node, start >> PAGE_SHIFT, end);
+	rc = interval_set(&lock->rl_node, start >> PAGE_SHIFT, end);
+	if (rc)
+		return rc;
+
 	INIT_LIST_HEAD(&lock->rl_next_lock);
 	lock->rl_task = NULL;
 	lock->rl_lock_count = 0;
 	lock->rl_blocking_ranges = 0;
 	lock->rl_sequence = 0;
+	return rc;
 }
 
 static inline struct range_lock *next_lock(struct range_lock *lock)
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.h b/drivers/staging/lustre/lustre/llite/range_lock.h
index c6d04a6..779091c 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.h
+++ b/drivers/staging/lustre/lustre/llite/range_lock.h
@@ -76,7 +76,7 @@ struct range_lock_tree {
 };
 
 void range_lock_tree_init(struct range_lock_tree *tree);
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
 int  range_lock(struct range_lock_tree *tree, struct range_lock *lock);
 void range_unlock(struct range_lock_tree *tree, struct range_lock *lock);
 #endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (35 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 36/60] staging: lustre: header: remove assert from interval_set() James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin James Simmons
                   ` (23 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

So that debug log only contains relevant messages for debugging
purpose.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8413
Reviewed-on: http://review.whamcloud.com/22753
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/rw.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw.c b/drivers/staging/lustre/lustre/llite/rw.c
index 18d3ccb..50d027e 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -729,6 +729,10 @@ static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
 
 	spin_lock(&ras->ras_lock);
 
+	if (!hit)
+		CDEBUG(D_READA, DFID " pages at %lu miss.\n",
+		       PFID(ll_inode2fid(inode)), index);
+
 	ll_ra_stats_inc_sbi(sbi, hit ? RA_STAT_HIT : RA_STAT_MISS);
 
 	/* reset the read-ahead window in two cases.  First when the app seeks
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (36 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs James Simmons
                   ` (22 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Rahul Deshmukh, Lokesh Nagappa Jaliminche, Jian Yu,
	James Simmons

From: Rahul Deshmukh <rahul.deshmukh@seagate.com>

There exists timing race between umount and other
thread which will increment the reference count on
mnt e.g. getattr. If umount thread lose the race
then umount fails with EBUSY error. To avoid this
timed wait is added so that umount thread will wait
for user to decrement the mnt reference count.

Signed-off-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@seagate.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1882
Seagate-bug-id: MRP-1192
Reviewed-on: http://review.whamcloud.com/20061
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_internal.h |  1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c      | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 501957c..ecdfd0c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -506,6 +506,7 @@ struct ll_sb_info {
 						 */
 	/* root squash */
 	struct root_squash_info	  ll_squash;
+	struct path		 ll_mnt;
 
 	__kernel_fsid_t		  ll_fsid;
 	struct kobject		 ll_kobj; /* sysfs object */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 0a87058..b229cbc 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -304,6 +304,7 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 	sb->s_magic = LL_SUPER_MAGIC;
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
 	sbi->ll_namelen = osfs->os_namelen;
+	sbi->ll_mnt.mnt = current->fs->root.mnt;
 
 	if ((sbi->ll_flags & LL_SBI_USER_XATTR) &&
 	    !(data->ocd_connect_flags & OBD_CONNECT_XATTR)) {
@@ -1990,6 +1991,8 @@ void ll_umount_begin(struct super_block *sb)
 	struct ll_sb_info *sbi = ll_s2sbi(sb);
 	struct obd_device *obd;
 	struct obd_ioctl_data *ioc_data;
+	wait_queue_head_t waitq;
+	struct l_wait_info lwi;
 
 	CDEBUG(D_VFSTRACE, "VFS Op: superblock %p count %d active %d\n", sb,
 	       sb->s_count, atomic_read(&sb->s_active));
@@ -2022,9 +2025,14 @@ void ll_umount_begin(struct super_block *sb)
 	}
 
 	/* Really, we'd like to wait until there are no requests outstanding,
-	 * and then continue.  For now, we just invalidate the requests,
-	 * schedule() and sleep one second if needed, and hope.
+	 * and then continue. For now, we just periodically checking for vfs
+	 * to decrement mnt_cnt and hope to finish it within 10sec.
 	 */
+	init_waitqueue_head(&waitq);
+	lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(10),
+				   cfs_time_seconds(1), NULL, NULL);
+	l_wait_event(waitq, may_umount(sbi->ll_mnt.mnt), &lwi);
+
 	schedule();
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (37 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply James Simmons
                   ` (21 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

Replace the ulong_ptr_t and long_ptr_t with standard
kernel types.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6245
Reviewed-on: http://review.whamcloud.com/20204
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h | 4 ----
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c        | 2 +-
 drivers/staging/lustre/lnet/libcfs/debug.c                 | 2 +-
 drivers/staging/lustre/lnet/lnet/acceptor.c                | 4 ++--
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
index e8695e4..fa0808d 100644
--- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
@@ -125,10 +125,6 @@
 
 #include <linux/capability.h>
 
-/* long integer with size equal to pointer */
-typedef unsigned long ulong_ptr_t;
-typedef long long_ptr_t;
-
 #ifndef WITH_WATCHDOG
 #define WITH_WATCHDOG
 #endif
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index 2181c67..8aab001 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -2507,7 +2507,7 @@ static int ksocknal_push(lnet_ni_t *ni, lnet_process_id_t id)
 
 		snprintf(name, sizeof(name), "socknal_cd%02d", i);
 		rc = ksocknal_thread_start(ksocknal_connd,
-					   (void *)((ulong_ptr_t)i), name);
+					   (void *)((uintptr_t)i), name);
 		if (rc) {
 			spin_lock_bh(&ksocknal_data.ksnd_connd_lock);
 			ksocknal_data.ksnd_connd_starting--;
diff --git a/drivers/staging/lustre/lnet/libcfs/debug.c b/drivers/staging/lustre/lnet/libcfs/debug.c
index a38db23..3408041 100644
--- a/drivers/staging/lustre/lnet/libcfs/debug.c
+++ b/drivers/staging/lustre/lnet/libcfs/debug.c
@@ -343,7 +343,7 @@ void libcfs_debug_dumplog_internal(void *arg)
 		last_dump_time = current_time;
 		snprintf(debug_file_name, sizeof(debug_file_name) - 1,
 			 "%s.%lld.%ld", libcfs_debug_file_path_arr,
-			 (s64)current_time, (long_ptr_t)arg);
+			 (s64)current_time, (long)arg);
 		pr_alert("LustreError: dumping log to %s\n", debug_file_name);
 		cfs_tracefile_dump_all_pages(debug_file_name);
 		libcfs_run_debug_log_upcall(debug_file_name);
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index a55c6cd..b43a994 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -330,7 +330,7 @@
 	__u32 magic;
 	__u32 peer_ip;
 	int peer_port;
-	int secure = (int)((long_ptr_t)arg);
+	int secure = (int)((long)arg);
 
 	LASSERT(!lnet_acceptor_state.pta_sock);
 
@@ -459,7 +459,7 @@
 	if (!lnet_count_acceptor_nis())  /* not required */
 		return 0;
 
-	task = kthread_run(lnet_acceptor, (void *)(ulong_ptr_t)secure,
+	task = kthread_run(lnet_acceptor, (void *)(uintptr_t)secure,
 			   "acceptor_%03ld", secure);
 	if (IS_ERR(task)) {
 		rc2 = PTR_ERR(task);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (38 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration James Simmons
                   ` (20 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

reply_out_callback() should call ptlrpc_schedule_difficult_reply()
to finalize the rs if it's already not on uncommitted list, otherwise,
the rs and the export held by rs could be leaked:

- target_send_reply() sends a difficult reply before the transaction
  committed, the reply is linked to scp_rep_active;

- export gets disconnected by umount or whatever reason,
  server_disconnect_export() is called to complete all outstanding
  replies, which will calls into ptlrpc_handle_rs() to dispose of
  the rs, so the rs is removed from the uncommitted list and
  LNetMDUnlink() is called to unlink the reply buffer and generate
  an unlink event;

- reply_out_callback() is called to process above unlink event,
  ptlrpc_schedule_difficult_reply() is supposed to be called to
  dispose of the rs finally. However, it could be skipped because of
  following flawed code snippet:

  if (!rs->rs_no_ack ||
      rs->rs_transno <= rs->rs_export->exp_obd->obd_last_committed)
        ptlrpc_schedule_difficult_reply(rs);

The intention of above code is: if rs_no_ack is true (COS enabled),
and transaction is not committed, we should rely on commit callback
to release the rs. However, it overlooked the situation that rs
could have been removed from the uncommitted list by disconnecting
export.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7903
Reviewed-on: http://review.whamcloud.com/22696
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/events.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c
index ae1650d..dc0fe9d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -420,7 +420,8 @@ void reply_out_callback(lnet_event_t *ev)
 		rs->rs_on_net = 0;
 		if (!rs->rs_no_ack ||
 		    rs->rs_transno <=
-		    rs->rs_export->exp_obd->obd_last_committed)
+		    rs->rs_export->exp_obd->obd_last_committed ||
+		    list_empty(&rs->rs_obd_list))
 			ptlrpc_schedule_difficult_reply(rs);
 
 		spin_unlock(&rs->rs_lock);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (39 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed James Simmons
                   ` (19 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Steve Guminski, James Simmons

From: Steve Guminski <stephenx.guminski@intel.com>

The patch updates the prototype in osc_internal.h to match the
enums used in the declaration.

The osc_match_base declaration in lustre/osc/osc_request.c uses
enums for stricter checking on the type and mode parameters:

int osc_match_base(struct obd_export *exp,
                   ...
-->                enum ldlm_type type,
                   union ldlm_policy_data *policy,
-->                enum ldlm_mode mode,
                   ...  int unref)

The prototype in lustre/osc/osc_internal.h instead used unsigned ints:

int osc_match_base(struct obd_export *exp,
                   ...
-->                __u32 type,
                       union ldlm_policy_data *policy,
-->                __u32 mode,
                   ...  int unref);

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8189
Reviewed-on: http://review.whamcloud.com/23167
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 43a43e4..8abd83f 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -114,9 +114,9 @@ int osc_enqueue_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		     struct ptlrpc_request_set *rqset, int async, int agl);
 
 int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
-		   __u32 type, union ldlm_policy_data *policy, __u32 mode,
-		   __u64 *flags, void *data, struct lustre_handle *lockh,
-		   int unref);
+		   enum ldlm_type type, union ldlm_policy_data *policy,
+		   enum ldlm_mode mode, __u64 *flags, void *data,
+		   struct lustre_handle *lockh, int unref);
 
 int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 		      obd_enqueue_update_f upcall, void *cookie,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (40 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE James Simmons
                   ` (18 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Vladimir Saveliev, James Simmons

From: Vladimir Saveliev <vladimir.saveliev@seagate.com>

ptlrpc_import_delay_req() refuses to delay blocking asts when import
is not in LUSTRE_IMP_FULL yet. That leads to client eviction assuming
that it failed to respond.

Allow delays for blocking asts being resent.

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8351
Seagate-bug-id: MRP-3500
Reviewed-on: https://review.whamcloud.com/21065
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 2 +-
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 3c18ab6..332b360 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1160,7 +1160,7 @@ static int ptlrpc_import_delay_req(struct obd_import *imp,
 		if (atomic_read(&imp->imp_inval_count) != 0) {
 			DEBUG_REQ(D_ERROR, req, "invalidate in flight");
 			*status = -EIO;
-		} else if (imp->imp_dlm_fake || req->rq_no_delay) {
+		} else if (req->rq_no_delay) {
 			*status = -EWOULDBLOCK;
 		} else if (req->rq_allow_replay &&
 			  (imp->imp_state == LUSTRE_IMP_REPLAY ||
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c004490..c03e113 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -221,6 +221,7 @@ int ptlrpc_resend(struct obd_import *imp)
 	}
 	spin_unlock(&imp->imp_lock);
 
+	OBD_FAIL_TIMEOUT(OBD_FAIL_LDLM_ENQUEUE_OLD_EXPORT, 2);
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (41 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 44/60] staging: lustre: libcfs: fix error messages James Simmons
                   ` (17 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: "John L. Hammond" <john.hammond@intel.com>

None of the obd_notify() handlers listen for the OBD_NOTIFY_CREATE
event, so remove it and its sole use in lov_add_target().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/21420
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h | 2 --
 drivers/staging/lustre/lustre/lov/lov_obd.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index ab47078..4ce8506 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -475,8 +475,6 @@ struct niobuf_local {
  * Events signalled through obd_notify() upcall-chain.
  */
 enum obd_notify_event {
-	/* target added */
-	OBD_NOTIFY_CREATE,
 	/* Device connect start */
 	OBD_NOTIFY_CONNECT,
 	/* Device activated */
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 63b0645..b3161fb 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -592,8 +592,6 @@ static int lov_add_target(struct obd_device *obd, struct obd_uuid *uuidp,
 	CDEBUG(D_CONFIG, "idx=%d ltd_gen=%d ld_tgt_count=%d\n",
 	       index, tgt->ltd_gen, lov->desc.ld_tgt_count);
 
-	rc = obd_notify(obd, tgt_obd, OBD_NOTIFY_CREATE, &index);
-
 	if (lov->lov_connects == 0) {
 		/* lov_connect hasn't been called yet. We'll do the
 		 * lov_connect_obd on this target when that fn first runs,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 44/60] staging: lustre: libcfs: fix error messages
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (42 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99 James Simmons
                   ` (16 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

Don't treat unability to set CPU partition affinity as error.
Improve those warning messages.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8703
Reviewed-on: https://review.whamcloud.com/23307
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 4 ++--
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c   | 5 +++--
 drivers/staging/lustre/lnet/libcfs/workitem.c          | 2 +-
 drivers/staging/lustre/lnet/selftest/module.c          | 3 ++-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 14dbc53..e2f3f72 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3546,7 +3546,7 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 
 	rc = cfs_cpt_bind(lnet_cpt_table(), sched->ibs_cpt);
 	if (rc) {
-		CWARN("Failed to bind on CPT %d, please verify whether all CPUs are healthy and reload modules if necessary, otherwise your system might under risk of low performance\n",
+		CWARN("Unable to bind on CPU partition %d, please verify whether all CPUs are healthy and reload modules if necessary, otherwise your system might under risk of low performance\n",
 		      sched->ibs_cpt);
 	}
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index 3531e7d..df4f55e 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -1414,8 +1414,8 @@ int ksocknal_scheduler(void *arg)
 
 	rc = cfs_cpt_bind(lnet_cpt_table(), info->ksi_cpt);
 	if (rc) {
-		CERROR("Can't set CPT affinity to %d: %d\n",
-		       info->ksi_cpt, rc);
+		CWARN("Can't set CPU partition affinity to %d: %d\n",
+		      info->ksi_cpt, rc);
 	}
 
 	spin_lock_bh(&sched->kss_lock);
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 62ab76e..4d35a37 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1082,8 +1082,9 @@ static int cfs_cpu_dead(unsigned int cpu)
 	}
 	spin_unlock(&cpt_data.cpt_lock);
 
-	LCONSOLE(0, "HW CPU cores: %d, npartitions: %d\n",
-		 num_online_cpus(), cfs_cpt_number(cfs_cpt_table));
+	LCONSOLE(0, "HW nodes: %d, HW CPU cores: %d, npartitions: %d\n",
+		 num_online_nodes(), num_online_cpus(),
+		 cfs_cpt_number(cfs_cpt_table));
 	return 0;
 
  failed:
diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c
index d0512da..dbc2a9b 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -209,7 +209,7 @@ static int cfs_wi_scheduler(void *arg)
 	/* CPT affinity scheduler? */
 	if (sched->ws_cptab)
 		if (cfs_cpt_bind(sched->ws_cptab, sched->ws_cpt))
-			CWARN("Failed to bind %s on CPT %d\n",
+			CWARN("Unable to bind %s on CPU partition %d\n",
 			      sched->ws_name, sched->ws_cpt);
 
 	spin_lock(&cfs_wi_data.wi_glock);
diff --git a/drivers/staging/lustre/lnet/selftest/module.c b/drivers/staging/lustre/lnet/selftest/module.c
index 71485f9..b5d556f 100644
--- a/drivers/staging/lustre/lnet/selftest/module.c
+++ b/drivers/staging/lustre/lnet/selftest/module.c
@@ -112,7 +112,8 @@ enum {
 		rc = cfs_wi_sched_create("lst_t", lnet_cpt_table(), i,
 					 nthrs, &lst_sched_test[i]);
 		if (rc) {
-			CERROR("Failed to create CPT affinity WI scheduler %d for LST\n", i);
+			CWARN("Failed to create CPU partition affinity WI scheduler %d for LST\n",
+			      i);
 			goto error;
 		}
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (43 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 44/60] staging: lustre: libcfs: fix error messages James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock James Simmons
                   ` (15 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Steve Guminski, James Simmons

From: Steve Guminski <stephenx.guminski@intel.com>

This patch makes no functional changes. Struct initializers in the
libcfs directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

libcfs/include/libcfs/libcfs_crypto.h:
        static struct cfs_crypto_hash_type hash_types[]

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23332
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/linux/libcfs/libcfs_crypto.h    | 60 ++++++++++++++++++----
 1 file changed, 50 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
index 8f34c5d..3f773a4 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
@@ -53,16 +53,56 @@ enum cfs_crypto_hash_alg {
 };
 
 static struct cfs_crypto_hash_type hash_types[] = {
-	[CFS_HASH_ALG_NULL]    = { "null",     0,      0 },
-	[CFS_HASH_ALG_ADLER32] = { "adler32",  1,      4 },
-	[CFS_HASH_ALG_CRC32]   = { "crc32",   ~0,      4 },
-	[CFS_HASH_ALG_CRC32C]  = { "crc32c",  ~0,      4 },
-	[CFS_HASH_ALG_MD5]     = { "md5",      0,     16 },
-	[CFS_HASH_ALG_SHA1]    = { "sha1",     0,     20 },
-	[CFS_HASH_ALG_SHA256]  = { "sha256",   0,     32 },
-	[CFS_HASH_ALG_SHA384]  = { "sha384",   0,     48 },
-	[CFS_HASH_ALG_SHA512]  = { "sha512",   0,     64 },
-	[CFS_HASH_ALG_MAX]	= { NULL,	0,	64 },
+	[CFS_HASH_ALG_NULL] = {
+		.cht_name	= "null",
+		.cht_key	= 0,
+		.cht_size	= 0
+	},
+	[CFS_HASH_ALG_ADLER32] = {
+		.cht_name	= "adler32",
+		.cht_key	= 1,
+		.cht_size	= 4
+	},
+	[CFS_HASH_ALG_CRC32] = {
+		.cht_name	= "crc32",
+		.cht_key	= ~0,
+		.cht_size	= 4
+	},
+	[CFS_HASH_ALG_CRC32C] = {
+		.cht_name	= "crc32c",
+		.cht_key	= ~0,
+		.cht_size	= 4
+	},
+	[CFS_HASH_ALG_MD5] = {
+		.cht_name	= "md5",
+		.cht_key	= 0,
+		.cht_size	= 16
+	},
+	[CFS_HASH_ALG_SHA1] = {
+		.cht_name	= "sha1",
+		.cht_key	= 0,
+		.cht_size	= 20
+	},
+	[CFS_HASH_ALG_SHA256] = {
+		.cht_name	= "sha256",
+		.cht_key	= 0,
+		.cht_size	= 32
+	},
+	[CFS_HASH_ALG_SHA384] = {
+		.cht_name	= "sha384",
+		.cht_key	= 0,
+		.cht_size	= 48
+	},
+	[CFS_HASH_ALG_SHA512] = {
+		.cht_name	= "sha512",
+		.cht_key	= 0,
+		.cht_size	= 64
+	},
+	[CFS_HASH_ALG_MAX] = {
+		.cht_name	= NULL,
+		.cht_key	= 0,
+		.cht_size	= 64
+	},
 };
 
 /* Maximum size of hash_types[].cht_size */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (44 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99 James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 47/60] staging: lustre: mdc: avoid returning freed request James Simmons
                   ` (14 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Patrick Farrell, James Simmons

From: Patrick Farrell <paf@cray.com>

An earlier commit accidentally changed handling of IT_OPEN,
making it take the MDS_INODELOCK_UPDATE bits lock instead of
MDS_INODELOCK_LOOKUP. This does not cause any known bugs.

Signed-off-by: Patrick Farrell <paf@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8842
Reviewed-on: https://review.whamcloud.com/23797
Fixes: 70a251f68dea ("staging: lustre: obd: decruft md_enqueue() and md_intent_lock()"
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_locks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 156add7..91a7243 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -721,7 +721,7 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 		LASSERT(!policy);
 
 		saved_flags |= LDLM_FL_HAS_INTENT;
-		if (it->it_op & (IT_OPEN | IT_UNLINK | IT_GETATTR | IT_READDIR))
+		if (it->it_op & (IT_UNLINK | IT_GETATTR | IT_READDIR))
 			policy = &update_policy;
 		else if (it->it_op & IT_LAYOUT)
 			policy = &layout_policy;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 47/60] staging: lustre: mdc: avoid returning freed request
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (45 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection James Simmons
                   ` (13 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: "John L. Hammond" <john.hammond@intel.com>

In mdc_close() if ptlrpc_request_pack() fails then set req to NULL so
that an already freed request is not returned in *request.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8811
Reviewed-on: https://review.whamcloud.com/23843
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 02f57d8..a12035d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -762,6 +762,7 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 	rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
 	if (rc) {
 		ptlrpc_request_free(req);
+		req = NULL;
 		goto out;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (46 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 47/60] staging: lustre: mdc: avoid returning freed request James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg James Simmons
                   ` (12 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen,
	James Simmons

From: Liang Zhen <liang.zhen@intel.com>

ksocklnd reaper thread always tries to close the connection for the
first timedout zero-copy TX. This is wrong if this connection is
already being closed, because the reaper will see the same TX again
and again and cannot find out other timedout zero-copy TXs and close
connections for them.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8867
Reviewed-on: https://review.whamcloud.com/23973
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index df4f55e..b7043e2 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -2456,6 +2456,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
 	list_for_each_entry(peer, peers, ksnp_list) {
 		unsigned long deadline = 0;
+		struct ksock_tx *tx_stale;
 		int resid = 0;
 		int n = 0;
 
@@ -2503,6 +2504,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		if (list_empty(&peer->ksnp_zc_req_list))
 			continue;
 
+		tx_stale = NULL;
 		spin_lock(&peer->ksnp_lock);
 		list_for_each_entry(tx, &peer->ksnp_zc_req_list, tx_zc_list) {
 			if (!cfs_time_aftereq(cfs_time_current(),
@@ -2511,26 +2513,26 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 			/* ignore the TX if connection is being closed */
 			if (tx->tx_conn->ksnc_closing)
 				continue;
+			if (!tx_stale)
+				tx_stale = tx;
 			n++;
 		}
 
-		if (!n) {
+		if (!tx_stale) {
 			spin_unlock(&peer->ksnp_lock);
 			continue;
 		}
 
-		tx = list_entry(peer->ksnp_zc_req_list.next,
-				struct ksock_tx, tx_zc_list);
-		deadline = tx->tx_deadline;
-		resid = tx->tx_resid;
-		conn = tx->tx_conn;
+		deadline = tx_stale->tx_deadline;
+		resid = tx_stale->tx_resid;
+		conn = tx_stale->tx_conn;
 		ksocknal_conn_addref(conn);
 
 		spin_unlock(&peer->ksnp_lock);
 		read_unlock(&ksocknal_data.ksnd_global_lock);
 
 		CERROR("Total %d stale ZC_REQs for peer %s detected; the oldest(%p) timed out %ld secs ago, resid: %d, wmem: %d\n",
-		       n, libcfs_nid2str(peer->ksnp_id.nid), tx,
+		       n, libcfs_nid2str(peer->ksnp_id.nid), tx_stale,
 		       cfs_duration_sec(cfs_time_current() - deadline),
 		       resid, conn->ksnc_sock->sk->sk_wmem_queued);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (47 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env James Simmons
                   ` (11 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

Remove the inline function socklnd_init_msg.
Its only used by the kernel code so no point
keeping it in an UAPI header.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6142
Reviewed-on: https://review.whamcloud.com/18506
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/include/linux/lnet/socklnd.h    | 9 ---------
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 9 +++++++--
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/socklnd.h b/drivers/staging/lustre/include/linux/lnet/socklnd.h
index 7d24a91..acf20ce 100644
--- a/drivers/staging/lustre/include/linux/lnet/socklnd.h
+++ b/drivers/staging/lustre/include/linux/lnet/socklnd.h
@@ -80,15 +80,6 @@
 	} WIRE_ATTR ksm_u;
 } WIRE_ATTR ksock_msg_t;
 
-static inline void
-socklnd_init_msg(ksock_msg_t *msg, int type)
-{
-	msg->ksm_csum = 0;
-	msg->ksm_type = type;
-	msg->ksm_zc_cookies[0] = 0;
-	msg->ksm_zc_cookies[1] = 0;
-}
-
 #define KSOCK_MSG_NOOP	0xC0	/* ksm_u empty */
 #define KSOCK_MSG_LNET	0xC1	/* lnet msg */
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index b7043e2..b161c2b 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -80,7 +80,9 @@ struct ksock_tx *
 	tx->tx_niov    = 1;
 	tx->tx_nonblk  = nonblk;
 
-	socklnd_init_msg(&tx->tx_msg, KSOCK_MSG_NOOP);
+	tx->tx_msg.ksm_csum = 0;
+	tx->tx_msg.ksm_type = KSOCK_MSG_NOOP;
+	tx->tx_msg.ksm_zc_cookies[0] = 0;
 	tx->tx_msg.ksm_zc_cookies[1] = cookie;
 
 	return tx;
@@ -1004,7 +1006,10 @@ struct ksock_route *
 			tx->tx_zc_capable = 1;
 	}
 
-	socklnd_init_msg(&tx->tx_msg, KSOCK_MSG_LNET);
+	tx->tx_msg.ksm_csum = 0;
+	tx->tx_msg.ksm_type = KSOCK_MSG_LNET;
+	tx->tx_msg.ksm_zc_cookies[0] = 0;
+	tx->tx_msg.ksm_zc_cookies[1] = 0;
 
 	/* The first fragment will be set later in pro_pack */
 	rc = ksocknal_launch_packet(ni, tx, target);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (48 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c James Simmons
                   ` (10 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

Environment for request interpreters is not used any more.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8887
Reviewed-on: https://review.whamcloud.com/24061
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  4 ----
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c     | 13 -------------
 2 files changed, 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 411eb0d..a73f168 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -1661,10 +1661,6 @@ struct ptlrpcd_ctl {
 	 */
 	char			pc_name[16];
 	/**
-	 * Environment for request interpreters to run in.
-	 */
-	struct lu_env	       pc_env;
-	/**
 	 * CPT the thread is bound on.
 	 */
 	int				pc_cpt;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 1f55d64..84c5551 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -562,15 +562,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
 		return 0;
 	}
 
-	/*
-	 * So far only "client" ptlrpcd uses an environment. In the future,
-	 * ptlrpcd thread (or a thread-set) has to be given an argument,
-	 * describing its "scope".
-	 */
-	rc = lu_context_init(&pc->pc_env.le_ctx, LCT_CL_THREAD | LCT_REMEMBER);
-	if (rc != 0)
-		goto out;
-
 	task = kthread_run(ptlrpcd, pc, "%s", pc->pc_name);
 	if (IS_ERR(task)) {
 		rc = PTR_ERR(task);
@@ -593,9 +584,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
 		spin_unlock(&pc->pc_lock);
 		ptlrpc_set_destroy(set);
 	}
-	lu_context_fini(&pc->pc_env.le_ctx);
-
-out:
 	clear_bit(LIOD_START, &pc->pc_flags);
 	return rc;
 }
@@ -623,7 +611,6 @@ void ptlrpcd_free(struct ptlrpcd_ctl *pc)
 	}
 
 	wait_for_completion(&pc->pc_finishing);
-	lu_context_fini(&pc->pc_env.le_ctx);
 
 	spin_lock(&pc->pc_lock);
 	pc->pc_set = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (49 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 52/60] staging: lustre: linkea: linkEA size limitation James Simmons
                   ` (9 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Dmitry Eremin, James Simmons

From: Dmitry Eremin <dmitry.eremin@intel.com>

Update max_ptlrpcds module parameter descriptions to let
users know its obsolete. Change cpt to CPT for the module
parameter description ptlrpcd_per_cpt_max so it matches
documentation.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8890
Reviewed-on: https://review.whamcloud.com/24065
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 84c5551..59b5813 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -82,7 +82,8 @@ struct ptlrpcd {
  */
 static int max_ptlrpcds;
 module_param(max_ptlrpcds, int, 0644);
-MODULE_PARM_DESC(max_ptlrpcds, "Max ptlrpcd thread count to be started.");
+MODULE_PARM_DESC(max_ptlrpcds,
+		 "Max ptlrpcd thread count to be started (obsolete).");
 
 /*
  * ptlrpcd_bind_policy is obsolete, but retained to ensure that
@@ -102,7 +103,7 @@ struct ptlrpcd {
 static int ptlrpcd_per_cpt_max;
 module_param(ptlrpcd_per_cpt_max, int, 0644);
 MODULE_PARM_DESC(ptlrpcd_per_cpt_max,
-		 "Max ptlrpcd thread count to be started per cpt.");
+		 "Max ptlrpcd thread count to be started per CPT.");
 
 /*
  * ptlrpcd_partner_group_size: The desired number of threads in each
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 52/60] staging: lustre: linkea: linkEA size limitation
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (50 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay James Simmons
                   ` (8 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Fan Yong,
	James Simmons

From: Fan Yong <fan.yong@intel.com>

Under DNE mode, if we do not restrict the linkEA size, and if there
are too many cross-MDTs hard links to the same object, then it will
casue the llog overflow. On the other hand, too many linkEA entries
in the linkEA will serious affect the linkEA performance because we
only support to locate linkEA entry consecutively.

So we need to restrict the linkEA size. Currently, it is 4096 bytes,
that is independent from the backend. If too many hard links caused
the linkEA overflowed, we will add overflow timestamp in the linkEA
header. Such overflow timestamp has some functionalities:

1. It will prevent the object being migrated to other MDT, because
   some name entries may be not in the linkEA, so we cannot update
   these name entries for the migration.

2. It will tell the namespace LFSCK that the 'nlink' attribute may
   be more trustable than the linkEA, then avoid misguiding the
   namespace LFSCK to repair 'nlink' attribute based on linkEA.

There will be subsequent patch(es) for namespace LFSCK to handle the
linkEA size limitation and overflow cases.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8569
Reviewed-on: https://review.whamcloud.com/23500
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |  5 +-
 .../staging/lustre/lustre/include/lustre_linkea.h  | 15 ++++-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  2 +-
 drivers/staging/lustre/lustre/obdclass/linkea.c    | 70 +++++++++++++++++-----
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 16 ++---
 5 files changed, 81 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index b0eb80d..fc960da 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3217,9 +3217,8 @@ struct link_ea_header {
 	__u32 leh_magic;
 	__u32 leh_reccount;
 	__u64 leh_len;      /* total size */
-	/* future use */
-	__u32 padding1;
-	__u32 padding2;
+	__u32 leh_overflow_time;
+	__u32 leh_padding;
 };
 
 /** Hardlink data is name and parent fid.
diff --git a/drivers/staging/lustre/lustre/include/lustre_linkea.h b/drivers/staging/lustre/lustre/include/lustre_linkea.h
index 249e8bf..3ff008f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_linkea.h
+++ b/drivers/staging/lustre/lustre/include/lustre_linkea.h
@@ -26,7 +26,19 @@
  * Author: di wang <di.wang@intel.com>
  */
 
-#define DEFAULT_LINKEA_SIZE	4096
+/* There are several reasons to restrict the linkEA size:
+ *
+ * 1. Under DNE mode, if we do not restrict the linkEA size, and if there
+ *    are too many cross-MDTs hard links to the same object, then it will
+ *    casue the llog overflow.
+ *
+ * 2. Some backend has limited size for EA. For example, if without large
+ *    EA enabled, the ldiskfs will make all EAs to share one (4K) EA block.
+ *
+ * 3. Too many entries in linkEA will seriously affect linkEA performance
+ *    because we only support to locate linkEA entry consecutively.
+ */
+#define MAX_LINKEA_SIZE		4096
 
 struct linkea_data {
 	/**
@@ -43,6 +55,7 @@ struct linkea_data {
 
 int linkea_data_new(struct linkea_data *ldata, struct lu_buf *buf);
 int linkea_init(struct linkea_data *ldata);
+int linkea_init_with_rec(struct linkea_data *ldata);
 void linkea_entry_unpack(const struct link_ea_entry *lee, int *reclen,
 			 struct lu_name *lname, struct lu_fid *pfid);
 int linkea_entry_pack(struct link_ea_entry *lee, const struct lu_name *lname,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index b229cbc..9a9cdb0 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2553,7 +2553,7 @@ static int ll_linkea_decode(struct linkea_data *ldata, unsigned int linkno,
 	unsigned int idx;
 	int rc;
 
-	rc = linkea_init(ldata);
+	rc = linkea_init_with_rec(ldata);
 	if (rc < 0)
 		return rc;
 
diff --git a/drivers/staging/lustre/lustre/obdclass/linkea.c b/drivers/staging/lustre/lustre/obdclass/linkea.c
index 0b1d2f0..dddd0c4 100644
--- a/drivers/staging/lustre/lustre/obdclass/linkea.c
+++ b/drivers/staging/lustre/lustre/obdclass/linkea.c
@@ -39,6 +39,8 @@ int linkea_data_new(struct linkea_data *ldata, struct lu_buf *buf)
 	ldata->ld_leh->leh_magic = LINK_EA_MAGIC;
 	ldata->ld_leh->leh_len = sizeof(struct link_ea_header);
 	ldata->ld_leh->leh_reccount = 0;
+	ldata->ld_leh->leh_overflow_time = 0;
+	ldata->ld_leh->leh_padding = 0;
 	return 0;
 }
 EXPORT_SYMBOL(linkea_data_new);
@@ -53,11 +55,15 @@ int linkea_init(struct linkea_data *ldata)
 		leh->leh_magic = LINK_EA_MAGIC;
 		leh->leh_reccount = __swab32(leh->leh_reccount);
 		leh->leh_len = __swab64(leh->leh_len);
-		/* entries are swabbed by linkea_entry_unpack */
+		leh->leh_overflow_time = __swab32(leh->leh_overflow_time);
+		leh->leh_padding = __swab32(leh->leh_padding);
+		/* individual entries are swabbed by linkea_entry_unpack() */
 	}
+
 	if (leh->leh_magic != LINK_EA_MAGIC)
 		return -EINVAL;
-	if (leh->leh_reccount == 0)
+
+	if (leh->leh_reccount == 0 && leh->leh_overflow_time == 0)
 		return -ENODATA;
 
 	ldata->ld_leh = leh;
@@ -65,6 +71,18 @@ int linkea_init(struct linkea_data *ldata)
 }
 EXPORT_SYMBOL(linkea_init);
 
+int linkea_init_with_rec(struct linkea_data *ldata)
+{
+	int rc;
+
+	rc = linkea_init(ldata);
+	if (!rc && ldata->ld_leh->leh_reccount == 0)
+		rc = -ENODATA;
+
+	return rc;
+}
+EXPORT_SYMBOL(linkea_init_with_rec);
+
 /**
  * Pack a link_ea_entry.
  * All elements are stored as chars to avoid alignment issues.
@@ -94,6 +112,8 @@ int linkea_entry_pack(struct link_ea_entry *lee, const struct lu_name *lname,
 void linkea_entry_unpack(const struct link_ea_entry *lee, int *reclen,
 			 struct lu_name *lname, struct lu_fid *pfid)
 {
+	LASSERT(lee);
+
 	*reclen = (lee->lee_reclen[0] << 8) | lee->lee_reclen[1];
 	memcpy(pfid, &lee->lee_parent_fid, sizeof(*pfid));
 	fid_be_to_cpu(pfid, pfid);
@@ -110,25 +130,45 @@ void linkea_entry_unpack(const struct link_ea_entry *lee, int *reclen,
 int linkea_add_buf(struct linkea_data *ldata, const struct lu_name *lname,
 		   const struct lu_fid *pfid)
 {
-	LASSERT(ldata->ld_leh);
+	struct link_ea_header *leh = ldata->ld_leh;
+	int reclen;
+
+	LASSERT(leh);
 
 	if (!lname || !pfid)
 		return -EINVAL;
 
-	ldata->ld_reclen = lname->ln_namelen + sizeof(struct link_ea_entry);
-	if (ldata->ld_leh->leh_len + ldata->ld_reclen >
-	    ldata->ld_buf->lb_len) {
+	reclen = lname->ln_namelen + sizeof(struct link_ea_entry);
+	if (unlikely(leh->leh_len + reclen > MAX_LINKEA_SIZE)) {
+		/*
+		 * Use 32-bits to save the overflow time, although it will
+		 * shrink the ktime_get_real_seconds() returned 64-bits value
+		 * to 32-bits value, it is still quite large and can be used
+		 * for about 140 years. That is enough.
+		 */
+		leh->leh_overflow_time = ktime_get_real_seconds();
+		if (unlikely(leh->leh_overflow_time == 0))
+			leh->leh_overflow_time++;
+
+		CDEBUG(D_INODE, "No enough space to hold linkea entry '" DFID ": %.*s' at %u\n",
+		       PFID(pfid), lname->ln_namelen,
+		       lname->ln_name, leh->leh_overflow_time);
+		return 0;
+	}
+
+	if (leh->leh_len + reclen > ldata->ld_buf->lb_len) {
 		if (lu_buf_check_and_grow(ldata->ld_buf,
-					  ldata->ld_leh->leh_len +
-					  ldata->ld_reclen) < 0)
+					  leh->leh_len + reclen) < 0)
 			return -ENOMEM;
+
+		ldata->ld_leh = ldata->ld_buf->lb_buf;
+		leh = ldata->ld_leh;
 	}
 
-	ldata->ld_leh = ldata->ld_buf->lb_buf;
-	ldata->ld_lee = ldata->ld_buf->lb_buf + ldata->ld_leh->leh_len;
+	ldata->ld_lee = ldata->ld_buf->lb_buf + leh->leh_len;
 	ldata->ld_reclen = linkea_entry_pack(ldata->ld_lee, lname, pfid);
-	ldata->ld_leh->leh_len += ldata->ld_reclen;
-	ldata->ld_leh->leh_reccount++;
+	leh->leh_len += ldata->ld_reclen;
+	leh->leh_reccount++;
 	CDEBUG(D_INODE, "New link_ea name '" DFID ":%.*s' is added\n",
 	       PFID(pfid), lname->ln_namelen, lname->ln_name);
 	return 0;
@@ -139,6 +179,7 @@ int linkea_add_buf(struct linkea_data *ldata, const struct lu_name *lname,
 void linkea_del_buf(struct linkea_data *ldata, const struct lu_name *lname)
 {
 	LASSERT(ldata->ld_leh && ldata->ld_lee);
+	LASSERT(ldata->ld_leh->leh_reccount > 0);
 
 	ldata->ld_leh->leh_reccount--;
 	ldata->ld_leh->leh_len -= ldata->ld_reclen;
@@ -174,8 +215,9 @@ int linkea_links_find(struct linkea_data *ldata, const struct lu_name *lname,
 
 	LASSERT(ldata->ld_leh);
 
-	/* link #0 */
-	ldata->ld_lee = (struct link_ea_entry *)(ldata->ld_leh + 1);
+	/* link #0, if leh_reccount == 0 we skip the loop and return -ENOENT */
+	if (likely(ldata->ld_leh->leh_reccount > 0))
+		ldata->ld_lee = (struct link_ea_entry *)(ldata->ld_leh + 1);
 
 	for (count = 0; count < ldata->ld_leh->leh_reccount; count++) {
 		linkea_entry_unpack(ldata->ld_lee, &ldata->ld_reclen,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index a04e36c..f166518 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -3820,14 +3820,14 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct link_ea_header, leh_len));
 	LASSERTF((int)sizeof(((struct link_ea_header *)0)->leh_len) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct link_ea_header *)0)->leh_len));
-	LASSERTF((int)offsetof(struct link_ea_header, padding1) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct link_ea_header, padding1));
-	LASSERTF((int)sizeof(((struct link_ea_header *)0)->padding1) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct link_ea_header *)0)->padding1));
-	LASSERTF((int)offsetof(struct link_ea_header, padding2) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct link_ea_header, padding2));
-	LASSERTF((int)sizeof(((struct link_ea_header *)0)->padding2) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct link_ea_header *)0)->padding2));
+	LASSERTF((int)offsetof(struct link_ea_header, leh_overflow_time) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct link_ea_header, leh_overflow_time));
+	LASSERTF((int)sizeof(((struct link_ea_header *)0)->leh_overflow_time) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct link_ea_header *)0)->leh_overflow_time));
+	LASSERTF((int)offsetof(struct link_ea_header, leh_padding) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct link_ea_header, leh_padding));
+	LASSERTF((int)sizeof(((struct link_ea_header *)0)->leh_padding) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct link_ea_header *)0)->leh_padding));
 	CLASSERT(LINK_EA_MAGIC == 0x11EAF1DFUL);
 
 	/* Checks for struct link_ea_entry */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (51 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 52/60] staging: lustre: linkea: linkEA size limitation James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99 James Simmons
                   ` (7 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

The replay cursor should be updated properly when close happened
during replay, otherwise, ptlrpc_replay_next() could run into a
dead loop due to an invalid replay cursor:

- replay cursor is moved to an open request during replay;
- application close that open file, so the rq_replay of the open
  request is cleared;
- ptlrpc_replay_next() calls ptlrpc_free_committed() to free
  committed/closed requests, the open request is removed from
  the committed list, so the replay cursor is changed to an
  empty list_head now. The open request won't be freed now since
  it's still held by the pending close request;
- ptlrpc_replay_next() continue to move the replay cursor to
  next and run into a dead loop at the end;

Another change in this patch is to remove the out of date comments
in ptlrpc_replay_next() and cover the whole process of finding
replay request within imp_lock, because:

1. With two separated replay lists and replay cursor introduced,
   finding replay request won't take much time as before, it's
   not necessary to do this "lock -> unlock -> lock -> unlock"
   trick anymore;

2. Nowadays there are various kind of non-replay requests are
   allowed during recovery, so ptlrpc_free_committed() may run in
   parallel to remove an open request while ptlrpc_replay_next()
   is iterating the open requests list;

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8765
Reviewed-on: https://review.whamcloud.com/23418
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 15 ++++++++++-----
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 23 +----------------------
 2 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 332b360..8dfb40f 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2662,11 +2662,16 @@ void ptlrpc_free_committed(struct obd_import *imp)
 	list_for_each_entry_safe(req, saved, &imp->imp_committed_list,
 				 rq_replay_list) {
 		LASSERT(req->rq_transno != 0);
-		if (req->rq_import_generation < imp->imp_generation) {
-			DEBUG_REQ(D_RPCTRACE, req, "free stale open request");
-			ptlrpc_free_request(req);
-		} else if (!req->rq_replay) {
-			DEBUG_REQ(D_RPCTRACE, req, "free closed open request");
+		if (req->rq_import_generation < imp->imp_generation ||
+		    !req->rq_replay) {
+			DEBUG_REQ(D_RPCTRACE, req, "free %s open request",
+				  req->rq_import_generation <
+				  imp->imp_generation ? "stale" : "closed");
+
+			if (imp->imp_replay_cursor == &req->rq_replay_list)
+				imp->imp_replay_cursor =
+					req->rq_replay_list.next;
+
 			ptlrpc_free_request(req);
 		}
 	}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c03e113..7b58545 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -78,28 +78,11 @@ int ptlrpc_replay_next(struct obd_import *imp, int *inflight)
 	imp->imp_last_transno_checked = 0;
 	ptlrpc_free_committed(imp);
 	last_transno = imp->imp_last_replay_transno;
-	spin_unlock(&imp->imp_lock);
 
 	CDEBUG(D_HA, "import %p from %s committed %llu last %llu\n",
 	       imp, obd2cli_tgt(imp->imp_obd),
 	       imp->imp_peer_committed_transno, last_transno);
 
-	/* Do I need to hold a lock across this iteration?  We shouldn't be
-	 * racing with any additions to the list, because we're in recovery
-	 * and are therefore not processing additional requests to add.  Calls
-	 * to ptlrpc_free_committed might commit requests, but nothing "newer"
-	 * than the one we're replaying (it can't be committed until it's
-	 * replayed, and we're doing that here).  l_f_e_safe protects against
-	 * problems with the current request being committed, in the unlikely
-	 * event of that race.  So, in conclusion, I think that it's safe to
-	 * perform this list-walk without the imp_lock held.
-	 *
-	 * But, the {mdc,osc}_replay_open callbacks both iterate
-	 * request lists, and have comments saying they assume the
-	 * imp_lock is being held by ptlrpc_replay, but it's not. it's
-	 * just a little race...
-	 */
-
 	/* Replay all the committed open requests on committed_list first */
 	if (!list_empty(&imp->imp_committed_list)) {
 		tmp = imp->imp_committed_list.prev;
@@ -107,10 +90,6 @@ int ptlrpc_replay_next(struct obd_import *imp, int *inflight)
 
 		/* The last request on committed_list hasn't been replayed */
 		if (req->rq_transno > last_transno) {
-			/* Since the imp_committed_list is immutable before
-			 * all of it's requests being replayed, it's safe to
-			 * use a cursor to accelerate the search
-			 */
 			if (!imp->imp_resend_replay ||
 			    imp->imp_replay_cursor == &imp->imp_committed_list)
 				imp->imp_replay_cursor = imp->imp_replay_cursor->next;
@@ -124,6 +103,7 @@ int ptlrpc_replay_next(struct obd_import *imp, int *inflight)
 					break;
 
 				req = NULL;
+				LASSERT(!list_empty(imp->imp_replay_cursor));
 				imp->imp_replay_cursor =
 					imp->imp_replay_cursor->next;
 			}
@@ -156,7 +136,6 @@ int ptlrpc_replay_next(struct obd_import *imp, int *inflight)
 	if (req && imp->imp_resend_replay)
 		lustre_msg_add_flags(req->rq_reqmsg, MSG_RESENT);
 
-	spin_lock(&imp->imp_lock);
 	/* The resend replay request may have been removed from the
 	 * unreplied list.
 	 */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (52 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop James Simmons
                   ` (6 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Steve Guminski, James Simmons

From: Steve Guminski <stephenx.guminski@intel.com>

This patch makes no functional changes.  Struct initializers in the
fid directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/fid/fid_lib.c:
        const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE
        const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE
lustre/fid/lproc_fid.c:
        struct lprocfs_vars seq_client_debugfs_list

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23789
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/fid/fid_lib.c   |  7 +++----
 drivers/staging/lustre/lustre/fid/lproc_fid.c | 12 ++++++++----
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_lib.c b/drivers/staging/lustre/lustre/fid/fid_lib.c
index 4e49cb3..9eb4059 100644
--- a/drivers/staging/lustre/lustre/fid/fid_lib.c
+++ b/drivers/staging/lustre/lustre/fid/fid_lib.c
@@ -60,14 +60,13 @@
  * FID_SEQ_START + 2 is for .lustre directory and its objects
  */
 const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE = {
-	FID_SEQ_NORMAL,
-	(__u64)~0ULL
+	.lsr_start	= FID_SEQ_NORMAL,
+	.lsr_end	= (__u64)~0ULL,
 };
 
 /* Zero range, used for init and other purposes. */
 const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE = {
-	0,
-	0
+	.lsr_start	= 0,
 };
 
 /* Lustre Big Fs Lock fid. */
diff --git a/drivers/staging/lustre/lustre/fid/lproc_fid.c b/drivers/staging/lustre/lustre/fid/lproc_fid.c
index 97d4849..3eed838 100644
--- a/drivers/staging/lustre/lustre/fid/lproc_fid.c
+++ b/drivers/staging/lustre/lustre/fid/lproc_fid.c
@@ -203,9 +203,13 @@
 LPROC_SEQ_FOPS_RO(ldebugfs_fid_fid);
 
 struct lprocfs_vars seq_client_debugfs_list[] = {
-	{ "space", &ldebugfs_fid_space_fops },
-	{ "width", &ldebugfs_fid_width_fops },
-	{ "server", &ldebugfs_fid_server_fops },
-	{ "fid", &ldebugfs_fid_fid_fops },
+	{ .name =	"space",
+	  .fops =	&ldebugfs_fid_space_fops },
+	{ .name	=	"width",
+	  .fops =	&ldebugfs_fid_width_fops },
+	{ .name =	"server",
+	  .fops =	&ldebugfs_fid_server_fops },
+	{ .name	=	"fid",
+	  .fops =	&ldebugfs_fid_fid_fops },
 	{ NULL }
 };
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (53 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99 James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case James Simmons
                   ` (5 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

Building the lustre client with W=1 reports the following
error:

obdclass/obd_mount.c: In function lmd_parse:
obdclass/obd_mount.c:880: warning: variable set but not used

The solution is to move s3 to the inner loop
where it is only used.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8378
Reviewed-on: https://review.whamcloud.com/23820
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 2283e92..8e0d4b1 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -877,7 +877,7 @@ static int lmd_parse_mgs(struct lustre_mount_data *lmd, char **ptr)
  */
 static int lmd_parse(char *options, struct lustre_mount_data *lmd)
 {
-	char *s1, *s2, *s3, *devname = NULL;
+	char *s1, *s2, *devname = NULL;
 	struct lustre_mount_data *raw = (struct lustre_mount_data *)options;
 	int rc = 0;
 
@@ -906,6 +906,7 @@ static int lmd_parse(char *options, struct lustre_mount_data *lmd)
 	while (*s1) {
 		int clear = 0;
 		int time_min = OBD_RECOVERY_TIME_MIN;
+		char *s3;
 
 		/* Skip whitespace and extra commas */
 		while (*s1 == ' ' || *s1 == ',')
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (54 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves James Simmons
                   ` (4 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Yang Sheng,
	James Simmons

From: Yang Sheng <yang.sheng@intel.com>

The function generic_file_read_iter() does not check EOF
before invoke direct_IO callback. So we have to check it
ourselves.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8969
Reviewed-on: https://review.whamcloud.com/24552
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/rw26.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c
index 21e06e5..d89e795 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -345,6 +345,10 @@ static ssize_t ll_direct_IO_26(struct kiocb *iocb, struct iov_iter *iter)
 	ssize_t tot_bytes = 0, result = 0;
 	long size = MAX_DIO_SIZE;
 
+	/* Check EOF by ourselves */
+	if (iov_iter_rw(iter) == READ && file_offset >= i_size_read(inode))
+		return 0;
+
 	/* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */
 	if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK))
 		return -EINVAL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (55 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much James Simmons
                   ` (3 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

If an application attempts to remove millions of files in a
single directory it will fail. This failure was tracked down to
the nlink < 2 check in lmv_revalidate_slaves, because after
nlink reaches to maximum value of LDISKFS_LINK_MAX (65000),
the nlink broadcast back from the server will be reported as
one. The return value of 1 is not invalid so lets remove
the check.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6984
Reviewed-on: http://review.whamcloud.com/16490
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lmv_intent.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_intent.c b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
index b1071cf..aa42066 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_intent.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
@@ -220,21 +220,7 @@ int lmv_revalidate_slaves(struct obd_export *exp,
 			/* refresh slave from server */
 			body = req_capsule_server_get(&req->rq_pill,
 						      &RMF_MDT_BODY);
-			LASSERT(body);
-
-			if (unlikely(body->mbo_nlink < 2)) {
-				/*
-				 * If this is bad stripe, most likely due
-				 * to the race between close(unlink) and
-				 * getattr, let's return -EONENT, so llite
-				 * will revalidate the dentry see
-				 * ll_inode_revalidate_fini()
-				 */
-				CDEBUG(D_INODE, "%s: nlink %d < 2 corrupt stripe %d "DFID":" DFID"\n",
-				       obd->obd_name, body->mbo_nlink, i,
-				       PFID(&lsm->lsm_md_oinfo[i].lmo_fid),
-				       PFID(&lsm->lsm_md_oinfo[0].lmo_fid));
-
+			if (!body) {
 				if (it.it_lock_mode && lockh) {
 					ldlm_lock_decref(lockh, it.it_lock_mode);
 					it.it_lock_mode = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (56 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc James Simmons
                   ` (2 subsequent siblings)
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

The use of 64 bit time introduces an expensive 64 bit
division operation. Since the time lapse being calculated
in osc_cache_too_much will never be more than seventy years
we can cast the time lapse to an long and perform a normal
32 bit divison operation instead.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8835
Reviewed-on: https://review.whamcloud.com/23814
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_page.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c b/drivers/staging/lustre/lustre/osc/osc_page.c
index 0461408..ab9d0d7 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -370,12 +370,17 @@ static int osc_cache_too_much(struct client_obd *cli)
 			return lru_shrink_min(cli);
 	} else {
 		time64_t duration = ktime_get_real_seconds();
+		long timediff;
 
 		/* knock out pages by duration of no IO activity */
 		duration -= cli->cl_lru_last_used;
-		duration >>= 6; /* approximately 1 minute */
-		if (duration > 0 &&
-		    pages >= div64_s64((s64)budget, duration))
+		/*
+		 * The difference shouldn't be more than 70 years
+		 * so we can safely case to a long. Round to
+		 * approximately 1 minute.
+		 */
+		timediff = (long)(duration >> 6);
+		if (timediff > 0 && pages >= budget / timediff)
 			return lru_shrink_min(cli);
 	}
 	return 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (57 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-29  0:05 ` [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl James Simmons
  2017-02-03 10:46 ` [PATCH 00/60] staging: lustre: batches of fixes for lustre client Greg Kroah-Hartman
  60 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

The reason for __REQ_LAYOUT_USER__ was to expose a
section of code in layout.c to userland for a utility
similar to wireshark. This was done before wireshark
existed but now that it does we no longer need to do
this type of hack. This also reduces lustre_acl.h to
strictly a kernel header now.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8945
Reviewed-on: https://review.whamcloud.com/24396
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_req_layout.h | 10 ++--------
 drivers/staging/lustre/lustre/ptlrpc/layout.c             |  9 ---------
 2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index fbcd395..cd62ccd 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -39,6 +39,8 @@
 #ifndef _LUSTRE_REQ_LAYOUT_H__
 #define _LUSTRE_REQ_LAYOUT_H__
 
+#include <linux/types.h>
+
 /** \defgroup req_layout req_layout
  *
  * @{
@@ -66,11 +68,6 @@ struct req_capsule {
 	__u32		    rc_area[RCL_NR][REQ_MAX_FIELD_NR];
 };
 
-#if !defined(__REQ_LAYOUT_USER__)
-
-/* struct ptlrpc_request, lustre_msg* */
-#include "lustre_net.h"
-
 void req_capsule_init(struct req_capsule *pill, struct ptlrpc_request *req,
 		      enum req_location location);
 void req_capsule_fini(struct req_capsule *pill);
@@ -120,9 +117,6 @@ void req_capsule_shrink(struct req_capsule *pill,
 int  req_layout_init(void);
 void req_layout_fini(void);
 
-/* __REQ_LAYOUT_USER__ */
-#endif
-
 extern struct req_format RQF_OBD_PING;
 extern struct req_format RQF_OBD_SET_INFO;
 extern struct req_format RQF_SEC_CTX;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 2052848..356d735 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -42,8 +42,6 @@
  * of the format that the request conforms to.
  */
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 #define DEBUG_SUBSYSTEM S_RPC
 
 #include <linux/module.h>
@@ -57,8 +55,6 @@
 #include "../include/obd.h"
 #include "../include/obd_support.h"
 
-/* __REQ_LAYOUT_USER__ */
-#endif
 /* struct ptlrpc_request, lustre_msg* */
 #include "../include/lustre_req_layout.h"
 #include "../include/lustre_acl.h"
@@ -1558,8 +1554,6 @@ struct req_format RQF_OST_GET_INFO_FIEMAP =
 			ost_get_fiemap_server);
 EXPORT_SYMBOL(RQF_OST_GET_INFO_FIEMAP);
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 /* Convenience macro */
 #define FMT_FIELD(fmt, i, j) (fmt)->rf_fields[(i)].d[(j)]
 
@@ -2238,6 +2232,3 @@ void req_capsule_shrink(struct req_capsule *pill,
 							    1);
 }
 EXPORT_SYMBOL(req_capsule_shrink);
-
-/* __REQ_LAYOUT_USER__ */
-#endif
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (58 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc James Simmons
@ 2017-01-29  0:05 ` James Simmons
  2017-01-30 10:51   ` Dan Carpenter
  2017-02-03 10:46 ` [PATCH 00/60] staging: lustre: batches of fixes for lustre client Greg Kroah-Hartman
  60 siblings, 1 reply; 79+ messages in thread
From: James Simmons @ 2017-01-29  0:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

The check for the smallest ioctl data in libcfs_ioctl_getdata()
is incorrect. Instead of checking against struct libcfs_ioctl_data
compare the size to struct libcfs_ioctl_hdr.

Reported-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
index 3f5d58b..bda6c16 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
@@ -134,7 +134,7 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
 		return -EINVAL;
 	}
 
-	if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
+	if (hdr.ioc_len < sizeof(hdr)) {
 		CERROR("libcfs ioctl: user buffer too small for ioctl\n");
 		return -EINVAL;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-29  0:05 ` [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl James Simmons
@ 2017-01-30 10:51   ` Dan Carpenter
  2017-01-30 10:54     ` Dan Carpenter
  2017-01-31  2:25     ` James Simmons
  0 siblings, 2 replies; 79+ messages in thread
From: Dan Carpenter @ 2017-01-30 10:51 UTC (permalink / raw)
  To: James Simmons, Liang Zhen, Amir Shehata
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	Linux Kernel Mailing List, Lustre Development List

It looks like what happened is there were two patches applied out of
sync.  Let's add a fixes tag and CC the original author.

Fixes: ed2f549dc0f6 ("staging: lustre: libcfs: test if userland data is to small")

This patch was probably correct when it was written but commit
1290932728e5 ("staging: lustre: Dynamic LNet Configuration (DLC) IOCTL
changes") ended up getting applied first so the size was wrong.

The lstcon_ioctl_entry() function doesn't have enough size checking.
Also I'm uncomfortable with:

	data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);

If hdr isn't the first member of the struct then the code is broken but
container_of() implies that that isn't a hard requirement.  It should
just be:

	data = (struct libcfs_ioctl_data *)hdr;

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-30 10:51   ` Dan Carpenter
@ 2017-01-30 10:54     ` Dan Carpenter
  2017-01-31  0:48       ` James Simmons
  2017-01-31  2:25     ` James Simmons
  1 sibling, 1 reply; 79+ messages in thread
From: Dan Carpenter @ 2017-01-30 10:54 UTC (permalink / raw)
  To: James Simmons, Liang Zhen, Amir Shehata
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	Linux Kernel Mailing List, Lustre Development List

On Mon, Jan 30, 2017 at 01:51:56PM +0300, Dan Carpenter wrote:
> The lstcon_ioctl_entry() function doesn't have enough size checking.

Actually, the lstcon_ioctl_entry() would have been fine before we apply
this [patch 60/60]...  As near as I can tell, no in kernel code is
negatively affected by the bug this patch fixes.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate
  2017-01-29  0:04 ` [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate James Simmons
@ 2017-01-30 11:34   ` Dan Carpenter
  2017-02-11 17:12     ` James Simmons
  0 siblings, 1 reply; 79+ messages in thread
From: Dan Carpenter @ 2017-01-30 11:34 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin, wang di,
	Linux Kernel Mailing List, Lustre Development List

On Sat, Jan 28, 2017 at 07:04:33PM -0500, James Simmons wrote:
> From: wang di <di.wang@intel.com>
> 
> Check if the request is NULL, before retrieve reply body
> from the request.
> 
> Signed-off-by: wang di <di.wang@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7396
> Reviewed-on: http://review.whamcloud.com/17079
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/llite/file.c | 41 +++++++++++++++++-------------
>  1 file changed, 23 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
> index a1e51a5..b681e15 100644
> --- a/drivers/staging/lustre/lustre/llite/file.c
> +++ b/drivers/staging/lustre/lustre/llite/file.c
> @@ -2656,28 +2656,33 @@ int ll_migrate(struct inode *parent, struct file *file, int mdtidx,
>  	if (!rc)
>  		ll_update_times(request, parent);
>  

I don't like how we return a non-NULL request on many error paths.  It
would be simpler to understand if mdc_rename() freed request on error.
So mdc_reint() fails but we still continue?  I don't understand that but
there are no comments about it.


> -	body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
> -	if (!body) {
> -		rc = -EPROTO;
> -		goto out_free;
> -	}
> +	if (request) {
> +		body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
> +		if (!body) {
> +			rc = -EPROTO;
> +			goto out_free;

We should call ptlrpc_req_finished(request) before returning on this
path.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG
  2017-01-29  0:04 ` [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG James Simmons
@ 2017-01-30 12:03   ` Dan Carpenter
  2017-01-31  1:00     ` James Simmons
  0 siblings, 1 reply; 79+ messages in thread
From: Dan Carpenter @ 2017-01-30 12:03 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	Bruno Faccini, Linux Kernel Mailing List,
	Lustre Development List

Wat?

I'm sorry but this patch makes no sense at all.

On Sat, Jan 28, 2017 at 07:04:41PM -0500, James Simmons wrote:
> From: Bruno Faccini <bruno.faccini@intel.com>
> 
> When a LBUG has occurred, without panic_on_lbug being set,
> health_check sysfs file must return an unhealthy state.

Why?

> 
> Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7486
> Reviewed-on: http://review.whamcloud.com/17981
> Reviewed-by: Bobi Jam <bobijam@hotmail.com>
> Reviewed-by: Niu Yawei <yawei.niu@intel.com>
> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> index 22e6d1f..ef25db6 100644
> --- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> +++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> @@ -224,8 +224,10 @@ static ssize_t pinger_show(struct kobject *kobj, struct attribute *attr,
>  	int i;
>  	size_t len = 0;
>  
> -	if (libcfs_catastrophe)
> -		return sprintf(buf, "LBUG\n");
> +	if (libcfs_catastrophe) {
> +		len = sprintf(buf, "LBUG\n");

This line is dead code, now.

> +		healthy = false;
> +	}
>  

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-30 10:54     ` Dan Carpenter
@ 2017-01-31  0:48       ` James Simmons
  0 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-31  0:48 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Liang Zhen, Amir Shehata, Greg Kroah-Hartman, devel,
	Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List


> On Mon, Jan 30, 2017 at 01:51:56PM +0300, Dan Carpenter wrote:
> > The lstcon_ioctl_entry() function doesn't have enough size checking.
> 
> Actually, the lstcon_ioctl_entry() would have been fine before we apply
> this [patch 60/60]...  As near as I can tell, no in kernel code is
> negatively affected by the bug this patch fixes.

There is one, the ioctl IOC_LIBCFS_GET_LNET_STATS was affected by this
bug. That is how it was founded.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG
  2017-01-30 12:03   ` Dan Carpenter
@ 2017-01-31  1:00     ` James Simmons
  0 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-01-31  1:00 UTC (permalink / raw)
  To: e
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	Bruno Faccini, Linux Kernel Mailing List,
	Lustre Development List


> Wat?
> 
> I'm sorry but this patch makes no sense at all.
> 
> On Sat, Jan 28, 2017 at 07:04:41PM -0500, James Simmons wrote:
> > From: Bruno Faccini <bruno.faccini@intel.com>
> > 
> > When a LBUG has occurred, without panic_on_lbug being set,
> > health_check sysfs file must return an unhealthy state.
> 
> Why?

Its a patch being applied out of order issue. Originally this
patch was written before this was moved to sysfs and the original
code didn't return right after printing "LBUG". The move to
sysfs changed this behavior. I will fix up the out of tree code
in this case.
 
> > 
> > Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7486
> > Reviewed-on: http://review.whamcloud.com/17981
> > Reviewed-by: Bobi Jam <bobijam@hotmail.com>
> > Reviewed-by: Niu Yawei <yawei.niu@intel.com>
> > Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > ---
> >  drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> > index 22e6d1f..ef25db6 100644
> > --- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> > +++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> > @@ -224,8 +224,10 @@ static ssize_t pinger_show(struct kobject *kobj, struct attribute *attr,
> >  	int i;
> >  	size_t len = 0;
> >  
> > -	if (libcfs_catastrophe)
> > -		return sprintf(buf, "LBUG\n");
> > +	if (libcfs_catastrophe) {
> > +		len = sprintf(buf, "LBUG\n");
> 
> This line is dead code, now.
> 
> > +		healthy = false;
> > +	}
> >  
> 
> regards,
> dan carpenter
> 
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-30 10:51   ` Dan Carpenter
  2017-01-30 10:54     ` Dan Carpenter
@ 2017-01-31  2:25     ` James Simmons
  2017-01-31  8:13       ` Dan Carpenter
  2017-02-01 13:32       ` [lustre-devel] " Olaf Weber
  1 sibling, 2 replies; 79+ messages in thread
From: James Simmons @ 2017-01-31  2:25 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Liang Zhen, Amir Shehata, Greg Kroah-Hartman, devel,
	Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Doug Oucharek


> It looks like what happened is there were two patches applied out of
> sync.  Let's add a fixes tag and CC the original author.

So the only problem here is the commit message. I will update it then.
 
> Fixes: ed2f549dc0f6 ("staging: lustre: libcfs: test if userland data is to small")
> 
> This patch was probably correct when it was written but commit
> 1290932728e5 ("staging: lustre: Dynamic LNet Configuration (DLC) IOCTL
> changes") ended up getting applied first so the size was wrong.
> 
> The lstcon_ioctl_entry() function doesn't have enough size checking.

This sounds like a separate patch. I will open a ticket about this and
your comments below.

> Also I'm uncomfortable with:
> 
> 	data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
> 
> If hdr isn't the first member of the struct then the code is broken but
> container_of() implies that that isn't a hard requirement.  It should
> just be:
> 
> 	data = (struct libcfs_ioctl_data *)hdr;

Don't know if hdr being first is a hard requirment. Doug, Amir do you know 
if it is an requirement? 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-31  2:25     ` James Simmons
@ 2017-01-31  8:13       ` Dan Carpenter
  2017-02-01 13:32       ` [lustre-devel] " Olaf Weber
  1 sibling, 0 replies; 79+ messages in thread
From: Dan Carpenter @ 2017-01-31  8:13 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Liang Zhen, Greg Kroah-Hartman,
	Doug Oucharek, Linux Kernel Mailing List, Oleg Drokin,
	Amir Shehata, Lustre Development List

On Tue, Jan 31, 2017 at 02:25:22AM +0000, James Simmons wrote:
> This sounds like a separate patch. I will open a ticket about this and
> your comments below.

There are a some other places that need a size requirement like
LNetCtl().

It really feels like it should be a part of this patch because this
patch is introducing a security breakage and it's just fixing a normal
bug.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes
  2017-01-29  0:04 ` [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes James Simmons
@ 2017-01-31  8:53   ` Dan Carpenter
  0 siblings, 0 replies; 79+ messages in thread
From: Dan Carpenter @ 2017-01-31  8:53 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	Nathaniel Clark, Linux Kernel Mailing List,
	Lustre Development List

On Sat, Jan 28, 2017 at 07:04:42PM -0500, James Simmons wrote:
> From: Nathaniel Clark <nathaniel.l.clark@intel.com>
> 
> If a backing filesystem (ZFS) returns that it supports very large
> (LLONG_MAX) object sizes, that should be correctly supported.  This
> fixes the check for unitialized stripe_maxbytes in
> lsm_unpackmd_common(), so that ZFS can return LLONG_MAX and it will be
> okay. This issue is excersized by writing to or past the 2TB boundary
> of a singly stripped file.
> 
> Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7890
> Reviewed-on: http://review.whamcloud.com/19066
> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/lov/lov_ea.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
> index ac0bf64..07dee87 100644
> --- a/drivers/staging/lustre/lustre/lov/lov_ea.c
> +++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
> @@ -150,7 +150,7 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
>  			       struct lov_mds_md *lmm,
>  			       struct lov_ost_data_v1 *objects)
>  {
> -	loff_t stripe_maxbytes = LLONG_MAX;
> +	loff_t min_stripe_maxbytes = 0, lov_bytes;


I've seen this thing in sevaral patches and I haven't commented on it
but please don't do this.

	unsigned long foo = 0, bar;

It took my a long time to find the lov_bytes declaration hiding off the
end here.  We haven't made checkpatch.pl complain about it yet but it's
not kernel style.  One declaration per line and especially if they
aren't closely related like "int left, right;" and doubly especially if
there is an initialization involved.

>  	unsigned int stripe_count;
>  	struct lov_oinfo *loi;
>  	unsigned int i;
> @@ -168,8 +168,6 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
>  	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
>  
>  	for (i = 0; i < stripe_count; i++) {
> -		loff_t tgt_bytes;
> -
>  		loi = lsm->lsm_oinfo[i];
>  		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
>  		loi->loi_ost_idx = le32_to_cpu(objects[i].l_ost_idx);
> @@ -194,17 +192,21 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
>  			continue;
>  		}
>  
> -		tgt_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
> -		stripe_maxbytes = min_t(loff_t, stripe_maxbytes, tgt_bytes);
> +		lov_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
> +		if (min_stripe_maxbytes == 0 || lov_bytes < min_stripe_maxbytes)
> +			min_stripe_maxbytes = lov_bytes;
>  	}
>  
> -	if (stripe_maxbytes == LLONG_MAX)
> -		stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
> +	if (min_stripe_maxbytes == 0)
> +		min_stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
> +
> +	stripe_count = lsm->lsm_stripe_count ?: lov->desc.ld_tgt_count;
> +	lov_bytes = min_stripe_maxbytes * stripe_count;
>  
> -	if (!lsm->lsm_stripe_count)
> -		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
> +	if (lov_bytes < min_stripe_maxbytes) /* handle overflow */

Signed overflows are undefined.  I think we use GCC options so that the
compiler does not remove these checks but I also know that I have been
wrong before about GCC options and undefined behavior.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail()
  2017-01-29  0:04 ` [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail() James Simmons
@ 2017-01-31  8:54   ` Dan Carpenter
  0 siblings, 0 replies; 79+ messages in thread
From: Dan Carpenter @ 2017-01-31  8:54 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin,
	John L. Hammond, Linux Kernel Mailing List,
	Lustre Development List

On Sat, Jan 28, 2017 at 07:04:49PM -0500, James Simmons wrote:
> From: "John L. Hammond" <john.hammond@intel.com>
> 
> In sptlrpc_gc_add_sec() swap the arguments to list_add_tail() so that
> it does what we meant it to do.
> 

Huh...  This is from before lustre was merged into staging.  What are
the user visible effects of this bug?  I would have expected it to get
caught earlier.


> Signed-off-by: John L. Hammond <john.hammond@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8270

I bet the answer to my question is on this link but I'm reviewing
offline right now.  Plus that's not where the bug description belongs.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 22/60] staging: lustre: fid: fix race in fid allocation
  2017-01-29  0:04 ` [PATCH 22/60] staging: lustre: fid: fix race in fid allocation James Simmons
@ 2017-01-31  8:55   ` Dan Carpenter
  0 siblings, 0 replies; 79+ messages in thread
From: Dan Carpenter @ 2017-01-31  8:55 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin, Fan Yong,
	Linux Kernel Mailing List, Lustre Development List

On Sat, Jan 28, 2017 at 07:04:50PM -0500, James Simmons wrote:
> -		if (!fid_is_zero(&seq->lcs_fid) &&
> -		    fid_oid(&seq->lcs_fid) < seq->lcs_width) {
> +		if (unlikely(!fid_is_zero(&seq->lcs_fid) &&
> +			     fid_oid(&seq->lcs_fid) < seq->lcs_width)) {

What does adding an unlikely have to do with the race condition?  Also
only add likely/unlikely when it makes a difference to benchmarks.
Otherwise leave it out.

>  			/* Just bump last allocated fid and return to caller. */
> -			seq->lcs_fid.f_oid += 1;
> +			seq->lcs_fid.f_oid++;

Ok...  I'm pretty sure the compiler can figure this out on its own.
Stop mixing white space changes into your bug fixes.  It just makes
reviewing more complicated.

>  			rc = 0;
>  			break;
>  		}
>  

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [lustre-devel] [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-01-31  2:25     ` James Simmons
  2017-01-31  8:13       ` Dan Carpenter
@ 2017-02-01 13:32       ` Olaf Weber
  2017-02-01 16:39         ` Greg Kroah-Hartman
  1 sibling, 1 reply; 79+ messages in thread
From: Olaf Weber @ 2017-02-01 13:32 UTC (permalink / raw)
  To: James Simmons, Dan Carpenter
  Cc: devel, Greg Kroah-Hartman, Linux Kernel Mailing List,
	Oleg Drokin, Amir Shehata, Lustre Development List

On 31-01-17 03:25, James Simmons wrote:

[...]

>> Also I'm uncomfortable with:
>>
>> 	data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
>>
>> If hdr isn't the first member of the struct then the code is broken but
>> container_of() implies that that isn't a hard requirement.  It should
>> just be:
>>
>> 	data = (struct libcfs_ioctl_data *)hdr;
>
> Don't know if hdr being first is a hard requirment. Doug, Amir do you know
> if it is an requirement?

It's a requirement.

-- 
Olaf Weber                 SGI               Phone:  +31(0)30-6696796
                            Veldzigt 2b       Fax:    +31(0)30-6696799
Sr Software Engineer       3454 PW de Meern  Vnet:   955-6796
Storage Software           The Netherlands   Email:  olaf@sgi.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [lustre-devel] [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl
  2017-02-01 13:32       ` [lustre-devel] " Olaf Weber
@ 2017-02-01 16:39         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-01 16:39 UTC (permalink / raw)
  To: Olaf Weber
  Cc: James Simmons, Dan Carpenter, devel, Linux Kernel Mailing List,
	Oleg Drokin, Amir Shehata, Lustre Development List

On Wed, Feb 01, 2017 at 02:32:13PM +0100, Olaf Weber wrote:
> On 31-01-17 03:25, James Simmons wrote:
> 
> [...]
> 
> > > Also I'm uncomfortable with:
> > > 
> > > 	data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
> > > 
> > > If hdr isn't the first member of the struct then the code is broken but
> > > container_of() implies that that isn't a hard requirement.  It should
> > > just be:
> > > 
> > > 	data = (struct libcfs_ioctl_data *)hdr;
> > 
> > Don't know if hdr being first is a hard requirment. Doug, Amir do you know
> > if it is an requirement?
> 
> It's a requirement.

That's horrid.  Use container_of to be "safe" here please...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string
  2017-01-29  0:04 ` [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string James Simmons
@ 2017-02-03 10:33   ` Greg Kroah-Hartman
  2017-02-08  1:04     ` [lustre-devel] " Dilger, Andreas
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-03 10:33 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List

On Sat, Jan 28, 2017 at 07:04:38PM -0500, James Simmons wrote:
> From: Andreas Dilger <andreas.dilger@intel.com>
> 
> Update the sysfs "version" file to print "lustre: " with
> the version number.
> 
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5969
> Reviewed-on: http://review.whamcloud.com/16721
> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> index 9f5e829..22e6d1f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> +++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> @@ -208,7 +208,7 @@ struct miscdevice obd_psdev = {
>  static ssize_t version_show(struct kobject *kobj, struct attribute *attr,
>  			    char *buf)
>  {
> -	return sprintf(buf, "%s\n", LUSTRE_VERSION_STRING);
> +	return sprintf(buf, "lustre: %s\n", LUSTRE_VERSION_STRING);
>  }

Why?  You "know" this is lustre, why say it again?  Doesn't this affect
userspace tools?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 00/60] staging: lustre: batches of fixes for lustre client
  2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
                   ` (59 preceding siblings ...)
  2017-01-29  0:05 ` [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl James Simmons
@ 2017-02-03 10:46 ` Greg Kroah-Hartman
  60 siblings, 0 replies; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-03 10:46 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List

On Sat, Jan 28, 2017 at 07:04:28PM -0500, James Simmons wrote:
> Batch of missing fixes for lustre for the upstream client.

I've applied most of these, please fix up the rest, rebase, and resend.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [lustre-devel] [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string
  2017-02-03 10:33   ` Greg Kroah-Hartman
@ 2017-02-08  1:04     ` Dilger, Andreas
  2017-02-08  6:27       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 79+ messages in thread
From: Dilger, Andreas @ 2017-02-08  1:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: James Simmons, devel, Drokin, Oleg, Linux Kernel Mailing List,
	Lustre Development List


> On Feb 3, 2017, at 03:33, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> On Sat, Jan 28, 2017 at 07:04:38PM -0500, James Simmons wrote:
>> From: Andreas Dilger <andreas.dilger@intel.com>
>> 
>> Update the sysfs "version" file to print "lustre: " with
>> the version number.
>> 
>> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5969
>> Reviewed-on: http://review.whamcloud.com/16721
>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
>> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>> drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
>> index 9f5e829..22e6d1f 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
>> @@ -208,7 +208,7 @@ struct miscdevice obd_psdev = {
>> static ssize_t version_show(struct kobject *kobj, struct attribute *attr,
>> 			    char *buf)
>> {
>> -	return sprintf(buf, "%s\n", LUSTRE_VERSION_STRING);
>> +	return sprintf(buf, "lustre: %s\n", LUSTRE_VERSION_STRING);
>> }
> 
> Why?  You "know" this is lustre, why say it again?  Doesn't this affect
> userspace tools?

It included "lustre: " as a prefix until commit 8b8284450569 when the code
moved from /proc to /sys, and is what the userspace tools expect.  Formerly
there were multiple strings printed in this file, each with a different prefix,
but the "lustre: " prefix was dropped in the move to sysfs.

That didn't matter until a userspace patch to stop using ioctl(IOC_GET_VERSION)
and instead get the version from the existing /proc or /sys files, so that we
can deprecate and eventually drop the IOC_GET_VERSION ioctl completely.

So this patch is returning to the previous format of the /proc file, but if
there is a big objection to this patch we can also change the userspace tools
to live with or without this prefix now that there is only a single value here.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [lustre-devel] [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string
  2017-02-08  1:04     ` [lustre-devel] " Dilger, Andreas
@ 2017-02-08  6:27       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-02-08  6:27 UTC (permalink / raw)
  To: Dilger, Andreas
  Cc: devel, Drokin, Oleg, Linux Kernel Mailing List, Lustre Development List

On Wed, Feb 08, 2017 at 01:04:52AM +0000, Dilger, Andreas wrote:
> 
> > On Feb 3, 2017, at 03:33, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > 
> > On Sat, Jan 28, 2017 at 07:04:38PM -0500, James Simmons wrote:
> >> From: Andreas Dilger <andreas.dilger@intel.com>
> >> 
> >> Update the sysfs "version" file to print "lustre: " with
> >> the version number.
> >> 
> >> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> >> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5969
> >> Reviewed-on: http://review.whamcloud.com/16721
> >> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> >> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
> >> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> >> Signed-off-by: James Simmons <jsimmons@infradead.org>
> >> ---
> >> drivers/staging/lustre/lustre/obdclass/linux/linux-module.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> >> index 9f5e829..22e6d1f 100644
> >> --- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> >> +++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
> >> @@ -208,7 +208,7 @@ struct miscdevice obd_psdev = {
> >> static ssize_t version_show(struct kobject *kobj, struct attribute *attr,
> >> 			    char *buf)
> >> {
> >> -	return sprintf(buf, "%s\n", LUSTRE_VERSION_STRING);
> >> +	return sprintf(buf, "lustre: %s\n", LUSTRE_VERSION_STRING);
> >> }
> > 
> > Why?  You "know" this is lustre, why say it again?  Doesn't this affect
> > userspace tools?
> 
> It included "lustre: " as a prefix until commit 8b8284450569 when the code
> moved from /proc to /sys, and is what the userspace tools expect.  Formerly
> there were multiple strings printed in this file, each with a different prefix,
> but the "lustre: " prefix was dropped in the move to sysfs.
> 
> That didn't matter until a userspace patch to stop using ioctl(IOC_GET_VERSION)
> and instead get the version from the existing /proc or /sys files, so that we
> can deprecate and eventually drop the IOC_GET_VERSION ioctl completely.
> 
> So this patch is returning to the previous format of the /proc file, but if
> there is a big objection to this patch we can also change the userspace tools
> to live with or without this prefix now that there is only a single value here.

Think about it, it's a sysfs file, which should only have one value to
start with, and you are opening it from userspace knowing exactly where
it is (somewhere in the lustre subtree), so of course you know it is
"lustre"...

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate
  2017-01-30 11:34   ` Dan Carpenter
@ 2017-02-11 17:12     ` James Simmons
  0 siblings, 0 replies; 79+ messages in thread
From: James Simmons @ 2017-02-11 17:12 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin, wang di,
	Linux Kernel Mailing List, Lustre Development List


> On Sat, Jan 28, 2017 at 07:04:33PM -0500, James Simmons wrote:
> > From: wang di <di.wang@intel.com>
> > 
> > Check if the request is NULL, before retrieve reply body
> > from the request.
> > 
> > Signed-off-by: wang di <di.wang@intel.com>
> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7396
> > Reviewed-on: http://review.whamcloud.com/17079
> > Reviewed-by: John L. Hammond <john.hammond@intel.com>
> > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > ---
> >  drivers/staging/lustre/lustre/llite/file.c | 41 +++++++++++++++++-------------
> >  1 file changed, 23 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
> > index a1e51a5..b681e15 100644
> > --- a/drivers/staging/lustre/lustre/llite/file.c
> > +++ b/drivers/staging/lustre/lustre/llite/file.c
> > @@ -2656,28 +2656,33 @@ int ll_migrate(struct inode *parent, struct file *file, int mdtidx,
> >  	if (!rc)
> >  		ll_update_times(request, parent);
> >  
> 
> I don't like how we return a non-NULL request on many error paths.  It
> would be simpler to understand if mdc_rename() freed request on error.
> So mdc_reint() fails but we still continue?  I don't understand that but
> there are no comments about it.
> 
> 
> > -	body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
> > -	if (!body) {
> > -		rc = -EPROTO;
> > -		goto out_free;
> > -	}
> > +	if (request) {
> > +		body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
> > +		if (!body) {
> > +			rc = -EPROTO;
> > +			goto out_free;
> 
> We should call ptlrpc_req_finished(request) before returning on this
> path.

Their are more patches coming that fix the issues you pointed out. Its 
just I have been pushing patches that are order independent first. I
will push out the patches to address other issues with ll_migrate().

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2017-02-11 17:12 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-29  0:04 [PATCH 00/60] staging: lustre: batches of fixes for lustre client James Simmons
2017-01-29  0:04 ` [PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw James Simmons
2017-01-29  0:04 ` [PATCH 02/60] staging: lustre: statahead: drop support for remote entry James Simmons
2017-01-29  0:04 ` [PATCH 03/60] staging: lustre: clio: add cl_page LRU shrinker James Simmons
2017-01-29  0:04 ` [PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR James Simmons
2017-01-29  0:04 ` [PATCH 05/60] staging: lustre: llite: check request != NULL in ll_migrate James Simmons
2017-01-30 11:34   ` Dan Carpenter
2017-02-11 17:12     ` James Simmons
2017-01-29  0:04 ` [PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO James Simmons
2017-01-29  0:04 ` [PATCH 07/60] staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend James Simmons
2017-01-29  0:04 ` [PATCH 08/60] staging: lustre: ldlm: Restore connect flags on failure James Simmons
2017-01-29  0:04 ` [PATCH 09/60] staging: lustre: lmv: Correctly generate target_obd James Simmons
2017-01-29  0:04 ` [PATCH 10/60] staging: lustre: obdclass: add more info to sysfs version string James Simmons
2017-02-03 10:33   ` Greg Kroah-Hartman
2017-02-08  1:04     ` [lustre-devel] " Dilger, Andreas
2017-02-08  6:27       ` Greg Kroah-Hartman
2017-01-29  0:04 ` [PATCH 11/60] staging: lustre: obd: RCU stalls in lu_cache_shrink_count() James Simmons
2017-01-29  0:04 ` [PATCH 12/60] staging: lustre: lmv: Error not handled for lmv_find_target James Simmons
2017-01-29  0:04 ` [PATCH 13/60] staging: lustre: obdclass: health_check to report unhealthy upon LBUG James Simmons
2017-01-30 12:03   ` Dan Carpenter
2017-01-31  1:00     ` James Simmons
2017-01-29  0:04 ` [PATCH 14/60] staging: lustre: lov: Ensure correct operation for large object sizes James Simmons
2017-01-31  8:53   ` Dan Carpenter
2017-01-29  0:04 ` [PATCH 15/60] staging: lustre: hsm: stack overrun in hai_dump_data_field James Simmons
2017-01-29  0:04 ` [PATCH 16/60] staging: lustre: llite: don't ignore layout for group lock request James Simmons
2017-01-29  0:04 ` [PATCH 17/60] staging: lustre: obdclass: do not call lu_site_purge() for single object exceed James Simmons
2017-01-29  0:04 ` [PATCH 18/60] staging: lustre: ptlrpc: skip lock if export failed James Simmons
2017-01-29  0:04 ` [PATCH 19/60] staging: lustre: llite: handle inactive OSTs better in statfs James Simmons
2017-01-29  0:04 ` [PATCH 20/60] staging: lustre: llite: remove obsolete comment for ll_unlink() James Simmons
2017-01-29  0:04 ` [PATCH 21/60] staging: lustre: ptlrpc: correct use of list_add_tail() James Simmons
2017-01-31  8:54   ` Dan Carpenter
2017-01-29  0:04 ` [PATCH 22/60] staging: lustre: fid: fix race in fid allocation James Simmons
2017-01-31  8:55   ` Dan Carpenter
2017-01-29  0:04 ` [PATCH 23/60] staging: lustre: lmv: remove unused placement parameter James Simmons
2017-01-29  0:04 ` [PATCH 24/60] staging: lustre: lustre: Remove old commented out code James Simmons
2017-01-29  0:04 ` [PATCH 25/60] staging: lustre: llite: normal user can't set FS default stripe James Simmons
2017-01-29  0:04 ` [PATCH 26/60] staging: lustre: llite: Trust creates in revalidate too James Simmons
2017-01-29  0:04 ` [PATCH 27/60] staging: lustre: mgc: handle config_llog_data::cld_refcount properly James Simmons
2017-01-29  0:04 ` [PATCH 28/60] staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed James Simmons
2017-01-29  0:04 ` [PATCH 29/60] staging: lustre: llite: Setting xattr are properly checked with and without ACLs James Simmons
2017-01-29  0:04 ` [PATCH 30/60] staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab James Simmons
2017-01-29  0:04 ` [PATCH 31/60] staging: lustre: clio: sync write should update mtime James Simmons
2017-01-29  0:05 ` [PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC James Simmons
2017-01-29  0:05 ` [PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern James Simmons
2017-01-29  0:05 ` [PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology James Simmons
2017-01-29  0:05 ` [PATCH 35/60] staging: lustre: lov: ld_target could be NULL James Simmons
2017-01-29  0:05 ` [PATCH 36/60] staging: lustre: header: remove assert from interval_set() James Simmons
2017-01-29  0:05 ` [PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update James Simmons
2017-01-29  0:05 ` [PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin James Simmons
2017-01-29  0:05 ` [PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs James Simmons
2017-01-29  0:05 ` [PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply James Simmons
2017-01-29  0:05 ` [PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration James Simmons
2017-01-29  0:05 ` [PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed James Simmons
2017-01-29  0:05 ` [PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE James Simmons
2017-01-29  0:05 ` [PATCH 44/60] staging: lustre: libcfs: fix error messages James Simmons
2017-01-29  0:05 ` [PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99 James Simmons
2017-01-29  0:05 ` [PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock James Simmons
2017-01-29  0:05 ` [PATCH 47/60] staging: lustre: mdc: avoid returning freed request James Simmons
2017-01-29  0:05 ` [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection James Simmons
2017-01-29  0:05 ` [PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg James Simmons
2017-01-29  0:05 ` [PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env James Simmons
2017-01-29  0:05 ` [PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c James Simmons
2017-01-29  0:05 ` [PATCH 52/60] staging: lustre: linkea: linkEA size limitation James Simmons
2017-01-29  0:05 ` [PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay James Simmons
2017-01-29  0:05 ` [PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99 James Simmons
2017-01-29  0:05 ` [PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop James Simmons
2017-01-29  0:05 ` [PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case James Simmons
2017-01-29  0:05 ` [PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves James Simmons
2017-01-29  0:05 ` [PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much James Simmons
2017-01-29  0:05 ` [PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc James Simmons
2017-01-29  0:05 ` [PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl James Simmons
2017-01-30 10:51   ` Dan Carpenter
2017-01-30 10:54     ` Dan Carpenter
2017-01-31  0:48       ` James Simmons
2017-01-31  2:25     ` James Simmons
2017-01-31  8:13       ` Dan Carpenter
2017-02-01 13:32       ` [lustre-devel] " Olaf Weber
2017-02-01 16:39         ` Greg Kroah-Hartman
2017-02-03 10:46 ` [PATCH 00/60] staging: lustre: batches of fixes for lustre client Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).