linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59
@ 2016-10-27 22:11 James Simmons
  2016-10-27 22:11 ` [PATCH 01/29] staging: lustre: osc: remove handling cl_avail_grant less than zero James Simmons
                   ` (28 more replies)
  0 siblings, 29 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

The first two patches are fixes for problems pointed out by
Dan Carpenter and Julia LaWall. Batch of fixes from the Lustre 2.8
version. Basic support for selinux which before this work
prevented lustre from running. First patch to cleanup lustre_idl.h
which is a uapi header. All these patches are independent of
each other so they can be applied in any order.

Alex Zhuravlev (1):
  staging: lustre: ptlrpc: imp_peer_committed_transno should increase

Amir Shehata (1):
  staging: lustre: ptlrpc: Introduce iovec to bulk descriptor

Andreas Dilger (1):
  staging: lustre: obdclass: variable llog chunk size

Ben Evans (1):
  staging: lustre: headers: Create single .h for lu_seq_range

Bobi Jam (2):
  staging: lustre: llite: restart short read/write for normal IO
  staging: lustre: clio: update file attributes after sync

Gregoire Pichon (2):
  staging: lustre: ptlrpc: embed highest XID in each request
  staging: lustre: mdc: manage number of modify RPCs in flight

Henri Doreau (2):
  staging: lustre: ptlrpc: Forbid too early NRS policy tunings
  staging: lustre: llite: Inform copytool of dataversion changes

Hiroya Nozaki (1):
  staging: lustre: obdclass: race lustre_profile_list

Hongchao Zhang (1):
  staging: lustre: mdt: disable IMA support

James Simmons (2):
  staging: lustre: osc: remove handling cl_avail_grant less than zero
  staging: lustre: llite: remove IS_ERR(master_inode) check

John L. Hammond (2):
  staging: lustre: lov: remove LSM from struct lustre_md
  staging: lustre: llite: add LL_IOC_FUTIMES_3

Lai Siyao (1):
  staging: lustre: dne: setdirstripe should fail if not supported

Niu Yawei (3):
  staging: lustre: ldlm: reclaim granted locks defensively
  staging: lustre: recovery: don't skip open replay on reconnect
  staging: lustre: obdecho: don't copy lu_site

Sebastien Buisson (3):
  staging: lustre: llite: basic support of SELinux in CLIO
  staging: lustre: ptlrpc: do not sleep if encpool reached max capacity
  staging: lustre: ptlrpc: do not switch out-of-date context

wang di (6):
  staging: lustre: lmv: allow cross-MDT rename and link
  staging: lustre: llite: report back to user bad stripe count
  staging: lustre: ptlrpc: Do not resend req with allow_replay
  staging: lustre: mdc: deactive MDT permanently
  staging: lustre: ptlrpc: replay bulk request
  staging: lustre: llog: record the minimum record size

 .../lustre/include/linux/libcfs/libcfs_hash.h      |    2 +-
 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 drivers/staging/lustre/lnet/libcfs/hash.c          |   24 ++-
 drivers/staging/lustre/lustre/fid/fid_request.c    |   18 +-
 drivers/staging/lustre/lustre/fid/lproc_fid.c      |    2 +-
 drivers/staging/lustre/lustre/fld/fld_cache.c      |    6 +-
 drivers/staging/lustre/lustre/include/cl_object.h  |    2 +-
 .../lustre/lustre/include/lustre/lustre_idl.h      |  154 ++++------------
 .../lustre/lustre/include/lustre/lustre_user.h     |   10 +
 drivers/staging/lustre/lustre/include/lustre_fid.h |    1 +
 drivers/staging/lustre/lustre/include/lustre_log.h |    6 +
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   24 +++
 drivers/staging/lustre/lustre/include/lustre_net.h |  161 +++++++++++++---
 drivers/staging/lustre/lustre/include/lustre_sec.h |    2 +
 drivers/staging/lustre/lustre/include/obd.h        |    9 +-
 drivers/staging/lustre/lustre/include/obd_class.h  |    9 +
 .../staging/lustre/lustre/include/obd_support.h    |    2 +
 drivers/staging/lustre/lustre/include/seq_range.h  |  199 ++++++++++++++++++++
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |   37 ++++
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 +-
 drivers/staging/lustre/lustre/llite/Makefile       |    6 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   31 +++-
 drivers/staging/lustre/lustre/llite/file.c         |  115 ++++++++----
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |    2 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |    4 +
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   19 +-
 drivers/staging/lustre/lustre/llite/namei.c        |   14 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++-
 drivers/staging/lustre/lustre/llite/xattr.c        |   65 +++----
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |    4 +
 .../staging/lustre/lustre/llite/xattr_security.c   |   88 +++++++++
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   78 ++++++--
 drivers/staging/lustre/lustre/lov/lov_ea.c         |    9 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |   15 +--
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    1 -
 drivers/staging/lustre/lustre/lov/lov_object.c     |  120 +++++++-----
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  100 ++++------
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c      |   60 ++++++
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   28 ++--
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   23 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   77 +++-----
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    8 +-
 drivers/staging/lustre/lustre/obdclass/genops.c    |  171 +++++++++++++++++-
 drivers/staging/lustre/lustre/obdclass/llog.c      |   49 +++--
 drivers/staging/lustre/lustre/obdclass/llog_obd.c  |    1 +
 drivers/staging/lustre/lustre/obdclass/llog_swab.c |    8 +-
 .../staging/lustre/lustre/obdclass/obd_config.c    |   57 +++++-
 .../staging/lustre/lustre/obdecho/echo_client.c    |   20 +-
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    2 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |    3 +-
 drivers/staging/lustre/lustre/osc/osc_page.c       |    4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |   38 +++--
 drivers/staging/lustre/lustre/ptlrpc/client.c      |  199 ++++++++++++++++----
 drivers/staging/lustre/lustre/ptlrpc/events.c      |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/llog_client.c |   20 ++-
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |   16 +-
 drivers/staging/lustre/lustre/ptlrpc/nrs.c         |    9 +
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   15 ++
 drivers/staging/lustre/lustre/ptlrpc/pers.c        |   31 ++--
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |    9 +-
 drivers/staging/lustre/lustre/ptlrpc/recover.c     |   12 +-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    7 +
 drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c    |   48 ++++--
 drivers/staging/lustre/lustre/ptlrpc/sec_plain.c   |   20 +-
 65 files changed, 1649 insertions(+), 661 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/seq_range.h
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_security.c

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/29] staging: lustre: osc: remove handling cl_avail_grant less than zero
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 02/29] staging: lustre: llite: remove IS_ERR(master_inode) check James Simmons
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

Earlier cl_avail_grant was changed to an unsigned int. Juila
Lawall reported for the upstream client the following which
affects the Intel branch as well:

drivers/staging/lustre/lustre/osc/osc_request.c:1045:5-24: WARNING: Unsigned
expression compared with zero: cli -> cl_avail_grant < 0

Since cl_avail_grant can never be negative we can remove the
code handling the negative value case.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: LU-8697 https://jira.hpdd.intel.com/browse/LU-6303
Reviewed-on: http://review.whamcloud.com/23155
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_request.c |   10 ----------
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 038f00c..efd938a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -801,16 +801,6 @@ static void osc_init_grant(struct client_obd *cli, struct obd_connect_data *ocd)
 		cli->cl_avail_grant = ocd->ocd_grant -
 				      (cli->cl_dirty_pages << PAGE_SHIFT);
 
-	if (cli->cl_avail_grant < 0) {
-		CWARN("%s: available grant < 0: avail/ocd/dirty %ld/%u/%ld\n",
-		      cli->cl_import->imp_obd->obd_name, cli->cl_avail_grant,
-		      ocd->ocd_grant, cli->cl_dirty_pages << PAGE_SHIFT);
-		/* workaround for servers which do not have the patch from
-		 * LU-2679
-		 */
-		cli->cl_avail_grant = ocd->ocd_grant;
-	}
-
 	/* determine the appropriate chunk size used by osc_extent. */
 	cli->cl_chunkbits = max_t(int, PAGE_SHIFT, ocd->ocd_blocksize);
 	spin_unlock(&cli->cl_loi_list_lock);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/29] staging: lustre: llite: remove IS_ERR(master_inode) check
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
  2016-10-27 22:11 ` [PATCH 01/29] staging: lustre: osc: remove handling cl_avail_grant less than zero James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 03/29] staging: lustre: llite: restart short read/write for normal IO James Simmons
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

The kernel function ilookup5_nowait never returns
IS_ERR so we can remove the IS_ERR check in the
ll_md_blocking_ast() function.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8697
Reviewed-on: http://review.whamcloud.com/23151
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/namei.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index a69c8af..9b4095f 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -283,7 +283,7 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 				master_inode = ilookup5(inode->i_sb, hash,
 							ll_test_inode_by_fid,
 							(void *)&lli->lli_pfid);
-				if (master_inode && !IS_ERR(master_inode)) {
+				if (master_inode) {
 					ll_invalidate_negative_children(master_inode);
 					iput(master_inode);
 				}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/29] staging: lustre: llite: restart short read/write for normal IO
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
  2016-10-27 22:11 ` [PATCH 01/29] staging: lustre: osc: remove handling cl_avail_grant less than zero James Simmons
  2016-10-27 22:11 ` [PATCH 02/29] staging: lustre: llite: remove IS_ERR(master_inode) check James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 04/29] staging: lustre: obdclass: variable llog chunk size James Simmons
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	Jinshan Xiong, James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

If normal IO got short read/write, we'd restart the IO from where
we've accomplished until we meet EOF or error happens.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
Reviewed-on: http://review.whamcloud.com/14123
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 .../staging/lustre/lustre/include/obd_support.h    |    2 +
 drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
 4 files changed, 45 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
index e4b1a0a..3a9c8dd 100644
--- a/drivers/staging/lustre/lnet/libcfs/fail.c
+++ b/drivers/staging/lustre/lnet/libcfs/fail.c
@@ -113,6 +113,7 @@ int __cfs_fail_check_set(__u32 id, __u32 value, int set)
 		break;
 	case CFS_FAIL_LOC_RESET:
 		cfs_fail_loc = value;
+		atomic_set(&cfs_fail_count, 0);
 		break;
 	default:
 		LASSERTF(0, "called with bad set %u\n", set);
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 1233c34..7f3f8cd 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -458,6 +458,8 @@
 #define OBD_FAIL_LOV_INIT			    0x1403
 #define OBD_FAIL_GLIMPSE_DELAY			    0x1404
 #define OBD_FAIL_LLITE_XATTR_ENOMEM		    0x1405
+#define OBD_FAIL_MAKE_LOVEA_HOLE		    0x1406
+#define OBD_FAIL_LLITE_LOST_LAYOUT		    0x1407
 #define OBD_FAIL_GETATTR_DELAY			    0x1409
 
 #define OBD_FAIL_FID_INDIR	0x1501
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 0accf28..2c8df43 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -972,9 +972,11 @@ void ll_io_init(struct cl_io *io, const struct file *file, int write)
 {
 	struct ll_inode_info *lli = ll_i2info(file_inode(file));
 	struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
+	struct vvp_io *vio = vvp_env_io(env);
 	struct range_lock range;
 	struct cl_io	 *io;
-	ssize_t	       result;
+	ssize_t result = 0;
+	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "file: %pD, type: %d ppos: %llu, count: %zu\n",
 	       file, iot, *ppos, count);
@@ -1006,16 +1008,15 @@ void ll_io_init(struct cl_io *io, const struct file *file, int write)
 			CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
 			       range.rl_node.in_extent.start,
 			       range.rl_node.in_extent.end);
-			result = range_lock(&lli->lli_write_tree,
-					    &range);
-			if (result < 0)
+			rc = range_lock(&lli->lli_write_tree, &range);
+			if (rc < 0)
 				goto out;
 
 			range_locked = true;
 		}
 		down_read(&lli->lli_trunc_sem);
 		ll_cl_add(file, env, io);
-		result = cl_io_loop(env, io);
+		rc = cl_io_loop(env, io);
 		ll_cl_remove(file, env);
 		up_read(&lli->lli_trunc_sem);
 		if (range_locked) {
@@ -1026,24 +1027,26 @@ void ll_io_init(struct cl_io *io, const struct file *file, int write)
 		}
 	} else {
 		/* cl_io_rw_init() handled IO */
-		result = io->ci_result;
+		rc = io->ci_result;
 	}
 
 	if (io->ci_nob > 0) {
 		result = io->ci_nob;
+		count -= io->ci_nob;
 		*ppos = io->u.ci_wr.wr.crw_pos;
+
+		/* prepare IO restart */
+		if (count > 0)
+			args->u.normal.via_iter = vio->vui_iter;
 	}
-	goto out;
 out:
 	cl_io_fini(env, io);
-	/* If any bit been read/written (result != 0), we just return
-	 * short read/write instead of restart io.
-	 */
-	if ((result == 0 || result == -ENODATA) && io->ci_need_restart) {
-		CDEBUG(D_VFSTRACE, "Restart %s on %pD from %lld, count:%zu\n",
+
+	if ((!rc || rc == -ENODATA) && count > 0 && io->ci_need_restart) {
+		CDEBUG(D_VFSTRACE, "%s: restart %s from %lld, count:%zu, result: %zd\n",
+		       file_dentry(file)->d_name.name,
 		       iot == CIT_READ ? "read" : "write",
-		       file, *ppos, count);
-		LASSERTF(io->ci_nob == 0, "%zd\n", io->ci_nob);
+		       *ppos, count, result);
 		goto restart;
 	}
 
@@ -1056,13 +1059,19 @@ void ll_io_init(struct cl_io *io, const struct file *file, int write)
 			ll_stats_ops_tally(ll_i2sbi(file_inode(file)),
 					   LPROC_LL_WRITE_BYTES, result);
 			fd->fd_write_failed = false;
-		} else if (result != -ERESTARTSYS) {
+		} else if (!result && !rc) {
+			rc = io->ci_result;
+			if (rc < 0)
+				fd->fd_write_failed = true;
+			else
+				fd->fd_write_failed = false;
+		} else if (rc != -ERESTARTSYS) {
 			fd->fd_write_failed = true;
 		}
 	}
 	CDEBUG(D_VFSTRACE, "iot: %d, result: %zd\n", iot, result);
 
-	return result;
+	return result > 0 ? result : rc;
 }
 
 static ssize_t ll_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 87fdab2..3d327b4 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -72,9 +72,10 @@ static bool can_populate_pages(const struct lu_env *env, struct cl_io *io,
 		/* don't need lock here to check lli_layout_gen as we have held
 		 * extent lock and GROUP lock has to hold to swap layout
 		 */
-		if (ll_layout_version_get(lli) != vio->vui_layout_gen) {
+		if (ll_layout_version_get(lli) != vio->vui_layout_gen ||
+		    OBD_FAIL_CHECK_RESET(OBD_FAIL_LLITE_LOST_LAYOUT, 0)) {
 			io->ci_need_restart = 1;
-			/* this will return application a short read/write */
+			/* this will cause a short read/write */
 			io->ci_continue = 0;
 			rc = false;
 		}
@@ -924,6 +925,20 @@ static int vvp_io_write_start(const struct lu_env *env,
 
 	CDEBUG(D_VFSTRACE, "write: [%lli, %lli)\n", pos, pos + (long long)cnt);
 
+	/*
+	 * The maximum Lustre file size is variable, based on the OST maximum
+	 * object size and number of stripes.  This needs another check in
+	 * addition to the VFS checks earlier.
+	 */
+	if (pos + cnt > ll_file_maxbytes(inode)) {
+		CDEBUG(D_INODE,
+		       "%s: file " DFID " offset %llu > maxbytes %llu\n",
+		       ll_get_fsname(inode->i_sb, NULL, 0),
+		       PFID(ll_inode2fid(inode)), pos + cnt,
+		       ll_file_maxbytes(inode));
+		return -EFBIG;
+	}
+
 	if (!vio->vui_iter) {
 		/* from a temp io in ll_cl_init(). */
 		result = 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/29] staging: lustre: obdclass: variable llog chunk size
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (2 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 03/29] staging: lustre: llite: restart short read/write for normal IO James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 05/29] staging: lustre: lmv: allow cross-MDT rename and link James Simmons
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Andreas Dilger, wang di, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

Do not use fix LLOG_CHUNK_SIZE (8192 bytes), and
it will get the llog_chunk_size from llog_log_hdr.
Accordingly llog header will be variable too, so
we can enlarge the bitmap in the header, then
have more records in each llog file.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6602
Reviewed-on: http://review.whamcloud.com/14883
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   39 ++++++++++++----
 drivers/staging/lustre/lustre/include/lustre_log.h |    6 +++
 drivers/staging/lustre/lustre/obdclass/llog.c      |   48 +++++++++++--------
 drivers/staging/lustre/lustre/obdclass/llog_obd.c  |    1 +
 drivers/staging/lustre/lustre/obdclass/llog_swab.c |    8 ++-
 drivers/staging/lustre/lustre/ptlrpc/llog_client.c |   20 ++++++--
 6 files changed, 84 insertions(+), 38 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 5d2f845..e542ce6 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3122,13 +3122,6 @@ struct llog_gen_rec {
 	struct llog_rec_tail	lgr_tail;
 };
 
-/* On-disk header structure of each log object, stored in little endian order */
-#define LLOG_CHUNK_SIZE	 8192
-#define LLOG_HEADER_SIZE	(96)
-#define LLOG_BITMAP_BYTES       (LLOG_CHUNK_SIZE - LLOG_HEADER_SIZE)
-
-#define LLOG_MIN_REC_SIZE       (24) /* round(llog_rec_hdr + llog_rec_tail) */
-
 /* flags for the logs */
 enum llog_flag {
 	LLOG_F_ZAP_WHEN_EMPTY	= 0x1,
@@ -3139,6 +3132,15 @@ enum llog_flag {
 	LLOG_F_EXT_MASK = LLOG_F_EXT_JOBID,
 };
 
+/* On-disk header structure of each log object, stored in little endian order */
+#define LLOG_MIN_CHUNK_SIZE	8192
+#define LLOG_HEADER_SIZE	(96)	/* sizeof (llog_log_hdr) +
+					 * sizeof(llh_tail) - sizeof(llh_bitmap)
+					 */
+#define LLOG_BITMAP_BYTES	(LLOG_MIN_CHUNK_SIZE - LLOG_HEADER_SIZE)
+#define LLOG_MIN_REC_SIZE	(24)	/* round(llog_rec_hdr + llog_rec_tail) */
+
+/* flags for the logs */
 struct llog_log_hdr {
 	struct llog_rec_hdr     llh_hdr;
 	__s64		   llh_timestamp;
@@ -3150,13 +3152,30 @@ struct llog_log_hdr {
 	/* for a catalog the first plain slot is next to it */
 	struct obd_uuid	 llh_tgtuuid;
 	__u32		   llh_reserved[LLOG_HEADER_SIZE / sizeof(__u32) - 23];
+	/* These fields must always be at the end of the llog_log_hdr.
+	 * Note: llh_bitmap size is variable because llog chunk size could be
+	 * bigger than LLOG_MIN_CHUNK_SIZE, i.e. sizeof(llog_log_hdr) > 8192
+	 * bytes, and the real size is stored in llh_hdr.lrh_len, which means
+	 * llh_tail should only be referred by LLOG_HDR_TAIL().
+	 * But this structure is also used by client/server llog interface
+	 * (see llog_client.c), it will be kept in its original way to avoid
+	 * compatibility issue.
+	 */
 	__u32		   llh_bitmap[LLOG_BITMAP_BYTES / sizeof(__u32)];
 	struct llog_rec_tail    llh_tail;
 } __packed;
 
-#define LLOG_BITMAP_SIZE(llh)  (__u32)((llh->llh_hdr.lrh_len -		\
-					llh->llh_bitmap_offset -	\
-					sizeof(llh->llh_tail)) * 8)
+#undef LLOG_HEADER_SIZE
+#undef LLOG_BITMAP_BYTES
+
+#define LLOG_HDR_BITMAP_SIZE(llh) (__u32)((llh->llh_hdr.lrh_len -	\
+					   llh->llh_bitmap_offset -	\
+					   sizeof(llh->llh_tail)) * 8)
+#define LLOG_HDR_BITMAP(llh)	(__u32 *)((char *)(llh) +		\
+					  (llh)->llh_bitmap_offset)
+#define LLOG_HDR_TAIL(llh)	((struct llog_rec_tail *)((char *)llh + \
+							 llh->llh_hdr.lrh_len - \
+							 sizeof(llh->llh_tail)))
 
 /** log cookies are used to reference a specific log file and a record
  * therein
diff --git a/drivers/staging/lustre/lustre/include/lustre_log.h b/drivers/staging/lustre/lustre/include/lustre_log.h
index 995b266..35e37eb 100644
--- a/drivers/staging/lustre/lustre/include/lustre_log.h
+++ b/drivers/staging/lustre/lustre/include/lustre_log.h
@@ -214,6 +214,7 @@ struct llog_handle {
 	spinlock_t		 lgh_hdr_lock; /* protect lgh_hdr data */
 	struct llog_logid	 lgh_id; /* id of this log */
 	struct llog_log_hdr	*lgh_hdr;
+	size_t			 lgh_hdr_size;
 	int			 lgh_last_idx;
 	int			 lgh_cur_idx; /* used during llog_process */
 	__u64			 lgh_cur_offset; /* used during llog_process */
@@ -244,6 +245,11 @@ struct llog_ctxt {
 	struct mutex		 loc_mutex; /* protect loc_imp */
 	atomic_t	     loc_refcount;
 	long		     loc_flags; /* flags, see above defines */
+	/*
+	 * llog chunk size, and llog record size can not be bigger than
+	 * loc_chunk_size
+	 */
+	__u32			loc_chunk_size;
 };
 
 #define LLOG_PROC_BREAK 0x0001
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index 43797f1..5c9447e 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -80,8 +80,7 @@ static void llog_free_handle(struct llog_handle *loghandle)
 		LASSERT(list_empty(&loghandle->u.phd.phd_entry));
 	else if (loghandle->lgh_hdr->llh_flags & LLOG_F_IS_CAT)
 		LASSERT(list_empty(&loghandle->u.chd.chd_head));
-	LASSERT(sizeof(*loghandle->lgh_hdr) == LLOG_CHUNK_SIZE);
-	kfree(loghandle->lgh_hdr);
+	kvfree(loghandle->lgh_hdr);
 out:
 	kfree(loghandle);
 }
@@ -116,19 +115,21 @@ static int llog_read_header(const struct lu_env *env,
 	if (rc == LLOG_EEMPTY) {
 		struct llog_log_hdr *llh = handle->lgh_hdr;
 
+		/* lrh_len should be initialized in llog_init_handle */
 		handle->lgh_last_idx = 0; /* header is record with index 0 */
 		llh->llh_count = 1;	 /* for the header record */
 		llh->llh_hdr.lrh_type = LLOG_HDR_MAGIC;
-		llh->llh_hdr.lrh_len = LLOG_CHUNK_SIZE;
-		llh->llh_tail.lrt_len = LLOG_CHUNK_SIZE;
+		LASSERT(handle->lgh_ctxt->loc_chunk_size >= LLOG_MIN_CHUNK_SIZE);
+		llh->llh_hdr.lrh_len = handle->lgh_ctxt->loc_chunk_size;
 		llh->llh_hdr.lrh_index = 0;
-		llh->llh_tail.lrt_index = 0;
 		llh->llh_timestamp = ktime_get_real_seconds();
 		if (uuid)
 			memcpy(&llh->llh_tgtuuid, uuid,
 			       sizeof(llh->llh_tgtuuid));
 		llh->llh_bitmap_offset = offsetof(typeof(*llh), llh_bitmap);
-		ext2_set_bit(0, llh->llh_bitmap);
+		ext2_set_bit(0, LLOG_HDR_BITMAP(llh));
+		LLOG_HDR_TAIL(llh)->lrt_len = llh->llh_hdr.lrh_len;
+		LLOG_HDR_TAIL(llh)->lrt_index = llh->llh_hdr.lrh_index;
 		rc = 0;
 	}
 	return rc;
@@ -137,16 +138,19 @@ static int llog_read_header(const struct lu_env *env,
 int llog_init_handle(const struct lu_env *env, struct llog_handle *handle,
 		     int flags, struct obd_uuid *uuid)
 {
+	int chunk_size = handle->lgh_ctxt->loc_chunk_size;
 	enum llog_flag fmt = flags & LLOG_F_EXT_MASK;
 	struct llog_log_hdr	*llh;
 	int			 rc;
 
 	LASSERT(!handle->lgh_hdr);
 
-	llh = kzalloc(sizeof(*llh), GFP_NOFS);
+	LASSERT(chunk_size >= LLOG_MIN_CHUNK_SIZE);
+	llh = libcfs_kvzalloc(sizeof(*llh), GFP_NOFS);
 	if (!llh)
 		return -ENOMEM;
 	handle->lgh_hdr = llh;
+	handle->lgh_hdr_size = chunk_size;
 	/* first assign flags to use llog_client_ops */
 	llh->llh_flags = flags;
 	rc = llog_read_header(env, handle, uuid);
@@ -198,7 +202,7 @@ int llog_init_handle(const struct lu_env *env, struct llog_handle *handle,
 	llh->llh_flags |= fmt;
 out:
 	if (rc) {
-		kfree(llh);
+		kvfree(llh);
 		handle->lgh_hdr = NULL;
 	}
 	return rc;
@@ -212,15 +216,20 @@ static int llog_process_thread(void *arg)
 	struct llog_log_hdr		*llh = loghandle->lgh_hdr;
 	struct llog_process_cat_data	*cd  = lpi->lpi_catdata;
 	char				*buf;
-	__u64				 cur_offset = LLOG_CHUNK_SIZE;
+	__u64				 cur_offset;
 	__u64				 last_offset;
+	int chunk_size;
 	int				 rc = 0, index = 1, last_index;
 	int				 saved_index = 0;
 	int				 last_called_index = 0;
 
-	LASSERT(llh);
+	if (!llh)
+		return -EINVAL;
+
+	cur_offset = llh->llh_hdr.lrh_len;
+	chunk_size = llh->llh_hdr.lrh_len;
 
-	buf = kzalloc(LLOG_CHUNK_SIZE, GFP_NOFS);
+	buf = libcfs_kvzalloc(chunk_size, GFP_NOFS);
 	if (!buf) {
 		lpi->lpi_rc = -ENOMEM;
 		return 0;
@@ -233,7 +242,7 @@ static int llog_process_thread(void *arg)
 	if (cd && cd->lpcd_last_idx)
 		last_index = cd->lpcd_last_idx;
 	else
-		last_index = LLOG_BITMAP_BYTES * 8 - 1;
+		last_index = LLOG_HDR_BITMAP_SIZE(llh) - 1;
 
 	/* Record is not in this buffer. */
 	if (index > last_index)
@@ -244,7 +253,7 @@ static int llog_process_thread(void *arg)
 
 		/* skip records not set in bitmap */
 		while (index <= last_index &&
-		       !ext2_test_bit(index, llh->llh_bitmap))
+		       !ext2_test_bit(index, LLOG_HDR_BITMAP(llh)))
 			++index;
 
 		LASSERT(index <= last_index + 1);
@@ -255,10 +264,10 @@ static int llog_process_thread(void *arg)
 		       index, last_index);
 
 		/* get the buf with our target record; avoid old garbage */
-		memset(buf, 0, LLOG_CHUNK_SIZE);
+		memset(buf, 0, chunk_size);
 		last_offset = cur_offset;
 		rc = llog_next_block(lpi->lpi_env, loghandle, &saved_index,
-				     index, &cur_offset, buf, LLOG_CHUNK_SIZE);
+				     index, &cur_offset, buf, chunk_size);
 		if (rc)
 			goto out;
 
@@ -267,7 +276,7 @@ static int llog_process_thread(void *arg)
 		 * swabbing is done at the beginning of the loop.
 		 */
 		for (rec = (struct llog_rec_hdr *)buf;
-		     (char *)rec < buf + LLOG_CHUNK_SIZE;
+		     (char *)rec < buf + chunk_size;
 		     rec = llog_rec_hdr_next(rec)) {
 			CDEBUG(D_OTHER, "processing rec 0x%p type %#x\n",
 			       rec, rec->lrh_type);
@@ -285,8 +294,7 @@ static int llog_process_thread(void *arg)
 					goto repeat;
 				goto out; /* no more records */
 			}
-			if (rec->lrh_len == 0 ||
-			    rec->lrh_len > LLOG_CHUNK_SIZE) {
+			if (!rec->lrh_len || rec->lrh_len > chunk_size) {
 				CWARN("invalid length %d in llog record for index %d/%d\n",
 				      rec->lrh_len,
 				      rec->lrh_index, index);
@@ -303,14 +311,14 @@ static int llog_process_thread(void *arg)
 			CDEBUG(D_OTHER,
 			       "lrh_index: %d lrh_len: %d (%d remains)\n",
 			       rec->lrh_index, rec->lrh_len,
-			       (int)(buf + LLOG_CHUNK_SIZE - (char *)rec));
+			       (int)(buf + chunk_size - (char *)rec));
 
 			loghandle->lgh_cur_idx = rec->lrh_index;
 			loghandle->lgh_cur_offset = (char *)rec - (char *)buf +
 						    last_offset;
 
 			/* if set, process the callback on this record */
-			if (ext2_test_bit(index, llh->llh_bitmap)) {
+			if (ext2_test_bit(index, LLOG_HDR_BITMAP(llh))) {
 				rc = lpi->lpi_cb(lpi->lpi_env, loghandle, rec,
 						 lpi->lpi_cbdata);
 				last_called_index = index;
diff --git a/drivers/staging/lustre/lustre/obdclass/llog_obd.c b/drivers/staging/lustre/lustre/obdclass/llog_obd.c
index a4277d6..8574ad4 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog_obd.c
@@ -158,6 +158,7 @@ int llog_setup(const struct lu_env *env, struct obd_device *obd,
 	mutex_init(&ctxt->loc_mutex);
 	ctxt->loc_exp = class_export_get(disk_obd->obd_self_export);
 	ctxt->loc_flags = LLOG_CTXT_FLAG_UNINITIALIZED;
+	ctxt->loc_chunk_size = LLOG_MIN_CHUNK_SIZE;
 
 	rc = llog_group_set_ctxt(olg, ctxt, index);
 	if (rc) {
diff --git a/drivers/staging/lustre/lustre/obdclass/llog_swab.c b/drivers/staging/lustre/lustre/obdclass/llog_swab.c
index 8c4c1b3..7869092 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog_swab.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog_swab.c
@@ -244,7 +244,7 @@ void lustre_swab_llog_rec(struct llog_rec_hdr *rec)
 		__swab32s(&llh->llh_flags);
 		__swab32s(&llh->llh_size);
 		__swab32s(&llh->llh_cat_idx);
-		tail = &llh->llh_tail;
+		tail = LLOG_HDR_TAIL(llh);
 		break;
 	}
 	case LLOG_LOGID_MAGIC:
@@ -290,8 +290,10 @@ static void print_llog_hdr(struct llog_log_hdr *h)
 	CDEBUG(D_OTHER, "\tllh_flags: %#x\n", h->llh_flags);
 	CDEBUG(D_OTHER, "\tllh_size: %#x\n", h->llh_size);
 	CDEBUG(D_OTHER, "\tllh_cat_idx: %#x\n", h->llh_cat_idx);
-	CDEBUG(D_OTHER, "\tllh_tail.lrt_index: %#x\n", h->llh_tail.lrt_index);
-	CDEBUG(D_OTHER, "\tllh_tail.lrt_len: %#x\n", h->llh_tail.lrt_len);
+	CDEBUG(D_OTHER, "\tllh_tail.lrt_index: %#x\n",
+	       LLOG_HDR_TAIL(h)->lrt_index);
+	CDEBUG(D_OTHER, "\tllh_tail.lrt_len: %#x\n",
+	       LLOG_HDR_TAIL(h)->lrt_len);
 }
 
 void lustre_swab_llog_hdr(struct llog_log_hdr *h)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/llog_client.c b/drivers/staging/lustre/lustre/ptlrpc/llog_client.c
index 0f55c01..110d9f5 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/llog_client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/llog_client.c
@@ -287,8 +287,13 @@ static int llog_client_read_header(const struct lu_env *env,
 		goto out;
 	}
 
-	memcpy(handle->lgh_hdr, hdr, sizeof(*hdr));
-	handle->lgh_last_idx = handle->lgh_hdr->llh_tail.lrt_index;
+	if (handle->lgh_hdr_size < hdr->llh_hdr.lrh_len) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	memcpy(handle->lgh_hdr, hdr, hdr->llh_hdr.lrh_len);
+	handle->lgh_last_idx = LLOG_HDR_TAIL(handle->lgh_hdr)->lrt_index;
 
 	/* sanity checks */
 	llh_hdr = &handle->lgh_hdr->llh_hdr;
@@ -296,9 +301,14 @@ static int llog_client_read_header(const struct lu_env *env,
 		CERROR("bad log header magic: %#x (expecting %#x)\n",
 		       llh_hdr->lrh_type, LLOG_HDR_MAGIC);
 		rc = -EIO;
-	} else if (llh_hdr->lrh_len != LLOG_CHUNK_SIZE) {
-		CERROR("incorrectly sized log header: %#x (expecting %#x)\n",
-		       llh_hdr->lrh_len, LLOG_CHUNK_SIZE);
+	} else if (llh_hdr->lrh_len !=
+		   LLOG_HDR_TAIL(handle->lgh_hdr)->lrt_len ||
+		   (llh_hdr->lrh_len & (llh_hdr->lrh_len - 1)) ||
+		   llh_hdr->lrh_len < LLOG_MIN_CHUNK_SIZE ||
+		   llh_hdr->lrh_len > handle->lgh_hdr_size) {
+		CERROR("incorrectly sized log header: %#x (expecting %#x) (power of two > 8192)\n",
+		       llh_hdr->lrh_len,
+		       LLOG_HDR_TAIL(handle->lgh_hdr)->lrt_len);
 		CERROR("you may need to re-run lconf --write_conf.\n");
 		rc = -EIO;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/29] staging: lustre: lmv: allow cross-MDT rename and link
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (3 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 04/29] staging: lustre: obdclass: variable llog chunk size James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 06/29] staging: lustre: ptlrpc: Introduce iovec to bulk descriptor James Simmons
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Remove checks for cross-MDT operation, so cross-MDT
rename and link will be allowed.

Remove obsolete locality parameters in MDT lock API after all of
cross-MDT operations are allowed.

This is the client side changed only so the upstream
client can support this.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3537
Reviewed-on: http://review.whamcloud.com/12282
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c |   77 ++++++++++++++++++++-------
 1 files changed, 57 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 679cd87..7a8d1c8 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1903,7 +1903,10 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 {
 	struct obd_device       *obd = exp->exp_obd;
 	struct lmv_obd	  *lmv = &obd->u.lmv;
+	struct obd_export *target_exp;
 	struct lmv_tgt_desc     *src_tgt;
+	struct lmv_tgt_desc *tgt_tgt;
+	struct mdt_body *body;
 	int			rc;
 
 	LASSERT(oldlen != 0);
@@ -1943,6 +1946,10 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 		if (rc)
 			return rc;
 		src_tgt = lmv_find_target(lmv, &op_data->op_fid3);
+		if (IS_ERR(src_tgt))
+			return PTR_ERR(src_tgt);
+
+		target_exp = src_tgt->ltd_exp;
 	} else {
 		if (op_data->op_mea1) {
 			struct lmv_stripe_md *lsm = op_data->op_mea1;
@@ -1951,29 +1958,27 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 							     oldlen,
 							     &op_data->op_fid1,
 							     &op_data->op_mds);
-			if (IS_ERR(src_tgt))
-				return PTR_ERR(src_tgt);
 		} else {
 			src_tgt = lmv_find_target(lmv, &op_data->op_fid1);
-			if (IS_ERR(src_tgt))
-				return PTR_ERR(src_tgt);
-
-			op_data->op_mds = src_tgt->ltd_idx;
 		}
+		if (IS_ERR(src_tgt))
+			return PTR_ERR(src_tgt);
 
 		if (op_data->op_mea2) {
 			struct lmv_stripe_md *lsm = op_data->op_mea2;
-			const struct lmv_oinfo *oinfo;
 
-			oinfo = lsm_name_to_stripe_info(lsm, new, newlen);
-			if (IS_ERR(oinfo))
-				return PTR_ERR(oinfo);
-
-			op_data->op_fid2 = oinfo->lmo_fid;
+			tgt_tgt = lmv_locate_target_for_name(lmv, lsm, new,
+							     newlen,
+							     &op_data->op_fid2,
+							     &op_data->op_mds);
+		} else {
+			tgt_tgt = lmv_find_target(lmv, &op_data->op_fid2);
 		}
+		if (IS_ERR(tgt_tgt))
+			return PTR_ERR(tgt_tgt);
+
+		target_exp = tgt_tgt->ltd_exp;
 	}
-	if (IS_ERR(src_tgt))
-		return PTR_ERR(src_tgt);
 
 	/*
 	 * LOOKUP lock on src child (fid3) should also be cancelled for
@@ -2014,20 +2019,52 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 			return rc;
 	}
 
+retry_rename:
 	/*
 	 * Cancel all the locks on tgt child (fid4).
 	 */
-	if (fid_is_sane(&op_data->op_fid4))
+	if (fid_is_sane(&op_data->op_fid4)) {
+		struct lmv_tgt_desc *tgt;
+
 		rc = lmv_early_cancel(exp, NULL, op_data, src_tgt->ltd_idx,
 				      LCK_EX, MDS_INODELOCK_FULL,
 				      MF_MDC_CANCEL_FID4);
+		if (rc)
+			return rc;
 
-	CDEBUG(D_INODE, DFID":m%d to "DFID"\n", PFID(&op_data->op_fid1),
-	       op_data->op_mds, PFID(&op_data->op_fid2));
+		tgt = lmv_find_target(lmv, &op_data->op_fid4);
+		if (IS_ERR(tgt))
+			return PTR_ERR(tgt);
 
-	rc = md_rename(src_tgt->ltd_exp, op_data, old, oldlen,
-		       new, newlen, request);
-	return rc;
+		/*
+		 * Since the target child might be destroyed, and it might
+		 * become orphan, and we can only check orphan on the local
+		 * MDT right now, so we send rename request to the MDT where
+		 * target child is located. If target child does not exist,
+		 * then it will send the request to the target parent
+		 */
+		target_exp = tgt->ltd_exp;
+	}
+
+	rc = md_rename(target_exp, op_data, old, oldlen, new, newlen, request);
+	if (rc && rc != -EREMOTE)
+		return rc;
+
+	body = req_capsule_server_get(&(*request)->rq_pill, &RMF_MDT_BODY);
+	if (!body)
+		return -EPROTO;
+
+	/* Not cross-ref case, just get out of here. */
+	if (likely(!(body->mbo_valid & OBD_MD_MDS)))
+		return rc;
+
+	CDEBUG(D_INODE, "%s: try rename to another MDT for " DFID "\n",
+	       exp->exp_obd->obd_name, PFID(&body->mbo_fid1));
+
+	op_data->op_fid4 = body->mbo_fid1;
+	ptlrpc_req_finished(*request);
+	*request = NULL;
+	goto retry_rename;
 }
 
 static int lmv_setattr(struct obd_export *exp, struct md_op_data *op_data,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/29] staging: lustre: ptlrpc: Introduce iovec to bulk descriptor
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (4 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 05/29] staging: lustre: lmv: allow cross-MDT rename and link James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 07/29] staging: lustre: lov: remove LSM from struct lustre_md James Simmons
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Amir Shehata,
	James Simmons

From: Amir Shehata <amir.shehata@intel.com>

Added a union to ptlrpc_bulk_desc for KVEC and KIOV buffers.
bd_type has been changed to be a bit mask. Bits are set in
bd_type to specify {put,get}{source,sink}{kvec,kiov}
changed all instances in the code to access the union properly
ASSUMPTION: all current code only works with KIOV and new DNE code
to be added will introduce a different code path that uses IOVEC

As part of the implementation buffer operations are added
as callbacks to be defined by users of ptlrpc.  This enables
buffer users to define how to allocate and release buffers.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5835
Reviewed-on: http://review.whamcloud.com/12525
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  160 ++++++++++++++++---
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    8 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    8 +-
 drivers/staging/lustre/lustre/osc/osc_page.c       |    4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    7 +-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |  121 +++++++++++----
 drivers/staging/lustre/lustre/ptlrpc/events.c      |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |    7 +-
 drivers/staging/lustre/lustre/ptlrpc/pers.c        |   31 ++--
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |    9 +-
 drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c    |   19 ++-
 drivers/staging/lustre/lustre/ptlrpc/sec_plain.c   |   20 ++-
 12 files changed, 290 insertions(+), 108 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 7302238..67a7095 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -50,6 +50,7 @@
  * @{
  */
 
+#include <linux/uio.h>
 #include "../../include/linux/libcfs/libcfs.h"
 #include "../../include/linux/lnet/nidstr.h"
 #include "../../include/linux/lnet/api.h"
@@ -1083,10 +1084,93 @@ struct ptlrpc_bulk_page {
 	struct page     *bp_page;
 };
 
-#define BULK_GET_SOURCE   0
-#define BULK_PUT_SINK     1
-#define BULK_GET_SINK     2
-#define BULK_PUT_SOURCE   3
+enum ptlrpc_bulk_op_type {
+	PTLRPC_BULK_OP_ACTIVE	= 0x00000001,
+	PTLRPC_BULK_OP_PASSIVE	= 0x00000002,
+	PTLRPC_BULK_OP_PUT	= 0x00000004,
+	PTLRPC_BULK_OP_GET	= 0x00000008,
+	PTLRPC_BULK_BUF_KVEC	= 0x00000010,
+	PTLRPC_BULK_BUF_KIOV	= 0x00000020,
+	PTLRPC_BULK_GET_SOURCE	= PTLRPC_BULK_OP_PASSIVE | PTLRPC_BULK_OP_GET,
+	PTLRPC_BULK_PUT_SINK	= PTLRPC_BULK_OP_PASSIVE | PTLRPC_BULK_OP_PUT,
+	PTLRPC_BULK_GET_SINK	= PTLRPC_BULK_OP_ACTIVE | PTLRPC_BULK_OP_GET,
+	PTLRPC_BULK_PUT_SOURCE	= PTLRPC_BULK_OP_ACTIVE | PTLRPC_BULK_OP_PUT,
+};
+
+static inline bool ptlrpc_is_bulk_op_get(enum ptlrpc_bulk_op_type type)
+{
+	return (type & PTLRPC_BULK_OP_GET) == PTLRPC_BULK_OP_GET;
+}
+
+static inline bool ptlrpc_is_bulk_get_source(enum ptlrpc_bulk_op_type type)
+{
+	return (type & PTLRPC_BULK_GET_SOURCE) == PTLRPC_BULK_GET_SOURCE;
+}
+
+static inline bool ptlrpc_is_bulk_put_sink(enum ptlrpc_bulk_op_type type)
+{
+	return (type & PTLRPC_BULK_PUT_SINK) == PTLRPC_BULK_PUT_SINK;
+}
+
+static inline bool ptlrpc_is_bulk_get_sink(enum ptlrpc_bulk_op_type type)
+{
+	return (type & PTLRPC_BULK_GET_SINK) == PTLRPC_BULK_GET_SINK;
+}
+
+static inline bool ptlrpc_is_bulk_put_source(enum ptlrpc_bulk_op_type type)
+{
+	return (type & PTLRPC_BULK_PUT_SOURCE) == PTLRPC_BULK_PUT_SOURCE;
+}
+
+static inline bool ptlrpc_is_bulk_desc_kvec(enum ptlrpc_bulk_op_type type)
+{
+	return ((type & PTLRPC_BULK_BUF_KVEC) | (type & PTLRPC_BULK_BUF_KIOV))
+		== PTLRPC_BULK_BUF_KVEC;
+}
+
+static inline bool ptlrpc_is_bulk_desc_kiov(enum ptlrpc_bulk_op_type type)
+{
+	return ((type & PTLRPC_BULK_BUF_KVEC) | (type & PTLRPC_BULK_BUF_KIOV))
+		== PTLRPC_BULK_BUF_KIOV;
+}
+
+static inline bool ptlrpc_is_bulk_op_active(enum ptlrpc_bulk_op_type type)
+{
+	return ((type & PTLRPC_BULK_OP_ACTIVE) |
+		(type & PTLRPC_BULK_OP_PASSIVE)) == PTLRPC_BULK_OP_ACTIVE;
+}
+
+static inline bool ptlrpc_is_bulk_op_passive(enum ptlrpc_bulk_op_type type)
+{
+	return ((type & PTLRPC_BULK_OP_ACTIVE) |
+		(type & PTLRPC_BULK_OP_PASSIVE)) == PTLRPC_BULK_OP_PASSIVE;
+}
+
+struct ptlrpc_bulk_frag_ops {
+	/**
+	 * Add a page \a page to the bulk descriptor \a desc
+	 * Data to transfer in the page starts at offset \a pageoffset and
+	 * amount of data to transfer from the page is \a len
+	 */
+	void (*add_kiov_frag)(struct ptlrpc_bulk_desc *desc,
+			      struct page *page, int pageoffset, int len);
+
+	/*
+	 * Add a \a fragment to the bulk descriptor \a desc.
+	 * Data to transfer in the fragment is pointed to by \a frag
+	 * The size of the fragment is \a len
+	 */
+	int (*add_iov_frag)(struct ptlrpc_bulk_desc *desc, void *frag, int len);
+
+	/**
+	 * Uninitialize and free bulk descriptor \a desc.
+	 * Works on bulk descriptors both from server and client side.
+	 */
+	void (*release_frags)(struct ptlrpc_bulk_desc *desc);
+};
+
+extern const struct ptlrpc_bulk_frag_ops ptlrpc_bulk_kiov_pin_ops;
+extern const struct ptlrpc_bulk_frag_ops ptlrpc_bulk_kiov_nopin_ops;
 
 /**
  * Definition of bulk descriptor.
@@ -1101,14 +1185,14 @@ struct ptlrpc_bulk_page {
 struct ptlrpc_bulk_desc {
 	/** completed with failure */
 	unsigned long bd_failure:1;
-	/** {put,get}{source,sink} */
-	unsigned long bd_type:2;
 	/** client side */
 	unsigned long bd_registered:1;
 	/** For serialization with callback */
 	spinlock_t bd_lock;
 	/** Import generation when request for this bulk was sent */
 	int bd_import_generation;
+	/** {put,get}{source,sink}{kvec,kiov} */
+	enum ptlrpc_bulk_op_type bd_type;
 	/** LNet portal for this bulk */
 	__u32 bd_portal;
 	/** Server side - export this bulk created for */
@@ -1117,6 +1201,7 @@ struct ptlrpc_bulk_desc {
 	struct obd_import *bd_import;
 	/** Back pointer to the request */
 	struct ptlrpc_request *bd_req;
+	struct ptlrpc_bulk_frag_ops *bd_frag_ops;
 	wait_queue_head_t	    bd_waitq;	/* server side only WQ */
 	int		    bd_iov_count;    /* # entries in bd_iov */
 	int		    bd_max_iov;      /* allocated size of bd_iov */
@@ -1132,14 +1217,31 @@ struct ptlrpc_bulk_desc {
 	/** array of associated MDs */
 	lnet_handle_md_t	bd_mds[PTLRPC_BULK_OPS_COUNT];
 
-	/*
-	 * encrypt iov, size is either 0 or bd_iov_count.
-	 */
-	lnet_kiov_t	   *bd_enc_iov;
-
-	lnet_kiov_t	    bd_iov[0];
+	union {
+		struct {
+			/*
+			 * encrypt iov, size is either 0 or bd_iov_count.
+			 */
+			struct bio_vec *bd_enc_vec;
+			struct bio_vec *bd_vec;	/* Array of bio_vecs */
+		} bd_kiov;
+
+		struct {
+			struct kvec *bd_enc_kvec;
+			struct kvec *bd_kvec;	/* Array of kvecs */
+		} bd_kvec;
+	} bd_u;
 };
 
+#define GET_KIOV(desc)			((desc)->bd_u.bd_kiov.bd_vec)
+#define BD_GET_KIOV(desc, i)		((desc)->bd_u.bd_kiov.bd_vec[i])
+#define GET_ENC_KIOV(desc)		((desc)->bd_u.bd_kiov.bd_enc_vec)
+#define BD_GET_ENC_KIOV(desc, i)	((desc)->bd_u.bd_kiov.bd_enc_vec[i])
+#define GET_KVEC(desc)			((desc)->bd_u.bd_kvec.bd_kvec)
+#define BD_GET_KVEC(desc, i)		((desc)->bd_u.bd_kvec.bd_kvec[i])
+#define GET_ENC_KVEC(desc)		((desc)->bd_u.bd_kvec.bd_enc_kvec)
+#define BD_GET_ENC_KVEC(desc, i)	((desc)->bd_u.bd_kvec.bd_enc_kvec[i])
+
 enum {
 	SVC_STOPPED     = 1 << 0,
 	SVC_STOPPING    = 1 << 1,
@@ -1754,21 +1856,17 @@ int ptlrpc_request_bufs_pack(struct ptlrpc_request *request,
 void ptlrpc_req_finished(struct ptlrpc_request *request);
 struct ptlrpc_request *ptlrpc_request_addref(struct ptlrpc_request *req);
 struct ptlrpc_bulk_desc *ptlrpc_prep_bulk_imp(struct ptlrpc_request *req,
-					      unsigned npages, unsigned max_brw,
-					      unsigned type, unsigned portal);
-void __ptlrpc_free_bulk(struct ptlrpc_bulk_desc *bulk, int pin);
-static inline void ptlrpc_free_bulk_pin(struct ptlrpc_bulk_desc *bulk)
-{
-	__ptlrpc_free_bulk(bulk, 1);
-}
-
-static inline void ptlrpc_free_bulk_nopin(struct ptlrpc_bulk_desc *bulk)
-{
-	__ptlrpc_free_bulk(bulk, 0);
-}
-
+					      unsigned int nfrags,
+					      unsigned int max_brw,
+					      unsigned int type,
+					      unsigned int portal,
+					      const struct ptlrpc_bulk_frag_ops *ops);
+
+int ptlrpc_prep_bulk_frag(struct ptlrpc_bulk_desc *desc,
+			  void *frag, int len);
 void __ptlrpc_prep_bulk_page(struct ptlrpc_bulk_desc *desc,
-			     struct page *page, int pageoffset, int len, int);
+			     struct page *page, int pageoffset, int len,
+			     int pin);
 static inline void ptlrpc_prep_bulk_page_pin(struct ptlrpc_bulk_desc *desc,
 					     struct page *page, int pageoffset,
 					     int len)
@@ -1783,6 +1881,16 @@ static inline void ptlrpc_prep_bulk_page_nopin(struct ptlrpc_bulk_desc *desc,
 	__ptlrpc_prep_bulk_page(desc, page, pageoffset, len, 0);
 }
 
+void ptlrpc_free_bulk(struct ptlrpc_bulk_desc *bulk);
+
+static inline void ptlrpc_release_bulk_page_pin(struct ptlrpc_bulk_desc *desc)
+{
+	int i;
+
+	for (i = 0; i < desc->bd_iov_count ; i++)
+		put_page(BD_GET_KIOV(desc, i).bv_page);
+}
+
 void ptlrpc_retain_replayable_request(struct ptlrpc_request *req,
 				      struct obd_import *imp);
 __u64 ptlrpc_next_xid(void);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 5282c67..4c30ab6 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -859,8 +859,10 @@ static int mdc_getpage(struct obd_export *exp, const struct lu_fid *fid,
 	req->rq_request_portal = MDS_READPAGE_PORTAL;
 	ptlrpc_at_set_req_timeout(req);
 
-	desc = ptlrpc_prep_bulk_imp(req, npages, 1, BULK_PUT_SINK,
-				    MDS_BULK_PORTAL);
+	desc = ptlrpc_prep_bulk_imp(req, npages, 1,
+				    PTLRPC_BULK_PUT_SINK | PTLRPC_BULK_BUF_KIOV,
+				    MDS_BULK_PORTAL,
+				    &ptlrpc_bulk_kiov_pin_ops);
 	if (!desc) {
 		ptlrpc_request_free(req);
 		return -ENOMEM;
@@ -868,7 +870,7 @@ static int mdc_getpage(struct obd_export *exp, const struct lu_fid *fid,
 
 	/* NB req now owns desc and will free it when it gets freed */
 	for (i = 0; i < npages; i++)
-		ptlrpc_prep_bulk_page_pin(desc, pages[i], 0, PAGE_SIZE);
+		desc->bd_frag_ops->add_kiov_frag(desc, pages[i], 0, PAGE_SIZE);
 
 	mdc_readdir_pack(req, offset, PAGE_SIZE * npages, fid);
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 7b5ac44..2d6fdd0 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -1386,15 +1386,17 @@ static int mgc_process_recover_log(struct obd_device *obd,
 	body->mcb_units  = nrpages;
 
 	/* allocate bulk transfer descriptor */
-	desc = ptlrpc_prep_bulk_imp(req, nrpages, 1, BULK_PUT_SINK,
-				    MGS_BULK_PORTAL);
+	desc = ptlrpc_prep_bulk_imp(req, nrpages, 1,
+				    PTLRPC_BULK_PUT_SINK | PTLRPC_BULK_BUF_KIOV,
+				    MGS_BULK_PORTAL,
+				    &ptlrpc_bulk_kiov_pin_ops);
 	if (!desc) {
 		rc = -ENOMEM;
 		goto out;
 	}
 
 	for (i = 0; i < nrpages; i++)
-		ptlrpc_prep_bulk_page_pin(desc, pages[i], 0, PAGE_SIZE);
+		desc->bd_frag_ops->add_kiov_frag(desc, pages[i], 0, PAGE_SIZE);
 
 	ptlrpc_request_set_replen(req);
 	rc = ptlrpc_queue_wait(req);
diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c b/drivers/staging/lustre/lustre/osc/osc_page.c
index 399d36b..2fb0e53 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -776,8 +776,10 @@ static inline void unstable_page_accounting(struct ptlrpc_bulk_desc *desc,
 	int count = 0;
 	int i;
 
+	LASSERT(ptlrpc_is_bulk_desc_kiov(desc->bd_type));
+
 	for (i = 0; i < page_count; i++) {
-		pg_data_t *pgdat = page_pgdat(desc->bd_iov[i].bv_page);
+		pg_data_t *pgdat = page_pgdat(BD_GET_KIOV(desc, i).bv_page);
 
 		if (likely(pgdat == last)) {
 			++count;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index efd938a..c570d19 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -1030,8 +1030,9 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 
 	desc = ptlrpc_prep_bulk_imp(req, page_count,
 		cli->cl_import->imp_connect_data.ocd_brw_size >> LNET_MTU_BITS,
-		opc == OST_WRITE ? BULK_GET_SOURCE : BULK_PUT_SINK,
-		OST_BULK_PORTAL);
+		(opc == OST_WRITE ? PTLRPC_BULK_GET_SOURCE :
+		 PTLRPC_BULK_PUT_SINK) | PTLRPC_BULK_BUF_KIOV, OST_BULK_PORTAL,
+		 &ptlrpc_bulk_kiov_pin_ops);
 
 	if (!desc) {
 		rc = -ENOMEM;
@@ -1079,7 +1080,7 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 		LASSERT((pga[0]->flag & OBD_BRW_SRVLOCK) ==
 			(pg->flag & OBD_BRW_SRVLOCK));
 
-		ptlrpc_prep_bulk_page_pin(desc, pg->pg, poff, pg->count);
+		desc->bd_frag_ops->add_kiov_frag(desc, pg->pg, poff, pg->count);
 		requested_nob += pg->count;
 
 		if (i > 0 && can_merge_pages(pg_prev, pg)) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index fa4d3c9..e4a31eb 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -43,6 +43,18 @@
 
 #include "ptlrpc_internal.h"
 
+const struct ptlrpc_bulk_frag_ops ptlrpc_bulk_kiov_pin_ops = {
+	.add_kiov_frag	= ptlrpc_prep_bulk_page_pin,
+	.release_frags	= ptlrpc_release_bulk_page_pin,
+};
+EXPORT_SYMBOL(ptlrpc_bulk_kiov_pin_ops);
+
+const struct ptlrpc_bulk_frag_ops ptlrpc_bulk_kiov_nopin_ops = {
+	.add_kiov_frag	= ptlrpc_prep_bulk_page_nopin,
+	.release_frags	= NULL,
+};
+EXPORT_SYMBOL(ptlrpc_bulk_kiov_nopin_ops);
+
 static int ptlrpc_send_new_req(struct ptlrpc_request *req);
 static int ptlrpcd_check_work(struct ptlrpc_request *req);
 static int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async);
@@ -95,24 +107,43 @@ struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid)
  * Allocate and initialize new bulk descriptor on the sender.
  * Returns pointer to the descriptor or NULL on error.
  */
-struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned npages, unsigned max_brw,
-					 unsigned type, unsigned portal)
+struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned int nfrags,
+					 unsigned int max_brw,
+					 enum ptlrpc_bulk_op_type type,
+					 unsigned int portal,
+					 const struct ptlrpc_bulk_frag_ops *ops)
 {
 	struct ptlrpc_bulk_desc *desc;
 	int i;
 
-	desc = kzalloc(offsetof(struct ptlrpc_bulk_desc, bd_iov[npages]),
-		       GFP_NOFS);
+	/* ensure that only one of KIOV or IOVEC is set but not both */
+	LASSERT((ptlrpc_is_bulk_desc_kiov(type) && ops->add_kiov_frag) ||
+		(ptlrpc_is_bulk_desc_kvec(type) && ops->add_iov_frag));
+
+	desc = kzalloc(sizeof(*desc), GFP_NOFS);
 	if (!desc)
 		return NULL;
 
+	if (type & PTLRPC_BULK_BUF_KIOV) {
+		GET_KIOV(desc) = kcalloc(nfrags, sizeof(*GET_KIOV(desc)),
+					 GFP_NOFS);
+		if (!GET_KIOV(desc))
+			goto out;
+	} else {
+		GET_KVEC(desc) = kcalloc(nfrags, sizeof(*GET_KVEC(desc)),
+					 GFP_NOFS);
+		if (!GET_KVEC(desc))
+			goto out;
+	}
+
 	spin_lock_init(&desc->bd_lock);
 	init_waitqueue_head(&desc->bd_waitq);
-	desc->bd_max_iov = npages;
+	desc->bd_max_iov = nfrags;
 	desc->bd_iov_count = 0;
 	desc->bd_portal = portal;
 	desc->bd_type = type;
 	desc->bd_md_count = 0;
+	desc->bd_frag_ops = (struct ptlrpc_bulk_frag_ops *)ops;
 	LASSERT(max_brw > 0);
 	desc->bd_md_max_brw = min(max_brw, PTLRPC_BULK_OPS_COUNT);
 	/*
@@ -123,24 +154,30 @@ struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned npages, unsigned max_brw,
 		LNetInvalidateHandle(&desc->bd_mds[i]);
 
 	return desc;
+out:
+	return NULL;
 }
 
 /**
  * Prepare bulk descriptor for specified outgoing request \a req that
- * can fit \a npages * pages. \a type is bulk type. \a portal is where
+ * can fit \a nfrags * pages. \a type is bulk type. \a portal is where
  * the bulk to be sent. Used on client-side.
  * Returns pointer to newly allocated initialized bulk descriptor or NULL on
  * error.
  */
 struct ptlrpc_bulk_desc *ptlrpc_prep_bulk_imp(struct ptlrpc_request *req,
-					      unsigned npages, unsigned max_brw,
-					      unsigned type, unsigned portal)
+					      unsigned int nfrags,
+					      unsigned int max_brw,
+					      unsigned int type,
+					      unsigned int portal,
+					      const struct ptlrpc_bulk_frag_ops *ops)
 {
 	struct obd_import *imp = req->rq_import;
 	struct ptlrpc_bulk_desc *desc;
 
-	LASSERT(type == BULK_PUT_SINK || type == BULK_GET_SOURCE);
-	desc = ptlrpc_new_bulk(npages, max_brw, type, portal);
+	LASSERT(ptlrpc_is_bulk_op_passive(type));
+
+	desc = ptlrpc_new_bulk(nfrags, max_brw, type, portal, ops);
 	if (!desc)
 		return NULL;
 
@@ -158,56 +195,82 @@ struct ptlrpc_bulk_desc *ptlrpc_prep_bulk_imp(struct ptlrpc_request *req,
 }
 EXPORT_SYMBOL(ptlrpc_prep_bulk_imp);
 
-/**
- * Add a page \a page to the bulk descriptor \a desc.
- * Data to transfer in the page starts at offset \a pageoffset and
- * amount of data to transfer from the page is \a len
- */
 void __ptlrpc_prep_bulk_page(struct ptlrpc_bulk_desc *desc,
 			     struct page *page, int pageoffset, int len, int pin)
 {
+	struct bio_vec *kiov;
+
 	LASSERT(desc->bd_iov_count < desc->bd_max_iov);
 	LASSERT(page);
 	LASSERT(pageoffset >= 0);
 	LASSERT(len > 0);
 	LASSERT(pageoffset + len <= PAGE_SIZE);
+	LASSERT(ptlrpc_is_bulk_desc_kiov(desc->bd_type));
+
+	kiov = &BD_GET_KIOV(desc, desc->bd_iov_count);
 
 	desc->bd_nob += len;
 
 	if (pin)
 		get_page(page);
 
-	ptlrpc_add_bulk_page(desc, page, pageoffset, len);
+	kiov->bv_page = page;
+	kiov->bv_offset = pageoffset;
+	kiov->bv_len = len;
+
+	desc->bd_iov_count++;
 }
 EXPORT_SYMBOL(__ptlrpc_prep_bulk_page);
 
-/**
- * Uninitialize and free bulk descriptor \a desc.
- * Works on bulk descriptors both from server and client side.
- */
-void __ptlrpc_free_bulk(struct ptlrpc_bulk_desc *desc, int unpin)
+int ptlrpc_prep_bulk_frag(struct ptlrpc_bulk_desc *desc,
+			  void *frag, int len)
 {
-	int i;
+	struct kvec *iovec;
+
+	LASSERT(desc->bd_iov_count < desc->bd_max_iov);
+	LASSERT(frag);
+	LASSERT(len > 0);
+	LASSERT(ptlrpc_is_bulk_desc_kvec(desc->bd_type));
+
+	iovec = &BD_GET_KVEC(desc, desc->bd_iov_count);
+
+	desc->bd_nob += len;
+
+	iovec->iov_base = frag;
+	iovec->iov_len = len;
 
+	desc->bd_iov_count++;
+
+	return desc->bd_nob;
+}
+EXPORT_SYMBOL(ptlrpc_prep_bulk_frag);
+
+void ptlrpc_free_bulk(struct ptlrpc_bulk_desc *desc)
+{
 	LASSERT(desc->bd_iov_count != LI_POISON); /* not freed already */
 	LASSERT(desc->bd_md_count == 0);	 /* network hands off */
 	LASSERT((desc->bd_export != NULL) ^ (desc->bd_import != NULL));
+	LASSERT(desc->bd_frag_ops);
 
-	sptlrpc_enc_pool_put_pages(desc);
+	if (ptlrpc_is_bulk_desc_kiov(desc->bd_type))
+		sptlrpc_enc_pool_put_pages(desc);
 
 	if (desc->bd_export)
 		class_export_put(desc->bd_export);
 	else
 		class_import_put(desc->bd_import);
 
-	if (unpin) {
-		for (i = 0; i < desc->bd_iov_count; i++)
-			put_page(desc->bd_iov[i].bv_page);
-	}
+	if (desc->bd_frag_ops->release_frags)
+		desc->bd_frag_ops->release_frags(desc);
+
+	if (ptlrpc_is_bulk_desc_kiov(desc->bd_type))
+		kfree(GET_KIOV(desc));
+	else
+		kfree(GET_KVEC(desc));
 
 	kfree(desc);
 }
-EXPORT_SYMBOL(__ptlrpc_free_bulk);
+EXPORT_SYMBOL(ptlrpc_free_bulk);
 
 /**
  * Set server timelimit for this req, i.e. how long are we willing to wait
@@ -2266,7 +2329,7 @@ static void __ptlrpc_free_req(struct ptlrpc_request *request, int locked)
 		request->rq_import = NULL;
 	}
 	if (request->rq_bulk)
-		ptlrpc_free_bulk_pin(request->rq_bulk);
+		ptlrpc_free_bulk(request->rq_bulk);
 
 	if (request->rq_reqbuf || request->rq_clrbuf)
 		sptlrpc_cli_free_reqbuf(request);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c
index 283dfb2..49f3e63 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -182,9 +182,9 @@ void client_bulk_callback(lnet_event_t *ev)
 	struct ptlrpc_bulk_desc *desc = cbid->cbid_arg;
 	struct ptlrpc_request *req;
 
-	LASSERT((desc->bd_type == BULK_PUT_SINK &&
+	LASSERT((ptlrpc_is_bulk_put_sink(desc->bd_type) &&
 		 ev->type == LNET_EVENT_PUT) ||
-		(desc->bd_type == BULK_GET_SOURCE &&
+		(ptlrpc_is_bulk_get_source(desc->bd_type) &&
 		 ev->type == LNET_EVENT_GET) ||
 		ev->type == LNET_EVENT_UNLINK);
 	LASSERT(ev->unlinked);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index 9c93739..c2dd948 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -127,8 +127,7 @@ static int ptlrpc_register_bulk(struct ptlrpc_request *req)
 	LASSERT(desc->bd_md_max_brw <= PTLRPC_BULK_OPS_COUNT);
 	LASSERT(desc->bd_iov_count <= PTLRPC_MAX_BRW_PAGES);
 	LASSERT(desc->bd_req);
-	LASSERT(desc->bd_type == BULK_PUT_SINK ||
-		desc->bd_type == BULK_GET_SOURCE);
+	LASSERT(ptlrpc_is_bulk_op_passive(desc->bd_type));
 
 	/* cleanup the state of the bulk for it will be reused */
 	if (req->rq_resend || req->rq_send_state == LUSTRE_IMP_REPLAY)
@@ -168,7 +167,7 @@ static int ptlrpc_register_bulk(struct ptlrpc_request *req)
 
 	for (posted_md = 0; posted_md < total_md; posted_md++, xid++) {
 		md.options = PTLRPC_MD_OPTIONS |
-			     ((desc->bd_type == BULK_GET_SOURCE) ?
+			     (ptlrpc_is_bulk_op_get(desc->bd_type) ?
 			      LNET_MD_OP_GET : LNET_MD_OP_PUT);
 		ptlrpc_fill_bulk_md(&md, desc, posted_md);
 
@@ -223,7 +222,7 @@ static int ptlrpc_register_bulk(struct ptlrpc_request *req)
 
 	CDEBUG(D_NET, "Setup %u bulk %s buffers: %u pages %u bytes, xid x%#llx-%#llx, portal %u\n",
 	       desc->bd_md_count,
-	       desc->bd_type == BULK_GET_SOURCE ? "get-source" : "put-sink",
+	       ptlrpc_is_bulk_op_get(desc->bd_type) ? "get-source" : "put-sink",
 	       desc->bd_iov_count, desc->bd_nob,
 	       desc->bd_last_xid, req->rq_xid, desc->bd_portal);
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pers.c b/drivers/staging/lustre/lustre/ptlrpc/pers.c
index 5b9fb11..94e9fa8 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pers.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pers.c
@@ -43,6 +43,8 @@
 void ptlrpc_fill_bulk_md(lnet_md_t *md, struct ptlrpc_bulk_desc *desc,
 			 int mdidx)
 {
+	int offset = mdidx * LNET_MAX_IOV;
+
 	CLASSERT(PTLRPC_MAX_BRW_PAGES < LI_POISON);
 
 	LASSERT(mdidx < desc->bd_md_max_brw);
@@ -50,23 +52,20 @@ void ptlrpc_fill_bulk_md(lnet_md_t *md, struct ptlrpc_bulk_desc *desc,
 	LASSERT(!(md->options & (LNET_MD_IOVEC | LNET_MD_KIOV |
 				 LNET_MD_PHYS)));
 
-	md->options |= LNET_MD_KIOV;
 	md->length = max(0, desc->bd_iov_count - mdidx * LNET_MAX_IOV);
 	md->length = min_t(unsigned int, LNET_MAX_IOV, md->length);
-	if (desc->bd_enc_iov)
-		md->start = &desc->bd_enc_iov[mdidx * LNET_MAX_IOV];
-	else
-		md->start = &desc->bd_iov[mdidx * LNET_MAX_IOV];
-}
-
-void ptlrpc_add_bulk_page(struct ptlrpc_bulk_desc *desc, struct page *page,
-			  int pageoffset, int len)
-{
-	lnet_kiov_t *kiov = &desc->bd_iov[desc->bd_iov_count];
-
-	kiov->bv_page = page;
-	kiov->bv_offset = pageoffset;
-	kiov->bv_len = len;
 
-	desc->bd_iov_count++;
+	if (ptlrpc_is_bulk_desc_kiov(desc->bd_type)) {
+		md->options |= LNET_MD_KIOV;
+		if (GET_ENC_KIOV(desc))
+			md->start = &BD_GET_ENC_KIOV(desc, offset);
+		else
+			md->start = &BD_GET_KIOV(desc, offset);
+	} else {
+		md->options |= LNET_MD_IOVEC;
+		if (GET_ENC_KVEC(desc))
+			md->start = &BD_GET_ENC_KVEC(desc, offset);
+		else
+			md->start = &BD_GET_KVEC(desc, offset);
+	}
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index f14d193..b848c25 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -55,8 +55,11 @@
 /* client.c */
 void ptlrpc_at_adj_net_latency(struct ptlrpc_request *req,
 			       unsigned int service_time);
-struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned npages, unsigned max_brw,
-					 unsigned type, unsigned portal);
+struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned int nfrags,
+					 unsigned int max_brw,
+					 enum ptlrpc_bulk_op_type type,
+					 unsigned int portal,
+					 const struct ptlrpc_bulk_frag_ops *ops);
 int ptlrpc_request_cache_init(void);
 void ptlrpc_request_cache_fini(void);
 struct ptlrpc_request *ptlrpc_request_cache_alloc(gfp_t flags);
@@ -226,8 +229,6 @@ struct ptlrpc_nrs_policy *nrs_request_policy(struct ptlrpc_nrs_request *nrq)
 /* pers.c */
 void ptlrpc_fill_bulk_md(lnet_md_t *md, struct ptlrpc_bulk_desc *desc,
 			 int mdcnt);
-void ptlrpc_add_bulk_page(struct ptlrpc_bulk_desc *desc, struct page *page,
-			  int pageoffset, int len);
 
 /* pack_generic.c */
 struct ptlrpc_reply_state *
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
index b2cc5ea..ceb805d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
@@ -311,7 +311,9 @@ void sptlrpc_enc_pool_put_pages(struct ptlrpc_bulk_desc *desc)
 	int p_idx, g_idx;
 	int i;
 
-	if (!desc->bd_enc_iov)
+	LASSERT(ptlrpc_is_bulk_desc_kiov(desc->bd_type));
+
+	if (!GET_ENC_KIOV(desc))
 		return;
 
 	LASSERT(desc->bd_iov_count > 0);
@@ -326,12 +328,12 @@ void sptlrpc_enc_pool_put_pages(struct ptlrpc_bulk_desc *desc)
 	LASSERT(page_pools.epp_pools[p_idx]);
 
 	for (i = 0; i < desc->bd_iov_count; i++) {
-		LASSERT(desc->bd_enc_iov[i].bv_page);
+		LASSERT(BD_GET_ENC_KIOV(desc, i).bv_page);
 		LASSERT(g_idx != 0 || page_pools.epp_pools[p_idx]);
 		LASSERT(!page_pools.epp_pools[p_idx][g_idx]);
 
 		page_pools.epp_pools[p_idx][g_idx] =
-					desc->bd_enc_iov[i].bv_page;
+			BD_GET_ENC_KIOV(desc, i).bv_page;
 
 		if (++g_idx == PAGES_PER_POOL) {
 			p_idx++;
@@ -345,8 +347,8 @@ void sptlrpc_enc_pool_put_pages(struct ptlrpc_bulk_desc *desc)
 
 	spin_unlock(&page_pools.epp_lock);
 
-	kfree(desc->bd_enc_iov);
-	desc->bd_enc_iov = NULL;
+	kfree(GET_ENC_KIOV(desc));
+	GET_ENC_KIOV(desc) = NULL;
 }
 
 static inline void enc_pools_alloc(void)
@@ -520,10 +522,11 @@ int sptlrpc_get_bulk_checksum(struct ptlrpc_bulk_desc *desc, __u8 alg,
 	hashsize = cfs_crypto_hash_digestsize(cfs_hash_alg_id[alg]);
 
 	for (i = 0; i < desc->bd_iov_count; i++) {
-		cfs_crypto_hash_update_page(hdesc, desc->bd_iov[i].bv_page,
-					    desc->bd_iov[i].bv_offset &
+		cfs_crypto_hash_update_page(hdesc,
+					    BD_GET_KIOV(desc, i).bv_page,
+					    BD_GET_KIOV(desc, i).bv_offset &
 					    ~PAGE_MASK,
-					    desc->bd_iov[i].bv_len);
+					    BD_GET_KIOV(desc, i).bv_len);
 	}
 
 	if (hashsize > buflen) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec_plain.c b/drivers/staging/lustre/lustre/ptlrpc/sec_plain.c
index cd305bc..c5e7a23 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec_plain.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec_plain.c
@@ -153,14 +153,16 @@ static void corrupt_bulk_data(struct ptlrpc_bulk_desc *desc)
 	char *ptr;
 	unsigned int off, i;
 
+	LASSERT(ptlrpc_is_bulk_desc_kiov(desc->bd_type));
+
 	for (i = 0; i < desc->bd_iov_count; i++) {
-		if (desc->bd_iov[i].bv_len == 0)
+		if (!BD_GET_KIOV(desc, i).bv_len)
 			continue;
 
-		ptr = kmap(desc->bd_iov[i].bv_page);
-		off = desc->bd_iov[i].bv_offset & ~PAGE_MASK;
+		ptr = kmap(BD_GET_KIOV(desc, i).bv_page);
+		off = BD_GET_KIOV(desc, i).bv_offset & ~PAGE_MASK;
 		ptr[off] ^= 0x1;
-		kunmap(desc->bd_iov[i].bv_page);
+		kunmap(BD_GET_KIOV(desc, i).bv_page);
 		return;
 	}
 }
@@ -352,11 +354,11 @@ int plain_cli_unwrap_bulk(struct ptlrpc_cli_ctx *ctx,
 
 	/* fix the actual data size */
 	for (i = 0, nob = 0; i < desc->bd_iov_count; i++) {
-		if (desc->bd_iov[i].bv_len + nob > desc->bd_nob_transferred) {
-			desc->bd_iov[i].bv_len =
-				desc->bd_nob_transferred - nob;
-		}
-		nob += desc->bd_iov[i].bv_len;
+		struct bio_vec bv_desc = BD_GET_KIOV(desc, i);
+
+		if (bv_desc.bv_len + nob > desc->bd_nob_transferred)
+			bv_desc.bv_len = desc->bd_nob_transferred - nob;
+		nob += bv_desc.bv_len;
 	}
 
 	rc = plain_verify_bulk_csum(desc, req->rq_flvr.u_bulk.hash.hash_alg,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/29] staging: lustre: lov: remove LSM from struct lustre_md
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (5 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 06/29] staging: lustre: ptlrpc: Introduce iovec to bulk descriptor James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 08/29] staging: lustre: clio: update file attributes after sync James Simmons
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

In struct lustre_md replace the struct lov_stripe_md *lsm member with
an opaque struct lu_buf layout which holds the layout metadata
returned by the server. Refactor the LOV object initialization and
layout change code to accommodate this. Simplify lov_unpackmd() and
supporting functions according to the reduced number of use cases.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13722
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h |    2 +-
 drivers/staging/lustre/lustre/include/obd.h       |    2 +-
 drivers/staging/lustre/lustre/llite/file.c        |   29 +----
 drivers/staging/lustre/lustre/llite/lcommon_cl.c  |    2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c   |   11 +--
 drivers/staging/lustre/lustre/llite/namei.c       |    7 +-
 drivers/staging/lustre/lustre/lov/lov_ea.c        |    9 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h  |   15 +--
 drivers/staging/lustre/lustre/lov/lov_obd.c       |    1 -
 drivers/staging/lustre/lustre/lov/lov_object.c    |  120 ++++++++++++---------
 drivers/staging/lustre/lustre/lov/lov_pack.c      |  100 +++++++-----------
 drivers/staging/lustre/lustre/mdc/mdc_request.c   |   19 +---
 12 files changed, 134 insertions(+), 183 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index dfef4bc..514d650 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -267,7 +267,7 @@ struct cl_object_conf {
 		/**
 		 * Object layout. This is consumed by lov.
 		 */
-		struct lustre_md *coc_md;
+		struct lu_buf	  coc_layout;
 		/**
 		 * Description of particular stripe location in the
 		 * cluster. This is consumed by osc.
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index ebb3012..c8a6e23 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -864,7 +864,7 @@ struct obd_ops {
 /* lmv structures */
 struct lustre_md {
 	struct mdt_body	 *body;
-	struct lov_stripe_md    *lsm;
+	struct lu_buf		 layout;
 	struct lmv_stripe_md    *lmv;
 #ifdef CONFIG_FS_POSIX_ACL
 	struct posix_acl	*posix_acl;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 2c8df43..0ae78c7 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3275,7 +3275,6 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct ll_sb_info    *sbi = ll_i2sbi(inode);
 	struct ldlm_lock *lock;
-	struct lustre_md md = { NULL };
 	struct cl_object_conf conf;
 	int rc = 0;
 	bool lvb_ready;
@@ -3310,37 +3309,19 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 
 	/* for layout lock, lmm is returned in lock's lvb.
 	 * lvb_data is immutable if the lock is held so it's safe to access it
-	 * without res lock. See the description in ldlm_lock_decref_internal()
-	 * for the condition to free lvb_data of layout lock
-	 */
-	if (lock->l_lvb_data) {
-		rc = obd_unpackmd(sbi->ll_dt_exp, &md.lsm,
-				  lock->l_lvb_data, lock->l_lvb_len);
-		if (rc < 0) {
-			CERROR("%s: file " DFID " unpackmd error: %d\n",
-			       ll_get_fsname(inode->i_sb, NULL, 0),
-			       PFID(&lli->lli_fid), rc);
-			goto out;
-		}
-
-		LASSERTF(md.lsm, "lvb_data = %p, lvb_len = %u\n",
-			 lock->l_lvb_data, lock->l_lvb_len);
-		rc = 0;
-	}
-
-	/* set layout to file. Unlikely this will fail as old layout was
+	 * without res lock.
+	 *
+	 * set layout to file. Unlikely this will fail as old layout was
 	 * surely eliminated
 	 */
 	memset(&conf, 0, sizeof(conf));
 	conf.coc_opc = OBJECT_CONF_SET;
 	conf.coc_inode = inode;
 	conf.coc_lock = lock;
-	conf.u.coc_md = &md;
+	conf.u.coc_layout.lb_buf = lock->l_lvb_data;
+	conf.u.coc_layout.lb_len = lock->l_lvb_len;
 	rc = ll_layout_conf(inode, &conf);
 
-	if (md.lsm)
-		obd_free_memmd(sbi->ll_dt_exp, &md.lsm);
-
 	/* refresh layout failed, need to wait */
 	wait_layout = rc == -EBUSY;
 
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index 4087db0..56e414b 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -150,7 +150,7 @@ int cl_file_inode_init(struct inode *inode, struct lustre_md *md)
 	struct cl_object_conf conf = {
 		.coc_inode = inode,
 		.u = {
-			.coc_md    = md
+			.coc_layout = md->layout,
 		}
 	};
 	int result = 0;
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index fba1499..b896ac1 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -469,8 +469,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 	ptlrpc_req_finished(request);
 
 	if (IS_ERR(root)) {
-		if (lmd.lsm)
-			obd_free_memmd(sbi->ll_dt_exp, &lmd.lsm);
 #ifdef CONFIG_FS_POSIX_ACL
 		if (lmd.posix_acl) {
 			posix_acl_release(lmd.posix_acl);
@@ -1688,11 +1686,9 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct mdt_body *body = md->body;
-	struct lov_stripe_md *lsm = md->lsm;
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 
-	LASSERT((lsm != NULL) == ((body->mbo_valid & OBD_MD_FLEASIZE) != 0));
-	if (lsm)
+	if (body->mbo_valid & OBD_MD_FLEASIZE)
 		cl_file_inode_init(inode, md);
 
 	if (S_ISDIR(inode->i_mode)) {
@@ -2135,17 +2131,14 @@ int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
 			conf.coc_opc = OBJECT_CONF_SET;
 			conf.coc_inode = *inode;
 			conf.coc_lock = lock;
-			conf.u.coc_md = &md;
+			conf.u.coc_layout = md.layout;
 			(void)ll_layout_conf(*inode, &conf);
 		}
 		LDLM_LOCK_PUT(lock);
 	}
 
 out:
-	if (md.lsm)
-		obd_free_memmd(sbi->ll_dt_exp, &md.lsm);
 	md_free_lustre_md(sbi->ll_md_exp, &md);
-
 cleanup:
 	if (rc != 0 && it && it->it_op & IT_OPEN)
 		ll_open_cleanup(sb ? sb : (*inode)->i_sb, req);
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 9b4095f..89fd441 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -113,12 +113,9 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash,
 	if (inode->i_state & I_NEW) {
 		rc = ll_read_inode2(inode, md);
 		if (!rc && S_ISREG(inode->i_mode) &&
-		    !ll_i2info(inode)->lli_clob) {
-			CDEBUG(D_INODE, "%s: apply lsm %p to inode "DFID"\n",
-			       ll_get_fsname(sb, NULL, 0), md->lsm,
-			       PFID(ll_inode2fid(inode)));
+		    !ll_i2info(inode)->lli_clob)
 			rc = cl_file_inode_init(inode, md);
-		}
+
 		if (rc) {
 			make_bad_inode(inode);
 			unlock_new_inode(inode);
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index d7dc0aa..53db170 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -76,18 +76,19 @@ static int lsm_lmm_verify_common(struct lov_mds_md *lmm, int lmm_bytes,
 	return 0;
 }
 
-struct lov_stripe_md *lsm_alloc_plain(__u16 stripe_count, int *size)
+struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
 {
+	size_t oinfo_ptrs_size, lsm_size;
 	struct lov_stripe_md *lsm;
 	struct lov_oinfo     *loi;
-	int		   i, oinfo_ptrs_size;
+	unsigned int i;
 
 	LASSERT(stripe_count <= LOV_MAX_STRIPE_COUNT);
 
 	oinfo_ptrs_size = sizeof(struct lov_oinfo *) * stripe_count;
-	*size = sizeof(struct lov_stripe_md) + oinfo_ptrs_size;
+	lsm_size = sizeof(*lsm) + oinfo_ptrs_size;
 
-	lsm = libcfs_kvzalloc(*size, GFP_NOFS);
+	lsm = libcfs_kvzalloc(lsm_size, GFP_NOFS);
 	if (!lsm)
 		return NULL;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 41e7c5f..5b151bb 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -79,13 +79,6 @@ static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 	return true;
 }
 
-static inline int lov_stripe_md_size(unsigned int stripe_count)
-{
-	struct lov_stripe_md lsm;
-
-	return sizeof(lsm) + stripe_count * sizeof(lsm.lsm_oinfo[0]);
-}
-
 struct lsm_operations {
 	void (*lsm_free)(struct lov_stripe_md *);
 	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, loff_t *,
@@ -254,10 +247,8 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 /* lov_pack.c */
 ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 		     size_t buf_size);
-int lov_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
-		 struct lov_mds_md *lmm, int lmm_bytes);
-int lov_alloc_memmd(struct lov_stripe_md **lsmp, __u16 stripe_count,
-		    int pattern, int magic);
+struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
+				   size_t lmm_size);
 int lov_free_memmd(struct lov_stripe_md **lsmp);
 
 void lov_dump_lmm_v1(int level, struct lov_mds_md_v1 *lmm);
@@ -265,7 +256,7 @@ int lov_alloc_memmd(struct lov_stripe_md **lsmp, __u16 stripe_count,
 void lov_dump_lmm_common(int level, void *lmmp);
 
 /* lov_ea.c */
-struct lov_stripe_md *lsm_alloc_plain(__u16 stripe_count, int *size);
+struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count);
 void lsm_free_plain(struct lov_stripe_md *lsm);
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm);
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 621f66e..421a0e2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1406,7 +1406,6 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	.disconnect     = lov_disconnect,
 	.statfs         = lov_statfs,
 	.statfs_async   = lov_statfs_async,
-	.unpackmd       = lov_unpackmd,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
 	.set_info_async = lov_set_info_async,
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 82b99e0..711a3b9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -39,6 +39,11 @@
 
 #include "lov_cl_internal.h"
 
+static inline struct lov_device *lov_object_dev(struct lov_object *obj)
+{
+	return lu2lov_dev(obj->lo_cl.co_lu.lo_dev);
+}
+
 /** \addtogroup lov
  *  @{
  */
@@ -51,7 +56,7 @@
 
 struct lov_layout_operations {
 	int (*llo_init)(const struct lu_env *env, struct lov_device *dev,
-			struct lov_object *lov,
+			struct lov_object *lov, struct lov_stripe_md *lsm,
 			const struct cl_object_conf *conf,
 			union lov_layout_state *state);
 	int (*llo_delete)(const struct lu_env *env, struct lov_object *lov,
@@ -96,17 +101,17 @@ static void lov_install_empty(const struct lu_env *env,
 	 */
 }
 
-static int lov_init_empty(const struct lu_env *env,
-			  struct lov_device *dev, struct lov_object *lov,
+static int lov_init_empty(const struct lu_env *env, struct lov_device *dev,
+			  struct lov_object *lov, struct lov_stripe_md *lsm,
 			  const struct cl_object_conf *conf,
-			  union  lov_layout_state *state)
+			  union lov_layout_state *state)
 {
 	return 0;
 }
 
 static void lov_install_raid0(const struct lu_env *env,
 			      struct lov_object *lov,
-			      union  lov_layout_state *state)
+			      union lov_layout_state *state)
 {
 }
 
@@ -211,8 +216,8 @@ static int lov_page_slice_fixup(struct lov_object *lov,
 	return cl_object_header(stripe)->coh_page_bufsize;
 }
 
-static int lov_init_raid0(const struct lu_env *env,
-			  struct lov_device *dev, struct lov_object *lov,
+static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
+			  struct lov_object *lov, struct lov_stripe_md *lsm,
 			  const struct cl_object_conf *conf,
 			  union  lov_layout_state *state)
 {
@@ -222,7 +227,6 @@ static int lov_init_raid0(const struct lu_env *env,
 	struct cl_object	*stripe;
 	struct lov_thread_info  *lti     = lov_env_info(env);
 	struct cl_object_conf   *subconf = &lti->lti_stripe_conf;
-	struct lov_stripe_md    *lsm     = conf->u.coc_md->lsm;
 	struct lu_fid	   *ofid    = &lti->lti_fid;
 	struct lov_layout_raid0 *r0      = &state->raid0;
 
@@ -297,13 +301,11 @@ static int lov_init_raid0(const struct lu_env *env,
 	return result;
 }
 
-static int lov_init_released(const struct lu_env *env,
-			     struct lov_device *dev, struct lov_object *lov,
+static int lov_init_released(const struct lu_env *env, struct lov_device *dev,
+			     struct lov_object *lov, struct lov_stripe_md *lsm,
 			     const struct cl_object_conf *conf,
 			     union  lov_layout_state *state)
 {
-	struct lov_stripe_md *lsm = conf->u.coc_md->lsm;
-
 	LASSERT(lsm);
 	LASSERT(lsm_is_released(lsm));
 	LASSERT(!lov->lo_lsm);
@@ -720,25 +722,20 @@ static int lov_layout_wait(const struct lu_env *env, struct lov_object *lov)
 }
 
 static int lov_layout_change(const struct lu_env *unused,
-			     struct lov_object *lov,
+			     struct lov_object *lov, struct lov_stripe_md *lsm,
 			     const struct cl_object_conf *conf)
 {
-	int result;
-	enum lov_layout_type llt = LLT_EMPTY;
+	enum lov_layout_type llt = lov_type(lsm);
 	union lov_layout_state *state = &lov->u;
 	const struct lov_layout_operations *old_ops;
 	const struct lov_layout_operations *new_ops;
-
 	void *cookie;
 	struct lu_env *env;
 	int refcheck;
+	int rc;
 
 	LASSERT(0 <= lov->lo_type && lov->lo_type < ARRAY_SIZE(lov_dispatch));
 
-	if (conf->u.coc_md)
-		llt = lov_type(conf->u.coc_md->lsm);
-	LASSERT(0 <= llt && llt < ARRAY_SIZE(lov_dispatch));
-
 	cookie = cl_env_reenter();
 	env = cl_env_get(&refcheck);
 	if (IS_ERR(env)) {
@@ -746,6 +743,8 @@ static int lov_layout_change(const struct lu_env *unused,
 		return PTR_ERR(env);
 	}
 
+	LASSERT(0 <= llt && llt < ARRAY_SIZE(lov_dispatch));
+
 	CDEBUG(D_INODE, DFID" from %s to %s\n",
 	       PFID(lu_object_fid(lov2lu(lov))),
 	       llt2str(lov->lo_type), llt2str(llt));
@@ -753,38 +752,38 @@ static int lov_layout_change(const struct lu_env *unused,
 	old_ops = &lov_dispatch[lov->lo_type];
 	new_ops = &lov_dispatch[llt];
 
-	result = cl_object_prune(env, &lov->lo_cl);
-	if (result != 0)
+	rc = cl_object_prune(env, &lov->lo_cl);
+	if (rc)
 		goto out;
 
-	result = old_ops->llo_delete(env, lov, &lov->u);
-	if (result == 0) {
-		old_ops->llo_fini(env, lov, &lov->u);
+	rc = old_ops->llo_delete(env, lov, &lov->u);
+	if (rc)
+		goto out;
 
-		LASSERT(atomic_read(&lov->lo_active_ios) == 0);
+	old_ops->llo_fini(env, lov, &lov->u);
 
-		lov->lo_type = LLT_EMPTY;
-		/* page bufsize fixup */
-		cl_object_header(&lov->lo_cl)->coh_page_bufsize -=
+	LASSERT(!atomic_read(&lov->lo_active_ios));
+
+	lov->lo_type = LLT_EMPTY;
+
+	/* page bufsize fixup */
+	cl_object_header(&lov->lo_cl)->coh_page_bufsize -=
 			lov_page_slice_fixup(lov, NULL);
 
-		result = new_ops->llo_init(env,
-					lu2lov_dev(lov->lo_cl.co_lu.lo_dev),
-					lov, conf, state);
-		if (result == 0) {
-			new_ops->llo_install(env, lov, state);
-			lov->lo_type = llt;
-		} else {
-			new_ops->llo_delete(env, lov, state);
-			new_ops->llo_fini(env, lov, state);
-			/* this file becomes an EMPTY file. */
-		}
+	rc = new_ops->llo_init(env, lov_object_dev(lov), lov, lsm, conf, state);
+	if (rc) {
+		new_ops->llo_delete(env, lov, state);
+		new_ops->llo_fini(env, lov, state);
+		/* this file becomes an EMPTY file. */
+		goto out;
 	}
 
+	new_ops->llo_install(env, lov, state);
+	lov->lo_type = llt;
 out:
 	cl_env_put(env, &refcheck);
 	cl_env_reexit(cookie);
-	return result;
+	return rc;
 }
 
 /*****************************************************************************
@@ -795,26 +794,37 @@ static int lov_layout_change(const struct lu_env *unused,
 int lov_object_init(const struct lu_env *env, struct lu_object *obj,
 		    const struct lu_object_conf *conf)
 {
-	struct lov_device	    *dev   = lu2lov_dev(obj->lo_dev);
 	struct lov_object	    *lov   = lu2lov(obj);
+	struct lov_device *dev = lov_object_dev(lov);
 	const struct cl_object_conf  *cconf = lu2cl_conf(conf);
 	union  lov_layout_state      *set   = &lov->u;
 	const struct lov_layout_operations *ops;
-	int result;
+	struct lov_stripe_md *lsm = NULL;
+	int rc;
 
 	init_rwsem(&lov->lo_type_guard);
 	atomic_set(&lov->lo_active_ios, 0);
 	init_waitqueue_head(&lov->lo_waitq);
-
 	cl_object_page_init(lu2cl(obj), sizeof(struct lov_page));
 
+	if (cconf->u.coc_layout.lb_buf) {
+		lsm = lov_unpackmd(dev->ld_lov,
+				   cconf->u.coc_layout.lb_buf,
+				   cconf->u.coc_layout.lb_len);
+		if (IS_ERR(lsm))
+			return PTR_ERR(lsm);
+	}
+
 	/* no locking is necessary, as object is being created */
-	lov->lo_type = lov_type(cconf->u.coc_md->lsm);
+	lov->lo_type = lov_type(lsm);
 	ops = &lov_dispatch[lov->lo_type];
-	result = ops->llo_init(env, dev, lov, cconf, set);
-	if (result == 0)
+	rc = ops->llo_init(env, dev, lov, lsm, cconf, set);
+	if (!rc)
 		ops->llo_install(env, lov, set);
-	return result;
+
+	lov_lsm_put(lsm);
+
+	return rc;
 }
 
 static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
@@ -824,6 +834,15 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 	struct lov_object	*lov = cl2lov(obj);
 	int			 result = 0;
 
+	if (conf->coc_opc == OBJECT_CONF_SET &&
+	    conf->u.coc_layout.lb_buf) {
+		lsm = lov_unpackmd(lov_object_dev(lov)->ld_lov,
+				   conf->u.coc_layout.lb_buf,
+				   conf->u.coc_layout.lb_len);
+		if (IS_ERR(lsm))
+			return PTR_ERR(lsm);
+	}
+
 	lov_conf_lock(lov);
 	if (conf->coc_opc == OBJECT_CONF_INVALIDATE) {
 		lov->lo_layout_invalid = true;
@@ -843,8 +862,6 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 
 	LASSERT(conf->coc_opc == OBJECT_CONF_SET);
 
-	if (conf->u.coc_md)
-		lsm = conf->u.coc_md->lsm;
 	if ((!lsm && !lov->lo_lsm) ||
 	    ((lsm && lov->lo_lsm) &&
 	     (lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen) &&
@@ -862,11 +879,12 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 		goto out;
 	}
 
-	result = lov_layout_change(env, lov, conf);
+	result = lov_layout_change(env, lov, lsm, conf);
 	lov->lo_layout_invalid = result != 0;
 
 out:
 	lov_conf_unlock(lov);
+	lov_lsm_put(lsm);
 	CDEBUG(D_INODE, DFID" lo_layout_invalid=%d\n",
 	       PFID(lu_object_fid(lov2lu(lov))), lov->lo_layout_invalid);
 	return result;
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index ccc1fae..9ba87d2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -195,34 +195,34 @@ static int lov_verify_lmm(void *lmm, int lmm_bytes, __u16 *stripe_count)
 	return rc;
 }
 
-int lov_alloc_memmd(struct lov_stripe_md **lsmp, __u16 stripe_count,
-		    int pattern, int magic)
+struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern, u32 magic)
 {
-	int i, lsm_size;
+	struct lov_stripe_md *lsm;
+	unsigned int i;
 
-	CDEBUG(D_INFO, "alloc lsm, stripe_count %d\n", stripe_count);
+	CDEBUG(D_INFO, "alloc lsm, stripe_count %u\n", stripe_count);
 
-	*lsmp = lsm_alloc_plain(stripe_count, &lsm_size);
-	if (!*lsmp) {
-		CERROR("can't allocate lsmp stripe_count %d\n", stripe_count);
-		return -ENOMEM;
+	lsm = lsm_alloc_plain(stripe_count);
+	if (!lsm) {
+		CERROR("cannot allocate LSM stripe_count %u\n", stripe_count);
+		return ERR_PTR(-ENOMEM);
 	}
 
-	atomic_set(&(*lsmp)->lsm_refc, 1);
-	spin_lock_init(&(*lsmp)->lsm_lock);
-	(*lsmp)->lsm_magic = magic;
-	(*lsmp)->lsm_stripe_count = stripe_count;
-	(*lsmp)->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES * stripe_count;
-	(*lsmp)->lsm_pattern = pattern;
-	(*lsmp)->lsm_pool_name[0] = '\0';
-	(*lsmp)->lsm_layout_gen = 0;
+	atomic_set(&lsm->lsm_refc, 1);
+	spin_lock_init(&lsm->lsm_lock);
+	lsm->lsm_magic = magic;
+	lsm->lsm_stripe_count = stripe_count;
+	lsm->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES * stripe_count;
+	lsm->lsm_pattern = pattern;
+	lsm->lsm_pool_name[0] = '\0';
+	lsm->lsm_layout_gen = 0;
 	if (stripe_count > 0)
-		(*lsmp)->lsm_oinfo[0]->loi_ost_idx = ~0;
+		lsm->lsm_oinfo[0]->loi_ost_idx = ~0;
 
 	for (i = 0; i < stripe_count; i++)
-		loi_init((*lsmp)->lsm_oinfo[i]);
+		loi_init(lsm->lsm_oinfo[i]);
 
-	return lsm_size;
+	return lsm;
 }
 
 int lov_free_memmd(struct lov_stripe_md **lsmp)
@@ -242,56 +242,34 @@ int lov_free_memmd(struct lov_stripe_md **lsmp)
 /* Unpack LOV object metadata from disk storage.  It is packed in LE byte
  * order and is opaque to the networking layer.
  */
-int lov_unpackmd(struct obd_export *exp,  struct lov_stripe_md **lsmp,
-		 struct lov_mds_md *lmm, int lmm_bytes)
+struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
+				   size_t lmm_size)
 {
-	struct obd_device *obd = class_exp2obd(exp);
-	struct lov_obd *lov = &obd->u.lov;
-	int rc = 0, lsm_size;
-	__u16 stripe_count;
-	__u32 magic;
-	__u32 pattern;
-
-	/* If passed an MDS struct use values from there, otherwise defaults */
-	if (lmm) {
-		rc = lov_verify_lmm(lmm, lmm_bytes, &stripe_count);
-		if (rc)
-			return rc;
-		magic = le32_to_cpu(lmm->lmm_magic);
-		pattern = le32_to_cpu(lmm->lmm_pattern);
-	} else {
-		magic = LOV_MAGIC;
-		stripe_count = lov_get_stripecnt(lov, magic, 0);
-		pattern = LOV_PATTERN_RAID0;
-	}
+	struct lov_stripe_md *lsm;
+	u16 stripe_count;
+	u32 pattern;
+	u32 magic;
+	int rc;
 
-	/* If we aren't passed an lsmp struct, we just want the size */
-	if (!lsmp) {
-		/* XXX LOV STACKING call into osc for sizes */
-		LBUG();
-		return lov_stripe_md_size(stripe_count);
-	}
-	/* If we are passed an allocated struct but nothing to unpack, free */
-	if (*lsmp && !lmm) {
-		lov_free_memmd(lsmp);
-		return 0;
-	}
+	rc = lov_verify_lmm(lmm, lmm_size, &stripe_count);
+	if (rc)
+		return ERR_PTR(rc);
 
-	lsm_size = lov_alloc_memmd(lsmp, stripe_count, pattern, magic);
-	if (lsm_size < 0)
-		return lsm_size;
+	magic = le32_to_cpu(lmm->lmm_magic);
+	pattern = le32_to_cpu(lmm->lmm_pattern);
 
-	/* If we are passed a pointer but nothing to unpack, we only alloc */
-	if (!lmm)
-		return lsm_size;
+	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
+	if (IS_ERR(lsm))
+		return lsm;
 
-	rc = lsm_op_find(magic)->lsm_unpackmd(lov, *lsmp, lmm);
+	LASSERT(lsm_op_find(magic));
+	rc = lsm_op_find(magic)->lsm_unpackmd(lov, lsm, lmm);
 	if (rc) {
-		lov_free_memmd(lsmp);
-		return rc;
+		lov_free_memmd(&lsm);
+		return ERR_PTR(rc);
 	}
 
-	return lsm_size;
+	return lsm;
 }
 
 /* Retrieve object striping information.
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 4c30ab6..c620f5c 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -420,9 +420,6 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 	md->body = req_capsule_server_get(pill, &RMF_MDT_BODY);
 
 	if (md->body->mbo_valid & OBD_MD_FLEASIZE) {
-		int lmmsize;
-		struct lov_mds_md *lmm;
-
 		if (!S_ISREG(md->body->mbo_mode)) {
 			CDEBUG(D_INFO,
 			       "OBD_MD_FLEASIZE set, should be a regular file, but is not\n");
@@ -436,17 +433,15 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 			rc = -EPROTO;
 			goto out;
 		}
-		lmmsize = md->body->mbo_eadatasize;
-		lmm = req_capsule_server_sized_get(pill, &RMF_MDT_MD, lmmsize);
-		if (!lmm) {
+
+		md->layout.lb_len = md->body->mbo_eadatasize;
+		md->layout.lb_buf = req_capsule_server_sized_get(pill,
+								 &RMF_MDT_MD,
+								 md->layout.lb_len);
+		if (!md->layout.lb_buf) {
 			rc = -EPROTO;
 			goto out;
 		}
-
-		rc = obd_unpackmd(dt_exp, &md->lsm, lmm, lmmsize);
-		if (rc < 0)
-			goto out;
-
 	} else if (md->body->mbo_valid & OBD_MD_FLDIREA) {
 		int lmvsize;
 		struct lov_mds_md *lmv;
@@ -509,8 +504,6 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 #ifdef CONFIG_FS_POSIX_ACL
 		posix_acl_release(md->posix_acl);
 #endif
-		if (md->lsm)
-			obd_free_memmd(dt_exp, &md->lsm);
 	}
 	return rc;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/29] staging: lustre: clio: update file attributes after sync
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (6 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 07/29] staging: lustre: lov: remove LSM from struct lustre_md James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 09/29] staging: lustre: dne: setdirstripe should fail if not supported James Simmons
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

This really only affects clients that are attached to lustre
server running ZFS, because zfs does not update # of blocks
until the blocks are flushed to disk.

This patch update file's blocks attribute after OST_SYNC completes.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4389
Reviewed-on: http://review.whamcloud.com/12915
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_internal.h |    2 +-
 drivers/staging/lustre/lustre/osc/osc_io.c       |    3 +--
 drivers/staging/lustre/lustre/osc/osc_request.c  |   21 ++++++++++++++++++++-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index dc708ea..12d3b58 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -124,7 +124,7 @@ int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 int osc_punch_base(struct obd_export *exp, struct obdo *oa,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset);
-int osc_sync_base(struct obd_export *exp, struct obdo *oa,
+int osc_sync_base(struct osc_object *exp, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset);
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 8eb4275..3b82d0a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -760,8 +760,7 @@ static int osc_fsync_ost(const struct lu_env *env, struct osc_object *obj,
 
 	init_completion(&cbargs->opc_sync);
 
-	rc = osc_sync_base(osc_export(obj), oa, osc_async_upcall, cbargs,
-			   PTLRPCD_SET);
+	rc = osc_sync_base(obj, oa, osc_async_upcall, cbargs, PTLRPCD_SET);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index c570d19..091558e 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -82,6 +82,7 @@ struct osc_setattr_args {
 };
 
 struct osc_fsync_args {
+	struct osc_object	*fa_obj;
 	struct obdo		*fa_oa;
 	obd_enqueue_update_f fa_upcall;
 	void		*fa_cookie;
@@ -365,8 +366,11 @@ static int osc_sync_interpret(const struct lu_env *env,
 			      struct ptlrpc_request *req,
 			      void *arg, int rc)
 {
+	struct cl_attr *attr = &osc_env_info(env)->oti_attr;
 	struct osc_fsync_args *fa = arg;
+	unsigned long valid = 0;
 	struct ost_body *body;
+	struct cl_object *obj;
 
 	if (rc)
 		goto out;
@@ -379,15 +383,29 @@ static int osc_sync_interpret(const struct lu_env *env,
 	}
 
 	*fa->fa_oa = body->oa;
+	obj = osc2cl(fa->fa_obj);
+
+	/* Update osc object's blocks attribute */
+	cl_object_attr_lock(obj);
+	if (body->oa.o_valid & OBD_MD_FLBLOCKS) {
+		attr->cat_blocks = body->oa.o_blocks;
+		valid |= CAT_BLOCKS;
+	}
+
+	if (valid)
+		cl_object_attr_update(env, obj, attr, valid);
+	cl_object_attr_unlock(obj);
+
 out:
 	rc = fa->fa_upcall(fa->fa_cookie, rc);
 	return rc;
 }
 
-int osc_sync_base(struct obd_export *exp, struct obdo *oa,
+int osc_sync_base(struct osc_object *obj, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset)
 {
+	struct obd_export *exp = osc_export(obj);
 	struct ptlrpc_request *req;
 	struct ost_body *body;
 	struct osc_fsync_args *fa;
@@ -414,6 +432,7 @@ int osc_sync_base(struct obd_export *exp, struct obdo *oa,
 
 	CLASSERT(sizeof(*fa) <= sizeof(req->rq_async_args));
 	fa = ptlrpc_req_async_args(req);
+	fa->fa_obj = obj;
 	fa->fa_oa = oa;
 	fa->fa_upcall = upcall;
 	fa->fa_cookie = cookie;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/29] staging: lustre: dne: setdirstripe should fail if not supported
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (7 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 08/29] staging: lustre: clio: update file attributes after sync James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 10/29] staging: lustre: ptlrpc: embed highest XID in each request James Simmons
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Lai Siyao,
	James Simmons

From: Lai Siyao <lai.siyao@intel.com>

If MDS doesn't support striped directory, creating striped directory
from a userland utility such as 'lfs setdirstripe' should fail.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6661
Reviewed-on: http://review.whamcloud.com/15123
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 929c32c..6e744a5 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -419,6 +419,10 @@ static int ll_dir_setdirstripe(struct inode *parent, struct lmv_user_md *lump,
 	       PFID(ll_inode2fid(parent)), parent, dirname,
 	       (int)lump->lum_stripe_offset, lump->lum_stripe_count);
 
+	if (lump->lum_stripe_count > 1 &&
+	    !(exp_connect_flags(sbi->ll_md_exp) & OBD_CONNECT_DIR_STRIPE))
+		return -EINVAL;
+
 	if (lump->lum_magic != cpu_to_le32(LMV_USER_MAGIC))
 		lustre_swab_lmv_user_md(lump);
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/29] staging: lustre: ptlrpc: embed highest XID in each request
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (8 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 09/29] staging: lustre: dne: setdirstripe should fail if not supported James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 11/29] staging: lustre: llite: basic support of SELinux in CLIO James Simmons
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, Alex Zhuravlev, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

Atomically assign XIDs and put request and sending list so
we can learn the lowest unreplied XID at any point.

This allows to embed in every resquests the highest XID for
which a reply has been received and does not have an unreplied
lower-numbered XID.

This will be used by the MDT target to release in-memory
reply data corresponding to XIDs of reply received by the client.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14793
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |    1 +
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   34 +++++++++++++++++++-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   15 +++++++++
 3 files changed, 49 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index 67a7095..a14f1a4 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -2069,6 +2069,7 @@ void lustre_msg_set_handle(struct lustre_msg *msg,
 			   struct lustre_handle *handle);
 void lustre_msg_set_type(struct lustre_msg *msg, __u32 type);
 void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc);
+void lustre_msg_set_last_xid(struct lustre_msg *msg, u64 last_xid);
 void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag);
 void lustre_msg_set_versions(struct lustre_msg *msg, __u64 *versions);
 void lustre_msg_set_transno(struct lustre_msg *msg, __u64 transno);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index e4a31eb..e4fbdd0 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -700,7 +700,6 @@ int ptlrpc_request_bufs_pack(struct ptlrpc_request *request,
 
 	ptlrpc_at_set_req_timeout(request);
 
-	request->rq_xid = ptlrpc_next_xid();
 	lustre_msg_set_opc(request->rq_reqmsg, opcode);
 
 	/* Let's setup deadline for req/reply/bulk unlink for opcode. */
@@ -1436,6 +1435,8 @@ static int after_reply(struct ptlrpc_request *req)
 static int ptlrpc_send_new_req(struct ptlrpc_request *req)
 {
 	struct obd_import *imp = req->rq_import;
+	struct list_head *tmp;
+	u64 min_xid = ~0ULL;
 	int rc;
 
 	LASSERT(req->rq_phase == RQ_PHASE_NEW);
@@ -1448,6 +1449,18 @@ static int ptlrpc_send_new_req(struct ptlrpc_request *req)
 
 	spin_lock(&imp->imp_lock);
 
+	/*
+	 * the very first time we assign XID. it's important to assign XID
+	 * and put it on the list atomically, so that the lowest assigned
+	 * XID is always known. this is vital for multislot last_rcvd
+	 */
+	if (req->rq_send_state == LUSTRE_IMP_REPLAY) {
+		LASSERT(req->rq_xid);
+	} else {
+		LASSERT(!req->rq_xid);
+		req->rq_xid = ptlrpc_next_xid();
+	}
+
 	if (!req->rq_generation_set)
 		req->rq_import_generation = imp->imp_generation;
 
@@ -1477,8 +1490,27 @@ static int ptlrpc_send_new_req(struct ptlrpc_request *req)
 	LASSERT(list_empty(&req->rq_list));
 	list_add_tail(&req->rq_list, &imp->imp_sending_list);
 	atomic_inc(&req->rq_import->imp_inflight);
+
+	/* find the lowest unreplied XID */
+	list_for_each(tmp, &imp->imp_delayed_list) {
+		struct ptlrpc_request *r;
+
+		r = list_entry(tmp, struct ptlrpc_request, rq_list);
+		if (r->rq_xid < min_xid)
+			min_xid = r->rq_xid;
+	}
+	list_for_each(tmp, &imp->imp_sending_list) {
+		struct ptlrpc_request *r;
+
+		r = list_entry(tmp, struct ptlrpc_request, rq_list);
+		if (r->rq_xid < min_xid)
+			min_xid = r->rq_xid;
+	}
 	spin_unlock(&imp->imp_lock);
 
+	if (likely(min_xid != ~0ULL))
+		lustre_msg_set_last_xid(req->rq_reqmsg, min_xid - 1);
+
 	lustre_msg_set_status(req->rq_reqmsg, current_pid());
 
 	rc = sptlrpc_req_refresh_ctx(req, -1);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 1c06b4e..4f63a80 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1255,6 +1255,21 @@ void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc)
 	}
 }
 
+void lustre_msg_set_last_xid(struct lustre_msg *msg, u64 last_xid)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		LASSERTF(pb, "invalid msg %p: no ptlrpc body!\n", msg);
+		pb->pb_last_xid = last_xid;
+		return;
+	}
+	default:
+		LASSERTF(0, "incorrect message magic: %08x\n", msg->lm_magic);
+	}
+}
+
 void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag)
 {
 	switch (msg->lm_magic) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 11/29] staging: lustre: llite: basic support of SELinux in CLIO
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (9 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 10/29] staging: lustre: ptlrpc: embed highest XID in each request James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight James Simmons
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

Bring the ability to properly initiate security context
on SELinux-enabled client and store it on server side via
extended attribute.

Security context initialization is not atomic, but that would
require a wire protocol change to send security label in the
creation request.

Filter out security.selinux from xattr cache as it is
already cached in system slab.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5560
Reviewed-on: http://review.whamcloud.com/11648
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/Makefile       |    6 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   11 +++
 .../staging/lustre/lustre/llite/llite_internal.h   |    4 +
 drivers/staging/lustre/lustre/llite/namei.c        |    5 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |   60 ++++++--------
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |    4 +
 .../staging/lustre/lustre/llite/xattr_security.c   |   88 ++++++++++++++++++++
 7 files changed, 140 insertions(+), 38 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/llite/xattr_security.c

diff --git a/drivers/staging/lustre/lustre/llite/Makefile b/drivers/staging/lustre/lustre/llite/Makefile
index 3690bee..ca9c275 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,7 +1,7 @@
 obj-$(CONFIG_LUSTRE_FS) += lustre.o
 lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
-	    rw.o namei.o symlink.o llite_mmap.o range_lock.o \
-	    xattr.o xattr_cache.o rw26.o super25.o statahead.o \
-	    glimpse.o lcommon_cl.o lcommon_misc.o \
+	    rw.o rw26.o namei.o symlink.o llite_mmap.o range_lock.o \
+	    xattr.o xattr_cache.o xattr_security.o \
+	    super25.o statahead.o glimpse.o lcommon_cl.o lcommon_misc.o \
 	    vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o vvp_req.o \
 	    lproc_llite.o
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 6e744a5..fb367ae 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -410,6 +410,8 @@ static int ll_dir_setdirstripe(struct inode *parent, struct lmv_user_md *lump,
 	struct ptlrpc_request *request = NULL;
 	struct md_op_data *op_data;
 	struct ll_sb_info *sbi = ll_i2sbi(parent);
+	struct inode *inode = NULL;
+	struct dentry dentry;
 	int err;
 
 	if (unlikely(lump->lum_magic != LMV_USER_MAGIC))
@@ -443,8 +445,17 @@ static int ll_dir_setdirstripe(struct inode *parent, struct lmv_user_md *lump,
 			from_kgid(&init_user_ns, current_fsgid()),
 			cfs_curproc_cap_pack(), 0, &request);
 	ll_finish_md_op_data(op_data);
+
+	err = ll_prep_inode(&inode, request, parent->i_sb, NULL);
 	if (err)
 		goto err_exit;
+
+	memset(&dentry, 0, sizeof(dentry));
+	dentry.d_inode = inode;
+
+	err = ll_init_security(&dentry, inode, parent);
+	iput(inode);
+
 err_exit:
 	ptlrpc_req_finished(request);
 	return err;
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 80fb862..8bd1eb8 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -261,6 +261,9 @@ static inline void ll_layout_version_set(struct ll_inode_info *lli, __u32 gen)
 int ll_xattr_cache_get(struct inode *inode, const char *name,
 		       char *buffer, size_t size, __u64 valid);
 
+int ll_init_security(struct dentry *dentry, struct inode *inode,
+		     struct inode *dir);
+
 /*
  * Locking to guarantee consistency of non-atomic updates to long long i_size,
  * consistency between file size and KMS.
@@ -998,6 +1001,7 @@ static inline loff_t ll_file_maxbytes(struct inode *inode)
 ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size);
 int ll_xattr_list(struct inode *inode, const char *name, int type,
 		  void *buffer, size_t size, __u64 valid);
+const struct xattr_handler *get_xattr_type(const char *name);
 
 /**
  * Common IO arguments for various VFS I/O interfaces.
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 89fd441..74d9b73 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -790,7 +790,8 @@ static int ll_create_it(struct inode *dir, struct dentry *dentry,
 		return PTR_ERR(inode);
 
 	d_instantiate(dentry, inode);
-	return 0;
+
+	return ll_init_security(dentry, inode, dir);
 }
 
 void ll_update_times(struct ptlrpc_request *request, struct inode *inode)
@@ -885,6 +886,8 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 		goto err_exit;
 
 	d_instantiate(dentry, inode);
+
+	err = ll_init_security(dentry, inode, dir);
 err_exit:
 	if (request)
 		ptlrpc_req_finished(request);
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index b8b5c32..3ae1a02 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -44,48 +44,39 @@
 
 #include "llite_internal.h"
 
-static
-int get_xattr_type(const char *name)
+const struct xattr_handler *get_xattr_type(const char *name)
 {
-	if (!strcmp(name, XATTR_NAME_POSIX_ACL_ACCESS))
-		return XATTR_ACL_ACCESS_T;
+	int i = 0;
 
-	if (!strcmp(name, XATTR_NAME_POSIX_ACL_DEFAULT))
-		return XATTR_ACL_DEFAULT_T;
+	while (ll_xattr_handlers[i]) {
+		size_t len = strlen(ll_xattr_handlers[i]->prefix);
 
-	if (!strncmp(name, XATTR_USER_PREFIX,
-		     sizeof(XATTR_USER_PREFIX) - 1))
-		return XATTR_USER_T;
-
-	if (!strncmp(name, XATTR_TRUSTED_PREFIX,
-		     sizeof(XATTR_TRUSTED_PREFIX) - 1))
-		return XATTR_TRUSTED_T;
-
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1))
-		return XATTR_SECURITY_T;
-
-	if (!strncmp(name, XATTR_LUSTRE_PREFIX,
-		     sizeof(XATTR_LUSTRE_PREFIX) - 1))
-		return XATTR_LUSTRE_T;
-
-	return XATTR_OTHER_T;
+		if (!strncmp(ll_xattr_handlers[i]->prefix, name, len))
+			return ll_xattr_handlers[i];
+		i++;
+	}
+	return NULL;
 }
 
-static
-int xattr_type_filter(struct ll_sb_info *sbi, int xattr_type)
+static int xattr_type_filter(struct ll_sb_info *sbi,
+			     const struct xattr_handler *handler)
 {
-	if ((xattr_type == XATTR_ACL_ACCESS_T ||
-	     xattr_type == XATTR_ACL_DEFAULT_T) &&
+	/* No handler means XATTR_OTHER_T */
+	if (!handler)
+		return -EOPNOTSUPP;
+
+	if ((handler->flags == XATTR_ACL_ACCESS_T ||
+	     handler->flags == XATTR_ACL_DEFAULT_T) &&
 	   !(sbi->ll_flags & LL_SBI_ACL))
 		return -EOPNOTSUPP;
 
-	if (xattr_type == XATTR_USER_T && !(sbi->ll_flags & LL_SBI_USER_XATTR))
+	if (handler->flags == XATTR_USER_T &&
+	    !(sbi->ll_flags & LL_SBI_USER_XATTR))
 		return -EOPNOTSUPP;
-	if (xattr_type == XATTR_TRUSTED_T && !capable(CFS_CAP_SYS_ADMIN))
+
+	if (handler->flags == XATTR_TRUSTED_T &&
+	    !capable(CFS_CAP_SYS_ADMIN))
 		return -EPERM;
-	if (xattr_type == XATTR_OTHER_T)
-		return -EOPNOTSUPP;
 
 	return 0;
 }
@@ -111,7 +102,7 @@ int xattr_type_filter(struct ll_sb_info *sbi, int xattr_type)
 		valid = OBD_MD_FLXATTR;
 	}
 
-	rc = xattr_type_filter(sbi, handler->flags);
+	rc = xattr_type_filter(sbi, handler);
 	if (rc)
 		return rc;
 
@@ -225,7 +216,8 @@ static int ll_xattr_set(const struct xattr_handler *handler,
 	void *xdata;
 	int rc;
 
-	if (sbi->ll_xattr_cache_enabled && type != XATTR_ACL_ACCESS_T) {
+	if (sbi->ll_xattr_cache_enabled && type != XATTR_ACL_ACCESS_T &&
+	    (type != XATTR_SECURITY_T || strcmp(name, "security.selinux"))) {
 		rc = ll_xattr_cache_get(inode, name, buffer, size, valid);
 		if (rc == -EAGAIN)
 			goto getxattr_nocache;
@@ -313,7 +305,7 @@ static int ll_xattr_get_common(const struct xattr_handler *handler,
 
 	ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR, 1);
 
-	rc = xattr_type_filter(sbi, handler->flags);
+	rc = xattr_type_filter(sbi, handler);
 	if (rc)
 		return rc;
 
diff --git a/drivers/staging/lustre/lustre/llite/xattr_cache.c b/drivers/staging/lustre/lustre/llite/xattr_cache.c
index 50a19a4..47c1d11 100644
--- a/drivers/staging/lustre/lustre/llite/xattr_cache.c
+++ b/drivers/staging/lustre/lustre/llite/xattr_cache.c
@@ -415,6 +415,10 @@ static int ll_xattr_cache_refill(struct inode *inode, struct lookup_intent *oit)
 			CDEBUG(D_CACHE, "not caching %s\n",
 			       XATTR_NAME_ACL_ACCESS);
 			rc = 0;
+		} else if (!strcmp(xdata, "security.selinux")) {
+			/* Filter out security.selinux, it is cached in slab */
+			CDEBUG(D_CACHE, "not caching security.selinux\n");
+			rc = 0;
 		} else {
 			rc = ll_xattr_cache_add(&lli->lli_xattrs, xdata, xval,
 						*xsizes);
diff --git a/drivers/staging/lustre/lustre/llite/xattr_security.c b/drivers/staging/lustre/lustre/llite/xattr_security.c
new file mode 100644
index 0000000..d61d801
--- /dev/null
+++ b/drivers/staging/lustre/lustre/llite/xattr_security.c
@@ -0,0 +1,88 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright (c) 2014 Bull SAS
+ * Author: Sebastien Buisson sebastien.buisson@bull.net
+ */
+
+/*
+ * lustre/llite/xattr_security.c
+ * Handler for storing security labels as extended attributes.
+ */
+#include <linux/security.h>
+#include <linux/xattr.h>
+#include "llite_internal.h"
+
+/**
+ * A helper function for ll_security_inode_init_security()
+ * that takes care of setting xattrs
+ *
+ * Get security context of @inode from @xattr_array,
+ * and put it in 'security.xxx' xattr of dentry
+ * stored in @fs_info.
+ *
+ * \retval 0        success
+ * \retval -ENOMEM  if no memory could be allocated for xattr name
+ * \retval < 0      failure to set xattr
+ */
+static int
+ll_initxattrs(struct inode *inode, const struct xattr *xattr_array,
+	      void *fs_info)
+{
+	const struct xattr_handler *handler;
+	struct dentry *dentry = fs_info;
+	const struct xattr *xattr;
+	int err = 0;
+
+	handler = get_xattr_type(XATTR_SECURITY_PREFIX);
+	if (!handler)
+		return -ENXIO;
+
+	for (xattr = xattr_array; xattr->name; xattr++) {
+		err = handler->set(handler, dentry, inode, xattr->name,
+				   xattr->value, xattr->value_len,
+				   XATTR_CREATE);
+		if (err < 0)
+			break;
+	}
+	return err;
+}
+
+/**
+ * Initializes security context
+ *
+ * Get security context of @inode in @dir,
+ * and put it in 'security.xxx' xattr of @dentry.
+ *
+ * \retval 0        success, or SELinux is disabled
+ * \retval -ENOMEM  if no memory could be allocated for xattr name
+ * \retval < 0      failure to get security context or set xattr
+ */
+int
+ll_init_security(struct dentry *dentry, struct inode *inode, struct inode *dir)
+{
+	if (!selinux_is_enabled())
+		return 0;
+
+	return security_inode_init_security(inode, dir, NULL,
+					    &ll_initxattrs, dentry);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (10 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 11/29] staging: lustre: llite: basic support of SELinux in CLIO James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 23:40   ` kbuild test robot
  2016-10-30 15:04   ` Greg Kroah-Hartman
  2016-10-27 22:11 ` [PATCH 13/29] staging: lustre: llite: report back to user bad stripe count James Simmons
                   ` (16 subsequent siblings)
  28 siblings, 2 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

This patch is the main client part of a new feature that supports
multiple modify metadata RPCs in parallel. Its goal is to improve
metadata operations performance of a single client, while maintening
the consistency of MDT reply reconstruction and MDT recovery
mechanisms.

It allows to manage the number of modify RPCs in flight within
the client obd structure and to assign a virtual index (the tag) to
each modify RPC to help server side cleaning of reply data.

The mdc component uses this feature to send multiple modify RPCs
in parallel.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14374
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/fid/fid_request.c    |    4 -
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   24 +++
 drivers/staging/lustre/lustre/include/obd.h        |    7 +-
 drivers/staging/lustre/lustre/include/obd_class.h  |    6 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |   37 +++++
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c      |   23 +++
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   13 +-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   23 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   50 ++-----
 drivers/staging/lustre/lustre/obdclass/genops.c    |  171 +++++++++++++++++++-
 10 files changed, 293 insertions(+), 65 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
index edd72b9..dfef3ca 100644
--- a/drivers/staging/lustre/lustre/fid/fid_request.c
+++ b/drivers/staging/lustre/lustre/fid/fid_request.c
@@ -112,11 +112,7 @@ static int seq_client_rpc(struct lu_client_seq *seq,
 
 	ptlrpc_at_set_req_timeout(req);
 
-	if (opc != SEQ_ALLOC_SUPER && seq->lcs_type == LUSTRE_SEQ_METADATA)
-		mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 	rc = ptlrpc_queue_wait(req);
-	if (opc != SEQ_ALLOC_SUPER && seq->lcs_type == LUSTRE_SEQ_METADATA)
-		mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 	if (rc)
 		goto out_req;
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index 92a5c0f..198ceb0 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -156,6 +156,30 @@ static inline void mdc_put_rpc_lock(struct mdc_rpc_lock *lck,
 	mutex_unlock(&lck->rpcl_mutex);
 }
 
+static inline void mdc_get_mod_rpc_slot(struct ptlrpc_request *req,
+					struct lookup_intent *it)
+{
+	struct client_obd *cli = &req->rq_import->imp_obd->u.cli;
+	u32 opc;
+	u16 tag;
+
+	opc = lustre_msg_get_opc(req->rq_reqmsg);
+	tag = obd_get_mod_rpc_slot(cli, opc, it);
+	lustre_msg_set_tag(req->rq_reqmsg, tag);
+}
+
+static inline void mdc_put_mod_rpc_slot(struct ptlrpc_request *req,
+					struct lookup_intent *it)
+{
+	struct client_obd *cli = &req->rq_import->imp_obd->u.cli;
+	u32 opc;
+	u16 tag;
+
+	opc = lustre_msg_get_opc(req->rq_reqmsg);
+	tag = lustre_msg_get_tag(req->rq_reqmsg);
+	obd_put_mod_rpc_slot(cli, opc, it, tag);
+}
+
 /**
  * Update the maximum possible easize.
  *
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index c8a6e23..09e3e71 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -263,14 +263,17 @@ struct client_obd {
 	wait_queue_head_t	      cl_destroy_waitq;
 
 	struct mdc_rpc_lock     *cl_rpc_lock;
-	struct mdc_rpc_lock     *cl_close_lock;
 
 	/* modify rpcs in flight
 	 * currently used for metadata only
 	 */
 	spinlock_t		 cl_mod_rpcs_lock;
 	u16			 cl_max_mod_rpcs_in_flight;
-
+	u16			 cl_mod_rpcs_in_flight;
+	u16			 cl_close_rpcs_in_flight;
+	wait_queue_head_t	 cl_mod_rpcs_waitq;
+	unsigned long		*cl_mod_tag_bitmap;
+	struct obd_histogram	 cl_mod_rpcs_hist;
 
 	/* mgc datastruct */
 	atomic_t	     cl_mgc_refcount;
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index aba96c3..476b1e4 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -101,6 +101,12 @@ struct obd_device *class_devices_in_group(struct obd_uuid *grp_uuid,
 __u32 obd_get_max_rpcs_in_flight(struct client_obd *cli);
 int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max);
 int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, u16 max);
+int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq);
+
+u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc,
+			 struct lookup_intent *it);
+void obd_put_mod_rpc_slot(struct client_obd *cli, u32 opc,
+			  struct lookup_intent *it, u16 tag);
 
 struct llog_handle;
 struct llog_rec_hdr;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 4f9480e..ee4bb56 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -372,6 +372,25 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	} else {
 		cli->cl_max_rpcs_in_flight = OBD_MAX_RIF_DEFAULT;
 	}
+
+	spin_lock_init(&cli->cl_mod_rpcs_lock);
+	spin_lock_init(&cli->cl_mod_rpcs_hist.oh_lock);
+	cli->cl_max_mod_rpcs_in_flight = 0;
+	cli->cl_mod_rpcs_in_flight = 0;
+	cli->cl_close_rpcs_in_flight = 0;
+	init_waitqueue_head(&cli->cl_mod_rpcs_waitq);
+	cli->cl_mod_tag_bitmap = NULL;
+
+	if (connect_op == MDS_CONNECT) {
+		cli->cl_max_mod_rpcs_in_flight = cli->cl_max_rpcs_in_flight - 1;
+		cli->cl_mod_tag_bitmap = kcalloc(BITS_TO_LONGS(OBD_MAX_RIF_MAX),
+						 sizeof(long), GFP_NOFS);
+		if (!cli->cl_mod_tag_bitmap) {
+			rc = -ENOMEM;
+			goto err;
+		}
+	}
+
 	rc = ldlm_get_ref();
 	if (rc) {
 		CERROR("ldlm_get_ref failed: %d\n", rc);
@@ -431,12 +450,16 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 err_ldlm:
 	ldlm_put_ref();
 err:
+	kfree(cli->cl_mod_tag_bitmap);
+	cli->cl_mod_tag_bitmap = NULL;
 	return rc;
 }
 EXPORT_SYMBOL(client_obd_setup);
 
 int client_obd_cleanup(struct obd_device *obddev)
 {
+	struct client_obd *cli = &obddev->u.cli;
+
 	ldlm_namespace_free_post(obddev->obd_namespace);
 	obddev->obd_namespace = NULL;
 
@@ -444,6 +467,10 @@ int client_obd_cleanup(struct obd_device *obddev)
 	LASSERT(!obddev->u.cli.cl_import);
 
 	ldlm_put_ref();
+
+	kfree(cli->cl_mod_tag_bitmap);
+	cli->cl_mod_tag_bitmap = NULL;
+
 	return 0;
 }
 EXPORT_SYMBOL(client_obd_cleanup);
@@ -458,6 +485,7 @@ int client_connect_import(const struct lu_env *env,
 	struct obd_import       *imp    = cli->cl_import;
 	struct obd_connect_data *ocd;
 	struct lustre_handle    conn    = { 0 };
+	bool is_mdc = false;
 	int		     rc;
 
 	*exp = NULL;
@@ -484,6 +512,10 @@ int client_connect_import(const struct lu_env *env,
 	ocd = &imp->imp_connect_data;
 	if (data) {
 		*ocd = *data;
+		is_mdc = !strncmp(imp->imp_obd->obd_type->typ_name,
+				  LUSTRE_MDC_NAME, 3);
+		if (is_mdc)
+			data->ocd_connect_flags |= OBD_CONNECT_MULTIMODRPCS;
 		imp->imp_connect_flags_orig = data->ocd_connect_flags;
 	}
 
@@ -499,6 +531,11 @@ int client_connect_import(const struct lu_env *env,
 			 ocd->ocd_connect_flags, "old %#llx, new %#llx\n",
 			 data->ocd_connect_flags, ocd->ocd_connect_flags);
 		data->ocd_connect_flags = ocd->ocd_connect_flags;
+		/* clear the flag as it was not set and is not known
+		 * by upper layers
+		 */
+		if (is_mdc)
+			data->ocd_connect_flags &= ~OBD_CONNECT_MULTIMODRPCS;
 	}
 
 	ptlrpc_pinger_add_import(imp);
diff --git a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
index 5fdee9e..c89003a 100644
--- a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
+++ b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
@@ -110,6 +110,27 @@ static ssize_t max_mod_rpcs_in_flight_store(struct kobject *kobj,
 }
 LUSTRE_RW_ATTR(max_mod_rpcs_in_flight);
 
+static int mdc_rpc_stats_seq_show(struct seq_file *seq, void *v)
+{
+	struct obd_device *dev = seq->private;
+
+	return obd_mod_rpc_stats_seq_show(&dev->u.cli, seq);
+}
+
+static ssize_t mdc_rpc_stats_seq_write(struct file *file,
+				       const char __user *buf,
+				       size_t len, loff_t *off)
+{
+	struct seq_file *seq = file->private_data;
+	struct obd_device *dev = seq->private;
+	struct client_obd *cli = &dev->u.cli;
+
+	lprocfs_oh_clear(&cli->cl_mod_rpcs_hist);
+
+	return len;
+}
+LPROC_SEQ_FOPS(mdc_rpc_stats);
+
 LPROC_SEQ_FOPS_WR_ONLY(mdc, ping);
 
 LPROC_SEQ_FOPS_RO_TYPE(mdc, connect_flags);
@@ -149,6 +170,8 @@ static ssize_t max_pages_per_rpc_show(struct kobject *kobj,
 	{ "import",		&mdc_import_fops,		NULL, 0 },
 	{ "state",		&mdc_state_fops,		NULL, 0 },
 	{ "pinger_recov",	&mdc_pinger_recov_fops,		NULL, 0 },
+	{ .name =	"rpc_stats",
+	  .fops =	&mdc_rpc_stats_fops		},
 	{ NULL }
 };
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 5b3d0ba..9bc5314 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -772,15 +772,16 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 		req->rq_sent = ktime_get_real_seconds() + resends;
 	}
 
-	/* It is important to obtain rpc_lock first (if applicable), so that
-	 * threads that are serialised with rpc_lock are not polluting our
-	 * rpcs in flight counter. We do not do flock request limiting, though
+	/* It is important to obtain modify RPC slot first (if applicable), so
+	 * that threads that are waiting for a modify RPC slot are not polluting
+	 * our rpcs in flight counter.
+	 * We do not do flock request limiting, though
 	 */
 	if (it) {
-		mdc_get_rpc_lock(obddev->u.cli.cl_rpc_lock, it);
+		mdc_get_mod_rpc_slot(req, it);
 		rc = obd_get_request_slot(&obddev->u.cli);
 		if (rc != 0) {
-			mdc_put_rpc_lock(obddev->u.cli.cl_rpc_lock, it);
+			mdc_put_mod_rpc_slot(req, it);
 			mdc_clear_replay_flag(req, 0);
 			ptlrpc_req_finished(req);
 			return rc;
@@ -807,7 +808,7 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 	}
 
 	obd_put_request_slot(&obddev->u.cli);
-	mdc_put_rpc_lock(obddev->u.cli.cl_rpc_lock, it);
+	mdc_put_mod_rpc_slot(req, it);
 
 	if (rc < 0) {
 		CDEBUG(D_INFO, "%s: ldlm_cli_enqueue failed: rc = %d\n",
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index 1847e5a..b551c57 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -40,17 +40,15 @@
 #include "../include/lustre_fid.h"
 
 /* mdc_setattr does its own semaphore handling */
-static int mdc_reint(struct ptlrpc_request *request,
-		     struct mdc_rpc_lock *rpc_lock,
-		     int level)
+static int mdc_reint(struct ptlrpc_request *request, int level)
 {
 	int rc;
 
 	request->rq_send_state = level;
 
-	mdc_get_rpc_lock(rpc_lock, NULL);
+	mdc_get_mod_rpc_slot(request, NULL);
 	rc = ptlrpc_queue_wait(request);
-	mdc_put_rpc_lock(rpc_lock, NULL);
+	mdc_put_mod_rpc_slot(request, NULL);
 	if (rc)
 		CDEBUG(D_INFO, "error in handling %d\n", rc);
 	else if (!req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY))
@@ -103,8 +101,6 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 {
 	LIST_HEAD(cancels);
 	struct ptlrpc_request *req;
-	struct mdc_rpc_lock *rpc_lock;
-	struct obd_device *obd = exp->exp_obd;
 	int count = 0, rc;
 	__u64 bits;
 
@@ -131,8 +127,6 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		return rc;
 	}
 
-	rpc_lock = obd->u.cli.cl_rpc_lock;
-
 	if (op_data->op_attr.ia_valid & (ATTR_MTIME | ATTR_CTIME))
 		CDEBUG(D_INODE, "setting mtime %ld, ctime %ld\n",
 		       LTIME_S(op_data->op_attr.ia_mtime),
@@ -141,7 +135,7 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_reint(req, rpc_lock, LUSTRE_IMP_FULL);
+	rc = mdc_reint(req, LUSTRE_IMP_FULL);
 
 	if (rc == -ERESTARTSYS)
 		rc = 0;
@@ -220,7 +214,7 @@ int mdc_create(struct obd_export *exp, struct md_op_data *op_data,
 	}
 	level = LUSTRE_IMP_FULL;
  resend:
-	rc = mdc_reint(req, exp->exp_obd->u.cli.cl_rpc_lock, level);
+	rc = mdc_reint(req, level);
 
 	/* Resend if we were told to. */
 	if (rc == -ERESTARTSYS) {
@@ -292,7 +286,7 @@ int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 
 	*request = req;
 
-	rc = mdc_reint(req, obd->u.cli.cl_rpc_lock, LUSTRE_IMP_FULL);
+	rc = mdc_reint(req, LUSTRE_IMP_FULL);
 	if (rc == -ERESTARTSYS)
 		rc = 0;
 	return rc;
@@ -302,7 +296,6 @@ int mdc_link(struct obd_export *exp, struct md_op_data *op_data,
 	     struct ptlrpc_request **request)
 {
 	LIST_HEAD(cancels);
-	struct obd_device *obd = exp->exp_obd;
 	struct ptlrpc_request *req;
 	int count = 0, rc;
 
@@ -334,7 +327,7 @@ int mdc_link(struct obd_export *exp, struct md_op_data *op_data,
 	mdc_link_pack(req, op_data);
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_reint(req, obd->u.cli.cl_rpc_lock, LUSTRE_IMP_FULL);
+	rc = mdc_reint(req, LUSTRE_IMP_FULL);
 	*request = req;
 	if (rc == -ERESTARTSYS)
 		rc = 0;
@@ -398,7 +391,7 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 			     obd->u.cli.cl_default_mds_easize);
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_reint(req, obd->u.cli.cl_rpc_lock, LUSTRE_IMP_FULL);
+	rc = mdc_reint(req, LUSTRE_IMP_FULL);
 	*request = req;
 	if (rc == -ERESTARTSYS)
 		rc = 0;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index c620f5c..e3a167a 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -327,12 +327,12 @@ static int mdc_xattr_common(struct obd_export *exp,
 
 	/* make rpc */
 	if (opcode == MDS_REINT)
-		mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+		mdc_get_mod_rpc_slot(req, NULL);
 
 	rc = ptlrpc_queue_wait(req);
 
 	if (opcode == MDS_REINT)
-		mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+		mdc_put_mod_rpc_slot(req, NULL);
 
 	if (rc)
 		ptlrpc_req_finished(req);
@@ -775,9 +775,9 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 
 	ptlrpc_request_set_replen(req);
 
-	mdc_get_rpc_lock(obd->u.cli.cl_close_lock, NULL);
+	mdc_get_mod_rpc_slot(req, NULL);
 	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(obd->u.cli.cl_close_lock, NULL);
+	mdc_put_mod_rpc_slot(req, NULL);
 
 	if (!req->rq_repmsg) {
 		CDEBUG(D_RPCTRACE, "request failed to send: %p, %d\n", req,
@@ -1490,9 +1490,9 @@ static int mdc_ioc_hsm_progress(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_get_mod_rpc_slot(req, NULL);
 	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_put_mod_rpc_slot(req, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1670,9 +1670,9 @@ static int mdc_ioc_hsm_state_set(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_get_mod_rpc_slot(req, NULL);
 	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_put_mod_rpc_slot(req, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1735,9 +1735,9 @@ static int mdc_ioc_hsm_request(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_get_mod_rpc_slot(req, NULL);
 	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	mdc_put_mod_rpc_slot(req, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -2582,29 +2582,17 @@ static void mdc_llog_finish(struct obd_device *obd)
 
 static int mdc_setup(struct obd_device *obd, struct lustre_cfg *cfg)
 {
-	struct client_obd *cli = &obd->u.cli;
 	struct lprocfs_static_vars lvars = { NULL };
 	int rc;
 
-	cli->cl_rpc_lock = kzalloc(sizeof(*cli->cl_rpc_lock), GFP_NOFS);
-	if (!cli->cl_rpc_lock)
-		return -ENOMEM;
-	mdc_init_rpc_lock(cli->cl_rpc_lock);
-
 	rc = ptlrpcd_addref();
 	if (rc < 0)
-		goto err_rpc_lock;
-
-	cli->cl_close_lock = kzalloc(sizeof(*cli->cl_close_lock), GFP_NOFS);
-	if (!cli->cl_close_lock) {
-		rc = -ENOMEM;
-		goto err_ptlrpcd_decref;
-	}
-	mdc_init_rpc_lock(cli->cl_close_lock);
+		return rc;
 
 	rc = client_obd_setup(obd, cfg);
 	if (rc)
-		goto err_close_lock;
+		goto err_ptlrpcd_decref;
+
 	lprocfs_mdc_init_vars(&lvars);
 	lprocfs_obd_setup(obd, lvars.obd_vars, lvars.sysfs_vars);
 	sptlrpc_lprocfs_cliobd_attach(obd);
@@ -2621,17 +2609,10 @@ static int mdc_setup(struct obd_device *obd, struct lustre_cfg *cfg)
 		return rc;
 	}
 
-	spin_lock_init(&cli->cl_mod_rpcs_lock);
-	cli->cl_max_mod_rpcs_in_flight = OBD_MAX_RIF_DEFAULT - 1;
-
 	return rc;
 
-err_close_lock:
-	kfree(cli->cl_close_lock);
 err_ptlrpcd_decref:
 	ptlrpcd_decref();
-err_rpc_lock:
-	kfree(cli->cl_rpc_lock);
 	return rc;
 }
 
@@ -2679,11 +2660,6 @@ static int mdc_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
 
 static int mdc_cleanup(struct obd_device *obd)
 {
-	struct client_obd *cli = &obd->u.cli;
-
-	kfree(cli->cl_rpc_lock);
-	kfree(cli->cl_close_lock);
-
 	ptlrpcd_decref();
 
 	return client_obd_cleanup(obd);
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index 62e6636..c7e7374 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -1461,6 +1461,7 @@ int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, __u16 max)
 {
 	struct obd_connect_data *ocd;
 	u16 maxmodrpcs;
+	u16 prev;
 
 	if (max > OBD_MAX_RIF_MAX || max < 1)
 		return -ERANGE;
@@ -1486,10 +1487,178 @@ int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, __u16 max)
 		return -ERANGE;
 	}
 
+	spin_lock(&cli->cl_mod_rpcs_lock);
+
+	prev = cli->cl_max_mod_rpcs_in_flight;
 	cli->cl_max_mod_rpcs_in_flight = max;
 
-	/* will have to wakeup waiters if max has been increased */
+	/* wakeup waiters if limit has been increased */
+	if (cli->cl_max_mod_rpcs_in_flight > prev)
+		wake_up(&cli->cl_mod_rpcs_waitq);
+
+	spin_unlock(&cli->cl_mod_rpcs_lock);
 
 	return 0;
 }
 EXPORT_SYMBOL(obd_set_max_mod_rpcs_in_flight);
+
+#define pct(a, b) (b ? (a * 100) / b : 0)
+
+int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq)
+{
+	unsigned long mod_tot = 0, mod_cum;
+	struct timespec64 now;
+	int i;
+
+	ktime_get_real_ts64(&now);
+
+	spin_lock(&cli->cl_mod_rpcs_lock);
+
+	seq_printf(seq, "snapshot_time:         %lu.%lu (secs.usecs)\n",
+		   now.tv_sec, now.tv_nsec / 1000);
+	seq_printf(seq, "modify_RPCs_in_flight:  %hu\n",
+		   cli->cl_mod_rpcs_in_flight);
+
+	seq_puts(seq, "\n\t\t\tmodify\n");
+	seq_puts(seq, "rpcs in flight        rpcs   %% cum %%\n");
+
+	mod_tot = lprocfs_oh_sum(&cli->cl_mod_rpcs_hist);
+
+	mod_cum = 0;
+	for (i = 0; i < OBD_HIST_MAX; i++) {
+		unsigned long mod = cli->cl_mod_rpcs_hist.oh_buckets[i];
+
+		mod_cum += mod;
+		seq_printf(seq, "%d:\t\t%10lu %3lu %3lu\n",
+			   i, mod, pct(mod, mod_tot),
+			   pct(mod_cum, mod_tot));
+		if (mod_cum == mod_tot)
+			break;
+	}
+
+	spin_unlock(&cli->cl_mod_rpcs_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL(obd_mod_rpc_stats_seq_show);
+#undef pct
+
+/*
+ * The number of modify RPCs sent in parallel is limited
+ * because the server has a finite number of slots per client to
+ * store request result and ensure reply reconstruction when needed.
+ * On the client, this limit is stored in cl_max_mod_rpcs_in_flight
+ * that takes into account server limit and cl_max_rpcs_in_flight
+ * value.
+ * On the MDC client, to avoid a potential deadlock (see Bugzilla 3462),
+ * one close request is allowed above the maximum.
+ */
+static inline bool obd_mod_rpc_slot_avail_locked(struct client_obd *cli,
+						 bool close_req)
+{
+	bool avail;
+
+	/* A slot is available if
+	 * - number of modify RPCs in flight is less than the max
+	 * - it's a close RPC and no other close request is in flight
+	 */
+	avail = cli->cl_mod_rpcs_in_flight < cli->cl_max_mod_rpcs_in_flight ||
+		(close_req && !cli->cl_close_rpcs_in_flight);
+
+	return avail;
+}
+
+static inline bool obd_mod_rpc_slot_avail(struct client_obd *cli,
+					  bool close_req)
+{
+	bool avail;
+
+	spin_lock(&cli->cl_mod_rpcs_lock);
+	avail = obd_mod_rpc_slot_avail_locked(cli, close_req);
+	spin_unlock(&cli->cl_mod_rpcs_lock);
+	return avail;
+}
+
+/* Get a modify RPC slot from the obd client @cli according
+ * to the kind of operation @opc that is going to be sent
+ * and the intent @it of the operation if it applies.
+ * If the maximum number of modify RPCs in flight is reached
+ * the thread is put to sleep.
+ * Returns the tag to be set in the request message. Tag 0
+ * is reserved for non-modifying requests.
+ */
+u16 obd_get_mod_rpc_slot(struct client_obd *cli, __u32 opc,
+			 struct lookup_intent *it)
+{
+	struct l_wait_info lwi = LWI_INTR(NULL, NULL);
+	bool close_req = false;
+	u16 i, max;
+
+	/* read-only metadata RPCs don't consume a slot on MDT
+	 * for reply reconstruction
+	 */
+	if (it && (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
+		   it->it_op == IT_LAYOUT || it->it_op == IT_READDIR))
+		return 0;
+
+	if (opc == MDS_CLOSE)
+		close_req = true;
+
+	do {
+		spin_lock(&cli->cl_mod_rpcs_lock);
+		max = cli->cl_max_mod_rpcs_in_flight;
+		if (obd_mod_rpc_slot_avail_locked(cli, close_req)) {
+			/* there is a slot available */
+			cli->cl_mod_rpcs_in_flight++;
+			if (close_req)
+				cli->cl_close_rpcs_in_flight++;
+			lprocfs_oh_tally(&cli->cl_mod_rpcs_hist,
+					 cli->cl_mod_rpcs_in_flight);
+			/* find a free tag */
+			i = find_first_zero_bit(cli->cl_mod_tag_bitmap,
+						max + 1);
+			LASSERT(i < OBD_MAX_RIF_MAX);
+			LASSERT(!test_and_set_bit(i, cli->cl_mod_tag_bitmap));
+			spin_unlock(&cli->cl_mod_rpcs_lock);
+			/* tag 0 is reserved for non-modify RPCs */
+			return i + 1;
+		}
+		spin_unlock(&cli->cl_mod_rpcs_lock);
+
+		CDEBUG(D_RPCTRACE, "%s: sleeping for a modify RPC slot opc %u, max %hu\n",
+		       cli->cl_import->imp_obd->obd_name, opc, max);
+
+		l_wait_event(cli->cl_mod_rpcs_waitq,
+			     obd_mod_rpc_slot_avail(cli, close_req), &lwi);
+	} while (true);
+}
+EXPORT_SYMBOL(obd_get_mod_rpc_slot);
+
+/*
+ * Put a modify RPC slot from the obd client @cli according
+ * to the kind of operation @opc that has been sent and the
+ * intent @it of the operation if it applies.
+ */
+void obd_put_mod_rpc_slot(struct client_obd *cli, u32 opc,
+			  struct lookup_intent *it, u16 tag)
+{
+	bool close_req = false;
+
+	if (it && (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
+		   it->it_op == IT_LAYOUT || it->it_op == IT_READDIR))
+		return;
+
+	if (opc == MDS_CLOSE)
+		close_req = true;
+
+	spin_lock(&cli->cl_mod_rpcs_lock);
+	cli->cl_mod_rpcs_in_flight--;
+	if (close_req)
+		cli->cl_close_rpcs_in_flight--;
+	/* release the tag in the bitmap */
+	LASSERT(tag - 1 < OBD_MAX_RIF_MAX);
+	LASSERT(test_and_clear_bit(tag - 1, cli->cl_mod_tag_bitmap) != 0);
+	spin_unlock(&cli->cl_mod_rpcs_lock);
+	wake_up(&cli->cl_mod_rpcs_waitq);
+}
+EXPORT_SYMBOL(obd_put_mod_rpc_slot);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 13/29] staging: lustre: llite: report back to user bad stripe count
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (11 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 14/29] staging: lustre: ptlrpc: Do not resend req with allow_replay James Simmons
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

If the user is requesting a stripe count larger than what
is supported in the file system then report back to the
user an error as well as what is the largest possible
striping.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6602
Reviewed-on: http://review.whamcloud.com/15162
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index fb367ae..64a32d5 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1195,6 +1195,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		struct lmv_user_md *tmp = NULL;
 		union lmv_mds_md *lmm = NULL;
 		u64 valid = 0;
+		int max_stripe_count;
 		int stripe_count;
 		int mdt_index;
 		int lum_size;
@@ -1206,6 +1207,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (copy_from_user(&lum, ulmv, sizeof(*ulmv)))
 			return -EFAULT;
 
+		max_stripe_count = lum.lum_stripe_count;
 		/*
 		 * lum_magic will indicate which stripe the ioctl will like
 		 * to get, LMV_MAGIC_V1 is for normal LMV stripe, LMV_USER_MAGIC
@@ -1240,6 +1242,16 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		}
 
 		stripe_count = lmv_mds_md_stripe_count_get(lmm);
+		if (max_stripe_count < stripe_count) {
+			lum.lum_stripe_count = stripe_count;
+			if (copy_to_user(ulmv, &lum, sizeof(lum))) {
+				rc = -EFAULT;
+				goto finish_req;
+			}
+			rc = -E2BIG;
+			goto finish_req;
+		}
+
 		lum_size = lmv_user_md_size(stripe_count, LMV_MAGIC_V1);
 		tmp = kzalloc(lum_size, GFP_NOFS);
 		if (!tmp) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 14/29] staging: lustre: ptlrpc: Do not resend req with allow_replay
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (12 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 13/29] staging: lustre: llite: report back to user bad stripe count James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 15/29] staging: lustre: obdecho: don't copy lu_site James Simmons
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

If the request is allowed to be sent during recovery,
and it is not timeout yet, then we do not need to
resend it in the final stage of recovery.

Unnecessary resend will cause the bulk request to resend the
request with different mbit, but same xid, and on the remote
server side, it will drop such request with same xid, so it
will never get the bulk in this case.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6780
Reviewed-on: http://review.whamcloud.com/15458
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/recover.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index 405faf0..aa96534 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -194,7 +194,13 @@ int ptlrpc_resend(struct obd_import *imp)
 		LASSERTF((long)req > PAGE_SIZE && req != LP_POISON,
 			 "req %p bad\n", req);
 		LASSERTF(req->rq_type != LI_POISON, "req %p freed\n", req);
-		if (!ptlrpc_no_resend(req))
+
+		/*
+		 * If the request is allowed to be sent during replay and it
+		 * is not timeout yet, then it does not need to be resent.
+		 */
+		if (!ptlrpc_no_resend(req) &&
+		    (req->rq_timedout || !req->rq_allow_replay))
 			ptlrpc_resend_req(req);
 	}
 	spin_unlock(&imp->imp_lock);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 15/29] staging: lustre: obdecho: don't copy lu_site
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (13 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 14/29] staging: lustre: ptlrpc: Do not resend req with allow_replay James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 16/29] staging: lustre: mdc: deactive MDT permanently James Simmons
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	Olaf Faaland, James Simmons

From: Niu Yawei <yawei.niu@intel.com>

While creating an echo device, echo_device_alloc() copies the lu_site
from MD stack, such kind of copy result in uninitialized mutex and
other potential issues.

Instead of copying the lu_site, we'd use the lu_site by pointer directly.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6765
Reviewed-on: http://review.whamcloud.com/15657
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/obdecho/echo_client.c    |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index c69588c..30caf5f 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -55,7 +55,7 @@ struct echo_device {
 	struct echo_client_obd *ed_ec;
 
 	struct cl_site	  ed_site_myself;
-	struct cl_site	 *ed_site;
+	struct lu_site		*ed_site;
 	struct lu_device       *ed_next;
 };
 
@@ -527,17 +527,19 @@ static int echo_site_init(const struct lu_env *env, struct echo_device *ed)
 	}
 
 	rc = lu_site_init_finish(&site->cs_lu);
-	if (rc)
+	if (rc) {
+		cl_site_fini(site);
 		return rc;
+	}
 
-	ed->ed_site = site;
+	ed->ed_site = &site->cs_lu;
 	return 0;
 }
 
 static void echo_site_fini(const struct lu_env *env, struct echo_device *ed)
 {
 	if (ed->ed_site) {
-		cl_site_fini(ed->ed_site);
+		lu_site_fini(ed->ed_site);
 		ed->ed_site = NULL;
 	}
 }
@@ -674,7 +676,7 @@ static void echo_session_key_exit(const struct lu_context *ctx,
 			goto out_cleanup;
 		}
 
-		next->ld_site = &ed->ed_site->cs_lu;
+		next->ld_site = ed->ed_site;
 		rc = next->ld_type->ldt_ops->ldto_device_init(env, next,
 						next->ld_type->ldt_name,
 							      NULL);
@@ -741,7 +743,7 @@ static void echo_lock_release(const struct lu_env *env,
 	CDEBUG(D_INFO, "echo device:%p is going to be freed, next = %p\n",
 	       ed, next);
 
-	lu_site_purge(env, &ed->ed_site->cs_lu, -1);
+	lu_site_purge(env, ed->ed_site, -1);
 
 	/* check if there are objects still alive.
 	 * It shouldn't have any object because lu_site_purge would cleanup
@@ -754,7 +756,7 @@ static void echo_lock_release(const struct lu_env *env,
 	spin_unlock(&ec->ec_lock);
 
 	/* purge again */
-	lu_site_purge(env, &ed->ed_site->cs_lu, -1);
+	lu_site_purge(env, ed->ed_site, -1);
 
 	CDEBUG(D_INFO,
 	       "Waiting for the reference of echo object to be dropped\n");
@@ -766,7 +768,7 @@ static void echo_lock_release(const struct lu_env *env,
 		CERROR("echo_client still has objects at cleanup time, wait for 1 second\n");
 		set_current_state(TASK_UNINTERRUPTIBLE);
 		schedule_timeout(cfs_time_seconds(1));
-		lu_site_purge(env, &ed->ed_site->cs_lu, -1);
+		lu_site_purge(env, ed->ed_site, -1);
 		spin_lock(&ec->ec_lock);
 	}
 	spin_unlock(&ec->ec_lock);
@@ -780,7 +782,7 @@ static void echo_lock_release(const struct lu_env *env,
 	while (next)
 		next = next->ld_type->ldt_ops->ldto_device_free(env, next);
 
-	LASSERT(ed->ed_site == lu2cl_site(d->ld_site));
+	LASSERT(ed->ed_site == d->ld_site);
 	echo_site_fini(env, ed);
 	cl_device_fini(&ed->ed_cl);
 	kfree(ed);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 16/29] staging: lustre: mdc: deactive MDT permanently
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (14 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 15/29] staging: lustre: obdecho: don't copy lu_site James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 17/29] staging: lustre: ptlrpc: replay bulk request James Simmons
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Add active proc entry for MDC, and mark MDC to
be inactive once the MDC is deactivated.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6586
Reviewed-on: http://review.whamcloud.com/14747
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c   |    1 +
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c |   37 +++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7a8d1c8..12e8b1e 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -62,6 +62,7 @@ static void lmv_activate_target(struct lmv_obd *lmv,
 
 	tgt->ltd_active = activate;
 	lmv->desc.ld_active_tgt_count += (activate ? 1 : -1);
+	tgt->ltd_exp->exp_obd->obd_inactive = !activate;
 }
 
 /**
diff --git a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
index c89003a..9021c46 100644
--- a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
+++ b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
@@ -36,6 +36,42 @@
 #include "../include/lprocfs_status.h"
 #include "mdc_internal.h"
 
+static ssize_t active_show(struct kobject *kobj, struct attribute *attr,
+			   char *buf)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+
+	return sprintf(buf, "%u\n", !dev->u.cli.cl_import->imp_deactive);
+}
+
+static ssize_t active_store(struct kobject *kobj, struct attribute *attr,
+			    const char *buffer, size_t count)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+	unsigned long val;
+	int rc;
+
+	rc = kstrtoul(buffer, 10, &val);
+	if (rc)
+		return rc;
+
+	if (val < 0 || val > 1)
+		return -ERANGE;
+
+	/* opposite senses */
+	if (dev->u.cli.cl_import->imp_deactive == val) {
+		rc = ptlrpc_set_import_active(dev->u.cli.cl_import, val);
+		if (rc)
+			count = rc;
+	} else {
+		CDEBUG(D_CONFIG, "activate %lu: ignoring repeat request\n", val);
+	}
+	return count;
+}
+LUSTRE_RW_ATTR(active);
+
 static ssize_t max_rpcs_in_flight_show(struct kobject *kobj,
 				       struct attribute *attr,
 				       char *buf)
@@ -176,6 +212,7 @@ static ssize_t max_pages_per_rpc_show(struct kobject *kobj,
 };
 
 static struct attribute *mdc_attrs[] = {
+	&lustre_attr_active.attr,
 	&lustre_attr_max_rpcs_in_flight.attr,
 	&lustre_attr_max_mod_rpcs_in_flight.attr,
 	&lustre_attr_max_pages_per_rpc.attr,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 17/29] staging: lustre: ptlrpc: replay bulk request
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (15 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 16/29] staging: lustre: mdc: deactive MDT permanently James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 18/29] staging: lustre: mdt: disable IMA support James Simmons
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Even though the server might already got the bulk
replay request, but bulk transfer timeout, let's
replay the bulk request, i.e. treat such replay as
same as no replied replay request (See
ptlrpc_replay_interpret()).

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6924
Reviewed-on: http://review.whamcloud.com/15793
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index e4fbdd0..bda925e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2762,8 +2762,15 @@ static int ptlrpc_replay_interpret(const struct lu_env *env,
 
 	atomic_dec(&imp->imp_replay_inflight);
 
-	if (!ptlrpc_client_replied(req)) {
-		CERROR("request replay timed out, restarting recovery\n");
+	/*
+	 * Note: if it is bulk replay (MDS-MDS replay), then even if
+	 * server got the request, but bulk transfer timeout, let's
+	 * replay the bulk req again
+	 */
+	if (!ptlrpc_client_replied(req) ||
+	    (req->rq_bulk &&
+	     lustre_msg_get_status(req->rq_repmsg) == -ETIMEDOUT)) {
+		DEBUG_REQ(D_ERROR, req, "request replay timed out.\n");
 		rc = -ETIMEDOUT;
 		goto out;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 18/29] staging: lustre: mdt: disable IMA support
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (16 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 17/29] staging: lustre: ptlrpc: replay bulk request James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 19/29] staging: lustre: ldlm: reclaim granted locks defensively James Simmons
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Hongchao Zhang, James Simmons

From: Hongchao Zhang <hongchao.zhang@intel.com>

For IMA (Integrity Measurement Architecture), there are two xattr
"security.ima" and "security.evm" to protect the file to be modified
accidentally or maliciously, the two xattr are not compatible with
VBR, then disable it to workaround the problem currently and enable
it when the conditions are ready.

Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6455
Reviewed-on: http://review.whamcloud.com/14928
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/xattr.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index 3ae1a02..ea3becc 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -126,6 +126,11 @@ static int xattr_type_filter(struct ll_sb_info *sbi,
 	    strcmp(name, "selinux") == 0)
 		return -EOPNOTSUPP;
 
+	/*FIXME: enable IMA when the conditions are ready */
+	if (handler->flags == XATTR_SECURITY_T &&
+	    (!strcmp(name, "ima") || !strcmp(name, "evm")))
+		return -EOPNOTSUPP;
+
 	sprintf(fullname, "%s%s\n", handler->prefix, name);
 	rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode),
 			 valid, fullname, pv, size, 0, flags,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 19/29] staging: lustre: ldlm: reclaim granted locks defensively
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (17 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 18/29] staging: lustre: mdt: disable IMA support James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 20/29] staging: lustre: recovery: don't skip open replay on reconnect James Simmons
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

It was discovered that to many ldlm locks where being
created on the server side to the point of memory
exhaustion. The work of LU-6529 introduced watermarks
to avoid this memory exhaustion. This is the client
side part of this work for the upstream client.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6529
Reviewed-on: http://review.whamcloud.com/14931
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6929
Reviewed-on: http://review.whamcloud.com/15813
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/linux/libcfs/libcfs_hash.h      |    2 +-
 drivers/staging/lustre/lnet/libcfs/hash.c          |   24 +++++++++++++++-----
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 ++++--
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |   15 ++++--------
 5 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_hash.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_hash.h
index 6949a18..f2b4399 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_hash.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_hash.h
@@ -705,7 +705,7 @@ typedef int (*cfs_hash_for_each_cb_t)(struct cfs_hash *hs,
 cfs_hash_for_each_safe(struct cfs_hash *hs, cfs_hash_for_each_cb_t, void *data);
 int
 cfs_hash_for_each_nolock(struct cfs_hash *hs, cfs_hash_for_each_cb_t,
-			 void *data);
+			 void *data, int start);
 int
 cfs_hash_for_each_empty(struct cfs_hash *hs, cfs_hash_for_each_cb_t,
 			void *data);
diff --git a/drivers/staging/lustre/lnet/libcfs/hash.c b/drivers/staging/lustre/lnet/libcfs/hash.c
index 23283b6..997b8a5 100644
--- a/drivers/staging/lustre/lnet/libcfs/hash.c
+++ b/drivers/staging/lustre/lnet/libcfs/hash.c
@@ -1552,7 +1552,7 @@ struct cfs_hash_cond_arg {
  */
 static int
 cfs_hash_for_each_relax(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
-			void *data)
+			void *data, int start)
 {
 	struct hlist_node *hnode;
 	struct hlist_node *tmp;
@@ -1560,18 +1560,25 @@ struct cfs_hash_cond_arg {
 	__u32 version;
 	int count = 0;
 	int stop_on_change;
-	int rc;
+	int end = -1;
+	int rc = 0;
 	int i;
 
 	stop_on_change = cfs_hash_with_rehash_key(hs) ||
 			 !cfs_hash_with_no_itemref(hs) ||
 			 !hs->hs_ops->hs_put_locked;
 	cfs_hash_lock(hs, 0);
+again:
 	LASSERT(!cfs_hash_is_rehashing(hs));
 
 	cfs_hash_for_each_bucket(hs, &bd, i) {
 		struct hlist_head *hhead;
 
+		if (i < start)
+			continue;
+		else if (end > 0 && i >= end)
+			break;
+
 		cfs_hash_bd_lock(hs, &bd, 0);
 		version = cfs_hash_bd_version_get(&bd);
 
@@ -1611,14 +1618,19 @@ struct cfs_hash_cond_arg {
 		if (rc) /* callback wants to break iteration */
 			break;
 	}
-	cfs_hash_unlock(hs, 0);
+	if (start > 0 && !rc) {
+		end = start;
+		start = 0;
+		goto again;
+	}
 
+	cfs_hash_unlock(hs, 0);
 	return count;
 }
 
 int
 cfs_hash_for_each_nolock(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
-			 void *data)
+			 void *data, int start)
 {
 	if (cfs_hash_with_no_lock(hs) ||
 	    cfs_hash_with_rehash_key(hs) ||
@@ -1630,7 +1642,7 @@ struct cfs_hash_cond_arg {
 		return -EOPNOTSUPP;
 
 	cfs_hash_for_each_enter(hs);
-	cfs_hash_for_each_relax(hs, func, data);
+	cfs_hash_for_each_relax(hs, func, data, start);
 	cfs_hash_for_each_exit(hs);
 
 	return 0;
@@ -1662,7 +1674,7 @@ struct cfs_hash_cond_arg {
 		return -EOPNOTSUPP;
 
 	cfs_hash_for_each_enter(hs);
-	while (cfs_hash_for_each_relax(hs, func, data)) {
+	while (cfs_hash_for_each_relax(hs, func, data, 0)) {
 		CDEBUG(D_INFO, "Try to empty hash: %s, loop: %u\n",
 		       hs->hs_name, i++);
 	}
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index ac1927c..43856ff 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1729,7 +1729,7 @@ int ldlm_cli_cancel_unused(struct ldlm_namespace *ns,
 						       opaque);
 	} else {
 		cfs_hash_for_each_nolock(ns->ns_rs_hash,
-					 ldlm_cli_hash_cancel_unused, &arg);
+					 ldlm_cli_hash_cancel_unused, &arg, 0);
 		return ELDLM_OK;
 	}
 }
@@ -1802,7 +1802,7 @@ static void ldlm_namespace_foreach(struct ldlm_namespace *ns,
 	};
 
 	cfs_hash_for_each_nolock(ns->ns_rs_hash,
-				 ldlm_res_iter_helper, &helper);
+				 ldlm_res_iter_helper, &helper, 0);
 }
 
 /* non-blocking function to manipulate a lock whose cb_data is being put away.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 07cb955..c452400 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -855,8 +855,10 @@ int ldlm_namespace_cleanup(struct ldlm_namespace *ns, __u64 flags)
 		return ELDLM_OK;
 	}
 
-	cfs_hash_for_each_nolock(ns->ns_rs_hash, ldlm_resource_clean, &flags);
-	cfs_hash_for_each_nolock(ns->ns_rs_hash, ldlm_resource_complain, NULL);
+	cfs_hash_for_each_nolock(ns->ns_rs_hash, ldlm_resource_clean,
+				 &flags, 0);
+	cfs_hash_for_each_nolock(ns->ns_rs_hash, ldlm_resource_complain,
+				 NULL, 0);
 	return ELDLM_OK;
 }
 EXPORT_SYMBOL(ldlm_namespace_cleanup);
@@ -1352,7 +1354,7 @@ void ldlm_namespace_dump(int level, struct ldlm_namespace *ns)
 
 	cfs_hash_for_each_nolock(ns->ns_rs_hash,
 				 ldlm_res_hash_dump,
-				 (void *)(unsigned long)level);
+				 (void *)(unsigned long)level, 0);
 	spin_lock(&ns->ns_lock);
 	ns->ns_next_dump = cfs_time_shift(10);
 	spin_unlock(&ns->ns_lock);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 9bc5314..42a128f 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -760,12 +760,6 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	if (req && it && it->it_op & IT_CREAT)
-		/* ask ptlrpc not to resend on EINPROGRESS since we have our own
-		 * retry logic
-		 */
-		req->rq_no_retry_einprogress = 1;
-
 	if (resends) {
 		req->rq_generation_set = 1;
 		req->rq_import_generation = generation;
@@ -824,11 +818,12 @@ int mdc_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 	lockrep->lock_policy_res2 =
 		ptlrpc_status_ntoh(lockrep->lock_policy_res2);
 
-	/* Retry the create infinitely when we get -EINPROGRESS from
-	 * server. This is required by the new quota design.
+	/*
+	 * Retry infinitely when the server returns -EINPROGRESS for the
+	 * intent operation, when server returns -EINPROGRESS for acquiring
+	 * intent lock, we'll retry in after_reply().
 	 */
-	if (it->it_op & IT_CREAT &&
-	    (int)lockrep->lock_policy_res2 == -EINPROGRESS) {
+	if (it->it_op && (int)lockrep->lock_policy_res2 == -EINPROGRESS) {
 		mdc_clear_replay_flag(req, rc);
 		ptlrpc_req_finished(req);
 		resends++;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 20/29] staging: lustre: recovery: don't skip open replay on reconnect
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (18 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 19/29] staging: lustre: ldlm: reclaim granted locks defensively James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 21/29] staging: lustre: obdclass: race lustre_profile_list James Simmons
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

Once reconnect happened during replay, we'd continue the open
replay with the last failed replay, but not the next.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6802
Reviewed-on: http://review.whamcloud.com/15871
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/recover.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index aa96534..9144cd8 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -111,7 +111,9 @@ int ptlrpc_replay_next(struct obd_import *imp, int *inflight)
 			 * all of it's requests being replayed, it's safe to
 			 * use a cursor to accelerate the search
 			 */
-			imp->imp_replay_cursor = imp->imp_replay_cursor->next;
+			if (!imp->imp_resend_replay ||
+			    imp->imp_replay_cursor == &imp->imp_committed_list)
+				imp->imp_replay_cursor = imp->imp_replay_cursor->next;
 
 			while (imp->imp_replay_cursor !=
 			       &imp->imp_committed_list) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 21/29] staging: lustre: obdclass: race lustre_profile_list
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (19 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 20/29] staging: lustre: recovery: don't skip open replay on reconnect James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3 James Simmons
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Hiroya Nozaki, James Simmons

From: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com>

Running multiple mounts at the same time results in
lustre_profile_list corruption when adding a new profile.

This patch adds a new spin_lock to protect the list and
avoid the bug

Signed-off-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6600
Reviewed-on: http://review.whamcloud.com/14896
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd_class.h  |    3 +
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    2 +
 .../staging/lustre/lustre/obdclass/obd_config.c    |   57 +++++++++++++++++---
 3 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 476b1e4..f79133c 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -182,10 +182,13 @@ struct lustre_profile {
 	char	    *lp_profile;
 	char	    *lp_dt;
 	char	    *lp_md;
+	int			lp_refs;
+	bool			lp_list_deleted;
 };
 
 struct lustre_profile *class_get_profile(const char *prof);
 void class_del_profile(const char *prof);
+void class_put_profile(struct lustre_profile *lprof);
 void class_del_profiles(void);
 
 #if LUSTRE_TRACKS_LOCK_EXP_REFS
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index b896ac1..308da06 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -929,6 +929,8 @@ int ll_fill_super(struct super_block *sb, struct vfsmount *mnt)
 out_free:
 	kfree(md);
 	kfree(dt);
+	if (lprof)
+		class_put_profile(lprof);
 	if (err)
 		ll_put_super(sb);
 	else if (sbi->ll_flags & LL_SBI_VERBOSE)
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c b/drivers/staging/lustre/lustre/obdclass/obd_config.c
index bbed1b7..017bdac 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
@@ -585,16 +585,21 @@ static int class_del_conn(struct obd_device *obd, struct lustre_cfg *lcfg)
 }
 
 static LIST_HEAD(lustre_profile_list);
+static DEFINE_SPINLOCK(lustre_profile_list_lock);
 
 struct lustre_profile *class_get_profile(const char *prof)
 {
 	struct lustre_profile *lprof;
 
+	spin_lock(&lustre_profile_list_lock);
 	list_for_each_entry(lprof, &lustre_profile_list, lp_list) {
 		if (!strcmp(lprof->lp_profile, prof)) {
+			lprof->lp_refs++;
+			spin_unlock(&lustre_profile_list_lock);
 			return lprof;
 		}
 	}
+	spin_unlock(&lustre_profile_list_lock);
 	return NULL;
 }
 EXPORT_SYMBOL(class_get_profile);
@@ -639,7 +644,11 @@ static int class_add_profile(int proflen, char *prof, int osclen, char *osc,
 		}
 	}
 
+	spin_lock(&lustre_profile_list_lock);
+	lprof->lp_refs = 1;
+	lprof->lp_list_deleted = false;
 	list_add(&lprof->lp_list, &lustre_profile_list);
+	spin_unlock(&lustre_profile_list_lock);
 	return err;
 
 free_lp_dt:
@@ -659,27 +668,59 @@ void class_del_profile(const char *prof)
 
 	lprof = class_get_profile(prof);
 	if (lprof) {
+		spin_lock(&lustre_profile_list_lock);
+		/* because get profile increments the ref counter */
+		lprof->lp_refs--;
 		list_del(&lprof->lp_list);
-		kfree(lprof->lp_profile);
-		kfree(lprof->lp_dt);
-		kfree(lprof->lp_md);
-		kfree(lprof);
+		lprof->lp_list_deleted = true;
+		spin_unlock(&lustre_profile_list_lock);
+
+		class_put_profile(lprof);
 	}
 }
 EXPORT_SYMBOL(class_del_profile);
 
+void class_put_profile(struct lustre_profile *lprof)
+{
+	spin_lock(&lustre_profile_list_lock);
+	if (--lprof->lp_refs > 0) {
+		LASSERT(lprof->lp_refs > 0);
+		spin_unlock(&lustre_profile_list_lock);
+		return;
+	}
+	spin_unlock(&lustre_profile_list_lock);
+
+	/* confirm not a negative number */
+	LASSERT(!lprof->lp_refs);
+
+	/*
+	 * At least one class_del_profile/profiles must be called
+	 * on the target profile or lustre_profile_list will corrupt
+	 */
+	LASSERT(lprof->lp_list_deleted);
+	kfree(lprof->lp_profile);
+	kfree(lprof->lp_dt);
+	kfree(lprof->lp_md);
+	kfree(lprof);
+}
+EXPORT_SYMBOL(class_put_profile);
+
 /* COMPAT_146 */
 void class_del_profiles(void)
 {
 	struct lustre_profile *lprof, *n;
 
+	spin_lock(&lustre_profile_list_lock);
 	list_for_each_entry_safe(lprof, n, &lustre_profile_list, lp_list) {
 		list_del(&lprof->lp_list);
-		kfree(lprof->lp_profile);
-		kfree(lprof->lp_dt);
-		kfree(lprof->lp_md);
-		kfree(lprof);
+		lprof->lp_list_deleted = true;
+		spin_unlock(&lustre_profile_list_lock);
+
+		class_put_profile(lprof);
+
+		spin_lock(&lustre_profile_list_lock);
 	}
+	spin_unlock(&lustre_profile_list_lock);
 }
 EXPORT_SYMBOL(class_del_profiles);
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (20 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 21/29] staging: lustre: obdclass: race lustre_profile_list James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-30 14:59   ` Greg Kroah-Hartman
  2016-10-27 22:11 ` [PATCH 23/29] staging: lustre: ptlrpc: do not sleep if encpool reached max capacity James Simmons
                   ` (6 subsequent siblings)
  28 siblings, 1 reply; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add a new regular file ioctl LL_IOC_FUTIMES_3 similar to futimes() but
which allows setting of all three inode timestamps. Use this ioctl
during HSM restore to ensure that the volatile file has the same
timestamps as the file to be restored. Strengthen sanity-hsm test_24a
to check that archive, release, and restore do not change a file's
ctime. Add sanity-hsm test_24e to check that tar will succeed when it
encounters a HSM released file.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6213
Reviewed-on: http://review.whamcloud.com/13665
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |   10 ++++
 drivers/staging/lustre/lustre/llite/file.c         |   45 ++++++++++++++++++++
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 ++-
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 856e2f9..1d72b73 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -213,6 +213,15 @@ struct ost_id {
 #define DOSTID "%#llx:%llu"
 #define POSTID(oi) ostid_seq(oi), ostid_id(oi)
 
+struct ll_futimes_3 {
+	__u64 lfu_atime_sec;
+	__u64 lfu_atime_nsec;
+	__u64 lfu_mtime_sec;
+	__u64 lfu_mtime_nsec;
+	__u64 lfu_ctime_sec;
+	__u64 lfu_ctime_nsec;
+};
+
 /*
  * The ioctl naming rules:
  * LL_*     - works on the currently opened filehandle instead of parent dir
@@ -249,6 +258,7 @@ struct ost_id {
 #define LL_IOC_PATH2FID		 _IOR('f', 173, long)
 #define LL_IOC_GET_CONNECT_FLAGS	_IOWR('f', 174, __u64 *)
 #define LL_IOC_GET_MDTIDX	       _IOR('f', 175, int)
+#define LL_IOC_FUTIMES_3		_IOWR('f', 176, struct ll_futimes_3)
 
 /*	lustre_ioctl.h			177-210 */
 #define LL_IOC_HSM_STATE_GET		_IOR('f', 211, struct hsm_user_state)
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 0ae78c7..694d920 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1904,6 +1904,41 @@ static inline long ll_lease_type_from_fmode(fmode_t fmode)
 	       ((fmode & FMODE_WRITE) ? LL_LEASE_WRLCK : 0);
 }
 
+static int ll_file_futimes_3(struct file *file, const struct ll_futimes_3 *lfu)
+{
+	struct inode *inode = file_inode(file);
+	struct iattr ia = {
+		.ia_valid = ATTR_ATIME | ATTR_ATIME_SET |
+			    ATTR_MTIME | ATTR_MTIME_SET |
+			    ATTR_CTIME | ATTR_CTIME_SET,
+		.ia_atime = {
+			.tv_sec = lfu->lfu_atime_sec,
+			.tv_nsec = lfu->lfu_atime_nsec,
+		},
+		.ia_mtime = {
+			.tv_sec = lfu->lfu_mtime_sec,
+			.tv_nsec = lfu->lfu_mtime_nsec,
+		},
+		.ia_ctime = {
+			.tv_sec = lfu->lfu_ctime_sec,
+			.tv_nsec = lfu->lfu_ctime_nsec,
+		},
+	};
+	int rc;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!S_ISREG(inode->i_mode))
+		return -EINVAL;
+
+	inode_lock(inode);
+	rc = ll_setattr_raw(file_dentry(file), &ia, false);
+	inode_unlock(inode);
+
+	return rc;
+}
+
 static long
 ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
@@ -2138,6 +2173,16 @@ static inline long ll_lease_type_from_fmode(fmode_t fmode)
 				fmode = 0;
 
 			return ll_lease_type_from_fmode(fmode);
+		case LL_IOC_FUTIMES_3: {
+			const struct ll_futimes_3 __user *lfu_user;
+			struct ll_futimes_3 lfu;
+
+			lfu_user = (const struct ll_futimes_3 __user *)arg;
+			if (copy_from_user(&lfu, lfu_user, sizeof(lfu)))
+				return -EFAULT;
+
+			return ll_file_futimes_3(file, &lfu);
+		}
 		default:
 			return -EINVAL;
 		}
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 308da06..2954aa9 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1410,7 +1410,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	}
 
 	/* We mark all of the fields "set" so MDS/OST does not re-set them */
-	if (attr->ia_valid & ATTR_CTIME) {
+	if (!(attr->ia_valid & ATTR_CTIME_SET) &&
+	    (attr->ia_valid & ATTR_CTIME)) {
 		attr->ia_ctime = CURRENT_TIME;
 		attr->ia_valid |= ATTR_CTIME_SET;
 	}
@@ -1516,7 +1517,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 
 	if (attr->ia_valid & (ATTR_SIZE |
 			      ATTR_ATIME | ATTR_ATIME_SET |
-			      ATTR_MTIME | ATTR_MTIME_SET)) {
+			      ATTR_MTIME | ATTR_MTIME_SET |
+			      ATTR_CTIME | ATTR_CTIME_SET)) {
 		/* For truncate and utimes sending attributes to OSTs, setting
 		 * mtime/atime to the past will be performed under PW [0:EOF]
 		 * extent lock (new_size:EOF for truncate).  It may seem
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 23/29] staging: lustre: ptlrpc: do not sleep if encpool reached max capacity
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (21 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3 James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 24/29] staging: lustre: ptlrpc: Forbid too early NRS policy tunings James Simmons
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

When encryption is enabled RPCs are encrypted just before being
sent. The encryption requires allocating memory in the encoding pool.
The current implementation in sptlrpc_enc_pool_get_pages() is
deadlock-prone. Indeed, if there is no more free pages in the pool,
all ptlrpcd threads can end up waiting in a queue, so there is no
thread available to process other requests. It means client is not
able to process replies from servers that yet contain last committed
transno useful to release memory allocated by previous requests,
including enc_pool pages.

To fix this, in sptlrpc_enc_pool_get_pages(), do not make ptlrpcd
threads wait in queue if encoding pool has already reached its maximum
capacity. Instead, return -ENOMEM. If functions calling ptl_send_rpc()
get -ENOMEM, then put back request in queue by moving it back to
RQ_PHASE_NEW phase.

As an optimization, do not call ptl_send_rpc() again for requests that
already failed to allocate in the enc_pool, as long as there is not
enough memory in the enc_pool to satisfy theirs needs.

In /sys/fs/lustre/sptlrpc/encrypt_page_pools, add a new 'out of mem'
stat to track how many requests fail to allocate memory in the
enc_pool.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/15070
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_sec.h |    2 +
 drivers/staging/lustre/lustre/ptlrpc/client.c      |   25 +++++++++++++++++
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c      |    9 +++++-
 drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c    |   29 +++++++++++++++++---
 4 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_sec.h b/drivers/staging/lustre/lustre/include/lustre_sec.h
index 90c1834..89658e0 100644
--- a/drivers/staging/lustre/lustre/include/lustre_sec.h
+++ b/drivers/staging/lustre/lustre/include/lustre_sec.h
@@ -1029,6 +1029,8 @@ int  sptlrpc_target_export_check(struct obd_export *exp,
 
 /* bulk security api */
 void sptlrpc_enc_pool_put_pages(struct ptlrpc_bulk_desc *desc);
+int get_free_pages_in_pool(void);
+int pool_is_at_full_capacity(void);
 
 int sptlrpc_cli_wrap_bulk(struct ptlrpc_request *req,
 			  struct ptlrpc_bulk_desc *desc);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index bda925e..cc4b129 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1440,6 +1440,13 @@ static int ptlrpc_send_new_req(struct ptlrpc_request *req)
 	int rc;
 
 	LASSERT(req->rq_phase == RQ_PHASE_NEW);
+
+	/* do not try to go further if there is not enough memory in enc_pool */
+	if (req->rq_sent && req->rq_bulk)
+		if (req->rq_bulk->bd_iov_count > get_free_pages_in_pool() &&
+		    pool_is_at_full_capacity())
+			return -ENOMEM;
+
 	if (req->rq_sent && (req->rq_sent > ktime_get_real_seconds()) &&
 	    (!req->rq_generation_set ||
 	     req->rq_import_generation == imp->imp_generation))
@@ -1533,6 +1540,16 @@ static int ptlrpc_send_new_req(struct ptlrpc_request *req)
 	       lustre_msg_get_opc(req->rq_reqmsg));
 
 	rc = ptl_send_rpc(req, 0);
+	if (rc == -ENOMEM) {
+		spin_lock(&imp->imp_lock);
+		if (!list_empty(&req->rq_list)) {
+			list_del_init(&req->rq_list);
+			atomic_dec(&req->rq_import->imp_inflight);
+		}
+		spin_unlock(&imp->imp_lock);
+		ptlrpc_rqphase_move(req, RQ_PHASE_NEW);
+		return rc;
+	}
 	if (rc) {
 		DEBUG_REQ(D_HA, req, "send failed (%d); expect timeout", rc);
 		spin_lock(&req->rq_lock);
@@ -1822,6 +1839,14 @@ int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set)
 				}
 
 				rc = ptl_send_rpc(req, 0);
+				if (rc == -ENOMEM) {
+					spin_lock(&imp->imp_lock);
+					if (!list_empty(&req->rq_list))
+						list_del_init(&req->rq_list);
+					spin_unlock(&imp->imp_lock);
+					ptlrpc_rqphase_move(req, RQ_PHASE_NEW);
+					continue;
+				}
 				if (rc) {
 					DEBUG_REQ(D_HA, req,
 						  "send failed: rc = %d", rc);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index c2dd948..4e80ba9 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -536,8 +536,15 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
 		mpflag = cfs_memory_pressure_get_and_set();
 
 	rc = sptlrpc_cli_wrap_request(request);
-	if (rc)
+	if (rc) {
+		/*
+		 * set rq_sent so that this request is treated
+		 * as a delayed send in the upper layers
+		 */
+		if (rc == -ENOMEM)
+			request->rq_sent = ktime_get_seconds();
 		goto out;
+	}
 
 	/* bulk register should be done after wrap_request() */
 	if (request->rq_bulk) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
index ceb805d..2fe9085 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
@@ -108,6 +108,7 @@
 	unsigned long    epp_st_lowfree;	/* lowest free pages reached */
 	unsigned int     epp_st_max_wqlen;      /* highest waitqueue length */
 	unsigned long       epp_st_max_wait;       /* in jiffies */
+	unsigned long	 epp_st_outofmem;	/* # of out of mem requests */
 	/*
 	 * pointers to pools
 	 */
@@ -139,7 +140,8 @@ int sptlrpc_proc_enc_pool_seq_show(struct seq_file *m, void *v)
 		   "cache missing:	   %lu\n"
 		   "low free mark:	   %lu\n"
 		   "max waitqueue depth:     %u\n"
-		   "max wait time:	   %ld/%lu\n",
+		   "max wait time:	   %ld/%lu\n"
+		   "out of mem:		 %lu\n",
 		   totalram_pages,
 		   PAGES_PER_POOL,
 		   page_pools.epp_max_pages,
@@ -158,7 +160,8 @@ int sptlrpc_proc_enc_pool_seq_show(struct seq_file *m, void *v)
 		   page_pools.epp_st_lowfree,
 		   page_pools.epp_st_max_wqlen,
 		   page_pools.epp_st_max_wait,
-		   msecs_to_jiffies(MSEC_PER_SEC));
+		   msecs_to_jiffies(MSEC_PER_SEC),
+		   page_pools.epp_st_outofmem);
 
 	spin_unlock(&page_pools.epp_lock);
 
@@ -306,6 +309,22 @@ static inline void enc_pools_wakeup(void)
 	}
 }
 
+/*
+ * Export the number of free pages in the pool
+ */
+int get_free_pages_in_pool(void)
+{
+	return page_pools.epp_free_pages;
+}
+
+/*
+ * Let outside world know if enc_pool full capacity is reached
+ */
+int pool_is_at_full_capacity(void)
+{
+	return (page_pools.epp_total_pages == page_pools.epp_max_pages);
+}
+
 void sptlrpc_enc_pool_put_pages(struct ptlrpc_bulk_desc *desc)
 {
 	int p_idx, g_idx;
@@ -406,6 +425,7 @@ int sptlrpc_enc_pool_init(void)
 	page_pools.epp_st_lowfree = 0;
 	page_pools.epp_st_max_wqlen = 0;
 	page_pools.epp_st_max_wait = 0;
+	page_pools.epp_st_outofmem = 0;
 
 	enc_pools_alloc();
 	if (!page_pools.epp_pools)
@@ -433,13 +453,14 @@ void sptlrpc_enc_pool_fini(void)
 
 	if (page_pools.epp_st_access > 0) {
 		CDEBUG(D_SEC,
-		       "max pages %lu, grows %u, grow fails %u, shrinks %u, access %lu, missing %lu, max qlen %u, max wait %ld/%ld\n",
+		       "max pages %lu, grows %u, grow fails %u, shrinks %u, access %lu, missing %lu, max qlen %u, max wait %ld/%ld, out of mem %lu\n",
 		       page_pools.epp_st_max_pages, page_pools.epp_st_grows,
 		       page_pools.epp_st_grow_fails,
 		       page_pools.epp_st_shrinks, page_pools.epp_st_access,
 		       page_pools.epp_st_missings, page_pools.epp_st_max_wqlen,
 		       page_pools.epp_st_max_wait,
-		       msecs_to_jiffies(MSEC_PER_SEC));
+		       msecs_to_jiffies(MSEC_PER_SEC),
+		       page_pools.epp_st_outofmem);
 	}
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 24/29] staging: lustre: ptlrpc: Forbid too early NRS policy tunings
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (22 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 23/29] staging: lustre: ptlrpc: do not sleep if encpool reached max capacity James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:11 ` [PATCH 25/29] staging: lustre: llite: Inform copytool of dataversion changes James Simmons
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Wait for a NRS policy to be fully started before allowing
to apply related tunings, so that all fields are properly
initialized.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6673
Reviewed-on: http://review.whamcloud.com/15104
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/nrs.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/nrs.c b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
index d88faf6..f856632 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/nrs.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
@@ -619,6 +619,15 @@ static int nrs_policy_ctl(struct ptlrpc_nrs *nrs, char *name,
 		goto out;
 	}
 
+	/**
+	 * Wait for the policy to be fully started before attempting
+	 * to operate it.
+	 */
+	if (policy->pol_state == NRS_POL_STATE_STARTING) {
+		rc = -EAGAIN;
+		goto out;
+	}
+
 	switch (opc) {
 		/**
 		 * Unknown opcode, pass it down to the policy-specific control
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 25/29] staging: lustre: llite: Inform copytool of dataversion changes
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (23 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 24/29] staging: lustre: ptlrpc: Forbid too early NRS policy tunings James Simmons
@ 2016-10-27 22:11 ` James Simmons
  2016-10-27 22:12 ` [PATCH 26/29] staging: lustre: ptlrpc: do not switch out-of-date context James Simmons
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Notify copytool that a file could not be archived due to dataversion
change. This does not introduce any functional change but ensures
that errors are consistently reported in copytool logs.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5683
Reviewed-on: http://review.whamcloud.com/14377
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Robert Read <robert.read@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 64a32d5..ce05493 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -837,10 +837,10 @@ static int ll_ioc_copy_end(struct super_block *sb, struct hsm_copy *copy)
 			 * when the file will not be modified for some tunable
 			 * time
 			 */
-			/* we do not notify caller */
 			hpk.hpk_flags &= ~HP_FLAG_RETRY;
+			rc = -EBUSY;
 			/* hpk_errval must be >= 0 */
-			hpk.hpk_errval = EBUSY;
+			hpk.hpk_errval = -rc;
 		}
 	}
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 26/29] staging: lustre: ptlrpc: do not switch out-of-date context
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (24 preceding siblings ...)
  2016-10-27 22:11 ` [PATCH 25/29] staging: lustre: llite: Inform copytool of dataversion changes James Simmons
@ 2016-10-27 22:12 ` James Simmons
  2016-10-27 22:12 ` [PATCH 27/29] staging: lustre: headers: Create single .h for lu_seq_range James Simmons
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

When trying to replace a dead sec context with a new one, we must
ensure the new context is already up-to-date.
If it is not the case, just return from sptlrpc_req_replace_dead_ctx()
and come later when the new context has been updated.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6536
Reviewed-on: http://review.whamcloud.com/15708
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/sec.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 12f02ed..e860df7 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -515,6 +515,13 @@ static int sptlrpc_req_replace_dead_ctx(struct ptlrpc_request *req)
 
 		set_current_state(TASK_INTERRUPTIBLE);
 		schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC));
+	} else if (unlikely(!test_bit(PTLRPC_CTX_UPTODATE_BIT, &newctx->cc_flags))) {
+		/*
+		 * new ctx not up to date yet
+		 */
+		CDEBUG(D_SEC,
+		       "ctx (%p, fl %lx) doesn't switch, not up to date yet\n",
+		       newctx, newctx->cc_flags);
 	} else {
 		/*
 		 * it's possible newctx == oldctx if we're switching
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 27/29] staging: lustre: headers: Create single .h for lu_seq_range
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (25 preceding siblings ...)
  2016-10-27 22:12 ` [PATCH 26/29] staging: lustre: ptlrpc: do not switch out-of-date context James Simmons
@ 2016-10-27 22:12 ` James Simmons
  2016-10-27 22:12 ` [PATCH 28/29] staging: lustre: ptlrpc: imp_peer_committed_transno should increase James Simmons
  2016-10-27 22:12 ` [PATCH 29/29] staging: lustre: llog: record the minimum record size James Simmons
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Ben Evans,
	James Simmons

From: Ben Evans <bevans@cray.com>

Put lu_seq_range related functions into a single .h.
Include directly from files which use it, and remove
definitions from lustre_idl.h.

Signed-off-by: Ben Evans <bevans@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6401
Reviewed-on: http://review.whamcloud.com/15952
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/fid/fid_request.c    |   14 +-
 drivers/staging/lustre/lustre/fid/lproc_fid.c      |    2 +-
 drivers/staging/lustre/lustre/fld/fld_cache.c      |    6 +-
 .../lustre/lustre/include/lustre/lustre_idl.h      |  108 -----------
 drivers/staging/lustre/lustre/include/lustre_fid.h |    1 +
 drivers/staging/lustre/lustre/include/seq_range.h  |  199 ++++++++++++++++++++
 6 files changed, 211 insertions(+), 119 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/seq_range.h

diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
index dfef3ca..999f250 100644
--- a/drivers/staging/lustre/lustre/fid/fid_request.c
+++ b/drivers/staging/lustre/lustre/fid/fid_request.c
@@ -74,7 +74,7 @@ static int seq_client_rpc(struct lu_client_seq *seq,
 
 	/* Zero out input range, this is not recovery yet. */
 	in = req_capsule_client_get(&req->rq_pill, &RMF_SEQ_RANGE);
-	range_init(in);
+	lu_seq_range_init(in);
 
 	ptlrpc_request_set_replen(req);
 
@@ -119,14 +119,14 @@ static int seq_client_rpc(struct lu_client_seq *seq,
 	out = req_capsule_server_get(&req->rq_pill, &RMF_SEQ_RANGE);
 	*output = *out;
 
-	if (!range_is_sane(output)) {
+	if (!lu_seq_range_is_sane(output)) {
 		CERROR("%s: Invalid range received from server: "
 		       DRANGE "\n", seq->lcs_name, PRANGE(output));
 		rc = -EINVAL;
 		goto out_req;
 	}
 
-	if (range_is_exhausted(output)) {
+	if (lu_seq_range_is_exhausted(output)) {
 		CERROR("%s: Range received from server is exhausted: "
 		       DRANGE "]\n", seq->lcs_name, PRANGE(output));
 		rc = -EINVAL;
@@ -166,9 +166,9 @@ static int seq_client_alloc_seq(const struct lu_env *env,
 {
 	int rc;
 
-	LASSERT(range_is_sane(&seq->lcs_space));
+	LASSERT(lu_seq_range_is_sane(&seq->lcs_space));
 
-	if (range_is_exhausted(&seq->lcs_space)) {
+	if (lu_seq_range_is_exhausted(&seq->lcs_space)) {
 		rc = seq_client_alloc_meta(env, seq);
 		if (rc) {
 			CERROR("%s: Can't allocate new meta-sequence, rc %d\n",
@@ -181,7 +181,7 @@ static int seq_client_alloc_seq(const struct lu_env *env,
 		rc = 0;
 	}
 
-	LASSERT(!range_is_exhausted(&seq->lcs_space));
+	LASSERT(!lu_seq_range_is_exhausted(&seq->lcs_space));
 	*seqnr = seq->lcs_space.lsr_start;
 	seq->lcs_space.lsr_start += 1;
 
@@ -316,7 +316,7 @@ void seq_client_flush(struct lu_client_seq *seq)
 
 	seq->lcs_space.lsr_index = -1;
 
-	range_init(&seq->lcs_space);
+	lu_seq_range_init(&seq->lcs_space);
 	mutex_unlock(&seq->lcs_mutex);
 }
 EXPORT_SYMBOL(seq_client_flush);
diff --git a/drivers/staging/lustre/lustre/fid/lproc_fid.c b/drivers/staging/lustre/lustre/fid/lproc_fid.c
index 3ed32d7..97d4849 100644
--- a/drivers/staging/lustre/lustre/fid/lproc_fid.c
+++ b/drivers/staging/lustre/lustre/fid/lproc_fid.c
@@ -83,7 +83,7 @@
 		    (unsigned long long *)&tmp.lsr_end);
 	if (rc != 2)
 		return -EINVAL;
-	if (!range_is_sane(&tmp) || range_is_zero(&tmp) ||
+	if (!lu_seq_range_is_sane(&tmp) || lu_seq_range_is_zero(&tmp) ||
 	    tmp.lsr_start < range->lsr_start || tmp.lsr_end > range->lsr_end)
 		return -EINVAL;
 	*range = tmp;
diff --git a/drivers/staging/lustre/lustre/fld/fld_cache.c b/drivers/staging/lustre/lustre/fld/fld_cache.c
index 0100a93..11f6974 100644
--- a/drivers/staging/lustre/lustre/fld/fld_cache.c
+++ b/drivers/staging/lustre/lustre/fld/fld_cache.c
@@ -143,7 +143,7 @@ static void fld_fix_new_list(struct fld_cache *cache)
 		c_range = &f_curr->fce_range;
 		n_range = &f_next->fce_range;
 
-		LASSERT(range_is_sane(c_range));
+		LASSERT(lu_seq_range_is_sane(c_range));
 		if (&f_next->fce_list == head)
 			break;
 
@@ -358,7 +358,7 @@ struct fld_cache_entry
 {
 	struct fld_cache_entry *f_new;
 
-	LASSERT(range_is_sane(range));
+	LASSERT(lu_seq_range_is_sane(range));
 
 	f_new = kzalloc(sizeof(*f_new), GFP_NOFS);
 	if (!f_new)
@@ -503,7 +503,7 @@ int fld_cache_lookup(struct fld_cache *cache,
 		}
 
 		prev = flde;
-		if (range_within(&flde->fce_range, seq)) {
+		if (lu_seq_range_within(&flde->fce_range, seq)) {
 			*range = flde->fce_range;
 
 			cache->fci_stat.fst_cache++;
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index e542ce6..6896c37 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -192,113 +192,6 @@ struct lu_seq_range_array {
 
 #define LU_SEQ_RANGE_MASK	0x3
 
-static inline unsigned fld_range_type(const struct lu_seq_range *range)
-{
-	return range->lsr_flags & LU_SEQ_RANGE_MASK;
-}
-
-static inline bool fld_range_is_ost(const struct lu_seq_range *range)
-{
-	return fld_range_type(range) == LU_SEQ_RANGE_OST;
-}
-
-static inline bool fld_range_is_mdt(const struct lu_seq_range *range)
-{
-	return fld_range_type(range) == LU_SEQ_RANGE_MDT;
-}
-
-/**
- * This all range is only being used when fld client sends fld query request,
- * but it does not know whether the seq is MDT or OST, so it will send req
- * with ALL type, which means either seq type gotten from lookup can be
- * expected.
- */
-static inline unsigned fld_range_is_any(const struct lu_seq_range *range)
-{
-	return fld_range_type(range) == LU_SEQ_RANGE_ANY;
-}
-
-static inline void fld_range_set_type(struct lu_seq_range *range,
-				      unsigned flags)
-{
-	range->lsr_flags |= flags;
-}
-
-static inline void fld_range_set_mdt(struct lu_seq_range *range)
-{
-	fld_range_set_type(range, LU_SEQ_RANGE_MDT);
-}
-
-static inline void fld_range_set_ost(struct lu_seq_range *range)
-{
-	fld_range_set_type(range, LU_SEQ_RANGE_OST);
-}
-
-static inline void fld_range_set_any(struct lu_seq_range *range)
-{
-	fld_range_set_type(range, LU_SEQ_RANGE_ANY);
-}
-
-/**
- * returns  width of given range \a r
- */
-
-static inline __u64 range_space(const struct lu_seq_range *range)
-{
-	return range->lsr_end - range->lsr_start;
-}
-
-/**
- * initialize range to zero
- */
-
-static inline void range_init(struct lu_seq_range *range)
-{
-	memset(range, 0, sizeof(*range));
-}
-
-/**
- * check if given seq id \a s is within given range \a r
- */
-
-static inline bool range_within(const struct lu_seq_range *range,
-				__u64 s)
-{
-	return s >= range->lsr_start && s < range->lsr_end;
-}
-
-static inline bool range_is_sane(const struct lu_seq_range *range)
-{
-	return (range->lsr_end >= range->lsr_start);
-}
-
-static inline bool range_is_zero(const struct lu_seq_range *range)
-{
-	return (range->lsr_start == 0 && range->lsr_end == 0);
-}
-
-static inline bool range_is_exhausted(const struct lu_seq_range *range)
-
-{
-	return range_space(range) == 0;
-}
-
-/* return 0 if two range have the same location */
-static inline int range_compare_loc(const struct lu_seq_range *r1,
-				    const struct lu_seq_range *r2)
-{
-	return r1->lsr_index != r2->lsr_index ||
-	       r1->lsr_flags != r2->lsr_flags;
-}
-
-#define DRANGE "[%#16.16Lx-%#16.16Lx):%x:%s"
-
-#define PRANGE(range)		\
-	(range)->lsr_start,	\
-	(range)->lsr_end,	\
-	(range)->lsr_index,	\
-	fld_range_is_mdt(range) ? "mdt" : "ost"
-
 /** \defgroup lu_fid lu_fid
  * @{
  */
@@ -848,7 +741,6 @@ static inline bool fid_is_sane(const struct lu_fid *fid)
 }
 
 void lustre_swab_lu_fid(struct lu_fid *fid);
-void lustre_swab_lu_seq_range(struct lu_seq_range *range);
 
 static inline bool lu_fid_eq(const struct lu_fid *f0, const struct lu_fid *f1)
 {
diff --git a/drivers/staging/lustre/lustre/include/lustre_fid.h b/drivers/staging/lustre/lustre/include/lustre_fid.h
index 3167806..b5a1aad 100644
--- a/drivers/staging/lustre/lustre/include/lustre_fid.h
+++ b/drivers/staging/lustre/lustre/include/lustre_fid.h
@@ -150,6 +150,7 @@
 
 #include "../../include/linux/libcfs/libcfs.h"
 #include "lustre/lustre_idl.h"
+#include "seq_range.h"
 
 struct lu_env;
 struct lu_site;
diff --git a/drivers/staging/lustre/lustre/include/seq_range.h b/drivers/staging/lustre/lustre/include/seq_range.h
new file mode 100644
index 0000000..30c4dd6
--- /dev/null
+++ b/drivers/staging/lustre/lustre/include/seq_range.h
@@ -0,0 +1,199 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved.
+ * Use is subject to license terms.
+ *
+ * Copyright (c) 2011, 2014, Intel Corporation.
+ *
+ * Copyright 2015 Cray Inc, all rights reserved.
+ * Author: Ben Evans.
+ *
+ * Define lu_seq_range  associated functions
+ */
+
+#ifndef _SEQ_RANGE_H_
+#define _SEQ_RANGE_H_
+
+#include "lustre/lustre_idl.h"
+
+/**
+ * computes the sequence range type \a range
+ */
+
+static inline unsigned int fld_range_type(const struct lu_seq_range *range)
+{
+	return range->lsr_flags & LU_SEQ_RANGE_MASK;
+}
+
+/**
+ *  Is this sequence range an OST? \a range
+ */
+
+static inline bool fld_range_is_ost(const struct lu_seq_range *range)
+{
+	return fld_range_type(range) == LU_SEQ_RANGE_OST;
+}
+
+/**
+ *  Is this sequence range an MDT? \a range
+ */
+
+static inline bool fld_range_is_mdt(const struct lu_seq_range *range)
+{
+	return fld_range_type(range) == LU_SEQ_RANGE_MDT;
+}
+
+/**
+ * ANY range is only used when the fld client sends a fld query request,
+ * but it does not know whether the seq is an MDT or OST, so it will send the
+ * request with ANY type, which means any seq type from the lookup can be
+ * expected. /a range
+ */
+static inline unsigned int fld_range_is_any(const struct lu_seq_range *range)
+{
+	return fld_range_type(range) == LU_SEQ_RANGE_ANY;
+}
+
+/**
+ * Apply flags to range \a range \a flags
+ */
+
+static inline void fld_range_set_type(struct lu_seq_range *range,
+				      unsigned int flags)
+{
+	range->lsr_flags |= flags;
+}
+
+/**
+ * Add MDT to range type \a range
+ */
+
+static inline void fld_range_set_mdt(struct lu_seq_range *range)
+{
+	fld_range_set_type(range, LU_SEQ_RANGE_MDT);
+}
+
+/**
+ * Add OST to range type \a range
+ */
+
+static inline void fld_range_set_ost(struct lu_seq_range *range)
+{
+	fld_range_set_type(range, LU_SEQ_RANGE_OST);
+}
+
+/**
+ * Add ANY to range type \a range
+ */
+
+static inline void fld_range_set_any(struct lu_seq_range *range)
+{
+	fld_range_set_type(range, LU_SEQ_RANGE_ANY);
+}
+
+/**
+ * computes width of given sequence range \a range
+ */
+
+static inline u64 lu_seq_range_space(const struct lu_seq_range *range)
+{
+	return range->lsr_end - range->lsr_start;
+}
+
+/**
+ * initialize range to zero \a range
+ */
+
+static inline void lu_seq_range_init(struct lu_seq_range *range)
+{
+	memset(range, 0, sizeof(*range));
+}
+
+/**
+ * check if given seq id \a s is within given range \a range
+ */
+
+static inline bool lu_seq_range_within(const struct lu_seq_range *range,
+				       u64 seq)
+{
+	return seq >= range->lsr_start && seq < range->lsr_end;
+}
+
+/**
+ * Is the range sane?  Is the end after the beginning? \a range
+ */
+
+static inline bool lu_seq_range_is_sane(const struct lu_seq_range *range)
+{
+	return range->lsr_end >= range->lsr_start;
+}
+
+/**
+ * Is the range 0? \a range
+ */
+
+static inline bool lu_seq_range_is_zero(const struct lu_seq_range *range)
+{
+	return range->lsr_start == 0 && range->lsr_end == 0;
+}
+
+/**
+ * Is the range out of space? \a range
+ */
+
+static inline bool lu_seq_range_is_exhausted(const struct lu_seq_range *range)
+{
+	return lu_seq_range_space(range) == 0;
+}
+
+/**
+ * return 0 if two ranges have the same location, nonzero if they are
+ * different \a r1 \a r2
+ */
+
+static inline int lu_seq_range_compare_loc(const struct lu_seq_range *r1,
+					   const struct lu_seq_range *r2)
+{
+	return r1->lsr_index != r2->lsr_index ||
+		r1->lsr_flags != r2->lsr_flags;
+}
+
+#if !defined(__REQ_LAYOUT_USER__)
+/**
+ * byte swap range structure \a range
+ */
+
+void lustre_swab_lu_seq_range(struct lu_seq_range *range);
+#endif
+/**
+ * printf string and argument list for sequence range
+ */
+#define DRANGE "[%#16.16llx-%#16.16llx]:%x:%s"
+
+#define PRANGE(range)		\
+	(range)->lsr_start,	\
+	(range)->lsr_end,	\
+	(range)->lsr_index,	\
+	fld_range_is_mdt(range) ? "mdt" : "ost"
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 28/29] staging: lustre: ptlrpc: imp_peer_committed_transno should increase
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (26 preceding siblings ...)
  2016-10-27 22:12 ` [PATCH 27/29] staging: lustre: headers: Create single .h for lu_seq_range James Simmons
@ 2016-10-27 22:12 ` James Simmons
  2016-10-27 22:12 ` [PATCH 29/29] staging: lustre: llog: record the minimum record size James Simmons
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alex Zhuravlev, James Simmons

From: Alex Zhuravlev <alexey.zhuravlev@intel.com>

imp_peer_committed_transno should not decrease as this can
confuse the users if imp_peer_committed_transno is used to
check commit status.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7079
Reviewed-on: http://review.whamcloud.com/16161
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index cc4b129..7cbfb4c 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1242,6 +1242,7 @@ static int after_reply(struct ptlrpc_request *req)
 	int rc;
 	struct timespec64 work_start;
 	long timediff;
+	u64 committed;
 
 	LASSERT(obd);
 	/* repbuf must be unlinked */
@@ -1400,10 +1401,9 @@ static int after_reply(struct ptlrpc_request *req)
 		}
 
 		/* Replay-enabled imports return commit-status information. */
-		if (lustre_msg_get_last_committed(req->rq_repmsg)) {
-			imp->imp_peer_committed_transno =
-				lustre_msg_get_last_committed(req->rq_repmsg);
-		}
+		committed = lustre_msg_get_last_committed(req->rq_repmsg);
+		if (likely(committed > imp->imp_peer_committed_transno))
+			imp->imp_peer_committed_transno = committed;
 
 		ptlrpc_free_committed(imp);
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 29/29] staging: lustre: llog: record the minimum record size
  2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
                   ` (27 preceding siblings ...)
  2016-10-27 22:12 ` [PATCH 28/29] staging: lustre: ptlrpc: imp_peer_committed_transno should increase James Simmons
@ 2016-10-27 22:12 ` James Simmons
  28 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-10-27 22:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

The minimum record size will be recorded in llh_size, which is
only used by fixed size record llog now, and also add another
flag LLOG_F_IS_FIXSIZE to indicate the fix size record llog.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7050
Reviewed-on: http://review.whamcloud.com/16103
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    7 +++++++
 drivers/staging/lustre/lustre/obdclass/llog.c      |    1 +
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 6896c37..db09f3b 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3020,7 +3020,14 @@ enum llog_flag {
 	LLOG_F_IS_CAT		= 0x2,
 	LLOG_F_IS_PLAIN		= 0x4,
 	LLOG_F_EXT_JOBID        = BIT(3),
+	LLOG_F_IS_FIXSIZE	= BIT(4),
 
+	/*
+	 * Note: Flags covered by LLOG_F_EXT_MASK will be inherited from
+	 * catlog to plain log, so do not add LLOG_F_IS_FIXSIZE here,
+	 * because the catlog record is usually fixed size, but its plain
+	 * log record can be variable
+	 */
 	LLOG_F_EXT_MASK = LLOG_F_EXT_JOBID,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index 5c9447e..3bc1789 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -193,6 +193,7 @@ int llog_init_handle(const struct lu_env *env, struct llog_handle *handle,
 		LASSERT(list_empty(&handle->u.chd.chd_head));
 		INIT_LIST_HEAD(&handle->u.chd.chd_head);
 		llh->llh_size = sizeof(struct llog_logid_rec);
+		llh->llh_flags |= LLOG_F_IS_FIXSIZE;
 	} else if (!(flags & LLOG_F_IS_PLAIN)) {
 		CERROR("%s: unknown flags: %#x (expected %#x or %#x)\n",
 		       handle->lgh_ctxt->loc_obd->obd_name,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight
  2016-10-27 22:11 ` [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight James Simmons
@ 2016-10-27 23:40   ` kbuild test robot
  2016-10-30 15:04   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 34+ messages in thread
From: kbuild test robot @ 2016-10-27 23:40 UTC (permalink / raw)
  To: James Simmons
  Cc: kbuild-all, Greg Kroah-Hartman, devel, Andreas Dilger,
	Oleg Drokin, Gregoire Pichon, Linux Kernel Mailing List,
	Lustre Development List

[-- Attachment #1: Type: text/plain, Size: 2276 bytes --]

Hi Gregoire,

[auto build test WARNING on staging/staging-testing]
[also build test WARNING on next-20161027]
[cannot apply to v4.9-rc2]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/James-Simmons/Batch-one-for-work-from-2-7-55-to-2-7-59/20161028-061814
config: openrisc-allmodconfig (attached as .config)
compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=openrisc 

All warnings (new ones prefixed by >>):

   drivers/staging/lustre/lustre/obdclass/genops.c: In function 'obd_mod_rpc_stats_seq_show':
>> drivers/staging/lustre/lustre/obdclass/genops.c:1518:6: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'time64_t'

vim +1518 drivers/staging/lustre/lustre/obdclass/genops.c

  1502	}
  1503	EXPORT_SYMBOL(obd_set_max_mod_rpcs_in_flight);
  1504	
  1505	#define pct(a, b) (b ? (a * 100) / b : 0)
  1506	
  1507	int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq)
  1508	{
  1509		unsigned long mod_tot = 0, mod_cum;
  1510		struct timespec64 now;
  1511		int i;
  1512	
  1513		ktime_get_real_ts64(&now);
  1514	
  1515		spin_lock(&cli->cl_mod_rpcs_lock);
  1516	
  1517		seq_printf(seq, "snapshot_time:         %lu.%lu (secs.usecs)\n",
> 1518			   now.tv_sec, now.tv_nsec / 1000);
  1519		seq_printf(seq, "modify_RPCs_in_flight:  %hu\n",
  1520			   cli->cl_mod_rpcs_in_flight);
  1521	
  1522		seq_puts(seq, "\n\t\t\tmodify\n");
  1523		seq_puts(seq, "rpcs in flight        rpcs   %% cum %%\n");
  1524	
  1525		mod_tot = lprocfs_oh_sum(&cli->cl_mod_rpcs_hist);
  1526	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39323 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3
  2016-10-27 22:11 ` [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3 James Simmons
@ 2016-10-30 14:59   ` Greg Kroah-Hartman
  2016-11-07  3:47     ` James Simmons
  0 siblings, 1 reply; 34+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-30 14:59 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, John L. Hammond,
	Linux Kernel Mailing List, Lustre Development List

On Thu, Oct 27, 2016 at 06:11:56PM -0400, James Simmons wrote:
> From: John L. Hammond <john.hammond@intel.com>
> 
> Add a new regular file ioctl LL_IOC_FUTIMES_3 similar to futimes() but
> which allows setting of all three inode timestamps. Use this ioctl
> during HSM restore to ensure that the volatile file has the same
> timestamps as the file to be restored. Strengthen sanity-hsm test_24a
> to check that archive, release, and restore do not change a file's
> ctime. Add sanity-hsm test_24e to check that tar will succeed when it
> encounters a HSM released file.

This sounds odd, why is this filesystem the only one that needs a
"special" futimes?  Don't make up new syscalls by making an ioctl
please, make a new syscall if that's what you really need!

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight
  2016-10-27 22:11 ` [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight James Simmons
  2016-10-27 23:40   ` kbuild test robot
@ 2016-10-30 15:04   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 34+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-30 15:04 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Gregoire Pichon,
	Linux Kernel Mailing List, Lustre Development List

On Thu, Oct 27, 2016 at 06:11:46PM -0400, James Simmons wrote:
> From: Gregoire Pichon <gregoire.pichon@bull.net>
> 
> This patch is the main client part of a new feature that supports
> multiple modify metadata RPCs in parallel. Its goal is to improve
> metadata operations performance of a single client, while maintening
> the consistency of MDT reply reconstruction and MDT recovery
> mechanisms.
> 
> It allows to manage the number of modify RPCs in flight within
> the client obd structure and to assign a virtual index (the tag) to
> each modify RPC to help server side cleaning of reply data.
> 
> The mdc component uses this feature to send multiple modify RPCs
> in parallel.

I'm not going to apply this one due to the kbuild warning it causes.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3
  2016-10-30 14:59   ` Greg Kroah-Hartman
@ 2016-11-07  3:47     ` James Simmons
  0 siblings, 0 replies; 34+ messages in thread
From: James Simmons @ 2016-11-07  3:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: devel, Andreas Dilger, Oleg Drokin, John L. Hammond,
	Linux Kernel Mailing List, Lustre Development List


> On Thu, Oct 27, 2016 at 06:11:56PM -0400, James Simmons wrote:
> > From: John L. Hammond <john.hammond@intel.com>
> > 
> > Add a new regular file ioctl LL_IOC_FUTIMES_3 similar to futimes() but
> > which allows setting of all three inode timestamps. Use this ioctl
> > during HSM restore to ensure that the volatile file has the same
> > timestamps as the file to be restored. Strengthen sanity-hsm test_24a
> > to check that archive, release, and restore do not change a file's
> > ctime. Add sanity-hsm test_24e to check that tar will succeed when it
> > encounters a HSM released file.
> 
> This sounds odd, why is this filesystem the only one that needs a
> "special" futimes?  Don't make up new syscalls by making an ioctl
> please, make a new syscall if that's what you really need!

This is a special case dealing with file being retrieve from tape backup.
The file being restored ended up with the wrong timestamps. This ioctl
was used to retrieve the correct timestamps from our meta data servers.
Another difference is that our futimes changes all the timestamps. I see
the main issue is wrapping such functionality into ioctls. We have the
have developed something, ladvise, along the lines of fadvise. We will
have to discuss doing it using syscall instead.   

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-11-07  3:47 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 22:11 [PATCH 00/29] Batch one for work from 2.7.55 to 2.7.59 James Simmons
2016-10-27 22:11 ` [PATCH 01/29] staging: lustre: osc: remove handling cl_avail_grant less than zero James Simmons
2016-10-27 22:11 ` [PATCH 02/29] staging: lustre: llite: remove IS_ERR(master_inode) check James Simmons
2016-10-27 22:11 ` [PATCH 03/29] staging: lustre: llite: restart short read/write for normal IO James Simmons
2016-10-27 22:11 ` [PATCH 04/29] staging: lustre: obdclass: variable llog chunk size James Simmons
2016-10-27 22:11 ` [PATCH 05/29] staging: lustre: lmv: allow cross-MDT rename and link James Simmons
2016-10-27 22:11 ` [PATCH 06/29] staging: lustre: ptlrpc: Introduce iovec to bulk descriptor James Simmons
2016-10-27 22:11 ` [PATCH 07/29] staging: lustre: lov: remove LSM from struct lustre_md James Simmons
2016-10-27 22:11 ` [PATCH 08/29] staging: lustre: clio: update file attributes after sync James Simmons
2016-10-27 22:11 ` [PATCH 09/29] staging: lustre: dne: setdirstripe should fail if not supported James Simmons
2016-10-27 22:11 ` [PATCH 10/29] staging: lustre: ptlrpc: embed highest XID in each request James Simmons
2016-10-27 22:11 ` [PATCH 11/29] staging: lustre: llite: basic support of SELinux in CLIO James Simmons
2016-10-27 22:11 ` [PATCH 12/29] staging: lustre: mdc: manage number of modify RPCs in flight James Simmons
2016-10-27 23:40   ` kbuild test robot
2016-10-30 15:04   ` Greg Kroah-Hartman
2016-10-27 22:11 ` [PATCH 13/29] staging: lustre: llite: report back to user bad stripe count James Simmons
2016-10-27 22:11 ` [PATCH 14/29] staging: lustre: ptlrpc: Do not resend req with allow_replay James Simmons
2016-10-27 22:11 ` [PATCH 15/29] staging: lustre: obdecho: don't copy lu_site James Simmons
2016-10-27 22:11 ` [PATCH 16/29] staging: lustre: mdc: deactive MDT permanently James Simmons
2016-10-27 22:11 ` [PATCH 17/29] staging: lustre: ptlrpc: replay bulk request James Simmons
2016-10-27 22:11 ` [PATCH 18/29] staging: lustre: mdt: disable IMA support James Simmons
2016-10-27 22:11 ` [PATCH 19/29] staging: lustre: ldlm: reclaim granted locks defensively James Simmons
2016-10-27 22:11 ` [PATCH 20/29] staging: lustre: recovery: don't skip open replay on reconnect James Simmons
2016-10-27 22:11 ` [PATCH 21/29] staging: lustre: obdclass: race lustre_profile_list James Simmons
2016-10-27 22:11 ` [PATCH 22/29] staging: lustre: llite: add LL_IOC_FUTIMES_3 James Simmons
2016-10-30 14:59   ` Greg Kroah-Hartman
2016-11-07  3:47     ` James Simmons
2016-10-27 22:11 ` [PATCH 23/29] staging: lustre: ptlrpc: do not sleep if encpool reached max capacity James Simmons
2016-10-27 22:11 ` [PATCH 24/29] staging: lustre: ptlrpc: Forbid too early NRS policy tunings James Simmons
2016-10-27 22:11 ` [PATCH 25/29] staging: lustre: llite: Inform copytool of dataversion changes James Simmons
2016-10-27 22:12 ` [PATCH 26/29] staging: lustre: ptlrpc: do not switch out-of-date context James Simmons
2016-10-27 22:12 ` [PATCH 27/29] staging: lustre: headers: Create single .h for lu_seq_range James Simmons
2016-10-27 22:12 ` [PATCH 28/29] staging: lustre: ptlrpc: imp_peer_committed_transno should increase James Simmons
2016-10-27 22:12 ` [PATCH 29/29] staging: lustre: llog: record the minimum record size James Simmons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).