All of lore.kernel.org
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client
@ 2018-12-17 16:29 James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout James Simmons
                   ` (28 more replies)
  0 siblings, 29 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

This is the initial PFL port to the linux lustre client. This opens
up feed back on the port so far. Currently sanity passes but the
test for sanity-pfl fail as below. I have been tracking downing
various bugs but this one remains and I haven't found out why its
failing. So far from what I can tell is lov_io_setattr_iter_init()
it returning -ENODATA due to lsm_entry_inited() is not initialized.
Hoping that sending this out more eyes might help to see where this
last problem is.

Lustre: DEBUG MARKER: == sanity-pfl test 0: Create full components file, no reused OSTs =======
============================= 10:53:08 (1545061988)
Lustre: DEBUG MARKER: create directory /lustre/lustre/d0.sanity-pfl
Lustre: DEBUG MARKER: create comp_file
Lustre: DEBUG MARKER: instantiate components
LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) ASSERTION( io->ci_state == CIS_UNLOCKED )
failed:
LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) LBUG
Pid: 19350, comm: dd 4.20.0-rc6+ #1 SMP PREEMPT Sat Dec 15 11:22:06 EST 2018
Call Trace:
  libcfs_call_trace+0x8b/0xc0 [libcfs]
  lbug_with_loc+0x41/0x90 [libcfs]
  cl_io_iter_fini+0x10c/0x110 [obdclass]
  cl_io_loop+0x46/0x220 [obdclass]
  cl_setattr_ost+0x1ed/0x2a0 [lustre]
  ll_setattr_raw+0x797/0x980 [lustre]
  notify_change+0x1dc/0x430
  do_truncate+0x72/0xc0
  do_sys_ftruncate+0xf5/0x160
  do_syscall_64+0x68/0x38f

Bobi Jam (20):
  lustre: lov: move code for PFL work
  lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  lustre: lov: fold lmm_verify() handling into lmm_unpackmd()
  lustre: lov: create struct lov_stripe_md_entry
  lustre: lov: add composite layout unpacking
  lustre: lov: embedded raid0 in struct lov_layout_composite
  lustre: lov: migrate lov raid0 to future PFL component handling
  lustre: lov: reduce code indentation
  lustre: lov: change lo_entries to array.
  lustre: lov: move around PFL code and cleanups
  lustre: lov: remove lsm_stripe_by_[index|offset]_plain
  lustre: lov: add looping lsm_entry_count times
  lustre: lov: create lov_comp_* wrappers
  lustre: clio: client side implementation for PFL
  lustre: pfl: dynamic layout modification with write/truncate
  lustre: pfl: calculate PFL file LOVEA correctly
  lustre: lov: keep minimum LOVEA size
  lustre: pfl: fix hang with grouplocks
  lustre: pfl: fix ost pool op->size handling
  lustre: llite: restore ll_file_getstripe in ll_lov_setstripe

Fan Yong (1):
  lustre: pfl: enhance PFID EA for PFL

Jinshan Xiong (3):
  lustre: pfl: Read should not trigger layout write intent
  lustre: lov: readahead shouldn't exceed component boundary
  lustre: lov: do not split IO for single striped file

Niu Yawei (4):
  lustre: pfl: Basic data structures for composite layout
  lustre: clio: getstripe support comp layout
  lustre: uapi: support negative flags
  lustre: llite: return v1/v3 layout for legacy app

 .../lustre/include/uapi/linux/lustre/lustre_idl.h  |  36 +-
 .../lustre/include/uapi/linux/lustre/lustre_user.h |  88 ++-
 drivers/staging/lustre/lustre/include/cl_object.h  |  12 +-
 drivers/staging/lustre/lustre/include/lustre_sec.h |   4 +-
 .../staging/lustre/lustre/include/lustre_swab.h    |   1 +
 drivers/staging/lustre/lustre/include/obd.h        |   4 -
 drivers/staging/lustre/lustre/llite/dir.c          |  38 +-
 drivers/staging/lustre/lustre/llite/file.c         | 185 +++--
 .../staging/lustre/lustre/llite/llite_internal.h   |   3 +
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  44 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  70 +-
 .../staging/lustre/lustre/lov/lov_cl_internal.h    | 191 ++---
 drivers/staging/lustre/lustre/lov/lov_ea.c         | 570 ++++++++++----
 drivers/staging/lustre/lustre/lov/lov_internal.h   | 175 +++--
 drivers/staging/lustre/lustre/lov/lov_io.c         | 651 +++++++++-------
 drivers/staging/lustre/lustre/lov/lov_lock.c       |  94 ++-
 drivers/staging/lustre/lustre/lov/lov_merge.c      |  12 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     | 833 ++++++++++++---------
 drivers/staging/lustre/lustre/lov/lov_offset.c     |  65 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c       | 364 +++++----
 drivers/staging/lustre/lustre/lov/lov_page.c       |  42 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c       |  20 +-
 drivers/staging/lustre/lustre/lov/lovsub_object.c  |  23 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  79 +-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   5 +-
 drivers/staging/lustre/lustre/obdclass/genops.c    |  16 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   4 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   6 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |  84 ++-
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |   7 +-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |   5 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 125 +++-
 32 files changed, 2483 insertions(+), 1373 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 23:54   ` NeilBrown
  2018-12-17 16:29 ` [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work James Simmons
                   ` (27 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

Added basic structures and magic numbers for composite layout.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24822
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_idl.h  |  1 +
 .../lustre/include/uapi/linux/lustre/lustre_user.h | 54 ++++++++++++++++
 .../staging/lustre/lustre/include/lustre_swab.h    |  1 +
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    | 71 ++++++++++++++++++++++
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 67 ++++++++++++++++++++
 5 files changed, 194 insertions(+)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
index 26646f9..e47eb52 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
@@ -960,6 +960,7 @@ enum obdo_flags {
 /* reserved for specifying OSTs */
 #define LOV_MAGIC_SPECIFIC	(0x0BD50000 | LOV_MAGIC_MAGIC)
 #define LOV_MAGIC		LOV_MAGIC_V1
+#define LOV_MAGIC_COMP_V1	(0x0BD60000 | LOV_MAGIC_MAGIC)
 
 /*
  * magic for fully defined striping
diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index 9d553ce6..3751b22 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -317,6 +317,7 @@ enum ll_lease_type {
 #define LOV_USER_MAGIC_V3	0x0BD30BD0
 /* 0x0BD40BD0 is occupied by LOV_MAGIC_MIGRATE */
 #define LOV_USER_MAGIC_SPECIFIC	0x0BD50BD0	/* for specific OSTs */
+#define LOV_USER_MAGIC_COMP_V1	0x0BD60BD0
 
 #define LMV_USER_MAGIC    0x0CD30CD0    /*default lmv magic*/
 
@@ -395,6 +396,59 @@ struct lov_user_md_v3 {	   /* LOV EA user data (host-endian) */
 	struct lov_user_ost_data_v1 lmm_objects[0]; /* per-stripe data */
 } __packed;
 
+struct lu_extent {
+	__u64	e_start;
+	__u64	e_end;
+};
+
+enum lov_comp_md_entry_flags {
+	LCME_FL_PRIMARY		= 0x00000001,   /* Not used */
+	LCME_FL_STALE		= 0x00000002,   /* Not used */
+	LCME_FL_OFFLINE		= 0x00000004,   /* Not used */
+	LCME_FL_PREFERRED	= 0x00000008,	/* Not used */
+	LCME_FL_INIT		= 0x00000010,	/* instantiated */
+};
+
+#define LCME_KNOWN_FLAGS	LCME_FL_INIT
+
+/* lcme_id can be specified as certain flags, and the the first
+ * bit of lcme_id is used to indicate that the ID is representing
+ * certain LCME_FL_* but not a real ID. Which implies we can have
+ * at most 31 flags (see LCME_FL_XXX).
+ */
+enum lcme_id {
+	LCME_ID_INVAL	= 0x0,
+	LCME_ID_MAX	= 0x7FFFFFFF,
+	LCME_ID_ALL	= 0xFFFFFFFF,
+	LCME_ID_NONE	= 0x80000000
+};
+
+#define LCME_ID_MASK	LCME_ID_MAX
+
+struct lov_comp_md_entry_v1 {
+	__u32			lcme_id;	/* unique id of component */
+	__u32			lcme_flags;	/* LCME_FL_XXX */
+	struct lu_extent	lcme_extent;	/* file extent for component */
+	__u32			lcme_offset;	/* offset of component blob,
+						 * start from lov_comp_md_v1
+						 */
+	__u32			lcme_size;	/* size of component blob */
+	__u64			lcme_padding[2];
+} __packed;
+
+enum lov_comp_md_flags;
+
+struct lov_comp_md_v1 {
+	__u32	lcm_magic;	/* LOV_USER_MAGIC_COMP_V1 */
+	__u32	lcm_size;	/* overall size including this struct */
+	__u32	lcm_layout_gen;
+	__u16	lcm_flags;
+	__u16	lcm_entry_count;
+	__u64	lcm_padding1;
+	__u64	lcm_padding2;
+	struct lov_comp_md_entry_v1 lcm_entries[0];
+} __packed;
+
 static inline __u32 lov_user_md_size(__u16 stripes, __u32 lmm_magic)
 {
 	if (lmm_magic == LOV_USER_MAGIC_V1)
diff --git a/drivers/staging/lustre/lustre/include/lustre_swab.h b/drivers/staging/lustre/lustre/include/lustre_swab.h
index e09a3dc..6939ac1 100644
--- a/drivers/staging/lustre/lustre/include/lustre_swab.h
+++ b/drivers/staging/lustre/lustre/include/lustre_swab.h
@@ -83,6 +83,7 @@
 void lustre_swab_fiemap(struct fiemap *fiemap);
 void lustre_swab_lov_user_md_v1(struct lov_user_md_v1 *lum);
 void lustre_swab_lov_user_md_v3(struct lov_user_md_v3 *lum);
+void lustre_swab_lov_comp_md_v1(struct lov_comp_md_v1 *lum);
 void lustre_swab_lov_user_md_objects(struct lov_user_ost_data *lod,
 				     int stripe_count);
 void lustre_swab_lov_mds_md(struct lov_mds_md *lmm);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 951bb92..9c5be30 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1990,6 +1990,77 @@ void lustre_swab_lov_user_md_v3(struct lov_user_md_v3 *lum)
 }
 EXPORT_SYMBOL(lustre_swab_lov_user_md_v3);
 
+void lustre_swab_lov_comp_md_v1(struct lov_comp_md_v1 *lum)
+{
+	struct lov_comp_md_entry_v1 *ent;
+	bool cpu_endian;
+	u16 ent_count;
+	int i;
+
+	cpu_endian = lum->lcm_magic == LOV_USER_MAGIC_COMP_V1;
+	ent_count = lum->lcm_entry_count;
+	if (!cpu_endian)
+		__swab16s(&ent_count);
+
+	CDEBUG(D_IOCTL, "swabbing lov_user_comp_md v1\n");
+	__swab32s(&lum->lcm_magic);
+	__swab32s(&lum->lcm_size);
+	__swab32s(&lum->lcm_layout_gen);
+	__swab16s(&lum->lcm_flags);
+	__swab16s(&lum->lcm_entry_count);
+	BUILD_BUG_ON(offsetof(typeof(*lum), lcm_padding1) == 0);
+	BUILD_BUG_ON(offsetof(typeof(*lum), lcm_padding2) == 0);
+
+	for (i = 0; i < ent_count; i++) {
+		struct lov_user_md_v1 *v1;
+		u16 stripe_count;
+		u32 off, size;
+
+		ent = &lum->lcm_entries[i];
+		off = ent->lcme_offset;
+		size = ent->lcme_size;
+
+		if (!cpu_endian) {
+			__swab32s(&off);
+			__swab32s(&size);
+		}
+		__swab32s(&ent->lcme_id);
+		__swab32s(&ent->lcme_flags);
+		__swab64s(&ent->lcme_extent.e_start);
+		__swab64s(&ent->lcme_extent.e_end);
+		__swab32s(&ent->lcme_offset);
+		__swab32s(&ent->lcme_size);
+		BUILD_BUG_ON(offsetof(typeof(*ent), lcme_padding) == 0);
+
+		v1 = (struct lov_user_md_v1 *)((char *)lum + off);
+		stripe_count = v1->lmm_stripe_count;
+		if (!cpu_endian)
+			__swab16s(&stripe_count);
+
+		if (v1->lmm_magic == __swab32(LOV_USER_MAGIC_V1) ||
+		    v1->lmm_magic == LOV_USER_MAGIC_V1) {
+			lustre_swab_lov_user_md_v1(v1);
+			if (size > sizeof(*v1))
+				lustre_swab_lov_user_md_objects(v1->lmm_objects,
+								stripe_count);
+		} else if (v1->lmm_magic == __swab32(LOV_USER_MAGIC_V3) ||
+			   v1->lmm_magic == LOV_USER_MAGIC_V3 ||
+			   v1->lmm_magic == __swab32(LOV_USER_MAGIC_SPECIFIC) ||
+			   v1->lmm_magic == LOV_USER_MAGIC_SPECIFIC) {
+			struct lov_user_md_v3 *v3;
+
+			v3 = (struct lov_user_md_v3 *)v1;
+			lustre_swab_lov_user_md_v3(v3);
+			if (size > sizeof(*v3))
+				lustre_swab_lov_user_md_objects(v3->lmm_objects,
+								stripe_count);
+		} else {
+			CERROR("Invalid magic %#x\n", v1->lmm_magic);
+		}
+	}
+}
+EXPORT_SYMBOL(lustre_swab_lov_comp_md_v1);
+
 void lustre_swab_lov_mds_md(struct lov_mds_md *lmm)
 {
 	CDEBUG(D_IOCTL, "swabbing lov_mds_md\n");
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 3aaaebb..90e6b8c 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -1450,6 +1450,73 @@ void lustre_assert_wire_constants(void)
 	LASSERTF(LOV_PATTERN_CMOBD == 0x00000200UL, "found 0x%.8xUL\n",
 		 (unsigned int)LOV_PATTERN_CMOBD);
 
+	/* Checks for struct lov_comp_md_entry_v1 */
+	LASSERTF((int)sizeof(struct lov_comp_md_entry_v1) == 48, "found %lld\n",
+		 (long long)(int)sizeof(struct lov_comp_md_entry_v1));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_id) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_id));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_id) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_id));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_flags) == 4, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_flags));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_flags) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_flags));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_extent) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_extent));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_extent) == 16, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_extent));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_offset) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_offset));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_offset) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_offset));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_size) == 28, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_size));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_size) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_size));
+	LASSERTF((int)offsetof(struct lov_comp_md_entry_v1, lcme_padding) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_entry_v1, lcme_padding));
+	LASSERTF((int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_padding) == 16, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_entry_v1 *)0)->lcme_padding));
+	LASSERTF(LCME_FL_INIT == 0x00000010UL, "found 0x%.8xUL\n",
+	         (unsigned int)LCME_FL_INIT);
+
+	/* Checks for struct lov_comp_md_v1 */
+	LASSERTF((int)sizeof(struct lov_comp_md_v1) == 32, "found %lld\n",
+		 (long long)(int)sizeof(struct lov_comp_md_v1));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_magic) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_magic));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_magic) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_magic));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_size) == 4, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_size));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_size) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_size));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_layout_gen) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_layout_gen));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_layout_gen) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_layout_gen));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_flags) == 12, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_flags));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_flags) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_flags));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_entry_count) == 14, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_entry_count));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_entry_count) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_entry_count));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_padding1) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_padding1));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_padding1) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_padding1));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_padding2) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_padding2));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_padding2) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_padding2));
+	LASSERTF((int)offsetof(struct lov_comp_md_v1, lcm_entries[0]) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_comp_md_v1, lcm_entries[0]));
+	LASSERTF((int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_entries[0]) == 48, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_comp_md_v1 *)0)->lcm_entries[0]));
+	BUILD_BUG_ON(LOV_MAGIC_COMP_V1 != (0x0BD60000 | 0x0BD0));
+
 	/* Checks for struct lmv_mds_md_v1 */
 	LASSERTF((int)sizeof(struct lmv_mds_md_v1) == 56, "found %lld\n",
 		 (long long)(int)sizeof(struct lmv_mds_md_v1));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-18  0:00   ` NeilBrown
  2018-12-17 16:29 ` [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling James Simmons
                   ` (26 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Move lov_tgt_maxbytes() and lsm_free_plain() toward the top of
lov_ea.c for upcoming PFL work.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24849
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c       | 87 ++++++++++++++----------
 drivers/staging/lustre/lustre/lov/lov_internal.h | 16 +----
 2 files changed, 51 insertions(+), 52 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index c80320a..6931ffd 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -44,6 +44,33 @@
 
 #include "lov_internal.h"
 
+/*
+ * Find minimum stripe maxbytes value.  For inactive or
+ * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
+ */
+static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
+{
+	loff_t maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+	struct obd_import *imp;
+
+	if (!tgt->ltd_active)
+		return maxbytes;
+
+	imp = tgt->ltd_obd->u.cli.cl_import;
+	if (!imp)
+		return maxbytes;
+
+	spin_lock(&imp->imp_lock);
+	if (imp->imp_state == LUSTRE_IMP_FULL &&
+	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
+	     imp->imp_connect_data.ocd_maxbytes > 0)
+		maxbytes = imp->imp_connect_data.ocd_maxbytes;
+
+	spin_unlock(&imp->imp_lock);
+
+	return maxbytes;
+}
+
 static int lsm_lmm_verify_common(struct lov_mds_md *lmm, int lmm_bytes,
 				 __u16 stripe_count)
 {
@@ -76,6 +103,16 @@ static int lsm_lmm_verify_common(struct lov_mds_md *lmm, int lmm_bytes,
 	return 0;
 }
 
+void lsm_free_plain(struct lov_stripe_md *lsm)
+{
+	__u16 stripe_count = lsm->lsm_stripe_count;
+	int i;
+
+	for (i = 0; i < stripe_count; i++)
+		kmem_cache_free(lov_oinfo_slab, lsm->lsm_oinfo[i]);
+	kvfree(lsm);
+}
+
 struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
 {
 	size_t oinfo_ptrs_size, lsm_size;
@@ -108,43 +145,6 @@ struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
 	return NULL;
 }
 
-void lsm_free_plain(struct lov_stripe_md *lsm)
-{
-	__u16 stripe_count = lsm->lsm_stripe_count;
-	int i;
-
-	for (i = 0; i < stripe_count; i++)
-		kmem_cache_free(lov_oinfo_slab, lsm->lsm_oinfo[i]);
-	kvfree(lsm);
-}
-
-/*
- * Find minimum stripe maxbytes value.  For inactive or
- * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
- */
-static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
-{
-	loff_t maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-	struct obd_import *imp;
-
-	if (!tgt->ltd_active)
-		return maxbytes;
-
-	imp = tgt->ltd_obd->u.cli.cl_import;
-	if (!imp)
-		return maxbytes;
-
-	spin_lock(&imp->imp_lock);
-	if (imp->imp_state == LUSTRE_IMP_FULL &&
-	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
-	     imp->imp_connect_data.ocd_maxbytes > 0)
-		maxbytes = imp->imp_connect_data.ocd_maxbytes;
-
-	spin_unlock(&imp->imp_lock);
-
-	return maxbytes;
-}
-
 static int lsm_unpackmd_common(struct lov_obd *lov,
 			       struct lov_stripe_md *lsm,
 			       struct lov_mds_md *lmm,
@@ -320,6 +320,19 @@ static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
 	.lsm_unpackmd	   = lsm_unpackmd_v3,
 };
 
+const struct lsm_operations *lsm_op_find(int magic)
+{
+	switch (magic) {
+	case LOV_MAGIC_V1:
+		return &lsm_v1_ops;
+	case LOV_MAGIC_V3:
+		return &lsm_v3_ops;
+	default:
+		CERROR("unrecognized lsm_magic %08x\n", magic);
+		return NULL;
+	}
+}
+
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 {
 	CDEBUG(level, "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, stripe_size %u, stripe_count %u, refc: %d, layout_gen %u, pool [" LOV_POOLNAMEF "]\n",
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 44a997e..51f416e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -92,21 +92,7 @@ struct lsm_operations {
 			    struct lov_mds_md *lmm);
 };
 
-extern const struct lsm_operations lsm_v1_ops;
-extern const struct lsm_operations lsm_v3_ops;
-
-static inline const struct lsm_operations *lsm_op_find(int magic)
-{
-	switch (magic) {
-	case LOV_MAGIC_V1:
-		return &lsm_v1_ops;
-	case LOV_MAGIC_V3:
-		return &lsm_v3_ops;
-	default:
-		CERROR("unrecognized lsm_magic %08x\n", magic);
-		return NULL;
-	}
-}
+const struct lsm_operations *lsm_op_find(int magic);
 
 /* lov_do_div64(a, b) returns a % b, and a = a / b.
  * The 32-bit code is LOV-specific due to knowing about stripe limits in
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-18  0:09   ` NeilBrown
  2018-12-17 16:29 ` [lustre-devel] [PATCH 04/28] lustre: lov: fold lmm_verify() handling into lmm_unpackmd() James Simmons
                   ` (25 subsequent siblings)
  28 siblings, 1 reply; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Several of the struct lsm_operations functions for both v1 and v3
are nearly identical. Let merge them together.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24849
WC-bug-id: https://jira.whamcloud.com/browse/LU-9315
Reviewed-on: https://review.whamcloud.com/26503
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c       | 66 ++++++++++++------------
 drivers/staging/lustre/lustre/lov/lov_internal.h |  3 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c     | 30 ++++-------
 3 files changed, 42 insertions(+), 57 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 6931ffd..5d9e619 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -71,8 +71,8 @@ static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
 	return maxbytes;
 }
 
-static int lsm_lmm_verify_common(struct lov_mds_md *lmm, int lmm_bytes,
-				 __u16 stripe_count)
+static int lsm_lmm_verify_v1v3(struct lov_mds_md *lmm, size_t lmm_size,
+			       u16 stripe_count)
 {
 	if (stripe_count > LOV_V1_INSANE_STRIPE_COUNT) {
 		CERROR("bad stripe count %d\n", stripe_count);
@@ -103,7 +103,7 @@ static int lsm_lmm_verify_common(struct lov_mds_md *lmm, int lmm_bytes,
 	return 0;
 }
 
-void lsm_free_plain(struct lov_stripe_md *lsm)
+void lsm_free(struct lov_stripe_md *lsm)
 {
 	__u16 stripe_count = lsm->lsm_stripe_count;
 	int i;
@@ -145,10 +145,11 @@ struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
 	return NULL;
 }
 
-static int lsm_unpackmd_common(struct lov_obd *lov,
-			       struct lov_stripe_md *lsm,
-			       struct lov_mds_md *lmm,
-			       struct lov_ost_data_v1 *objects)
+static int lsm_unpackmd_v1v3(struct lov_obd *lov,
+			     struct lov_stripe_md *lsm,
+			     struct lov_mds_md *lmm,
+			     const char *pool_name,
+			     struct lov_ost_data_v1 *objects)
 {
 	loff_t min_stripe_maxbytes = 0;
 	unsigned int stripe_count;
@@ -168,6 +169,15 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 
 	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
 
+	if (pool_name) {
+		size_t pool_name_len;
+
+		pool_name_len = strlcpy(lsm->lsm_pool_name, pool_name,
+					sizeof(lsm->lsm_pool_name));
+		if (pool_name_len >= sizeof(lsm->lsm_pool_name))
+			return -E2BIG;
+	}
+
 	for (i = 0; i < stripe_count; i++) {
 		loi = lsm->lsm_oinfo[i];
 		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
@@ -248,21 +258,20 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 		return -EINVAL;
 	}
 
-	return lsm_lmm_verify_common(lmm, lmm_bytes, *stripe_count);
+	return lsm_lmm_verify_v1v3(lmm, lmm_bytes, *stripe_count);
 }
 
 static int lsm_unpackmd_v1(struct lov_obd *lov, struct lov_stripe_md *lsm,
 			   struct lov_mds_md_v1 *lmm)
 {
-	return lsm_unpackmd_common(lov, lsm, lmm, lmm->lmm_objects);
+	return lsm_unpackmd_v1v3(lov, lsm, lmm, NULL, lmm->lmm_objects);
 }
 
-const struct lsm_operations lsm_v1_ops = {
-	.lsm_free	    = lsm_free_plain,
-	.lsm_stripe_by_index    = lsm_stripe_by_index_plain,
-	.lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
-	.lsm_lmm_verify	 = lsm_lmm_verify_v1,
-	.lsm_unpackmd	   = lsm_unpackmd_v1,
+const static struct lsm_operations lsm_v1_ops = {
+	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
+	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
+	.lsm_lmm_verify		= lsm_lmm_verify_v1,
+	.lsm_unpackmd		= lsm_unpackmd_v1,
 };
 
 static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
@@ -289,7 +298,7 @@ static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
 		return -EINVAL;
 	}
 
-	return lsm_lmm_verify_common((struct lov_mds_md_v1 *)lmm, lmm_bytes,
+	return lsm_lmm_verify_v1v3((struct lov_mds_md_v1 *)lmm, lmm_bytes,
 				     *stripe_count);
 }
 
@@ -297,27 +306,16 @@ static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
 			   struct lov_mds_md *lmm)
 {
 	struct lov_mds_md_v3 *lmm_v3 = (struct lov_mds_md_v3 *)lmm;
-	size_t cplen = 0;
-	int rc;
 
-	rc = lsm_unpackmd_common(lov, lsm, lmm, lmm_v3->lmm_objects);
-	if (rc)
-		return rc;
-
-	cplen = strlcpy(lsm->lsm_pool_name, lmm_v3->lmm_pool_name,
-			sizeof(lsm->lsm_pool_name));
-	if (cplen >= sizeof(lsm->lsm_pool_name))
-		return -E2BIG;
-
-	return 0;
+	return lsm_unpackmd_v1v3(lov, lsm, lmm, lmm_v3->lmm_pool_name,
+				 lmm_v3->lmm_objects);
 }
 
-const struct lsm_operations lsm_v3_ops = {
-	.lsm_free	    = lsm_free_plain,
-	.lsm_stripe_by_index    = lsm_stripe_by_index_plain,
-	.lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
-	.lsm_lmm_verify	 = lsm_lmm_verify_v3,
-	.lsm_unpackmd	   = lsm_unpackmd_v3,
+const static struct lsm_operations lsm_v3_ops = {
+	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
+	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
+	.lsm_lmm_verify		= lsm_lmm_verify_v3,
+	.lsm_unpackmd		= lsm_unpackmd_v3,
 };
 
 const struct lsm_operations *lsm_op_find(int magic)
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 51f416e..2c416b4 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -81,7 +81,6 @@ static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 }
 
 struct lsm_operations {
-	void (*lsm_free)(struct lov_stripe_md *);
 	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, loff_t *,
 				    loff_t *);
 	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, loff_t *,
@@ -93,6 +92,7 @@ struct lsm_operations {
 };
 
 const struct lsm_operations *lsm_op_find(int magic);
+void lsm_free(struct lov_stripe_md *lsm);
 
 /* lov_do_div64(a, b) returns a % b, and a = a / b.
  * The 32-bit code is LOV-specific due to knowing about stripe limits in
@@ -224,7 +224,6 @@ struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
 
 /* lov_ea.c */
 struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count);
-void lsm_free_plain(struct lov_stripe_md *lsm);
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm);
 
 /* lproc_lov.c */
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 98b114b..02936bf 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -181,22 +181,6 @@ __u16 lov_get_stripecnt(struct lov_obd *lov, __u32 magic, __u16 stripe_count)
 	return stripe_count;
 }
 
-static int lov_verify_lmm(void *lmm, int lmm_bytes, __u16 *stripe_count)
-{
-	int rc;
-
-	if (!lsm_op_find(le32_to_cpu(*(__u32 *)lmm))) {
-		CERROR("bad disk LOV MAGIC: 0x%08X; dumping LMM (size=%d):\n",
-		       le32_to_cpu(*(__u32 *)lmm), lmm_bytes);
-		CERROR("%*phN\n", lmm_bytes, lmm);
-		return -EINVAL;
-	}
-	rc = lsm_op_find(le32_to_cpu(*(__u32 *)lmm))->lsm_lmm_verify(lmm,
-								     lmm_bytes,
-								  stripe_count);
-	return rc;
-}
-
 static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
 					   u32 magic)
 {
@@ -237,7 +221,7 @@ int lov_free_memmd(struct lov_stripe_md **lsmp)
 	LASSERT(atomic_read(&lsm->lsm_refc) > 0);
 	refc = atomic_dec_return(&lsm->lsm_refc);
 	if (refc == 0)
-		lsm_op_find(lsm->lsm_magic)->lsm_free(lsm);
+		lsm_free(lsm);
 
 	return refc;
 }
@@ -248,25 +232,29 @@ int lov_free_memmd(struct lov_stripe_md **lsmp)
 struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
 				   size_t lmm_size)
 {
+	const struct lsm_operations *op;
 	struct lov_stripe_md *lsm;
 	u16 stripe_count;
 	u32 pattern;
 	u32 magic;
 	int rc;
 
-	rc = lov_verify_lmm(lmm, lmm_size, &stripe_count);
+	magic = le32_to_cpu(lmm->lmm_magic);
+	op = lsm_op_find(magic);
+	if (!op)
+		return ERR_PTR(-EINVAL);
+
+	rc = op->lsm_lmm_verify(lmm, lmm_size, &stripe_count);
 	if (rc)
 		return ERR_PTR(rc);
 
-	magic = le32_to_cpu(lmm->lmm_magic);
 	pattern = le32_to_cpu(lmm->lmm_pattern);
 
 	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
 	if (IS_ERR(lsm))
 		return lsm;
 
-	LASSERT(lsm_op_find(magic));
-	rc = lsm_op_find(magic)->lsm_unpackmd(lov, lsm, lmm);
+	rc = op->lsm_unpackmd(lov, lsm, lmm);
 	if (rc) {
 		lov_free_memmd(&lsm);
 		return ERR_PTR(rc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 04/28] lustre: lov: fold lmm_verify() handling into lmm_unpackmd()
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (2 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 05/28] lustre: lov: create struct lov_stripe_md_entry James Simmons
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

The function lov_unpackmd() calls the format specific version of
lmm_verify() and uses the returned information to allocate the
correct amount for the lsm information. We can fold the
lmm_verify() handling into the format specific unpackmd()
function. This also enables use to intergate the lsm allocation
as well into the unpackmd() function. This also greatly simplifies
lov_unpackmd().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24849
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c       | 113 ++++++++++++++++++-----
 drivers/staging/lustre/lustre/lov/lov_internal.h |  11 +--
 drivers/staging/lustre/lustre/lov/lov_pack.c     |  59 +-----------
 3 files changed, 99 insertions(+), 84 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 5d9e619..3e1b6a8 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -145,6 +145,37 @@ struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
 	return NULL;
 }
 
+static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
+					   u32 magic)
+{
+	struct lov_stripe_md *lsm;
+	unsigned int i;
+
+	CDEBUG(D_INFO, "alloc lsm, stripe_count %u\n", stripe_count);
+
+	lsm = lsm_alloc_plain(stripe_count);
+	if (!lsm) {
+		CERROR("cannot allocate LSM stripe_count %u\n", stripe_count);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	atomic_set(&lsm->lsm_refc, 1);
+	spin_lock_init(&lsm->lsm_lock);
+	lsm->lsm_magic = magic;
+	lsm->lsm_stripe_count = stripe_count;
+	lsm->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES * stripe_count;
+	lsm->lsm_pattern = pattern;
+	lsm->lsm_pool_name[0] = '\0';
+	lsm->lsm_layout_gen = 0;
+	if (stripe_count > 0)
+		lsm->lsm_oinfo[0]->loi_ost_idx = ~0;
+
+	for (i = 0; i < stripe_count; i++)
+		loi_init(lsm->lsm_oinfo[i]);
+
+	return lsm;
+}
+
 static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 			     struct lov_stripe_md *lsm,
 			     struct lov_mds_md *lmm,
@@ -238,12 +269,12 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
 }
 
-static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
+static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, size_t lmm_bytes,
 			     __u16 *stripe_count)
 {
 	if (lmm_bytes < sizeof(*lmm)) {
-		CERROR("lov_mds_md_v1 too small: %d, need at least %d\n",
-		       lmm_bytes, (int)sizeof(*lmm));
+		CERROR("lov_mds_md_v1 too small: %zu, need@least %zu\n",
+		       lmm_bytes, sizeof(*lmm));
 		return -EINVAL;
 	}
 
@@ -252,7 +283,7 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 		*stripe_count = 0;
 
 	if (lmm_bytes < lov_mds_md_size(*stripe_count, LOV_MAGIC_V1)) {
-		CERROR("LOV EA V1 too small: %d, need %d\n",
+		CERROR("LOV EA V1 too small: %zu, need %d\n",
 		       lmm_bytes, lov_mds_md_size(*stripe_count, LOV_MAGIC_V1));
 		lov_dump_lmm_common(D_WARNING, lmm);
 		return -EINVAL;
@@ -261,29 +292,47 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 	return lsm_lmm_verify_v1v3(lmm, lmm_bytes, *stripe_count);
 }
 
-static int lsm_unpackmd_v1(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			   struct lov_mds_md_v1 *lmm)
+static struct lov_stripe_md *
+lsm_unpackmd_v1(struct lov_obd *lov, void *buf, size_t buf_size)
 {
-	return lsm_unpackmd_v1v3(lov, lsm, lmm, NULL, lmm->lmm_objects);
+	struct lov_mds_md_v1 *lmm = buf;
+	u32 magic = le32_to_cpu(lmm->lmm_magic);
+	struct lov_stripe_md *lsm;
+	u16 stripe_count;
+	u32 pattern;
+	int rc;
+
+	rc = lsm_lmm_verify_v1(lmm, buf_size, &stripe_count);
+	if (rc)
+		return ERR_PTR(rc);
+
+	pattern = le32_to_cpu(lmm->lmm_pattern);
+
+	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
+	if (IS_ERR(lsm))
+		return lsm;
+
+	rc = lsm_unpackmd_v1v3(lov, lsm, lmm, NULL, lmm->lmm_objects);
+	if (rc) {
+		lov_free_memmd(&lsm);
+		lsm = ERR_PTR(rc);
+	}
+
+	return lsm;
 }
 
 const static struct lsm_operations lsm_v1_ops = {
 	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
 	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
-	.lsm_lmm_verify		= lsm_lmm_verify_v1,
 	.lsm_unpackmd		= lsm_unpackmd_v1,
 };
 
-static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
+static int lsm_lmm_verify_v3(struct lov_mds_md_v3 *lmm, size_t lmm_bytes,
 			     __u16 *stripe_count)
 {
-	struct lov_mds_md_v3 *lmm;
-
-	lmm = (struct lov_mds_md_v3 *)lmmv1;
-
 	if (lmm_bytes < sizeof(*lmm)) {
-		CERROR("lov_mds_md_v3 too small: %d, need at least %d\n",
-		       lmm_bytes, (int)sizeof(*lmm));
+		CERROR("lov_mds_md_v3 too small: %zu, need@least %zu\n",
+		       lmm_bytes, sizeof(*lmm));
 		return -EINVAL;
 	}
 
@@ -292,7 +341,7 @@ static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
 		*stripe_count = 0;
 
 	if (lmm_bytes < lov_mds_md_size(*stripe_count, LOV_MAGIC_V3)) {
-		CERROR("LOV EA V3 too small: %d, need %d\n",
+		CERROR("LOV EA V3 too small: %zu, need %d\n",
 		       lmm_bytes, lov_mds_md_size(*stripe_count, LOV_MAGIC_V3));
 		lov_dump_lmm_common(D_WARNING, lmm);
 		return -EINVAL;
@@ -302,19 +351,39 @@ static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
 				     *stripe_count);
 }
 
-static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			   struct lov_mds_md *lmm)
+static struct lov_stripe_md *
+lsm_unpackmd_v3(struct lov_obd *lov, void *buf, size_t buf_size)
 {
-	struct lov_mds_md_v3 *lmm_v3 = (struct lov_mds_md_v3 *)lmm;
+	struct lov_mds_md_v3 *lmm = buf;
+	u32 magic = le32_to_cpu(lmm->lmm_magic);
+	struct lov_stripe_md *lsm;
+	u16 stripe_count;
+	u32 pattern;
+	int rc;
+
+	rc = lsm_lmm_verify_v3(lmm, buf_size, &stripe_count);
+	if (rc)
+		return ERR_PTR(rc);
+
+	pattern = le32_to_cpu(lmm->lmm_pattern);
 
-	return lsm_unpackmd_v1v3(lov, lsm, lmm, lmm_v3->lmm_pool_name,
-				 lmm_v3->lmm_objects);
+	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
+	if (IS_ERR(lsm))
+		return lsm;
+
+	rc = lsm_unpackmd_v1v3(lov, lsm, (struct lov_mds_md_v1 *)lmm,
+			       lmm->lmm_pool_name, lmm->lmm_objects);
+	if (rc) {
+		lov_free_memmd(&lsm);
+		lsm = ERR_PTR(rc);
+	}
+
+	return lsm;
 }
 
 const static struct lsm_operations lsm_v3_ops = {
 	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
 	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
-	.lsm_lmm_verify		= lsm_lmm_verify_v3,
 	.lsm_unpackmd		= lsm_unpackmd_v3,
 };
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 2c416b4..ae122f6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -85,10 +85,8 @@ struct lsm_operations {
 				    loff_t *);
 	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, loff_t *,
 				     loff_t *);
-	int (*lsm_lmm_verify)(struct lov_mds_md *lmm, int lmm_bytes,
-			      u16 *stripe_count);
-	int (*lsm_unpackmd)(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			    struct lov_mds_md *lmm);
+	struct lov_stripe_md *(*lsm_unpackmd)(struct lov_obd *obd, void *buf,
+					      size_t buf_len);
 };
 
 const struct lsm_operations *lsm_op_find(int magic);
@@ -214,8 +212,8 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 /* lov_pack.c */
 ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 		     size_t buf_size);
-struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
-				   size_t lmm_size);
+struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, void *buf,
+				   size_t buf_size);
 int lov_free_memmd(struct lov_stripe_md **lsmp);
 
 void lov_dump_lmm_v1(int level, struct lov_mds_md_v1 *lmm);
@@ -223,7 +221,6 @@ struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
 void lov_dump_lmm_common(int level, void *lmmp);
 
 /* lov_ea.c */
-struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count);
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm);
 
 /* lproc_lov.c */
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 02936bf..90f9f2d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -181,37 +181,6 @@ __u16 lov_get_stripecnt(struct lov_obd *lov, __u32 magic, __u16 stripe_count)
 	return stripe_count;
 }
 
-static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
-					   u32 magic)
-{
-	struct lov_stripe_md *lsm;
-	unsigned int i;
-
-	CDEBUG(D_INFO, "alloc lsm, stripe_count %u\n", stripe_count);
-
-	lsm = lsm_alloc_plain(stripe_count);
-	if (!lsm) {
-		CERROR("cannot allocate LSM stripe_count %u\n", stripe_count);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	atomic_set(&lsm->lsm_refc, 1);
-	spin_lock_init(&lsm->lsm_lock);
-	lsm->lsm_magic = magic;
-	lsm->lsm_stripe_count = stripe_count;
-	lsm->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES * stripe_count;
-	lsm->lsm_pattern = pattern;
-	lsm->lsm_pool_name[0] = '\0';
-	lsm->lsm_layout_gen = 0;
-	if (stripe_count > 0)
-		lsm->lsm_oinfo[0]->loi_ost_idx = ~0;
-
-	for (i = 0; i < stripe_count; i++)
-		loi_init(lsm->lsm_oinfo[i]);
-
-	return lsm;
-}
-
 int lov_free_memmd(struct lov_stripe_md **lsmp)
 {
 	struct lov_stripe_md *lsm = *lsmp;
@@ -229,38 +198,18 @@ int lov_free_memmd(struct lov_stripe_md **lsmp)
 /* Unpack LOV object metadata from disk storage.  It is packed in LE byte
  * order and is opaque to the networking layer.
  */
-struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, struct lov_mds_md *lmm,
-				   size_t lmm_size)
+struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, void *buf,
+				   size_t buf_size)
 {
 	const struct lsm_operations *op;
-	struct lov_stripe_md *lsm;
-	u16 stripe_count;
-	u32 pattern;
 	u32 magic;
-	int rc;
 
-	magic = le32_to_cpu(lmm->lmm_magic);
+	magic = le32_to_cpu(*(u32 *)buf);
 	op = lsm_op_find(magic);
 	if (!op)
 		return ERR_PTR(-EINVAL);
 
-	rc = op->lsm_lmm_verify(lmm, lmm_size, &stripe_count);
-	if (rc)
-		return ERR_PTR(rc);
-
-	pattern = le32_to_cpu(lmm->lmm_pattern);
-
-	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
-	if (IS_ERR(lsm))
-		return lsm;
-
-	rc = op->lsm_unpackmd(lov, lsm, lmm);
-	if (rc) {
-		lov_free_memmd(&lsm);
-		return ERR_PTR(rc);
-	}
-
-	return lsm;
+	return op->lsm_unpackmd(lov, buf, buf_size);
 }
 
 /* Retrieve object striping information.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 05/28] lustre: lov: create struct lov_stripe_md_entry
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (3 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 04/28] lustre: lov: fold lmm_verify() handling into lmm_unpackmd() James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 06/28] lustre: lov: add composite layout unpacking James Simmons
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Create a new struct lov_stripe_md_entry that will be shared with
older striping methods and the new PFL handling. Rearrange the
code to handle this new data structure.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24849
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h      |   4 -
 drivers/staging/lustre/lustre/lov/lov_ea.c       | 338 ++++++++++-------------
 drivers/staging/lustre/lustre/lov/lov_internal.h |  35 ++-
 drivers/staging/lustre/lustre/lov/lov_io.c       |  17 +-
 drivers/staging/lustre/lustre/lov/lov_merge.c    |   4 +-
 drivers/staging/lustre/lustre/lov/lov_object.c   |  80 +++---
 drivers/staging/lustre/lustre/lov/lov_offset.c   |   8 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c     |  32 +--
 8 files changed, 245 insertions(+), 273 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index d6a968c..15d9573 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -75,10 +75,6 @@ static inline void loi_kms_set(struct lov_oinfo *oinfo, __u64 kms)
 	oinfo->loi_kms_valid = 1;
 }
 
-static inline void loi_init(struct lov_oinfo *loi)
-{
-}
-
 struct lov_stripe_md;
 struct obd_info;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 3e1b6a8..135ca33 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -103,114 +103,106 @@ static int lsm_lmm_verify_v1v3(struct lov_mds_md *lmm, size_t lmm_size,
 	return 0;
 }
 
-void lsm_free(struct lov_stripe_md *lsm)
+static void lsme_free(struct lov_stripe_md_entry *lsme)
 {
-	__u16 stripe_count = lsm->lsm_stripe_count;
-	int i;
+	unsigned int stripe_count = lsme->lsme_stripe_count;
+	unsigned int i;
 
 	for (i = 0; i < stripe_count; i++)
-		kmem_cache_free(lov_oinfo_slab, lsm->lsm_oinfo[i]);
-	kvfree(lsm);
-}
-
-struct lov_stripe_md *lsm_alloc_plain(u16 stripe_count)
-{
-	size_t oinfo_ptrs_size, lsm_size;
-	struct lov_stripe_md *lsm;
-	struct lov_oinfo     *loi;
-	int i;
-
-	LASSERT(stripe_count <= LOV_MAX_STRIPE_COUNT);
-
-	oinfo_ptrs_size = sizeof(struct lov_oinfo *) * stripe_count;
-	lsm_size = sizeof(*lsm) + oinfo_ptrs_size;
+		kmem_cache_free(lov_oinfo_slab, lsme->lsme_oinfo[i]);
 
-	lsm = kvzalloc(lsm_size, GFP_NOFS);
-	if (!lsm)
-		return NULL;
-
-	for (i = 0; i < stripe_count; i++) {
-		loi = kmem_cache_zalloc(lov_oinfo_slab, GFP_NOFS);
-		if (!loi)
-			goto err;
-		lsm->lsm_oinfo[i] = loi;
-	}
-	lsm->lsm_stripe_count = stripe_count;
-	return lsm;
-
-err:
-	while (--i >= 0)
-		kmem_cache_free(lov_oinfo_slab, lsm->lsm_oinfo[i]);
-	kvfree(lsm);
-	return NULL;
+	kvfree(lsme);
 }
 
-static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
-					   u32 magic)
+void lsm_free(struct lov_stripe_md *lsm)
 {
-	struct lov_stripe_md *lsm;
+	unsigned int entry_count = lsm->lsm_entry_count;
 	unsigned int i;
 
-	CDEBUG(D_INFO, "alloc lsm, stripe_count %u\n", stripe_count);
-
-	lsm = lsm_alloc_plain(stripe_count);
-	if (!lsm) {
-		CERROR("cannot allocate LSM stripe_count %u\n", stripe_count);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	atomic_set(&lsm->lsm_refc, 1);
-	spin_lock_init(&lsm->lsm_lock);
-	lsm->lsm_magic = magic;
-	lsm->lsm_stripe_count = stripe_count;
-	lsm->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES * stripe_count;
-	lsm->lsm_pattern = pattern;
-	lsm->lsm_pool_name[0] = '\0';
-	lsm->lsm_layout_gen = 0;
-	if (stripe_count > 0)
-		lsm->lsm_oinfo[0]->loi_ost_idx = ~0;
+	for (i = 0; i < entry_count; i++)
+		lsme_free(lsm->lsm_entries[i]);
 
-	for (i = 0; i < stripe_count; i++)
-		loi_init(lsm->lsm_oinfo[i]);
-
-	return lsm;
+	kfree(lsm);
 }
 
-static int lsm_unpackmd_v1v3(struct lov_obd *lov,
-			     struct lov_stripe_md *lsm,
-			     struct lov_mds_md *lmm,
-			     const char *pool_name,
-			     struct lov_ost_data_v1 *objects)
+/**
+ * Unpack a struct lov_mds_md into a struct lov_stripe_md_entry.
+ *
+ * The caller should set id and extent.
+ */
+static struct lov_stripe_md_entry *
+lsme_unpack(struct lov_obd *lov, struct lov_mds_md *lmm, size_t buf_size,
+	    const char *pool_name, struct lov_ost_data_v1 *objects,
+	    loff_t *maxbytes)
 {
+	struct lov_stripe_md_entry *lsme;
 	loff_t min_stripe_maxbytes = 0;
 	unsigned int stripe_count;
-	struct lov_oinfo *loi;
 	loff_t lov_bytes;
+	size_t lsme_size;
 	unsigned int i;
+	u32 pattern;
+	u32 magic;
+	int rc;
 
-	/*
-	 * This supposes lov_mds_md_v1/v3 first fields are
-	 * are the same
-	 */
-	lmm_oi_le_to_cpu(&lsm->lsm_oi, &lmm->lmm_oi);
-	lsm->lsm_stripe_size = le32_to_cpu(lmm->lmm_stripe_size);
-	lsm->lsm_pattern = le32_to_cpu(lmm->lmm_pattern);
-	lsm->lsm_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
-	lsm->lsm_pool_name[0] = '\0';
+	magic = le32_to_cpu(lmm->lmm_magic);
+	if (magic != LOV_MAGIC_V1 && magic != LOV_MAGIC_V3)
+		return ERR_PTR(-EINVAL);
+
+	pattern = le32_to_cpu(lmm->lmm_pattern);
+	if (pattern & LOV_PATTERN_F_RELEASED)
+		stripe_count = 0;
+	else
+		stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
+
+	if (buf_size < (magic == LOV_MAGIC_V1 ? sizeof(struct lov_mds_md_v1) :
+						sizeof(struct lov_mds_md_v3))) {
+		CERROR("LOV EA %s too small: %zu, need %u\n",
+		       magic == LOV_MAGIC_V1 ? "V1" : "V3", buf_size,
+		       lov_mds_md_size(stripe_count, magic == LOV_MAGIC_V1 ?
+				       LOV_MAGIC_V1 : LOV_MAGIC_V3));
+		lov_dump_lmm_common(D_WARNING, lmm);
+		return ERR_PTR(-EINVAL);
+	}
 
-	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
+	rc = lsm_lmm_verify_v1v3(lmm, buf_size, stripe_count);
+	if (rc < 0)
+		return ERR_PTR(rc);
+
+	lsme_size = offsetof(typeof(*lsme), lsme_oinfo[stripe_count]);
+	lsme = kvzalloc(lsme_size, GFP_KERNEL);
+	if (!lsme)
+		return ERR_PTR(-ENOMEM);
+
+	lsme->lsme_magic = magic;
+	lsme->lsme_pattern = pattern;
+	lsme->lsme_stripe_size = le32_to_cpu(lmm->lmm_stripe_size);
+	lsme->lsme_stripe_count = stripe_count;
+	lsme->lsme_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
 
 	if (pool_name) {
 		size_t pool_name_len;
 
-		pool_name_len = strlcpy(lsm->lsm_pool_name, pool_name,
-					sizeof(lsm->lsm_pool_name));
-		if (pool_name_len >= sizeof(lsm->lsm_pool_name))
-			return -E2BIG;
+		pool_name_len = strlcpy(lsme->lsme_pool_name, pool_name,
+					sizeof(lsme->lsme_pool_name));
+		if (pool_name_len >= sizeof(lsme->lsme_pool_name)) {
+			rc = -E2BIG;
+			goto out_lsme;
+		}
 	}
 
 	for (i = 0; i < stripe_count; i++) {
-		loi = lsm->lsm_oinfo[i];
+		struct lov_tgt_desc *ltd;
+		struct lov_oinfo *loi;
+
+		loi = kmem_cache_zalloc(lov_oinfo_slab, GFP_KERNEL);
+		if (!loi) {
+			rc = -ENOMEM;
+			goto out_lsme;
+		}
+
+		lsme->lsme_oinfo[i] = loi;
+
 		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
 		loi->loi_ost_idx = le32_to_cpu(objects[i].l_ost_idx);
 		loi->loi_ost_gen = le32_to_cpu(objects[i].l_ost_gen);
@@ -223,10 +215,12 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 			       (char *)lov->desc.ld_uuid.uuid,
 			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
 			lov_dump_lmm_v1(D_WARNING, lmm);
-			return -EINVAL;
+			rc = -EINVAL;
+			goto out_lsme;
 		}
 
-		if (!lov->lov_tgts[loi->loi_ost_idx]) {
+		ltd = lov->lov_tgts[loi->loi_ost_idx];
+		if (!ltd) {
 			CERROR("%s: OST index %d missing\n",
 			       (char *)lov->desc.ld_uuid.uuid,
 			       loi->loi_ost_idx);
@@ -234,7 +228,7 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 			continue;
 		}
 
-		lov_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
+		lov_bytes = lov_tgt_maxbytes(ltd);
 		if (min_stripe_maxbytes == 0 || lov_bytes < min_stripe_maxbytes)
 			min_stripe_maxbytes = lov_bytes;
 	}
@@ -242,15 +236,68 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 	if (min_stripe_maxbytes == 0)
 		min_stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
 
-	stripe_count = lsm->lsm_stripe_count ?: lov->desc.ld_tgt_count;
 	lov_bytes = min_stripe_maxbytes * stripe_count;
 
-	if (lov_bytes < min_stripe_maxbytes) /* handle overflow */
-		lsm->lsm_maxbytes = MAX_LFS_FILESIZE;
-	else
-		lsm->lsm_maxbytes = lov_bytes;
+	if (maxbytes) {
+		if (lov_bytes < min_stripe_maxbytes) /* handle overflow */
+			*maxbytes = MAX_LFS_FILESIZE;
+		else
+			*maxbytes = lov_bytes;
+	}
 
-	return 0;
+	return lsme;
+
+out_lsme:
+	for (i = 0; i < stripe_count; i++) {
+		struct lov_oinfo *loi = lsme->lsme_oinfo[i];
+
+		if (loi)
+			kmem_cache_free(lov_oinfo_slab, lsme->lsme_oinfo[i]);
+	}
+	kvfree(lsme);
+
+	return ERR_PTR(rc);
+}
+
+static inline struct lov_stripe_md *
+lsm_unpackmd_v1v3(struct lov_obd *lov,
+		  struct lov_mds_md *lmm, size_t buf_size,
+		  const char *pool_name,
+		  struct lov_ost_data_v1 *objects)
+{
+	struct lov_stripe_md_entry *lsme;
+	struct lov_stripe_md *lsm;
+	size_t lsm_size;
+	loff_t maxbytes;
+	u32 pattern;
+
+	pattern = le32_to_cpu(lmm->lmm_pattern);
+
+	lsme = lsme_unpack(lov, lmm, buf_size, pool_name, objects, &maxbytes);
+	if (IS_ERR(lsme))
+		return ERR_CAST(lsme);
+
+	lsme->lsme_extent.e_start = 0;
+	lsme->lsme_extent.e_end = LUSTRE_EOF;
+
+	lsm_size = offsetof(typeof(*lsm), lsm_entries[1]);
+	lsm = kzalloc(lsm_size, GFP_KERNEL);
+	if (!lsm) {
+		lsme_free(lsme);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	atomic_set(&lsm->lsm_refc, 1);
+	spin_lock_init(&lsm->lsm_lock);
+	lsm->lsm_maxbytes = maxbytes;
+	lmm_oi_le_to_cpu(&lsm->lsm_oi, &lmm->lmm_oi);
+	lsm->lsm_magic = le32_to_cpu(lmm->lmm_magic);
+	lsm->lsm_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
+	lsm->lsm_entry_count = 1;
+	lsm->lsm_is_released = pattern & LOV_PATTERN_F_RELEASED;
+	lsm->lsm_entries[0] = lsme;
+
+	return lsm;
 }
 
 static void
@@ -258,7 +305,8 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 			  loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
-		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
+		*swidth = (loff_t)lsm->lsm_entries[0]->lsme_stripe_size *
+			  lsm->lsm_entries[0]->lsme_stripe_count;
 }
 
 static void
@@ -266,59 +314,16 @@ static int lsm_unpackmd_v1v3(struct lov_obd *lov,
 			   loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
-		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
-}
-
-static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, size_t lmm_bytes,
-			     __u16 *stripe_count)
-{
-	if (lmm_bytes < sizeof(*lmm)) {
-		CERROR("lov_mds_md_v1 too small: %zu, need@least %zu\n",
-		       lmm_bytes, sizeof(*lmm));
-		return -EINVAL;
-	}
-
-	*stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
-	if (le32_to_cpu(lmm->lmm_pattern) & LOV_PATTERN_F_RELEASED)
-		*stripe_count = 0;
-
-	if (lmm_bytes < lov_mds_md_size(*stripe_count, LOV_MAGIC_V1)) {
-		CERROR("LOV EA V1 too small: %zu, need %d\n",
-		       lmm_bytes, lov_mds_md_size(*stripe_count, LOV_MAGIC_V1));
-		lov_dump_lmm_common(D_WARNING, lmm);
-		return -EINVAL;
-	}
-
-	return lsm_lmm_verify_v1v3(lmm, lmm_bytes, *stripe_count);
+		*swidth = (loff_t)lsm->lsm_entries[0]->lsme_stripe_size *
+			  lsm->lsm_entries[0]->lsme_stripe_count;
 }
 
 static struct lov_stripe_md *
 lsm_unpackmd_v1(struct lov_obd *lov, void *buf, size_t buf_size)
 {
 	struct lov_mds_md_v1 *lmm = buf;
-	u32 magic = le32_to_cpu(lmm->lmm_magic);
-	struct lov_stripe_md *lsm;
-	u16 stripe_count;
-	u32 pattern;
-	int rc;
 
-	rc = lsm_lmm_verify_v1(lmm, buf_size, &stripe_count);
-	if (rc)
-		return ERR_PTR(rc);
-
-	pattern = le32_to_cpu(lmm->lmm_pattern);
-
-	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
-	if (IS_ERR(lsm))
-		return lsm;
-
-	rc = lsm_unpackmd_v1v3(lov, lsm, lmm, NULL, lmm->lmm_objects);
-	if (rc) {
-		lov_free_memmd(&lsm);
-		lsm = ERR_PTR(rc);
-	}
-
-	return lsm;
+	return lsm_unpackmd_v1v3(lov, buf, buf_size, NULL, lmm->lmm_objects);
 }
 
 const static struct lsm_operations lsm_v1_ops = {
@@ -327,58 +332,13 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, size_t lmm_bytes,
 	.lsm_unpackmd		= lsm_unpackmd_v1,
 };
 
-static int lsm_lmm_verify_v3(struct lov_mds_md_v3 *lmm, size_t lmm_bytes,
-			     __u16 *stripe_count)
-{
-	if (lmm_bytes < sizeof(*lmm)) {
-		CERROR("lov_mds_md_v3 too small: %zu, need@least %zu\n",
-		       lmm_bytes, sizeof(*lmm));
-		return -EINVAL;
-	}
-
-	*stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
-	if (le32_to_cpu(lmm->lmm_pattern) & LOV_PATTERN_F_RELEASED)
-		*stripe_count = 0;
-
-	if (lmm_bytes < lov_mds_md_size(*stripe_count, LOV_MAGIC_V3)) {
-		CERROR("LOV EA V3 too small: %zu, need %d\n",
-		       lmm_bytes, lov_mds_md_size(*stripe_count, LOV_MAGIC_V3));
-		lov_dump_lmm_common(D_WARNING, lmm);
-		return -EINVAL;
-	}
-
-	return lsm_lmm_verify_v1v3((struct lov_mds_md_v1 *)lmm, lmm_bytes,
-				     *stripe_count);
-}
-
 static struct lov_stripe_md *
 lsm_unpackmd_v3(struct lov_obd *lov, void *buf, size_t buf_size)
 {
 	struct lov_mds_md_v3 *lmm = buf;
-	u32 magic = le32_to_cpu(lmm->lmm_magic);
-	struct lov_stripe_md *lsm;
-	u16 stripe_count;
-	u32 pattern;
-	int rc;
-
-	rc = lsm_lmm_verify_v3(lmm, buf_size, &stripe_count);
-	if (rc)
-		return ERR_PTR(rc);
-
-	pattern = le32_to_cpu(lmm->lmm_pattern);
-
-	lsm = lov_lsm_alloc(stripe_count, pattern, magic);
-	if (IS_ERR(lsm))
-		return lsm;
 
-	rc = lsm_unpackmd_v1v3(lov, lsm, (struct lov_mds_md_v1 *)lmm,
-			       lmm->lmm_pool_name, lmm->lmm_objects);
-	if (rc) {
-		lov_free_memmd(&lsm);
-		lsm = ERR_PTR(rc);
-	}
-
-	return lsm;
+	return lsm_unpackmd_v1v3(lov, buf, buf_size, lmm->lmm_pool_name,
+				 lmm->lmm_objects);
 }
 
 const static struct lsm_operations lsm_v3_ops = {
@@ -403,9 +363,9 @@ const struct lsm_operations *lsm_op_find(int magic)
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 {
 	CDEBUG(level, "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, stripe_size %u, stripe_count %u, refc: %d, layout_gen %u, pool [" LOV_POOLNAMEF "]\n",
-	       lsm,
-	       POSTID(&lsm->lsm_oi), lsm->lsm_maxbytes, lsm->lsm_magic,
-	       lsm->lsm_stripe_size, lsm->lsm_stripe_count,
+	       lsm, POSTID(&lsm->lsm_oi), lsm->lsm_maxbytes, lsm->lsm_magic,
+	       lsm->lsm_entries[0]->lsme_stripe_size,
+	       lsm->lsm_entries[0]->lsme_stripe_count,
 	       atomic_read(&lsm->lsm_refc), lsm->lsm_layout_gen,
-	       lsm->lsm_pool_name);
+	       lsm->lsm_entries[0]->lsme_pool_name);
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index ae122f6..f2747c9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -44,6 +44,18 @@
  */
 #define LUSTRE_EXT3_STRIPE_MAXBYTES 0x1fffffff000ULL
 
+struct lov_stripe_md_entry {
+	struct lu_extent	lsme_extent;
+	u32			lsme_id;
+	u32			lsme_magic;
+	u32			lsme_pattern;
+	u32			lsme_stripe_size;
+	u16			lsme_stripe_count;
+	u16			lsme_layout_gen;
+	char			lsme_pool_name[LOV_MAXPOOLNAME + 1];
+	struct lov_oinfo       *lsme_oinfo[];
+};
+
 struct lov_stripe_md {
 	atomic_t	lsm_refc;
 	spinlock_t	lsm_lock;
@@ -56,28 +68,15 @@ struct lov_stripe_md {
 	loff_t		lsm_maxbytes;
 	struct ost_id	lsm_oi;
 	u32		lsm_magic;
-	u32		lsm_stripe_size;
-	u32		lsm_pattern; /* RAID0, RAID1, released, ... */
-	u16		lsm_stripe_count;
-	u16		lsm_layout_gen;
-	char		lsm_pool_name[LOV_MAXPOOLNAME + 1];
-	struct lov_oinfo	*lsm_oinfo[0];
+	u32		lsm_layout_gen;
+	u32		lsm_entry_count;
+	bool		lsm_is_released;
+	struct lov_stripe_md_entry *lsm_entries[];
 };
 
-static inline bool lsm_is_released(struct lov_stripe_md *lsm)
-{
-	return !!(lsm->lsm_pattern & LOV_PATTERN_F_RELEASED);
-}
-
 static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 {
-	if (!lsm)
-		return false;
-
-	if (lsm_is_released(lsm))
-		return false;
-
-	return true;
+	return lsm && !lsm->lsm_is_released;
 }
 
 struct lsm_operations {
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 6537ba3..2d62566 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -251,10 +251,9 @@ static int lov_io_subio_init(const struct lu_env *env, struct lov_io *lio,
 	 * Need to be optimized, we can't afford to allocate a piece of memory
 	 * when writing a page. -jay
 	 */
-	lio->lis_subs =
-		kvzalloc(lsm->lsm_stripe_count *
+	lio->lis_subs = kcalloc(lsm->lsm_entries[0]->lsme_stripe_count,
 				sizeof(lio->lis_subs[0]),
-				GFP_NOFS);
+				GFP_KERNEL);
 	if (lio->lis_subs) {
 		lio->lis_nr_subios = lio->lis_stripe_count;
 		lio->lis_single_subio_index = -1;
@@ -272,7 +271,7 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 	io->ci_result = 0;
 	lio->lis_object = obj;
 
-	lio->lis_stripe_count = obj->lo_lsm->lsm_stripe_count;
+	lio->lis_stripe_count = obj->lo_lsm->lsm_entries[0]->lsme_stripe_count;
 
 	switch (io->ci_type) {
 	case CIT_READ:
@@ -287,7 +286,7 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 			 * If there is LOV EA hole, then we may cannot locate
 			 * the current file-tail exactly.
 			 */
-			if (unlikely(obj->lo_lsm->lsm_pattern &
+			if (unlikely(obj->lo_lsm->lsm_entries[0]->lsme_pattern &
 				     LOV_PATTERN_F_HOLE))
 				return -EIO;
 
@@ -419,9 +418,9 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	struct lov_io	*lio = cl2lov_io(env, ios);
 	struct cl_io	 *io  = ios->cis_io;
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
-	__u64 start = io->u.ci_rw.crw_pos;
+	unsigned long ssize = lsm->lsm_entries[0]->lsme_stripe_size;
+	u64 start = io->u.ci_rw.crw_pos;
 	loff_t next;
-	unsigned long ssize = lsm->lsm_stripe_size;
 
 	LASSERT(io->ci_type == CIT_READ || io->ci_type == CIT_WRITE);
 
@@ -596,11 +595,11 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	if (ra_end != CL_PAGE_EOF)
 		ra_end = lov_stripe_pgoff(loo->lo_lsm, ra_end, stripe);
 
-	pps = loo->lo_lsm->lsm_stripe_size >> PAGE_SHIFT;
+	pps = loo->lo_lsm->lsm_entries[0]->lsme_stripe_size >> PAGE_SHIFT;
 
 	CDEBUG(D_READA, DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
 	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
-	       loo->lo_lsm->lsm_stripe_size, stripe, start);
+	       loo->lo_lsm->lsm_entries[0]->lsme_stripe_size, stripe, start);
 
 	/* never exceed the end of the stripe */
 	ra->cra_end = min_t(pgoff_t, ra_end, start + pps - start % pps - 1);
diff --git a/drivers/staging/lustre/lustre/lov/lov_merge.c b/drivers/staging/lustre/lustre/lov/lov_merge.c
index 006717c..10b8448 100644
--- a/drivers/staging/lustre/lustre/lov/lov_merge.c
+++ b/drivers/staging/lustre/lustre/lov/lov_merge.c
@@ -59,8 +59,8 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 	CDEBUG(D_INODE, "MDT ID " DOSTID " initial value: s=%llu m=%llu a=%llu c=%llu b=%llu\n",
 	       POSTID(&lsm->lsm_oi), lvb->lvb_size, lvb->lvb_mtime,
 	       lvb->lvb_atime, lvb->lvb_ctime, lvb->lvb_blocks);
-	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
+		struct lov_oinfo *loi = lsm->lsm_entries[0]->lsme_oinfo[i];
 		u64 lov_size, tmpsize;
 
 		if (OST_LVB_IS_ERR(loi->loi_lvb.lvb_blocks)) {
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index adc90f3..ad2901a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -153,7 +153,7 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 	hdr    = cl_object_header(lov2cl(lov));
 	subhdr = cl_object_header(stripe);
 
-	oinfo = lov->lo_lsm->lsm_oinfo[idx];
+	oinfo = lov->lo_lsm->lsm_entries[0]->lsme_oinfo[idx];
 	CDEBUG(D_INODE, DFID "@%p[%d] -> " DFID "@%p: ostid: " DOSTID " idx: %d gen: %d\n",
 	       PFID(&subhdr->coh_lu.loh_fid), subhdr, idx,
 	       PFID(&hdr->coh_lu.loh_fid), hdr, POSTID(&oinfo->loi_oi),
@@ -239,7 +239,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	LASSERT(!lov->lo_lsm);
 	lov->lo_lsm = lsm_addref(lsm);
 	lov->lo_layout_invalid = true;
-	r0->lo_nr  = lsm->lsm_stripe_count;
+	r0->lo_nr  = lsm->lsm_entries[0]->lsme_stripe_count;
 	LASSERT(r0->lo_nr <= lov_targets_nr(dev));
 
 	r0->lo_sub = kvzalloc(r0->lo_nr * sizeof(r0->lo_sub[0]),
@@ -255,9 +255,10 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 		 */
 		for (i = 0; i < r0->lo_nr && result == 0; ++i) {
 			struct cl_device *subdev;
-			struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
-			int ost_idx = oinfo->loi_ost_idx;
+			struct lov_oinfo *oinfo;
+			int ost_idx;
 
+			oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
 			if (lov_oinfo_is_dummy(oinfo))
 				continue;
 
@@ -266,6 +267,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			if (result != 0)
 				goto out;
 
+			ost_idx = oinfo->loi_ost_idx;
 			if (!dev->ld_target[ost_idx]) {
 				CERROR("%s: OST %04x is not initialized\n",
 				lov2obd(dev->ld_lov)->obd_name, ost_idx);
@@ -314,7 +316,7 @@ static int lov_init_released(const struct lu_env *env, struct lov_device *dev,
 			     union  lov_layout_state *state)
 {
 	LASSERT(lsm);
-	LASSERT(lsm_is_released(lsm));
+	LASSERT(lsm->lsm_is_released);
 	LASSERT(!lov->lo_lsm);
 
 	lov->lo_lsm = lsm_addref(lsm);
@@ -327,7 +329,7 @@ static struct cl_object *lov_find_subobj(const struct lu_env *env,
 					 int stripe_idx)
 {
 	struct lov_device *dev = lu2lov_dev(lov2lu(lov)->lo_dev);
-	struct lov_oinfo *oinfo = lsm->lsm_oinfo[stripe_idx];
+	struct lov_oinfo *oinfo = lsm->lsm_entries[0]->lsme_oinfo[stripe_idx];
 	struct lov_thread_info *lti = lov_env_info(env);
 	struct lu_fid *ofid = &lti->lti_fid;
 	struct cl_device *subdev;
@@ -485,7 +487,7 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
 	     r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
-	     lsm->lsm_stripe_count, lsm->lsm_layout_gen);
+	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
 	for (i = 0; i < r0->lo_nr; ++i) {
 		struct lu_object *sub;
 
@@ -509,7 +511,7 @@ static int lov_print_released(const struct lu_env *env, void *cookie,
 	     "released: %s, lsm{%p 0x%08X %d %u %u}:\n",
 	     lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
-	     lsm->lsm_stripe_count, lsm->lsm_layout_gen);
+	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
 	return 0;
 }
 
@@ -650,8 +652,13 @@ static enum lov_layout_type lov_type(struct lov_stripe_md *lsm)
 {
 	if (!lsm)
 		return LLT_EMPTY;
-	if (lsm_is_released(lsm))
+
+	if (lsm->lsm_magic == LOV_MAGIC_COMP_V1)
+		return LLT_EMPTY;
+
+	if (lsm->lsm_is_released)
 		return LLT_RELEASED;
+
 	return LLT_RAID0;
 }
 
@@ -882,7 +889,8 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 	if ((!lsm && !lov->lo_lsm) ||
 	    ((lsm && lov->lo_lsm) &&
 	     (lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen) &&
-	     (lov->lo_lsm->lsm_pattern == lsm->lsm_pattern))) {
+	     (lov->lo_lsm->lsm_entries[0]->lsme_pattern ==
+	      lsm->lsm_entries[0]->lsme_pattern))) {
 		/* same version of layout */
 		lov->lo_layout_invalid = false;
 		result = 0;
@@ -1010,19 +1018,24 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm,
 	u64 obd_end;
 	int i, j;
 
-	if (fm_end - fm_start > lsm->lsm_stripe_size * lsm->lsm_stripe_count) {
-		last_stripe = (start_stripe < 1 ? lsm->lsm_stripe_count - 1 :
+	if (fm_end - fm_start > lsm->lsm_entries[0]->lsme_stripe_size *
+				lsm->lsm_entries[0]->lsme_stripe_count) {
+		last_stripe = (start_stripe < 1 ?
+			       lsm->lsm_entries[0]->lsme_stripe_count - 1 :
 			       start_stripe - 1);
-		*stripe_count = lsm->lsm_stripe_count;
+		*stripe_count = lsm->lsm_entries[0]->lsme_stripe_count;
 	} else {
-		for (j = 0, i = start_stripe; j < lsm->lsm_stripe_count;
-		     i = (i + 1) % lsm->lsm_stripe_count, j++) {
-			if (!(lov_stripe_intersects(lsm, i, fm_start, fm_end,
-						    &obd_start, &obd_end)))
+		for (j = 0, i = start_stripe;
+		     j < lsm->lsm_entries[0]->lsme_stripe_count;
+		     i = (i + 1) % lsm->lsm_entries[0]->lsme_stripe_count,
+		     j++) {
+			if (lov_stripe_intersects(lsm, i, fm_start, fm_end,
+						  &obd_start, &obd_end) == 0)
 				break;
 		}
 		*stripe_count = j;
-		last_stripe = (start_stripe + j - 1) % lsm->lsm_stripe_count;
+		last_stripe = (start_stripe + j - 1) %
+			      lsm->lsm_entries[0]->lsme_stripe_count;
 	}
 
 	return last_stripe;
@@ -1090,8 +1103,8 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 		return 0;
 
 	/* Find out stripe_no from ost_index saved in the fe_device */
-	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
+	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
+		struct lov_oinfo *oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
 
 		if (lov_oinfo_is_dummy(oinfo))
 			continue;
@@ -1110,7 +1123,7 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 	 * offset to start of next device
 	 */
 	if (lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
-				  &lun_start, &lun_end) &&
+				  &lun_start, &lun_end) != 0 &&
 	    local_end < lun_end) {
 		fm_end_offset = local_end;
 		*start_stripe = stripe_no;
@@ -1119,7 +1132,8 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 		 * calculate offset in next stripe.
 		 */
 		fm_end_offset = 0;
-		*start_stripe = (stripe_no + 1) % lsm->lsm_stripe_count;
+		*start_stripe = (stripe_no + 1) %
+				lsm->lsm_entries[0]->lsme_stripe_count;
 	}
 
 	return fm_end_offset;
@@ -1168,7 +1182,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 				   &lun_start, &obd_object_end)) == 0)
 		return 0;
 
-	if (lov_oinfo_is_dummy(lsm->lsm_oinfo[stripeno]))
+	if (lov_oinfo_is_dummy(lsm->lsm_entries[0]->lsme_oinfo[stripeno]))
 		return -EIO;
 
 	/* If this is a continuation FIEMAP call and we are on
@@ -1218,7 +1232,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 		fs->fs_fm->fm_mapped_extents = 0;
 		fs->fs_fm->fm_flags = fiemap->fm_flags;
 
-		ost_index = lsm->lsm_oinfo[stripeno]->loi_ost_idx;
+		ost_index = lsm->lsm_entries[0]->lsme_oinfo[stripeno]->loi_ost_idx;
 
 		if (ost_index < 0 || ost_index >= lov->desc.ld_tgt_count) {
 			rc = -EINVAL;
@@ -1347,13 +1361,13 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	 * If the stripe_count > 1 and the application does not understand
 	 * DEVICE_ORDER flag, it cannot interpret the extents correctly.
 	 */
-	if (lsm->lsm_stripe_count > 1 &&
+	if (lsm->lsm_entries[0]->lsme_stripe_count > 1 &&
 	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
 		rc = -ENOTSUPP;
 		goto out;
 	}
 
-	if (lsm_is_released(lsm)) {
+	if (lsm->lsm_is_released) {
 		if (fiemap->fm_start < fmkey->lfik_oa.o_size) {
 			/**
 			 * released file, return a minimal FIEMAP if
@@ -1431,7 +1445,8 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	/* Check each stripe */
 	for (cur_stripe = fs.fs_start_stripe; stripe_count > 0;
 	     --stripe_count,
-	     cur_stripe = (cur_stripe + 1) % lsm->lsm_stripe_count) {
+	     cur_stripe = (cur_stripe + 1) %
+			  lsm->lsm_entries[0]->lsme_stripe_count) {
 		rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen, fmkey,
 				       cur_stripe, &fs);
 		if (rc < 0)
@@ -1443,7 +1458,7 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	 * Indicate that we are returning device offsets unless file just has
 	 * single stripe
 	 */
-	if (lsm->lsm_stripe_count > 1)
+	if (lsm->lsm_entries[0]->lsme_stripe_count > 1)
 		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
 
 	if (!fiemap->fm_extent_count)
@@ -1495,7 +1510,8 @@ static int lov_object_layout_get(const struct lu_env *env,
 		return 0;
 	}
 
-	cl->cl_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	cl->cl_size = lov_mds_md_size(lsm->lsm_entries[0]->lsme_stripe_count,
+				      lsm->lsm_magic);
 	cl->cl_layout_gen = lsm->lsm_layout_gen;
 
 	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
@@ -1599,9 +1615,11 @@ int lov_read_and_clear_async_rc(struct cl_object *clob)
 			int i;
 
 			lsm = lov->lo_lsm;
-			for (i = 0; i < lsm->lsm_stripe_count; i++) {
-				struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+			for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count;
+			     i++) {
+				struct lov_oinfo *loi;
 
+				loi = lsm->lsm_entries[0]->lsme_oinfo[i];
 				if (lov_oinfo_is_dummy(loi))
 					continue;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_offset.c b/drivers/staging/lustre/lustre/lov/lov_offset.c
index a5f00f6..19a44d3 100644
--- a/drivers/staging/lustre/lustre/lov/lov_offset.c
+++ b/drivers/staging/lustre/lustre/lov/lov_offset.c
@@ -40,7 +40,7 @@
 /* compute object size given "stripeno" and the ost size */
 u64 lov_stripe_size(struct lov_stripe_md *lsm, u64 ost_size, int stripeno)
 {
-	unsigned long ssize = lsm->lsm_stripe_size;
+	unsigned long ssize = lsm->lsm_entries[0]->lsme_stripe_size;
 	unsigned long stripe_size;
 	u64 swidth;
 	u64 lov_size;
@@ -125,7 +125,7 @@ pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
 int lov_stripe_offset(struct lov_stripe_md *lsm, u64 lov_off,
 		      int stripeno, u64 *obdoff)
 {
-	unsigned long ssize  = lsm->lsm_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
 	int magic = lsm->lsm_magic;
 	int ret = 0;
@@ -180,7 +180,7 @@ int lov_stripe_offset(struct lov_stripe_md *lsm, u64 lov_off,
 u64 lov_size_to_stripe(struct lov_stripe_md *lsm, u64 file_size,
 		       int stripeno)
 {
-	unsigned long ssize  = lsm->lsm_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
 	int magic = lsm->lsm_magic;
 
@@ -254,7 +254,7 @@ int lov_stripe_intersects(struct lov_stripe_md *lsm, int stripeno,
 /* compute which stripe number "lov_off" will be written into */
 int lov_stripe_number(struct lov_stripe_md *lsm, u64 lov_off)
 {
-	unsigned long ssize  = lsm->lsm_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
 	u64 stripe_off, swidth;
 	int magic = lsm->lsm_magic;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 90f9f2d..3700937 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -116,7 +116,8 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	size_t lmm_size;
 	unsigned int i;
 
-	lmm_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	lmm_size = lov_mds_md_size(lsm->lsm_entries[0]->lsme_stripe_count,
+				   lsm->lsm_magic);
 	if (!buf_size)
 		return lmm_size;
 
@@ -129,23 +130,24 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	 */
 	lmmv1->lmm_magic = cpu_to_le32(lsm->lsm_magic);
 	lmm_oi_cpu_to_le(&lmmv1->lmm_oi, &lsm->lsm_oi);
-	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_stripe_size);
-	lmmv1->lmm_stripe_count = cpu_to_le16(lsm->lsm_stripe_count);
-	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_pattern);
+	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_entries[0]->lsme_stripe_size);
+	lmmv1->lmm_stripe_count = cpu_to_le16(lsm->lsm_entries[0]->lsme_stripe_count);
+	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_entries[0]->lsme_pattern);
 	lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
 
 	if (lsm->lsm_magic == LOV_MAGIC_V3) {
-		BUILD_BUG_ON(sizeof(lsm->lsm_pool_name) !=
-			 sizeof(lmmv3->lmm_pool_name));
-		strlcpy(lmmv3->lmm_pool_name, lsm->lsm_pool_name,
+		BUILD_BUG_ON(sizeof(lsm->lsm_entries[0]->lsme_pool_name) !=
+			     sizeof(lmmv3->lmm_pool_name));
+		strlcpy(lmmv3->lmm_pool_name,
+			lsm->lsm_entries[0]->lsme_pool_name,
 			sizeof(lmmv3->lmm_pool_name));
 		lmm_objects = lmmv3->lmm_objects;
 	} else {
 		lmm_objects = lmmv1->lmm_objects;
 	}
 
-	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
+		struct lov_oinfo *loi = lsm->lsm_entries[0]->lsme_oinfo[i];
 
 		ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
 		lmm_objects[i].l_ost_gen = cpu_to_le32(loi->loi_ost_gen);
@@ -240,8 +242,8 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		goto out;
 	}
 
-	if (!lsm_is_released(lsm))
-		stripe_count = lsm->lsm_stripe_count;
+	if (!lsm->lsm_is_released)
+		stripe_count = lsm->lsm_entries[0]->lsme_stripe_count;
 	else
 		stripe_count = 0;
 
@@ -260,18 +262,16 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		goto out;
 	}
 
-	if (lum.lmm_stripe_count &&
-	    (lum.lmm_stripe_count < lsm->lsm_stripe_count)) {
+	if (lum.lmm_stripe_count && lum.lmm_stripe_count < stripe_count) {
 		/* Return right size of stripe to user */
 		lum.lmm_stripe_count = stripe_count;
 		rc = copy_to_user(lump, &lum, lum_size);
 		rc = -EOVERFLOW;
 		goto out;
 	}
-	lmmk_size = lov_mds_md_size(stripe_count, lsm->lsm_magic);
-
 
-	lmmk = kvzalloc(lmmk_size, GFP_NOFS);
+	lmmk_size = lov_mds_md_size(stripe_count, lsm->lsm_magic);
+	lmmk = kvzalloc(lmmk_size, GFP_KERNEL);
 	if (!lmmk) {
 		rc = -ENOMEM;
 		goto out;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 06/28] lustre: lov: add composite layout unpacking
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (4 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 05/28] lustre: lov: create struct lov_stripe_md_entry James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 07/28] lustre: lov: embedded raid0 in struct lov_layout_composite James Simmons
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Update struct lov_stripe_md to accommodate composite layouts. Add
methods to unpack composite layouts.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24849
WC-bug-id: https://jira.whamcloud.com/browse/LU-9315
Reviewed-on: https://review.whamcloud.com/26503
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c   | 175 ++++++++++++++++++++++++++-
 drivers/staging/lustre/lustre/lov/lov_pack.c |   3 +
 2 files changed, 175 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 135ca33..7d3d691 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -38,12 +38,21 @@
 #define DEBUG_SUBSYSTEM S_LOV
 
 #include <asm/div64.h>
+#include <linux/sort.h>
 
 #include <obd_class.h>
 #include <uapi/linux/lustre/lustre_idl.h>
+#include <uapi/linux/lustre/lustre_user.h>
 
 #include "lov_internal.h"
 
+static inline void lu_extent_le_to_cpu(struct lu_extent *dst,
+				       const struct lu_extent *src)
+{
+	dst->e_start = le64_to_cpu(src->e_start);
+	dst->e_end = le64_to_cpu(src->e_end);
+}
+
 /*
  * Find minimum stripe maxbytes value.  For inactive or
  * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
@@ -347,17 +356,177 @@ void lsm_free(struct lov_stripe_md *lsm)
 	.lsm_unpackmd		= lsm_unpackmd_v3,
 };
 
+static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
+				 size_t lcm_buf_size)
+{
+	unsigned int entry_count;
+	size_t lcm_size;
+	unsigned int i;
+
+	lcm_size = le32_to_cpu(lcm->lcm_size);
+	if (lcm_buf_size < lcm_size) {
+		CERROR("bad LCM buffer size %zu, expected %zu\n",
+		       lcm_buf_size, lcm_size);
+		return -EINVAL;
+	}
+
+	entry_count = le16_to_cpu(lcm->lcm_entry_count);
+	for (i = 0; i < entry_count; i++) {
+		struct lov_comp_md_entry_v1 *lcme = &lcm->lcm_entries[i];
+		size_t blob_offset;
+		size_t blob_size;
+
+		blob_offset = le32_to_cpu(lcme->lcme_offset);
+		blob_size = le32_to_cpu(lcme->lcme_size);
+
+		if (lcm_size < blob_offset || lcm_size < blob_size ||
+		    lcm_size < blob_offset + blob_size) {
+			CERROR("LCM entry %u has invalid blob: LCM size = %zu, offset = %zu, size = %zu\n",
+			       le32_to_cpu(lcme->lcme_id), lcm_size,
+			       blob_offset, blob_size);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static struct lov_stripe_md_entry *
+lsme_unpack_comp(struct lov_obd *lov, struct lov_mds_md *lmm,
+		 size_t lmm_buf_size, loff_t *maxbytes)
+{
+	unsigned int stripe_count;
+	unsigned int magic;
+
+	stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
+	if (stripe_count == 0)
+		return ERR_PTR(-EINVAL);
+
+	magic = le32_to_cpu(lmm->lmm_magic);
+	if (magic != LOV_MAGIC_V1 && magic != LOV_MAGIC_V3)
+		return ERR_PTR(-EINVAL);
+
+	if (lmm_buf_size < lov_mds_md_size(stripe_count, magic))
+		return ERR_PTR(-EINVAL);
+
+	if (magic == LOV_MAGIC_V1) {
+		return lsme_unpack(lov, lmm, lmm_buf_size, NULL,
+				   lmm->lmm_objects, maxbytes);
+	} else {
+		struct lov_mds_md_v3 *lmm3 = (struct lov_mds_md_v3 *)lmm;
+
+		return lsme_unpack(lov, lmm, lmm_buf_size, lmm3->lmm_pool_name,
+				   lmm3->lmm_objects, maxbytes);
+	}
+}
+
+static struct lov_stripe_md *
+lsm_unpackmd_comp_md_v1(struct lov_obd *lov, void *buf, size_t buf_size)
+{
+	struct lov_comp_md_v1 *lcm = buf;
+	struct lov_stripe_md *lsm;
+	unsigned int entry_count = 0;
+	loff_t maxbytes;
+	size_t lsm_size;
+	unsigned int i;
+	int rc;
+
+	rc = lsm_verify_comp_md_v1(buf, buf_size);
+	if (rc < 0)
+		return ERR_PTR(rc);
+
+	entry_count = le16_to_cpu(lcm->lcm_entry_count);
+
+	lsm_size = offsetof(typeof(*lsm), lsm_entries[entry_count]);
+	lsm = kzalloc(lsm_size, GFP_KERNEL);
+	if (!lsm)
+		return ERR_PTR(-ENOMEM);
+
+	atomic_set(&lsm->lsm_refc, 1);
+	spin_lock_init(&lsm->lsm_lock);
+	lsm->lsm_magic = le32_to_cpu(lcm->lcm_magic);
+	lsm->lsm_layout_gen = le32_to_cpu(lcm->lcm_layout_gen);
+	lsm->lsm_entry_count = entry_count;
+	lsm->lsm_is_released = true;
+	lsm->lsm_maxbytes = LLONG_MIN;
+
+	for (i = 0; i < entry_count; i++) {
+		struct lov_comp_md_entry_v1 *lcme = &lcm->lcm_entries[i];
+		struct lov_stripe_md_entry *lsme;
+		size_t blob_offset;
+		size_t blob_size;
+		void *blob;
+
+		blob_offset = le32_to_cpu(lcme->lcme_offset);
+		blob_size = le32_to_cpu(lcme->lcme_size);
+		blob = (char *)lcm + blob_offset;
+
+		lsme = lsme_unpack_comp(lov, blob, blob_size,
+					(i == entry_count - 1) ? &maxbytes :
+					NULL);
+		if (IS_ERR(lsme)) {
+			rc = PTR_ERR(lsme);
+			goto out_lsm;
+		}
+
+		if (!(lsme->lsme_pattern & LOV_PATTERN_F_RELEASED))
+			lsm->lsm_is_released = false;
+
+		lsm->lsm_entries[i] = lsme;
+		lsme->lsme_id = le32_to_cpu(lcme->lcme_id);
+		lu_extent_le_to_cpu(&lsme->lsme_extent, &lcme->lcme_extent);
+
+		if (i == entry_count - 1) {
+			lsm->lsm_maxbytes = (loff_t)lsme->lsme_extent.e_start +
+					    maxbytes;
+			/* the last component hasn't been defined, or
+			 * lsm_maxbytes overflowed.
+			 */
+			if (lsme->lsme_extent.e_end != LUSTRE_EOF ||
+			    lsm->lsm_maxbytes <
+			    (loff_t)lsme->lsme_extent.e_start)
+				lsm->lsm_maxbytes = MAX_LFS_FILESIZE;
+		}
+	}
+
+	return lsm;
+
+out_lsm:
+	for (i = 0; i < entry_count; i++)
+		if (lsm->lsm_entries[i])
+			lsme_free(lsm->lsm_entries[i]);
+
+	kfree(lsm);
+
+	return ERR_PTR(rc);
+}
+
+const static struct lsm_operations lsm_comp_md_v1_ops = {
+	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
+	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
+	.lsm_unpackmd		= lsm_unpackmd_comp_md_v1,
+};
+
 const struct lsm_operations *lsm_op_find(int magic)
 {
+	const struct lsm_operations *lsm = NULL;
+
 	switch (magic) {
 	case LOV_MAGIC_V1:
-		return &lsm_v1_ops;
+		lsm = &lsm_v1_ops;
+		break;
 	case LOV_MAGIC_V3:
-		return &lsm_v3_ops;
+		lsm = &lsm_v3_ops;
+		break;
+	case LOV_MAGIC_COMP_V1:
+		lsm = &lsm_comp_md_v1_ops;
+		break;
 	default:
 		CERROR("unrecognized lsm_magic %08x\n", magic);
-		return NULL;
+		break;
 	}
+
+	return lsm;
 }
 
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 3700937..8b7a572 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -206,6 +206,9 @@ struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, void *buf,
 	const struct lsm_operations *op;
 	u32 magic;
 
+	if (buf_size < sizeof(magic))
+		return ERR_PTR(-EINVAL);
+
 	magic = le32_to_cpu(*(u32 *)buf);
 	op = lsm_op_find(magic);
 	if (!op)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 07/28] lustre: lov: embedded raid0 in struct lov_layout_composite
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (5 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 06/28] lustre: lov: add composite layout unpacking James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 08/28] lustre: lov: migrate lov raid0 to future PFL component handling James Simmons
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Create bare bones struct lov_layout_composite that
Make client layer support composite layout.

Plain layout will be stored in LOV layer as a composite layout
containing a single component.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/lov/lov_cl_internal.h    | 78 ++++++++++++----------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  8 +--
 drivers/staging/lustre/lustre/lov/lovsub_object.c  |  8 +--
 3 files changed, 50 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index e4f7621..c6ace49 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -129,6 +129,42 @@ static inline char *llt2str(enum lov_layout_type llt)
 	return "";
 }
 
+struct lov_layout_raid0 {
+	unsigned int		lo_nr;
+	/**
+	 * When this is true, lov_object::lo_attr contains
+	 * valid up to date attributes for a top-level
+	 * object. This field is reset to 0 when attributes of
+	 * any sub-object change.
+	 */
+	int			lo_attr_valid;
+	/**
+	 * Array of sub-objects. Allocated when top-object is
+	 * created (lov_init_raid0()).
+	 *
+	 * Top-object is a strict master of its sub-objects:
+	 * it is created before them, and outlives its
+	 * children (this later is necessary so that basic
+	 * functions like cl_object_top() always
+	 * work). Top-object keeps a reference on every
+	 * sub-object.
+	 *
+	 * When top-object is destroyed (lov_delete_raid0())
+	 * it releases its reference to a sub-object and waits
+	 * until the latter is finally destroyed.
+	 */
+	struct lovsub_object  **lo_sub;
+	/**
+	 * protect lo_sub
+	 */
+	spinlock_t		lo_sub_lock;
+	/**
+	 * Cached object attribute, built from sub-object
+	 * attributes.
+	 */
+	struct cl_attr		lo_attr;
+};
+
 /**
  * lov-specific file state.
  *
@@ -178,45 +214,15 @@ struct lov_object {
 	struct lov_stripe_md  *lo_lsm;
 
 	union lov_layout_state {
-		struct lov_layout_raid0 {
-			unsigned int	       lo_nr;
-			/**
-			 * When this is true, lov_object::lo_attr contains
-			 * valid up to date attributes for a top-level
-			 * object. This field is reset to 0 when attributes of
-			 * any sub-object change.
-			 */
-			int		       lo_attr_valid;
-			/**
-			 * Array of sub-objects. Allocated when top-object is
-			 * created (lov_init_raid0()).
-			 *
-			 * Top-object is a strict master of its sub-objects:
-			 * it is created before them, and outlives its
-			 * children (this later is necessary so that basic
-			 * functions like cl_object_top() always
-			 * work). Top-object keeps a reference on every
-			 * sub-object.
-			 *
-			 * When top-object is destroyed (lov_delete_raid0())
-			 * it releases its reference to a sub-object and waits
-			 * until the latter is finally destroyed.
-			 */
-			struct lovsub_object **lo_sub;
-			/**
-			 * protect lo_sub
-			 */
-			spinlock_t		lo_sub_lock;
-			/**
-			 * Cached object attribute, built from sub-object
-			 * attributes.
-			 */
-			struct cl_attr	 lo_attr;
-		} raid0;
 		struct lov_layout_state_empty {
 		} empty;
 		struct lov_layout_state_released {
 		} released;
+		struct lov_layout_composite {
+			struct lov_layout_entry {
+				struct lov_layout_raid0 lle_raid0;
+			} lo_entries;
+		} composite;
 	} u;
 	/**
 	 * Thread that acquired lov_object::lo_type_guard in an exclusive
@@ -627,7 +633,7 @@ static inline struct lov_layout_raid0 *lov_r0(struct lov_object *lov)
 	LASSERT(lov->lo_type == LLT_RAID0);
 	LASSERT(lov->lo_lsm->lsm_magic == LOV_MAGIC ||
 		lov->lo_lsm->lsm_magic == LOV_MAGIC_V3);
-	return &lov->u.raid0;
+	return &lov->u.composite.lo_entries.lle_raid0;
 }
 
 /* lov_pack.c */
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index ad2901a..15ed378 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -228,7 +228,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	struct lov_thread_info  *lti     = lov_env_info(env);
 	struct cl_object_conf   *subconf = &lti->lti_stripe_conf;
 	struct lu_fid	   *ofid    = &lti->lti_fid;
-	struct lov_layout_raid0 *r0      = &state->raid0;
+	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
 
 	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
 		dump_lsm(D_ERROR, lsm);
@@ -375,7 +375,7 @@ static void lov_subobject_kill(const struct lu_env *env, struct lov_object *lov,
 	wait_queue_head_t *wq;
 	wait_queue_entry_t	  *waiter;
 
-	r0  = &lov->u.raid0;
+	r0  = &lov->u.composite.lo_entries.lle_raid0;
 	LASSERT(r0->lo_sub[idx] == los);
 
 	sub  = lovsub2cl(los);
@@ -418,7 +418,7 @@ static void lov_subobject_kill(const struct lu_env *env, struct lov_object *lov,
 static int lov_delete_raid0(const struct lu_env *env, struct lov_object *lov,
 			    union lov_layout_state *state)
 {
-	struct lov_layout_raid0 *r0 = &state->raid0;
+	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
 	struct lov_stripe_md    *lsm = lov->lo_lsm;
 	int i;
 
@@ -451,7 +451,7 @@ static void lov_fini_empty(const struct lu_env *env, struct lov_object *lov,
 static void lov_fini_raid0(const struct lu_env *env, struct lov_object *lov,
 			   union lov_layout_state *state)
 {
-	struct lov_layout_raid0 *r0 = &state->raid0;
+	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
 
 	if (r0->lo_sub) {
 		kvfree(r0->lo_sub);
diff --git a/drivers/staging/lustre/lustre/lov/lovsub_object.c b/drivers/staging/lustre/lustre/lov/lovsub_object.c
index 13d4520..7360c16 100644
--- a/drivers/staging/lustre/lustre/lov/lovsub_object.c
+++ b/drivers/staging/lustre/lustre/lov/lovsub_object.c
@@ -80,10 +80,10 @@ static void lovsub_object_free(const struct lu_env *env, struct lu_object *obj)
 	 */
 	if (lov) {
 		LASSERT(lov->lo_type == LLT_RAID0);
-		LASSERT(lov->u.raid0.lo_sub[los->lso_index] == los);
-		spin_lock(&lov->u.raid0.lo_sub_lock);
-		lov->u.raid0.lo_sub[los->lso_index] = NULL;
-		spin_unlock(&lov->u.raid0.lo_sub_lock);
+		LASSERT(lov->u.composite.lo_entries.lle_raid0.lo_sub[los->lso_index] == los);
+		spin_lock(&lov->u.composite.lo_entries.lle_raid0.lo_sub_lock);
+		lov->u.composite.lo_entries.lle_raid0.lo_sub[los->lso_index] = NULL;
+		spin_unlock(&lov->u.composite.lo_entries.lle_raid0.lo_sub_lock);
 	}
 
 	lu_object_fini(obj);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 08/28] lustre: lov: migrate lov raid0 to future PFL component handling
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (6 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 07/28] lustre: lov: embedded raid0 in struct lov_layout_composite James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 09/28] lustre: lov: reduce code indentation James Simmons
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

PFL will change striping from being static to dynamic. The idea
of stripe count will change under this case. So rename the fields
representing stripe index to component index. The raid0 stripe
handing will be replaced with PFL component handling so make the
raid0 a subsystem of PFL handling.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  29 +--
 drivers/staging/lustre/lustre/lov/lov_io.c         |  41 +++--
 drivers/staging/lustre/lustre/lov/lov_lock.c       |   8 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     | 196 +++++++++++++--------
 drivers/staging/lustre/lustre/lov/lov_page.c       |  19 +-
 drivers/staging/lustre/lustre/lov/lovsub_object.c  |  13 +-
 6 files changed, 178 insertions(+), 128 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index c6ace49..c44c937 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -108,8 +108,8 @@ struct lov_device {
  */
 enum lov_layout_type {
 	LLT_EMPTY,	/** empty file without body (mknod + truncate) */
-	LLT_RAID0,	/** striped file */
 	LLT_RELEASED,	/** file with no objects (data in HSM) */
+	LLT_COMP,	/** support composite layout */
 	LLT_NR
 };
 
@@ -118,10 +118,10 @@ static inline char *llt2str(enum lov_layout_type llt)
 	switch (llt) {
 	case LLT_EMPTY:
 		return "EMPTY";
-	case LLT_RAID0:
-		return "RAID0";
 	case LLT_RELEASED:
 		return "RELEASED";
+	case LLT_COMP:
+		return "COMPOSITE";
 	case LLT_NR:
 		LBUG();
 	}
@@ -242,7 +242,7 @@ struct lov_lock_sub {
 	 */
 	unsigned int		sub_is_enqueued:1,
 				sub_initialized:1;
-	int		  sub_stripe;
+	int			sub_index;
 };
 
 /**
@@ -258,7 +258,8 @@ struct lov_lock {
 
 struct lov_page {
 	struct cl_page_slice	lps_cl;
-	unsigned int		lps_stripe; /* stripe index */
+	/** layout_entry + stripe index, composed using lov_comp_index() */
+	unsigned int		lps_index;
 };
 
 /*
@@ -309,7 +310,6 @@ struct lov_thread_info {
  * State that lov_io maintains for every sub-io.
  */
 struct lov_io_sub {
-	u16		 sub_stripe;
 	/**
 	 * environment's refcheck.
 	 *
@@ -331,6 +331,7 @@ struct lov_io_sub {
 	 * sub-io's active for the current IO iteration.
 	 */
 	struct list_head	 sub_linkage;
+	u16			sub_subio_index;
 	/**
 	 * sub-io for a stripe. Ideally sub-io's can be stopped and resumed
 	 * independently, with lov acting as a scheduler to maximize overall
@@ -425,12 +426,12 @@ int lov_io_init(const struct lu_env *env, struct cl_object *obj,
 int lovsub_lock_init(const struct lu_env *env, struct cl_object *obj,
 		     struct cl_lock *lock, const struct cl_io *io);
 
-int lov_lock_init_raid0(const struct lu_env *env, struct cl_object *obj,
-			struct cl_lock *lock, const struct cl_io *io);
+int lov_lock_init_composite(const struct lu_env *env, struct cl_object *obj,
+			    struct cl_lock *lock, const struct cl_io *io);
 int lov_lock_init_empty(const struct lu_env *env, struct cl_object *obj,
 			struct cl_lock *lock, const struct cl_io *io);
-int lov_io_init_raid0(const struct lu_env *env, struct cl_object *obj,
-		      struct cl_io *io);
+int lov_io_init_composite(const struct lu_env *env, struct cl_object *obj,
+			  struct cl_io *io);
 int lov_io_init_empty(const struct lu_env *env, struct cl_object *obj,
 		      struct cl_io *io);
 int lov_io_init_released(const struct lu_env *env, struct cl_object *obj,
@@ -445,8 +446,8 @@ int lovsub_page_init(const struct lu_env *env, struct cl_object *ob,
 		     struct cl_page *page, pgoff_t index);
 int lov_page_init_empty(const struct lu_env *env, struct cl_object *obj,
 			struct cl_page *page, pgoff_t index);
-int lov_page_init_raid0(const struct lu_env *env, struct cl_object *obj,
-			struct cl_page *page, pgoff_t index);
+int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
+			    struct cl_page *page, pgoff_t index);
 struct lu_object *lov_object_alloc(const struct lu_env *env,
 				   const struct lu_object_header *hdr,
 				   struct lu_device *dev);
@@ -455,7 +456,6 @@ struct lu_object *lovsub_object_alloc(const struct lu_env *env,
 				      struct lu_device *dev);
 
 struct lov_stripe_md *lov_lsm_addref(struct lov_object *lov);
-int lov_page_stripe(const struct cl_page *page);
 
 #define lov_foreach_target(lov, var)		    \
 	for (var = 0; var < lov_targets_nr(lov); ++var)
@@ -630,9 +630,10 @@ static inline struct lov_thread_info *lov_env_info(const struct lu_env *env)
 
 static inline struct lov_layout_raid0 *lov_r0(struct lov_object *lov)
 {
-	LASSERT(lov->lo_type == LLT_RAID0);
+	LASSERT(lov->lo_type == LLT_COMP);
 	LASSERT(lov->lo_lsm->lsm_magic == LOV_MAGIC ||
 		lov->lo_lsm->lsm_magic == LOV_MAGIC_V3);
+
 	return &lov->u.composite.lo_entries.lle_raid0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 2d62566..023b588 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -53,7 +53,7 @@ static void lov_io_sub_fini(const struct lu_env *env, struct lov_io *lio,
 			sub->sub_io_initialized = 0;
 			lio->lis_active_subios--;
 		}
-		if (sub->sub_stripe == lio->lis_single_subio_index)
+		if (sub->sub_subio_index == lio->lis_single_subio_index)
 			lio->lis_single_subio_index = -1;
 		else if (!sub->sub_borrowed)
 			kfree(sub->sub_io);
@@ -143,12 +143,12 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	struct cl_io      *sub_io;
 	struct cl_object  *sub_obj;
 	struct cl_io      *io  = lio->lis_cl.cis_io;
-	int stripe = sub->sub_stripe;
+	int stripe = sub->sub_subio_index;
 	int rc;
 
 	LASSERT(!sub->sub_io);
 	LASSERT(!sub->sub_env);
-	LASSERT(sub->sub_stripe < lio->lis_stripe_count);
+	LASSERT(sub->sub_subio_index < lio->lis_stripe_count);
 
 	if (unlikely(!lov_r0(lov)->lo_sub[stripe]))
 		return -EIO;
@@ -203,15 +203,15 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 }
 
 struct lov_io_sub *lov_sub_get(const struct lu_env *env,
-			       struct lov_io *lio, int stripe)
+			       struct lov_io *lio, int index)
 {
 	int rc;
-	struct lov_io_sub *sub = &lio->lis_subs[stripe];
+	struct lov_io_sub *sub = &lio->lis_subs[index];
 
-	LASSERT(stripe < lio->lis_stripe_count);
+	LASSERT(index < lio->lis_stripe_count);
 
 	if (!sub->sub_io_initialized) {
-		sub->sub_stripe = stripe;
+		sub->sub_subio_index = index;
 		rc = lov_io_sub_init(env, lio, sub);
 	} else {
 		rc = 0;
@@ -228,14 +228,14 @@ struct lov_io_sub *lov_sub_get(const struct lu_env *env,
  *
  */
 
-int lov_page_stripe(const struct cl_page *page)
+static int lov_page_index(const struct cl_page *page)
 {
 	const struct cl_page_slice *slice;
 
 	slice = cl_page_at(page, &lov_device_type);
 	LASSERT(slice->cpl_obj);
 
-	return cl2lov_page(slice)->lps_stripe;
+	return cl2lov_page(slice)->lps_index;
 }
 
 static int lov_io_subio_init(const struct lu_env *env, struct lov_io *lio,
@@ -630,8 +630,7 @@ static int lov_io_submit(const struct lu_env *env,
 	struct lov_io_sub *sub;
 	struct cl_page_list *plist = &lov_env_info(env)->lti_plist;
 	struct cl_page *page;
-	int stripe;
-
+	int index;
 	int rc = 0;
 
 	if (lio->lis_active_subios == 1) {
@@ -657,16 +656,16 @@ static int lov_io_submit(const struct lu_env *env,
 		page = cl_page_list_first(qin);
 		cl_page_list_move(&cl2q->c2_qin, qin, page);
 
-		stripe = lov_page_stripe(page);
+		index = lov_page_index(page);
 		while (qin->pl_nr > 0) {
 			page = cl_page_list_first(qin);
-			if (stripe != lov_page_stripe(page))
+			if (index != lov_page_index(page))
 				break;
 
 			cl_page_list_move(&cl2q->c2_qin, qin, page);
 		}
 
-		sub = lov_sub_get(env, lio, stripe);
+		sub = lov_sub_get(env, lio, index);
 		if (!IS_ERR(sub)) {
 			rc = cl_io_submit_rw(sub->sub_env, sub->sub_io,
 					     crt, cl2q);
@@ -716,16 +715,16 @@ static int lov_io_commit_async(const struct lu_env *env,
 	cl_page_list_init(plist);
 	while (queue->pl_nr > 0) {
 		int stripe_to = to;
-		int stripe;
+		int index;
 
 		LASSERT(plist->pl_nr == 0);
 		page = cl_page_list_first(queue);
 		cl_page_list_move(plist, queue, page);
 
-		stripe = lov_page_stripe(page);
+		index = lov_page_index(page);
 		while (queue->pl_nr > 0) {
 			page = cl_page_list_first(queue);
-			if (stripe != lov_page_stripe(page))
+			if (index != lov_page_index(page))
 				break;
 
 			cl_page_list_move(plist, queue, page);
@@ -734,7 +733,7 @@ static int lov_io_commit_async(const struct lu_env *env,
 		if (queue->pl_nr > 0) /* still has more pages */
 			stripe_to = PAGE_SIZE;
 
-		sub = lov_sub_get(env, lio, stripe);
+		sub = lov_sub_get(env, lio, index);
 		if (!IS_ERR(sub)) {
 			rc = cl_io_commit_async(sub->sub_env, sub->sub_io,
 						plist, from, stripe_to, cb);
@@ -769,7 +768,7 @@ static int lov_io_fault_start(const struct lu_env *env,
 
 	fio = &ios->cis_io->u.ci_fault;
 	lio = cl2lov_io(env, ios);
-	sub = lov_sub_get(env, lio, lov_page_stripe(fio->ft_page));
+	sub = lov_sub_get(env, lio, lov_page_index(fio->ft_page));
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
 	sub->sub_io->u.ci_fault.ft_nob = fio->ft_nob;
@@ -941,8 +940,8 @@ static void lov_empty_impossible(const struct lu_env *env,
 	.cio_commit_async              = LOV_EMPTY_IMPOSSIBLE
 };
 
-int lov_io_init_raid0(const struct lu_env *env, struct cl_object *obj,
-		      struct cl_io *io)
+int lov_io_init_composite(const struct lu_env *env, struct cl_object *obj,
+			  struct cl_io *io)
 {
 	struct lov_io       *lio = lov_env_io(env);
 	struct lov_object   *lov = cl2lov(obj);
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index b029210..4340063 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -73,7 +73,7 @@ static struct lov_sublock_env *lov_sublock_env_get(const struct lu_env *env,
 		subenv->lse_env = env;
 		subenv->lse_io  = io;
 	} else {
-		sub = lov_sub_get(env, lio, lls->sub_stripe);
+		sub = lov_sub_get(env, lio, lls->sub_index);
 		if (!IS_ERR(sub)) {
 			subenv->lse_env = sub->sub_env;
 			subenv->lse_io  = sub->sub_io;
@@ -167,7 +167,7 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 			descr->cld_mode  = lock->cll_descr.cld_mode;
 			descr->cld_gid   = lock->cll_descr.cld_gid;
 			descr->cld_enq_flags = lock->cll_descr.cld_enq_flags;
-			lls->sub_stripe = i;
+			lls->sub_index = i;
 
 			/* initialize sub lock */
 			result = lov_sublock_init(env, lock, lls);
@@ -295,8 +295,8 @@ static int lov_lock_print(const struct lu_env *env, void *cookie,
 	.clo_print     = lov_lock_print
 };
 
-int lov_lock_init_raid0(const struct lu_env *env, struct cl_object *obj,
-			struct cl_lock *lock, const struct cl_io *io)
+int lov_lock_init_composite(const struct lu_env *env, struct cl_object *obj,
+			    struct cl_lock *lock, const struct cl_io *io)
 {
 	struct lov_lock *lck;
 	int result = 0;
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 15ed378..f5c6da1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -110,9 +110,9 @@ static int lov_init_empty(const struct lu_env *env, struct lov_device *dev,
 	return 0;
 }
 
-static void lov_install_raid0(const struct lu_env *env,
-			      struct lov_object *lov,
-			      union lov_layout_state *state)
+static void lov_install_composite(const struct lu_env *env,
+				  struct lov_object *lov,
+				  union lov_layout_state *state)
 {
 }
 
@@ -129,7 +129,7 @@ static struct cl_object *lov_sub_find(const struct lu_env *env,
 }
 
 static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
-			struct cl_object *stripe, struct lov_layout_raid0 *r0,
+			struct cl_object *subobj, struct lov_layout_raid0 *r0,
 			int idx)
 {
 	struct cl_object_header *hdr;
@@ -145,13 +145,13 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 		 * lov_oinfo of lsm_stripe_data which will be freed due to
 		 * this failure.
 		 */
-		cl_object_kill(env, stripe);
-		cl_object_put(env, stripe);
+		cl_object_kill(env, subobj);
+		cl_object_put(env, subobj);
 		return -EIO;
 	}
 
 	hdr    = cl_object_header(lov2cl(lov));
-	subhdr = cl_object_header(stripe);
+	subhdr = cl_object_header(subobj);
 
 	oinfo = lov->lo_lsm->lsm_entries[0]->lsme_oinfo[idx];
 	CDEBUG(D_INODE, DFID "@%p[%d] -> " DFID "@%p: ostid: " DOSTID " idx: %d gen: %d\n",
@@ -166,8 +166,8 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 		subhdr->coh_parent = hdr;
 		spin_unlock(&subhdr->coh_attr_guard);
 		subhdr->coh_nesting = hdr->coh_nesting + 1;
-		lu_object_ref_add(&stripe->co_lu, "lov-parent", lov);
-		r0->lo_sub[idx] = cl2lovsub(stripe);
+		lu_object_ref_add(&subobj->co_lu, "lov-parent", lov);
+		r0->lo_sub[idx] = cl2lovsub(subobj);
 		r0->lo_sub[idx]->lso_super = lov;
 		r0->lo_sub[idx]->lso_index = idx;
 		result = 0;
@@ -184,18 +184,18 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 			/* the object's layout has already changed but isn't
 			 * refreshed
 			 */
-			lu_object_unhash(env, &stripe->co_lu);
+			lu_object_unhash(env, &subobj->co_lu);
 			result = -EAGAIN;
 		} else {
 			mask = D_ERROR;
 			result = -EIO;
 		}
 
-		LU_OBJECT_DEBUG(mask, env, &stripe->co_lu,
+		LU_OBJECT_DEBUG(mask, env, &subobj->co_lu,
 				"stripe %d is already owned.", idx);
 		LU_OBJECT_DEBUG(mask, env, old_obj, "owned.");
 		LU_OBJECT_HEADER(mask, env, lov2lu(lov), "try to own.\n");
-		cl_object_put(env, stripe);
+		cl_object_put(env, subobj);
 	}
 	return result;
 }
@@ -219,7 +219,7 @@ static int lov_page_slice_fixup(struct lov_object *lov,
 static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			  struct lov_object *lov, struct lov_stripe_md *lsm,
 			  const struct cl_object_conf *conf,
-			  union  lov_layout_state *state)
+			  struct lov_layout_raid0 *r0)
 {
 	int result;
 	int i;
@@ -228,7 +228,6 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	struct lov_thread_info  *lti     = lov_env_info(env);
 	struct cl_object_conf   *subconf = &lti->lti_stripe_conf;
 	struct lu_fid	   *ofid    = &lti->lti_fid;
-	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
 
 	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
 		dump_lsm(D_ERROR, lsm);
@@ -310,6 +309,17 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	return result;
 }
 
+static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
+			      struct lov_object *lov, struct lov_stripe_md *lsm,
+			      const struct cl_object_conf *conf,
+			      union lov_layout_state *state)
+{
+	struct lov_layout_composite *comp = &state->composite;
+	struct lov_layout_entry *le = &comp->lo_entries;
+
+	return lov_init_raid0(env, dev, lov, lsm, conf, &le->lle_raid0);
+}
+
 static int lov_init_released(const struct lu_env *env, struct lov_device *dev,
 			     struct lov_object *lov, struct lov_stripe_md *lsm,
 			     const struct cl_object_conf *conf,
@@ -337,7 +347,7 @@ static struct cl_object *lov_find_subobj(const struct lu_env *env,
 	int ost_idx;
 	int rc;
 
-	if (lov->lo_type != LLT_RAID0) {
+	if (lov->lo_type != LLT_COMP) {
 		result = NULL;
 		goto out;
 	}
@@ -367,15 +377,14 @@ static int lov_delete_empty(const struct lu_env *env, struct lov_object *lov,
 }
 
 static void lov_subobject_kill(const struct lu_env *env, struct lov_object *lov,
+			       struct lov_layout_raid0 *r0,
 			       struct lovsub_object *los, int idx)
 {
 	struct cl_object	*sub;
-	struct lov_layout_raid0 *r0;
 	struct lu_site	  *site;
 	wait_queue_head_t *wq;
 	wait_queue_entry_t	  *waiter;
 
-	r0  = &lov->u.composite.lo_entries.lle_raid0;
 	LASSERT(r0->lo_sub[idx] == los);
 
 	sub  = lovsub2cl(los);
@@ -415,17 +424,12 @@ static void lov_subobject_kill(const struct lu_env *env, struct lov_object *lov,
 	LASSERT(!r0->lo_sub[idx]);
 }
 
-static int lov_delete_raid0(const struct lu_env *env, struct lov_object *lov,
-			    union lov_layout_state *state)
+static void lov_delete_raid0(const struct lu_env *env, struct lov_object *lov,
+			     struct lov_layout_raid0 *r0)
 {
-	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
-	struct lov_stripe_md    *lsm = lov->lo_lsm;
-	int i;
-
-	dump_lsm(D_INODE, lsm);
-
-	lov_layout_wait(env, lov);
 	if (r0->lo_sub) {
+		int i;
+
 		for (i = 0; i < r0->lo_nr; ++i) {
 			struct lovsub_object *los = r0->lo_sub[i];
 
@@ -435,10 +439,24 @@ static int lov_delete_raid0(const struct lu_env *env, struct lov_object *lov,
 				 * If top-level object is to be evicted from
 				 * the cache, so are its sub-objects.
 				 */
-				lov_subobject_kill(env, lov, los, i);
+				lov_subobject_kill(env, lov, r0, los, i);
 			}
 		}
 	}
+}
+
+static int lov_delete_composite(const struct lu_env *env,
+				struct lov_object *lov,
+				union lov_layout_state *state)
+{
+	struct lov_layout_composite *comp = &state->composite;
+	struct lov_layout_entry *entry = &comp->lo_entries;
+
+	dump_lsm(D_INODE, lov->lo_lsm);
+
+	lov_layout_wait(env, lov);
+	lov_delete_raid0(env, lov, &entry->lle_raid0);
+
 	return 0;
 }
 
@@ -448,15 +466,23 @@ static void lov_fini_empty(const struct lu_env *env, struct lov_object *lov,
 	LASSERT(lov->lo_type == LLT_EMPTY || lov->lo_type == LLT_RELEASED);
 }
 
-static void lov_fini_raid0(const struct lu_env *env, struct lov_object *lov,
-			   union lov_layout_state *state)
+static void lov_fini_raid0(const struct lu_env *env,
+			   struct lov_layout_raid0 *r0)
 {
-	struct lov_layout_raid0 *r0 = &state->composite.lo_entries.lle_raid0;
-
 	if (r0->lo_sub) {
 		kvfree(r0->lo_sub);
 		r0->lo_sub = NULL;
 	}
+}
+
+static void lov_fini_composite(const struct lu_env *env,
+			       struct lov_object *lov,
+			       union lov_layout_state *state)
+{
+	struct lov_layout_composite *comp = &state->composite;
+	struct lov_layout_entry *entry = &comp->lo_entries;
+
+	lov_fini_raid0(env, &entry->lle_raid0);
 
 	dump_lsm(D_INODE, lov->lo_lsm);
 	lov_free_memmd(&lov->lo_lsm);
@@ -477,17 +503,10 @@ static int lov_print_empty(const struct lu_env *env, void *cookie,
 }
 
 static int lov_print_raid0(const struct lu_env *env, void *cookie,
-			   lu_printer_t p, const struct lu_object *o)
+			   lu_printer_t p, struct lov_layout_raid0 *r0)
 {
-	struct lov_object	*lov = lu2lov(o);
-	struct lov_layout_raid0	*r0  = lov_r0(lov);
-	struct lov_stripe_md	*lsm = lov->lo_lsm;
-	int			 i;
+	int i;
 
-	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
-	     r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
-	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
-	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
 	for (i = 0; i < r0->lo_nr; ++i) {
 		struct lu_object *sub;
 
@@ -501,6 +520,23 @@ static int lov_print_raid0(const struct lu_env *env, void *cookie,
 	return 0;
 }
 
+static int lov_print_composite(const struct lu_env *env, void *cookie,
+			       lu_printer_t p, const struct lu_object *o)
+{
+	struct lov_object *lov = lu2lov(o);
+	struct lov_layout_raid0	*r0 = lov_r0(lov);
+	struct lov_stripe_md *lsm = lov->lo_lsm;
+
+	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
+	     r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
+	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
+	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
+
+	lov_print_raid0(env, cookie, p, r0);
+
+	return 0;
+}
+
 static int lov_print_released(const struct lu_env *env, void *cookie,
 			      lu_printer_t p, const struct lu_object *o)
 {
@@ -525,17 +561,13 @@ static int lov_print_released(const struct lu_env *env, void *cookie,
 static int lov_attr_get_empty(const struct lu_env *env, struct cl_object *obj,
 			      struct cl_attr *attr)
 {
-	attr->cat_blocks = 0;
 	return 0;
 }
 
-static int lov_attr_get_raid0(const struct lu_env *env, struct cl_object *obj,
-			      struct cl_attr *attr)
+static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
+			      struct cl_attr *attr, struct lov_layout_raid0 *r0)
 {
-	struct lov_object	*lov = cl2lov(obj);
-	struct lov_layout_raid0 *r0 = lov_r0(lov);
-	struct cl_attr		*lov_attr = &r0->lo_attr;
-	int			 result = 0;
+	int result = 0;
 
 	/* this is called w/o holding type guard mutex, so it must be inside
 	 * an on going IO otherwise lsm may be replaced.
@@ -577,22 +609,38 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct cl_object *obj,
 		result = lov_merge_lvb_kms(lsm, lvb, &kms);
 		lov_stripe_unlock(lsm);
 		if (result == 0) {
-			cl_lvb2attr(lov_attr, lvb);
-			lov_attr->cat_kms = kms;
+			cl_lvb2attr(attr, lvb);
+			attr->cat_kms = kms;
 			r0->lo_attr_valid = 1;
 		}
 	}
-	if (result == 0) { /* merge results */
-		attr->cat_blocks = lov_attr->cat_blocks;
-		attr->cat_size = lov_attr->cat_size;
-		attr->cat_kms = lov_attr->cat_kms;
-		if (attr->cat_atime < lov_attr->cat_atime)
-			attr->cat_atime = lov_attr->cat_atime;
-		if (attr->cat_ctime < lov_attr->cat_ctime)
-			attr->cat_ctime = lov_attr->cat_ctime;
-		if (attr->cat_mtime < lov_attr->cat_mtime)
-			attr->cat_mtime = lov_attr->cat_mtime;
-	}
+
+	return result;
+}
+
+static int lov_attr_get_composite(const struct lu_env *env,
+				  struct cl_object *obj,
+				  struct cl_attr *attr)
+{
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_layout_raid0 *r0 = lov_r0(lov);
+	struct cl_attr *lov_attr = &r0->lo_attr;
+	int result;
+
+	result = lov_attr_get_raid0(env, lov, attr, r0);
+	if (result)
+		return result;
+
+	attr->cat_blocks = lov_attr->cat_blocks;
+	attr->cat_size = lov_attr->cat_size;
+	attr->cat_kms = lov_attr->cat_kms;
+	if (attr->cat_atime < lov_attr->cat_atime)
+		attr->cat_atime = lov_attr->cat_atime;
+	if (attr->cat_ctime < lov_attr->cat_ctime)
+		attr->cat_ctime = lov_attr->cat_ctime;
+	if (attr->cat_mtime < lov_attr->cat_mtime)
+		attr->cat_mtime = lov_attr->cat_mtime;
+
 	return result;
 }
 
@@ -608,17 +656,6 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct cl_object *obj,
 		.llo_io_init   = lov_io_init_empty,
 		.llo_getattr   = lov_attr_get_empty
 	},
-	[LLT_RAID0] = {
-		.llo_init      = lov_init_raid0,
-		.llo_delete    = lov_delete_raid0,
-		.llo_fini      = lov_fini_raid0,
-		.llo_install   = lov_install_raid0,
-		.llo_print     = lov_print_raid0,
-		.llo_page_init = lov_page_init_raid0,
-		.llo_lock_init = lov_lock_init_raid0,
-		.llo_io_init   = lov_io_init_raid0,
-		.llo_getattr   = lov_attr_get_raid0
-	},
 	[LLT_RELEASED] = {
 		.llo_init      = lov_init_released,
 		.llo_delete    = lov_delete_empty,
@@ -629,7 +666,18 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct cl_object *obj,
 		.llo_lock_init = lov_lock_init_empty,
 		.llo_io_init   = lov_io_init_released,
 		.llo_getattr   = lov_attr_get_empty
-	}
+	},
+	[LLT_COMP] = {
+		.llo_init	= lov_init_composite,
+		.llo_delete	= lov_delete_composite,
+		.llo_fini	= lov_fini_composite,
+		.llo_install	= lov_install_composite,
+		.llo_print	= lov_print_composite,
+		.llo_page_init	= lov_page_init_composite,
+		.llo_lock_init	= lov_lock_init_composite,
+		.llo_io_init	= lov_io_init_composite,
+		.llo_getattr	= lov_attr_get_composite,
+	},
 };
 
 /**
@@ -659,7 +707,7 @@ static enum lov_layout_type lov_type(struct lov_stripe_md *lsm)
 	if (lsm->lsm_is_released)
 		return LLT_RELEASED;
 
-	return LLT_RAID0;
+	return LLT_COMP;
 }
 
 static inline void lov_conf_freeze(struct lov_object *lov)
@@ -1610,7 +1658,7 @@ int lov_read_and_clear_async_rc(struct cl_object *clob)
 
 		lov_conf_freeze(lov);
 		switch (lov->lo_type) {
-		case LLT_RAID0: {
+		case LLT_COMP: {
 			struct lov_stripe_md *lsm;
 			int i;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index f1c99a2..d94d003 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -50,22 +50,21 @@
  * Lov page operations.
  *
  */
-
-static int lov_raid0_page_print(const struct lu_env *env,
-				const struct cl_page_slice *slice,
-				void *cookie, lu_printer_t printer)
+static int lov_comp_page_print(const struct lu_env *env,
+			       const struct cl_page_slice *slice,
+			       void *cookie, lu_printer_t printer)
 {
 	struct lov_page *lp = cl2lov_page(slice);
 
 	return (*printer)(env, cookie, LUSTRE_LOV_NAME "-page@%p, raid0\n", lp);
 }
 
-static const struct cl_page_operations lov_raid0_page_ops = {
-	.cpo_print  = lov_raid0_page_print
+static const struct cl_page_operations lov_comp_page_ops = {
+	.cpo_print  = lov_comp_page_print
 };
 
-int lov_page_init_raid0(const struct lu_env *env, struct cl_object *obj,
-			struct cl_page *page, pgoff_t index)
+int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
+			    struct cl_page *page, pgoff_t index)
 {
 	struct lov_object *loo = cl2lov(obj);
 	struct lov_layout_raid0 *r0 = lov_r0(loo);
@@ -85,8 +84,8 @@ int lov_page_init_raid0(const struct lu_env *env, struct cl_object *obj,
 	rc = lov_stripe_offset(loo->lo_lsm, offset, stripe, &suboff);
 	LASSERT(rc == 0);
 
-	lpg->lps_stripe = stripe;
-	cl_page_slice_add(page, &lpg->lps_cl, obj, index, &lov_raid0_page_ops);
+	lpg->lps_index = stripe;
+	cl_page_slice_add(page, &lpg->lps_cl, obj, index, &lov_comp_page_ops);
 
 	sub = lov_sub_get(env, lio, stripe);
 	if (IS_ERR(sub))
diff --git a/drivers/staging/lustre/lustre/lov/lovsub_object.c b/drivers/staging/lustre/lustre/lov/lovsub_object.c
index 7360c16..d3e9537 100644
--- a/drivers/staging/lustre/lustre/lov/lovsub_object.c
+++ b/drivers/staging/lustre/lustre/lov/lovsub_object.c
@@ -79,11 +79,14 @@ static void lovsub_object_free(const struct lu_env *env, struct lu_object *obj)
 	 * object handling in lu_object_find.
 	 */
 	if (lov) {
-		LASSERT(lov->lo_type == LLT_RAID0);
-		LASSERT(lov->u.composite.lo_entries.lle_raid0.lo_sub[los->lso_index] == los);
-		spin_lock(&lov->u.composite.lo_entries.lle_raid0.lo_sub_lock);
-		lov->u.composite.lo_entries.lle_raid0.lo_sub[los->lso_index] = NULL;
-		spin_unlock(&lov->u.composite.lo_entries.lle_raid0.lo_sub_lock);
+		int stripe = los->lso_index;
+		struct lov_layout_raid0 *r0 = lov_r0(lov);
+
+		LASSERT(lov->lo_type == LLT_COMP);
+		LASSERT(r0->lo_sub[stripe] == los);
+		spin_lock(&r0->lo_sub_lock);
+		r0->lo_sub[stripe] = NULL;
+		spin_unlock(&r0->lo_sub_lock);
 	}
 
 	lu_object_fini(obj);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 09/28] lustre: lov: reduce code indentation
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (7 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 08/28] lustre: lov: migrate lov raid0 to future PFL component handling James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 10/28] lustre: lov: change lo_entries to array James Simmons
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

For lov_init_raid0() we check for the failure of lo_sub and return
an error rigth away. This allows us to reduce the code indentation.
The same can be done for lov_attr_get_raid0() with the test of
r0->lo_attr_valid.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_io.c     |  11 +-
 drivers/staging/lustre/lustre/lov/lov_object.c | 186 ++++++++++++-------------
 2 files changed, 96 insertions(+), 101 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 023b588..6dd5639 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -948,12 +948,13 @@ int lov_io_init_composite(const struct lu_env *env, struct cl_object *obj,
 
 	INIT_LIST_HEAD(&lio->lis_active);
 	io->ci_result = lov_io_slice_init(lio, lov, io);
+	if (io->ci_result)
+		return io->ci_result;
+
+	io->ci_result = lov_io_subio_init(env, lio, io);
 	if (io->ci_result == 0) {
-		io->ci_result = lov_io_subio_init(env, lio, io);
-		if (io->ci_result == 0) {
-			cl_io_slice_add(io, &lio->lis_cl, obj, &lov_io_ops);
-			atomic_inc(&lov->lo_active_ios);
-		}
+		cl_io_slice_add(io, &lio->lis_cl, obj, &lov_io_ops);
+		atomic_inc(&lov->lo_active_ios);
 	}
 	return io->ci_result;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index f5c6da1..1ebaa23 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -228,6 +228,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	struct lov_thread_info  *lti     = lov_env_info(env);
 	struct cl_object_conf   *subconf = &lti->lti_stripe_conf;
 	struct lu_fid	   *ofid    = &lti->lti_fid;
+	int psz;
 
 	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
 		dump_lsm(D_ERROR, lsm);
@@ -238,73 +239,76 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	LASSERT(!lov->lo_lsm);
 	lov->lo_lsm = lsm_addref(lsm);
 	lov->lo_layout_invalid = true;
+
+	spin_lock_init(&r0->lo_sub_lock);
 	r0->lo_nr  = lsm->lsm_entries[0]->lsme_stripe_count;
 	LASSERT(r0->lo_nr <= lov_targets_nr(dev));
 
 	r0->lo_sub = kvzalloc(r0->lo_nr * sizeof(r0->lo_sub[0]),
 				     GFP_NOFS);
-	if (r0->lo_sub) {
-		int psz = 0;
+	if (!r0->lo_sub)
+		return -ENOMEM;
 
-		result = 0;
-		subconf->coc_inode = conf->coc_inode;
-		spin_lock_init(&r0->lo_sub_lock);
-		/*
-		 * Create stripe cl_objects.
+	psz = 0;
+	result = 0;
+	subconf->coc_inode = conf->coc_inode;
+	/*
+	 * Create stripe cl_objects.
+	 */
+	for (i = 0; i < r0->lo_nr; ++i) {
+		struct cl_device *subdev;
+		struct lov_oinfo *oinfo;
+		int ost_idx;
+
+		oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
+		if (lov_oinfo_is_dummy(oinfo))
+			continue;
+
+		result = ostid_to_fid(ofid, &oinfo->loi_oi,
+				      oinfo->loi_ost_idx);
+		if (result != 0)
+			goto out;
+
+		ost_idx = oinfo->loi_ost_idx;
+		if (!dev->ld_target[ost_idx]) {
+			CERROR("%s: OST %04x is not initialized\n",
+			       lov2obd(dev->ld_lov)->obd_name, ost_idx);
+			result = -EIO;
+			goto out;
+		}
+
+		subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
+		subconf->u.coc_oinfo = oinfo;
+		LASSERTF(subdev, "not init ost %d\n", ost_idx);
+		/* In the function below, .hs_keycmp resolves to
+		 * lu_obj_hop_keycmp()
 		 */
-		for (i = 0; i < r0->lo_nr && result == 0; ++i) {
-			struct cl_device *subdev;
-			struct lov_oinfo *oinfo;
-			int ost_idx;
-
-			oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
-			if (lov_oinfo_is_dummy(oinfo))
-				continue;
-
-			result = ostid_to_fid(ofid, &oinfo->loi_oi,
-					      oinfo->loi_ost_idx);
-			if (result != 0)
-				goto out;
-
-			ost_idx = oinfo->loi_ost_idx;
-			if (!dev->ld_target[ost_idx]) {
-				CERROR("%s: OST %04x is not initialized\n",
-				lov2obd(dev->ld_lov)->obd_name, ost_idx);
-				result = -EIO;
-				goto out;
-			}
+		/* coverity[overrun-buffer-val] */
+		stripe = lov_sub_find(env, subdev, ofid, subconf);
+		if (IS_ERR(stripe)) {
+			result = PTR_ERR(stripe);
+			goto out;
+		}
 
-			subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
-			subconf->u.coc_oinfo = oinfo;
-			LASSERTF(subdev, "not init ost %d\n", ost_idx);
-			/* In the function below, .hs_keycmp resolves to
-			 * lu_obj_hop_keycmp()
-			 */
-			/* coverity[overrun-buffer-val] */
-			stripe = lov_sub_find(env, subdev, ofid, subconf);
-			if (!IS_ERR(stripe)) {
-				result = lov_init_sub(env, lov, stripe, r0, i);
-				if (result == -EAGAIN) { /* try again */
-					--i;
-					result = 0;
-					continue;
-				}
-			} else {
-				result = PTR_ERR(stripe);
-			}
+		result = lov_init_sub(env, lov, stripe, r0, i);
+		if (result == -EAGAIN) { /* try again */
+			--i;
+			result = 0;
+			continue;
+		}
 
-			if (result == 0) {
-				int sz = lov_page_slice_fixup(lov, stripe);
+		if (result == 0) {
+			int sz = lov_page_slice_fixup(lov, stripe);
 
-				LASSERT(ergo(psz > 0, psz == sz));
-				psz = sz;
-			}
+			LASSERT(ergo(psz > 0, psz == sz));
+			psz = sz;
 		}
-		if (result == 0)
-			cl_object_header(&lov->lo_cl)->coh_page_bufsize += psz;
-	} else {
-		result = -ENOMEM;
 	}
+	if (result == 0)
+		cl_object_header(&lov->lo_cl)->coh_page_bufsize += psz;
+	else
+		result = -ENOMEM;
+
 out:
 	return result;
 }
@@ -567,53 +571,43 @@ static int lov_attr_get_empty(const struct lu_env *env, struct cl_object *obj,
 static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
 			      struct cl_attr *attr, struct lov_layout_raid0 *r0)
 {
+	struct lov_stripe_md *lsm = lov->lo_lsm;
+	struct ost_lvb *lvb = &lov_env_info(env)->lti_lvb;
 	int result = 0;
+	u64 kms = 0;
 
-	/* this is called w/o holding type guard mutex, so it must be inside
-	 * an on going IO otherwise lsm may be replaced.
-	 * LU-2117: it turns out there exists one exception. For mmaped files,
-	 * the lock of those files may be requested in the other file's IO
-	 * context, and this function is called in ccc_lock_state(), it will
-	 * hit this assertion.
-	 * Anyway, it's still okay to call attr_get w/o type guard as layout
-	 * can't go if locks exist.
-	 */
-	/* LASSERT(atomic_read(&lsm->lsm_refc) > 1); */
+	if (r0->lo_attr_valid)
+		return 0;
 
-	if (!r0->lo_attr_valid) {
-		struct lov_stripe_md    *lsm = lov->lo_lsm;
-		struct ost_lvb	  *lvb = &lov_env_info(env)->lti_lvb;
-		__u64		    kms = 0;
+	memset(lvb, 0, sizeof(*lvb));
+	/* XXX: timestamps can be negative by sanity:test_39m,
+	 * how can it be?
+	 */
+	lvb->lvb_atime = LLONG_MIN;
+	lvb->lvb_ctime = LLONG_MIN;
+	lvb->lvb_mtime = LLONG_MIN;
 
-		memset(lvb, 0, sizeof(*lvb));
-		/* XXX: timestamps can be negative by sanity:test_39m,
-		 * how can it be?
-		 */
-		lvb->lvb_atime = LLONG_MIN;
-		lvb->lvb_ctime = LLONG_MIN;
-		lvb->lvb_mtime = LLONG_MIN;
+	/*
+	 * XXX that should be replaced with a loop over sub-objects,
+	 * doing cl_object_attr_get() on them. But for now, let's
+	 * reuse old lov code.
+	 */
 
-		/*
-		 * XXX that should be replaced with a loop over sub-objects,
-		 * doing cl_object_attr_get() on them. But for now, let's
-		 * reuse old lov code.
-		 */
+	/*
+	 * XXX take lsm spin-lock to keep lov_merge_lvb_kms()
+	 * happy. It's not needed, because new code uses
+	 * ->coh_attr_guard spin-lock to protect consistency of
+	 * sub-object attributes.
+	 */
+	lov_stripe_lock(lsm);
+	result = lov_merge_lvb_kms(lsm, lvb, &kms);
+	lov_stripe_unlock(lsm);
+	if (result)
+		return result;
 
-		/*
-		 * XXX take lsm spin-lock to keep lov_merge_lvb_kms()
-		 * happy. It's not needed, because new code uses
-		 * ->coh_attr_guard spin-lock to protect consistency of
-		 * sub-object attributes.
-		 */
-		lov_stripe_lock(lsm);
-		result = lov_merge_lvb_kms(lsm, lvb, &kms);
-		lov_stripe_unlock(lsm);
-		if (result == 0) {
-			cl_lvb2attr(attr, lvb);
-			attr->cat_kms = kms;
-			r0->lo_attr_valid = 1;
-		}
-	}
+	cl_lvb2attr(attr, lvb);
+	attr->cat_kms = kms;
+	r0->lo_attr_valid = 1;
 
 	return result;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 10/28] lustre: lov: change lo_entries to array.
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (8 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 09/28] lustre: lov: reduce code indentation James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 11/28] lustre: lov: move around PFL code and cleanups James Simmons
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Old style striping is equal to a single component. To support PFL
we need to change lo_entries to an array.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  14 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |  20 +--
 drivers/staging/lustre/lustre/lov/lov_io.c         |  27 ++--
 drivers/staging/lustre/lustre/lov/lov_lock.c       |  13 +-
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   6 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     | 149 ++++++++++++---------
 drivers/staging/lustre/lustre/lov/lov_offset.c     |  30 +++--
 drivers/staging/lustre/lustre/lov/lov_page.c       |  11 +-
 drivers/staging/lustre/lustre/lov/lovsub_object.c  |   5 +-
 9 files changed, 158 insertions(+), 117 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index c44c937..99bd1c1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -219,9 +219,13 @@ struct lov_object {
 		struct lov_layout_state_released {
 		} released;
 		struct lov_layout_composite {
+			/**
+			 * Current valid entry count of lo_entries.
+			 */
+			unsigned int lo_entry_count;
 			struct lov_layout_entry {
 				struct lov_layout_raid0 lle_raid0;
-			} lo_entries;
+			} *lo_entries;
 		} composite;
 	} u;
 	/**
@@ -628,13 +632,13 @@ static inline struct lov_thread_info *lov_env_info(const struct lu_env *env)
 	return info;
 }
 
-static inline struct lov_layout_raid0 *lov_r0(struct lov_object *lov)
+static inline struct lov_layout_raid0 *lov_r0(struct lov_object *lov, int i)
 {
 	LASSERT(lov->lo_type == LLT_COMP);
-	LASSERT(lov->lo_lsm->lsm_magic == LOV_MAGIC ||
-		lov->lo_lsm->lsm_magic == LOV_MAGIC_V3);
+	LASSERTF(i < lov->u.composite.lo_entry_count,
+		 "entry %d entry_count %d", i, lov->u.composite.lo_entry_count);
 
-	return &lov->u.composite.lo_entries.lle_raid0;
+	return &lov->u.composite.lo_entries[i].lle_raid0;
 }
 
 /* lov_pack.c */
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index f2747c9..4c9e324 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -169,20 +169,22 @@ struct lov_request_set {
 	(char *)((lv)->lov_tgts[index]->ltd_uuid.uuid)
 
 /* lov_merge.c */
-int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
+int lov_merge_lvb_kms(struct lov_stripe_md *lsm, int index,
 		      struct ost_lvb *lvb, __u64 *kms_place);
 
 /* lov_offset.c */
-u64 lov_stripe_size(struct lov_stripe_md *lsm, u64 ost_size, int stripeno);
-int lov_stripe_offset(struct lov_stripe_md *lsm, u64 lov_off,
-		      int stripeno, u64 *u64);
-u64 lov_size_to_stripe(struct lov_stripe_md *lsm, u64 file_size, int stripeno);
-int lov_stripe_intersects(struct lov_stripe_md *lsm, int stripeno,
+u64 lov_stripe_size(struct lov_stripe_md *lsm, int index, u64 ost_size,
+		    int stripeno);
+int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off,
+		      int stripeno, u64 *obd_off);
+u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size,
+		       int stripeno);
+int lov_stripe_intersects(struct lov_stripe_md *lsm, int index, int stripeno,
 			  u64 start, u64 end,
 			  u64 *obd_start, u64 *obd_end);
-int lov_stripe_number(struct lov_stripe_md *lsm, u64 lov_off);
-pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
-			 int stripe);
+int lov_stripe_number(struct lov_stripe_md *lsm, int index, u64 lov_off);
+pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, int index,
+			 pgoff_t stripe_index, int stripe);
 
 /* lov_request.c */
 int lov_prep_statfs_set(struct obd_device *obd, struct obd_info *oinfo,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 6dd5639..26d0043 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -85,7 +85,7 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		if (cl_io_is_trunc(io)) {
 			loff_t new_size = parent->u.ci_setattr.sa_attr.lvb_size;
 
-			new_size = lov_size_to_stripe(lsm, new_size, stripe);
+			new_size = lov_size_to_stripe(lsm, 0, new_size, stripe);
 			io->u.ci_setattr.sa_attr.lvb_size = new_size;
 		}
 		break;
@@ -101,7 +101,7 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
 
 		io->u.ci_fault = parent->u.ci_fault;
-		off = lov_size_to_stripe(lsm, off, stripe);
+		off = lov_size_to_stripe(lsm, 0, off, stripe);
 		io->u.ci_fault.ft_index = cl_index(obj, off);
 		break;
 	}
@@ -144,13 +144,14 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	struct cl_object  *sub_obj;
 	struct cl_io      *io  = lio->lis_cl.cis_io;
 	int stripe = sub->sub_subio_index;
+	int index = 0;
 	int rc;
 
 	LASSERT(!sub->sub_io);
 	LASSERT(!sub->sub_env);
 	LASSERT(sub->sub_subio_index < lio->lis_stripe_count);
 
-	if (unlikely(!lov_r0(lov)->lo_sub[stripe]))
+	if (unlikely(!lov_r0(lov, index)->lo_sub[stripe]))
 		return -EIO;
 
 	sub->sub_io_initialized = 0;
@@ -179,7 +180,7 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 		}
 	}
 
-	sub_obj = lovsub2cl(lov_r0(lov)->lo_sub[stripe]);
+	sub_obj = lovsub2cl(lov_r0(lov, index)->lo_sub[stripe]);
 	sub_io = sub->sub_io;
 
 	sub_io->ci_obj = sub_obj;
@@ -375,14 +376,15 @@ static int lov_io_iter_init(const struct lu_env *env,
 	u64 end;
 	int stripe;
 	int rc = 0;
+	int index = 0;
 
 	endpos = lov_offset_mod(lio->lis_endpos, -1);
 	for (stripe = 0; stripe < lio->lis_stripe_count; stripe++) {
-		if (!lov_stripe_intersects(lsm, stripe, lio->lis_pos,
+		if (!lov_stripe_intersects(lsm, index, stripe, lio->lis_pos,
 					   endpos, &start, &end))
 			continue;
 
-		if (unlikely(!lov_r0(lio->lis_object)->lo_sub[stripe])) {
+		if (unlikely(!lov_r0(lio->lis_object, index)->lo_sub[stripe])) {
 			if (ios->cis_io->ci_type == CIT_READ ||
 			    ios->cis_io->ci_type == CIT_WRITE ||
 			    ios->cis_io->ci_type == CIT_FAULT)
@@ -555,15 +557,18 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	struct lov_io *lio = cl2lov_io(env, ios);
 	struct lov_object *loo = lio->lis_object;
 	struct cl_object *obj = lov2cl(loo);
-	struct lov_layout_raid0 *r0 = lov_r0(loo);
+	struct lov_layout_raid0 *r0;
 	unsigned int pps; /* pages per stripe */
 	struct lov_io_sub *sub;
 	pgoff_t ra_end;
-	loff_t suboff;
+	u64 suboff;
 	int stripe;
+	int index = 0;
 	int rc;
 
-	stripe = lov_stripe_number(loo->lo_lsm, cl_offset(obj, start));
+	stripe = lov_stripe_number(loo->lo_lsm, index, cl_offset(obj, start));
+
+	r0 = lov_r0(loo, index);
 	if (unlikely(!r0->lo_sub[stripe]))
 		return -EIO;
 
@@ -571,7 +576,7 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
 
-	lov_stripe_offset(loo->lo_lsm, cl_offset(obj, start), stripe, &suboff);
+	lov_stripe_offset(loo->lo_lsm, index, cl_offset(obj, start), stripe, &suboff);
 	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,
 			      cl_index(lovsub2cl(r0->lo_sub[stripe]), suboff),
 			      ra);
@@ -593,7 +598,7 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	/* cra_end is stripe level, convert it into file level */
 	ra_end = ra->cra_end;
 	if (ra_end != CL_PAGE_EOF)
-		ra_end = lov_stripe_pgoff(loo->lo_lsm, ra_end, stripe);
+		ra_end = lov_stripe_pgoff(loo->lo_lsm, index, ra_end, stripe);
 
 	pps = loo->lo_lsm->lsm_entries[0]->lsme_stripe_size >> PAGE_SHIFT;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index 4340063..36c9eb7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -114,7 +114,11 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 					  const struct cl_object *obj,
 					  struct cl_lock *lock)
 {
+	struct lov_object *loo = cl2lov(obj);
+	struct lov_layout_raid0 *r0;
+	struct lov_lock	*lovlck;
 	int result = 0;
+	int index = 0;
 	int i;
 	int nr;
 	u64 start;
@@ -122,10 +126,6 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 	u64 file_start;
 	u64 file_end;
 
-	struct lov_object       *loo    = cl2lov(obj);
-	struct lov_layout_raid0 *r0     = lov_r0(loo);
-	struct lov_lock		*lovlck;
-
 	CDEBUG(D_INODE, "%p: lock/io FID " DFID "/" DFID ", lock/io clobj %p/%p\n",
 	       loo, PFID(lu_object_fid(lov2lu(loo))),
 	       PFID(lu_object_fid(&obj->co_lu)),
@@ -134,13 +134,14 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 	file_start = cl_offset(lov2cl(loo), lock->cll_descr.cld_start);
 	file_end   = cl_offset(lov2cl(loo), lock->cll_descr.cld_end + 1) - 1;
 
+	r0 = lov_r0(loo, index);
 	for (i = 0, nr = 0; i < r0->lo_nr; i++) {
 		/*
 		 * XXX for wide striping smarter algorithm is desirable,
 		 * breaking out of the loop, early.
 		 */
 		if (likely(r0->lo_sub[i]) && /* spare layout */
-		    lov_stripe_intersects(loo->lo_lsm, i,
+		    lov_stripe_intersects(loo->lo_lsm, index, i,
 					  file_start, file_end, &start, &end))
 			nr++;
 	}
@@ -153,7 +154,7 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 	lovlck->lls_nr = nr;
 	for (i = 0, nr = 0; i < r0->lo_nr; ++i) {
 		if (likely(r0->lo_sub[i]) &&
-		    lov_stripe_intersects(loo->lo_lsm, i,
+		    lov_stripe_intersects(loo->lo_lsm, index, i,
 					  file_start, file_end, &start, &end)) {
 			struct lov_lock_sub *lls = &lovlck->lls_sub[nr];
 			struct cl_lock_descr *descr;
diff --git a/drivers/staging/lustre/lustre/lov/lov_merge.c b/drivers/staging/lustre/lustre/lov/lov_merge.c
index 10b8448..020795f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_merge.c
+++ b/drivers/staging/lustre/lustre/lov/lov_merge.c
@@ -41,7 +41,7 @@
  * initializes the current atime, mtime, ctime to avoid regressing a more
  * uptodate time on the local client.
  */
-int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
+int lov_merge_lvb_kms(struct lov_stripe_md *lsm, int index,
 		      struct ost_lvb *lvb, __u64 *kms_place)
 {
 	__u64 size = 0;
@@ -69,14 +69,14 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 		}
 
 		tmpsize = loi->loi_kms;
-		lov_size = lov_stripe_size(lsm, tmpsize, i);
+		lov_size = lov_stripe_size(lsm, index, tmpsize, i);
 		if (lov_size > kms)
 			kms = lov_size;
 
 		if (loi->loi_lvb.lvb_size > tmpsize)
 			tmpsize = loi->loi_lvb.lvb_size;
 
-		lov_size = lov_stripe_size(lsm, tmpsize, i);
+		lov_size = lov_stripe_size(lsm, index, tmpsize, i);
 		if (lov_size > size)
 			size = lov_size;
 		/* merge blocks, mtime, atime */
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 1ebaa23..de5e2a2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -221,24 +221,13 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			  const struct cl_object_conf *conf,
 			  struct lov_layout_raid0 *r0)
 {
+	struct cl_object *stripe;
+	struct lov_thread_info *lti = lov_env_info(env);
+	struct cl_object_conf *subconf = &lti->lti_stripe_conf;
+	struct lu_fid *ofid = &lti->lti_fid;
 	int result;
-	int i;
-
-	struct cl_object	*stripe;
-	struct lov_thread_info  *lti     = lov_env_info(env);
-	struct cl_object_conf   *subconf = &lti->lti_stripe_conf;
-	struct lu_fid	   *ofid    = &lti->lti_fid;
 	int psz;
-
-	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
-		dump_lsm(D_ERROR, lsm);
-		LASSERTF(0, "magic mismatch, expected %d/%d, actual %d.\n",
-			 LOV_MAGIC_V1, LOV_MAGIC_V3, lsm->lsm_magic);
-	}
-
-	LASSERT(!lov->lo_lsm);
-	lov->lo_lsm = lsm_addref(lsm);
-	lov->lo_layout_invalid = true;
+	int i;
 
 	spin_lock_init(&r0->lo_sub_lock);
 	r0->lo_nr  = lsm->lsm_entries[0]->lsme_stripe_count;
@@ -305,10 +294,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 		}
 	}
 	if (result == 0)
-		cl_object_header(&lov->lo_cl)->coh_page_bufsize += psz;
-	else
-		result = -ENOMEM;
-
+		result = psz;
 out:
 	return result;
 }
@@ -319,9 +305,37 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
 			      union lov_layout_state *state)
 {
 	struct lov_layout_composite *comp = &state->composite;
-	struct lov_layout_entry *le = &comp->lo_entries;
+	unsigned int entry_count = 1;
+	unsigned int psz = 0;
+	int result = 0;
+	int i;
 
-	return lov_init_raid0(env, dev, lov, lsm, conf, &le->lle_raid0);
+	LASSERT(!lov->lo_lsm);
+	lov->lo_lsm = lsm_addref(lsm);
+	lov->lo_layout_invalid = true;
+
+	comp->lo_entry_count = entry_count;
+
+	comp->lo_entries = kcalloc(entry_count, sizeof(*comp->lo_entries),
+				   GFP_KERNEL);
+	if (!comp->lo_entries)
+		return -ENOMEM;
+
+	for (i = 0; i < entry_count; i++) {
+		struct lov_layout_entry *le = &comp->lo_entries[i];
+
+		result = lov_init_raid0(env, dev, lov, lsm, conf,
+					&le->lle_raid0);
+		if (result < 0)
+			break;
+
+		LASSERT(ergo(psz > 0, psz == result));
+		psz = result;
+	}
+	if (psz > 0)
+		cl_object_header(&lov->lo_cl)->coh_page_bufsize += psz;
+
+	return result > 0 ? 0 : result;
 }
 
 static int lov_init_released(const struct lu_env *env, struct lov_device *dev,
@@ -454,7 +468,7 @@ static int lov_delete_composite(const struct lu_env *env,
 				union lov_layout_state *state)
 {
 	struct lov_layout_composite *comp = &state->composite;
-	struct lov_layout_entry *entry = &comp->lo_entries;
+	struct lov_layout_entry *entry = &comp->lo_entries[0];
 
 	dump_lsm(D_INODE, lov->lo_lsm);
 
@@ -484,9 +498,15 @@ static void lov_fini_composite(const struct lu_env *env,
 			       union lov_layout_state *state)
 {
 	struct lov_layout_composite *comp = &state->composite;
-	struct lov_layout_entry *entry = &comp->lo_entries;
 
-	lov_fini_raid0(env, &entry->lle_raid0);
+	if (comp->lo_entries) {
+		struct lov_layout_entry *entry = &comp->lo_entries[0];
+
+		lov_fini_raid0(env, &entry->lle_raid0);
+
+		kvfree(comp->lo_entries);
+		comp->lo_entries = NULL;
+	}
 
 	dump_lsm(D_INODE, lov->lo_lsm);
 	lov_free_memmd(&lov->lo_lsm);
@@ -528,7 +548,7 @@ static int lov_print_composite(const struct lu_env *env, void *cookie,
 			       lu_printer_t p, const struct lu_object *o)
 {
 	struct lov_object *lov = lu2lov(o);
-	struct lov_layout_raid0	*r0 = lov_r0(lov);
+	struct lov_layout_raid0	*r0 = lov_r0(lov, 0);
 	struct lov_stripe_md *lsm = lov->lo_lsm;
 
 	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
@@ -600,7 +620,7 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
 	 * sub-object attributes.
 	 */
 	lov_stripe_lock(lsm);
-	result = lov_merge_lvb_kms(lsm, lvb, &kms);
+	result = lov_merge_lvb_kms(lsm, 0, lvb, &kms);
 	lov_stripe_unlock(lsm);
 	if (result)
 		return result;
@@ -617,7 +637,7 @@ static int lov_attr_get_composite(const struct lu_env *env,
 				  struct cl_attr *attr)
 {
 	struct lov_object *lov = cl2lov(obj);
-	struct lov_layout_raid0 *r0 = lov_r0(lov);
+	struct lov_layout_raid0 *r0 = lov_r0(lov, 0);
 	struct cl_attr *lov_attr = &r0->lo_attr;
 	int result;
 
@@ -1051,33 +1071,31 @@ int lov_lock_init(const struct lu_env *env, struct cl_object *obj,
  *
  * \retval last_stripe		return the last stripe of the mapping
  */
-static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm,
+static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index,
 				   u64 fm_start, u64 fm_end,
 				   int start_stripe, int *stripe_count)
 {
+	struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index];
 	int last_stripe;
 	u64 obd_start;
 	u64 obd_end;
 	int i, j;
 
-	if (fm_end - fm_start > lsm->lsm_entries[0]->lsme_stripe_size *
-				lsm->lsm_entries[0]->lsme_stripe_count) {
-		last_stripe = (start_stripe < 1 ?
-			       lsm->lsm_entries[0]->lsme_stripe_count - 1 :
-			       start_stripe - 1);
-		*stripe_count = lsm->lsm_entries[0]->lsme_stripe_count;
+	if (fm_end - fm_start >
+	    lsme->lsme_stripe_size * lsme->lsme_stripe_count) {
+		last_stripe = (start_stripe < 1 ? lsme->lsme_stripe_count - 1 :
+						  start_stripe - 1);
+		*stripe_count = lsme->lsme_stripe_count;
 	} else {
-		for (j = 0, i = start_stripe;
-		     j < lsm->lsm_entries[0]->lsme_stripe_count;
-		     i = (i + 1) % lsm->lsm_entries[0]->lsme_stripe_count,
+		for (j = 0, i = start_stripe; j < lsme->lsme_stripe_count;
+		     i = (i + 1) % lsme->lsme_stripe_count,
 		     j++) {
-			if (lov_stripe_intersects(lsm, i, fm_start, fm_end,
+			if (lov_stripe_intersects(lsm, index, i, fm_start, fm_end,
 						  &obd_start, &obd_end) == 0)
 				break;
 		}
 		*stripe_count = j;
-		last_stripe = (start_stripe + j - 1) %
-			      lsm->lsm_entries[0]->lsme_stripe_count;
+		last_stripe = (start_stripe + j - 1) % lsme->lsme_stripe_count;
 	}
 
 	return last_stripe;
@@ -1132,9 +1150,10 @@ static void fiemap_prepare_and_copy_exts(struct fiemap *fiemap,
  */
 static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 				     struct lov_stripe_md *lsm,
-				     u64 fm_start, u64 fm_end,
+				     int index, u64 fm_start, u64 fm_end,
 				     int *start_stripe)
 {
+	struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index];
 	u64 local_end = fiemap->fm_extents[0].fe_logical;
 	u64 lun_start, lun_end;
 	u64 fm_end_offset;
@@ -1145,8 +1164,8 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 		return 0;
 
 	/* Find out stripe_no from ost_index saved in the fe_device */
-	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
-		struct lov_oinfo *oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
+	for (i = 0; i < lsme->lsme_stripe_count; i++) {
+		struct lov_oinfo *oinfo = lsme->lsme_oinfo[i];
 
 		if (lov_oinfo_is_dummy(oinfo))
 			continue;
@@ -1164,7 +1183,7 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 	 * If we have finished mapping on previous device, shift logical
 	 * offset to start of next device
 	 */
-	if (lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
+	if (lov_stripe_intersects(lsm, index, stripe_no, fm_start, fm_end,
 				  &lun_start, &lun_end) != 0 &&
 	    local_end < lun_end) {
 		fm_end_offset = local_end;
@@ -1174,8 +1193,7 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 		 * calculate offset in next stripe.
 		 */
 		fm_end_offset = 0;
-		*start_stripe = (stripe_no + 1) %
-				lsm->lsm_entries[0]->lsme_stripe_count;
+		*start_stripe = (stripe_no + 1) % lsme->lsme_stripe_count;
 	}
 
 	return fm_end_offset;
@@ -1197,11 +1215,11 @@ struct fiemap_state {
 };
 
 static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
-			     struct lov_stripe_md *lsm,
-			     struct fiemap *fiemap, size_t *buflen,
-			     struct ll_fiemap_info_key *fmkey, int stripeno,
-			     struct fiemap_state *fs)
+			     struct lov_stripe_md *lsm, struct fiemap *fiemap,
+			     size_t *buflen, struct ll_fiemap_info_key *fmkey,
+			     int index, int stripeno, struct fiemap_state *fs)
 {
+	struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index];
 	struct cl_object *subobj;
 	struct lov_obd *lov = lu2lov_dev(obj->co_lu.lo_dev)->ld_lov;
 	struct fiemap_extent *fm_ext = &fs->fs_fm->fm_extents[0];
@@ -1220,11 +1238,12 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 
 	fs->fs_device_done = false;
 	/* Find out range of mapping on this stripe */
-	if ((lov_stripe_intersects(lsm, stripeno, fs->fs_start, fs->fs_end,
+	if ((lov_stripe_intersects(lsm, index, stripeno,
+				   fs->fs_start, fs->fs_end,
 				   &lun_start, &obd_object_end)) == 0)
 		return 0;
 
-	if (lov_oinfo_is_dummy(lsm->lsm_entries[0]->lsme_oinfo[stripeno]))
+	if (lov_oinfo_is_dummy(lsme->lsme_oinfo[stripeno]))
 		return -EIO;
 
 	/* If this is a continuation FIEMAP call and we are on
@@ -1239,7 +1258,8 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 		/* Handle fs->fs_start + fs->fs_length overflow */
 		if (fs->fs_start + fs->fs_length < fs->fs_start)
 			fs->fs_length = ~0ULL - fs->fs_start;
-		lun_end = lov_size_to_stripe(lsm, fs->fs_start + fs->fs_length,
+		lun_end = lov_size_to_stripe(lsm, index,
+					     fs->fs_start + fs->fs_length,
 					     stripeno);
 	}
 
@@ -1274,7 +1294,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 		fs->fs_fm->fm_mapped_extents = 0;
 		fs->fs_fm->fm_flags = fiemap->fm_flags;
 
-		ost_index = lsm->lsm_entries[0]->lsme_oinfo[stripeno]->loi_ost_idx;
+		ost_index = lsme->lsme_oinfo[stripeno]->loi_ost_idx;
 
 		if (ost_index < 0 || ost_index >= lov->desc.ld_tgt_count) {
 			rc = -EINVAL;
@@ -1345,8 +1365,9 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 		 */
 		if (fm_ext[ext_count - 1].fe_flags & FIEMAP_EXTENT_LAST)
 			fm_ext[ext_count - 1].fe_flags &= ~FIEMAP_EXTENT_LAST;
-		if (lov_stripe_size(lsm, fm_ext[ext_count - 1].fe_logical +
-					 fm_ext[ext_count - 1].fe_length,
+		if (lov_stripe_size(lsm, index,
+				    fm_ext[ext_count - 1].fe_logical +
+				    fm_ext[ext_count - 1].fe_length,
 				    stripeno) >= fmkey->lfik_oa.o_size) {
 			ost_eof = true;
 			fs->fs_device_done = true;
@@ -1391,6 +1412,7 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	struct fiemap *fm_local = NULL;
 	struct lov_stripe_md *lsm;
 	int rc = 0;
+	int entry = 0;
 	int cur_stripe;
 	int stripe_count;
 	struct fiemap_state fs = { NULL };
@@ -1450,7 +1472,7 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		goto out;
 	}
 	/* Calculate start stripe, last stripe and length of mapping */
-	fs.fs_start_stripe = lov_stripe_number(lsm, fs.fs_start);
+	fs.fs_start_stripe = lov_stripe_number(lsm, 0, fs.fs_start);
 	fs.fs_end = (fs.fs_length == ~0ULL) ? fmkey->lfik_oa.o_size :
 					      fs.fs_start + fs.fs_length - 1;
 	/* If fs_length != ~0ULL but fs_start+fs_length-1 exceeds file size */
@@ -1459,11 +1481,12 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		fs.fs_length = fs.fs_end - fs.fs_start;
 	}
 
-	fs.fs_last_stripe = fiemap_calc_last_stripe(lsm, fs.fs_start, fs.fs_end,
+	fs.fs_last_stripe = fiemap_calc_last_stripe(lsm, entry,
+						    fs.fs_start, fs.fs_end,
 						    fs.fs_start_stripe,
 						    &stripe_count);
-	fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, fs.fs_start,
-						     fs.fs_end,
+	fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, entry,
+						     fs.fs_start, fs.fs_end,
 						     &fs.fs_start_stripe);
 	if (fs.fs_end_offset == -EINVAL) {
 		rc = -EINVAL;
@@ -1489,8 +1512,8 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	     --stripe_count,
 	     cur_stripe = (cur_stripe + 1) %
 			  lsm->lsm_entries[0]->lsme_stripe_count) {
-		rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen, fmkey,
-				       cur_stripe, &fs);
+		rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen,
+				       fmkey, 0, cur_stripe, &fs);
 		if (rc < 0)
 			goto out;
 		if (fs.fs_finish)
diff --git a/drivers/staging/lustre/lustre/lov/lov_offset.c b/drivers/staging/lustre/lustre/lov/lov_offset.c
index 19a44d3..d817aa5 100644
--- a/drivers/staging/lustre/lustre/lov/lov_offset.c
+++ b/drivers/staging/lustre/lustre/lov/lov_offset.c
@@ -38,9 +38,10 @@
 #include "lov_internal.h"
 
 /* compute object size given "stripeno" and the ost size */
-u64 lov_stripe_size(struct lov_stripe_md *lsm, u64 ost_size, int stripeno)
+u64 lov_stripe_size(struct lov_stripe_md *lsm, int index, u64 ost_size,
+		    int stripeno)
 {
-	unsigned long ssize = lsm->lsm_entries[0]->lsme_stripe_size;
+	unsigned long ssize = lsm->lsm_entries[index]->lsme_stripe_size;
 	unsigned long stripe_size;
 	u64 swidth;
 	u64 lov_size;
@@ -64,12 +65,13 @@ u64 lov_stripe_size(struct lov_stripe_md *lsm, u64 ost_size, int stripeno)
 /**
  * Compute file level page index by stripe level page offset
  */
-pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
-			 int stripe)
+pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, int index,
+			 pgoff_t stripe_index, int stripe)
 {
 	loff_t offset;
 
-	offset = lov_stripe_size(lsm, (stripe_index << PAGE_SHIFT) + 1, stripe);
+	offset = lov_stripe_size(lsm, index, (stripe_index << PAGE_SHIFT) + 1,
+				 stripe);
 	return offset >> PAGE_SHIFT;
 }
 
@@ -122,10 +124,10 @@ pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
  * falls in the stripe and no shifting was done; > 0 when the offset
  * was outside the stripe and was pulled back to its final byte.
  */
-int lov_stripe_offset(struct lov_stripe_md *lsm, u64 lov_off,
+int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off,
 		      int stripeno, u64 *obdoff)
 {
-	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
 	int magic = lsm->lsm_magic;
 	int ret = 0;
@@ -177,10 +179,10 @@ int lov_stripe_offset(struct lov_stripe_md *lsm, u64 lov_off,
  * |    0    |     1     |     2     |    0    |     1     |     2     |
  * ---------------------------------------------------------------------
  */
-u64 lov_size_to_stripe(struct lov_stripe_md *lsm, u64 file_size,
+u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size,
 		       int stripeno)
 {
-	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
 	int magic = lsm->lsm_magic;
 
@@ -218,13 +220,13 @@ u64 lov_size_to_stripe(struct lov_stripe_md *lsm, u64 file_size,
  * that is contained within the lov extent.  this returns true if the given
  * stripe does intersect with the lov extent.
  */
-int lov_stripe_intersects(struct lov_stripe_md *lsm, int stripeno,
+int lov_stripe_intersects(struct lov_stripe_md *lsm, int index, int stripeno,
 			  u64 start, u64 end, u64 *obd_start, u64 *obd_end)
 {
 	int start_side, end_side;
 
-	start_side = lov_stripe_offset(lsm, start, stripeno, obd_start);
-	end_side = lov_stripe_offset(lsm, end, stripeno, obd_end);
+	start_side = lov_stripe_offset(lsm, index, start, stripeno, obd_start);
+	end_side = lov_stripe_offset(lsm, index, end, stripeno, obd_end);
 
 	CDEBUG(D_INODE, "[%llu->%llu] -> [(%d) %llu->%llu (%d)]\n",
 	       start, end, start_side, *obd_start, *obd_end, end_side);
@@ -252,9 +254,9 @@ int lov_stripe_intersects(struct lov_stripe_md *lsm, int stripeno,
 }
 
 /* compute which stripe number "lov_off" will be written into */
-int lov_stripe_number(struct lov_stripe_md *lsm, u64 lov_off)
+int lov_stripe_number(struct lov_stripe_md *lsm, int index, u64 lov_off)
 {
-	unsigned long ssize  = lsm->lsm_entries[0]->lsme_stripe_size;
+	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, swidth;
 	int magic = lsm->lsm_magic;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index d94d003..ad34fc3 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -67,21 +67,24 @@ int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
 			    struct cl_page *page, pgoff_t index)
 {
 	struct lov_object *loo = cl2lov(obj);
-	struct lov_layout_raid0 *r0 = lov_r0(loo);
 	struct lov_io     *lio = lov_env_io(env);
+	struct lov_layout_raid0 *r0;
 	struct cl_object  *subobj;
 	struct cl_object  *o;
 	struct lov_io_sub *sub;
 	struct lov_page   *lpg = cl_object_page_slice(obj, page);
-	loff_t	     offset;
+	u64 offset;
 	u64	    suboff;
 	int		stripe;
+	int entry = 0;
 	int		rc;
 
 	offset = cl_offset(obj, index);
-	stripe = lov_stripe_number(loo->lo_lsm, offset);
+
+	r0 = lov_r0(loo, entry);
+	stripe = lov_stripe_number(loo->lo_lsm, entry, offset);
 	LASSERT(stripe < r0->lo_nr);
-	rc = lov_stripe_offset(loo->lo_lsm, offset, stripe, &suboff);
+	rc = lov_stripe_offset(loo->lo_lsm, entry, offset, stripe, &suboff);
 	LASSERT(rc == 0);
 
 	lpg->lps_index = stripe;
diff --git a/drivers/staging/lustre/lustre/lov/lovsub_object.c b/drivers/staging/lustre/lustre/lov/lovsub_object.c
index d3e9537..cd7806b 100644
--- a/drivers/staging/lustre/lustre/lov/lovsub_object.c
+++ b/drivers/staging/lustre/lustre/lov/lovsub_object.c
@@ -79,8 +79,9 @@ static void lovsub_object_free(const struct lu_env *env, struct lu_object *obj)
 	 * object handling in lu_object_find.
 	 */
 	if (lov) {
+		int index = 0;
 		int stripe = los->lso_index;
-		struct lov_layout_raid0 *r0 = lov_r0(lov);
+		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
 
 		LASSERT(lov->lo_type == LLT_COMP);
 		LASSERT(r0->lo_sub[stripe] == los);
@@ -107,7 +108,7 @@ static int lovsub_attr_update(const struct lu_env *env, struct cl_object *obj,
 {
 	struct lov_object *lov = cl2lovsub(obj)->lso_super;
 
-	lov_r0(lov)->lo_attr_valid = 0;
+	lov_r0(lov, 0)->lo_attr_valid = 0;
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 11/28] lustre: lov: move around PFL code and cleanups
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (9 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 10/28] lustre: lov: change lo_entries to array James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 12/28] lustre: lov: remove lsm_stripe_by_[index|offset]_plain James Simmons
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

No code changes expect for sub_subio_index that changed type.
Move some code around and some style cleanups. This makes it
clear the real code changes from style updates.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  45 ++---
 drivers/staging/lustre/lustre/lov/lov_ea.c         |   3 +-
 drivers/staging/lustre/lustre/lov/lov_io.c         | 181 ++++++++++-----------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  25 +--
 4 files changed, 128 insertions(+), 126 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 99bd1c1..ce32823 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -315,12 +315,6 @@ struct lov_thread_info {
  */
 struct lov_io_sub {
 	/**
-	 * environment's refcheck.
-	 *
-	 * \see cl_env_get()
-	 */
-	u16			 sub_refcheck;
-	/**
 	 * true, iff cl_io_init() was successfully executed against
 	 * lov_io_sub::sub_io.
 	 */
@@ -334,18 +328,24 @@ struct lov_io_sub {
 	 * Linkage into a list (hanging off lov_io::lis_active) of all
 	 * sub-io's active for the current IO iteration.
 	 */
-	struct list_head	 sub_linkage;
-	u16			sub_subio_index;
+	struct list_head	sub_linkage;
+	unsigned int		sub_subio_index;
 	/**
 	 * sub-io for a stripe. Ideally sub-io's can be stopped and resumed
 	 * independently, with lov acting as a scheduler to maximize overall
 	 * throughput.
 	 */
-	struct cl_io	*sub_io;
+	struct cl_io		*sub_io;
 	/**
 	 * environment, in which sub-io executes.
 	 */
-	struct lu_env *sub_env;
+	struct lu_env		*sub_env;
+	/**
+	 * environment's refcheck.
+	 *
+	 * \see cl_env_get()
+	 */
+	u16			sub_refcheck;
 };
 
 /**
@@ -367,37 +367,38 @@ struct lov_io {
 	 *
 	 * This is used only for CIT_READ and CIT_WRITE io's.
 	 */
-	loff_t	     lis_io_endpos;
+	loff_t			lis_io_endpos;
 
 	/**
 	 * starting position within a file, for the current io loop iteration
 	 * (stripe), used by ci_io_loop().
 	 */
-	u64	    lis_pos;
+	u64			lis_pos;
 	/**
 	 * end position with in a file, for the current stripe io. This is
 	 * exclusive (i.e., next offset after last byte affected by io).
 	 */
-	u64	    lis_endpos;
-
-	int		lis_stripe_count;
-	int		lis_active_subios;
+	u64			lis_endpos;
+	int			lis_stripe_count;
+	int			lis_active_subios;
 
 	/**
 	 * the index of ls_single_subio in ls_subios array
 	 */
-	int		lis_single_subio_index;
-	struct cl_io       lis_single_subio;
+	int			lis_single_subio_index;
+	struct cl_io		lis_single_subio;
+
+	/**
+	 * List of active sub-io's. Active sub-io's are under the range
+	 * of [lis_pos, lis_endpos).
+	 */
+	struct list_head	lis_active;
 
 	/**
 	 * size of ls_subios array, actually the highest stripe #
 	 */
 	int		lis_nr_subios;
 	struct lov_io_sub *lis_subs;
-	/**
-	 * List of active sub-io's.
-	 */
-	struct list_head	 lis_active;
 };
 
 struct lov_session {
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 7d3d691..3a8d79e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -531,7 +531,8 @@ const struct lsm_operations *lsm_op_find(int magic)
 
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 {
-	CDEBUG(level, "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, stripe_size %u, stripe_count %u, refc: %d, layout_gen %u, pool [" LOV_POOLNAMEF "]\n",
+	CDEBUG(level,
+	       "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, stripe_size %u, stripe_count %u, refc: %d, layout_gen %u, pool [" LOV_POOLNAMEF "]\n",
 	       lsm, POSTID(&lsm->lsm_oi), lsm->lsm_maxbytes, lsm->lsm_magic,
 	       lsm->lsm_entries[0]->lsme_stripe_size,
 	       lsm->lsm_entries[0]->lsme_stripe_count,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 26d0043..ab97326 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -43,7 +43,6 @@
 /** \addtogroup lov
  *  @{
  */
-
 static void lov_io_sub_fini(const struct lu_env *env, struct lov_io *lio,
 			    struct lov_io_sub *sub)
 {
@@ -66,76 +65,6 @@ static void lov_io_sub_fini(const struct lu_env *env, struct lov_io *lio,
 	}
 }
 
-static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
-			       int stripe, loff_t start, loff_t end)
-{
-	struct lov_stripe_md *lsm    = lio->lis_object->lo_lsm;
-	struct cl_io	 *parent = lio->lis_cl.cis_io;
-
-	switch (io->ci_type) {
-	case CIT_SETATTR: {
-		io->u.ci_setattr.sa_attr = parent->u.ci_setattr.sa_attr;
-		io->u.ci_setattr.sa_attr_flags =
-					parent->u.ci_setattr.sa_attr_flags;
-		io->u.ci_setattr.sa_avalid = parent->u.ci_setattr.sa_avalid;
-		io->u.ci_setattr.sa_xvalid = parent->u.ci_setattr.sa_xvalid;
-		io->u.ci_setattr.sa_stripe_index = stripe;
-		io->u.ci_setattr.sa_parent_fid =
-					parent->u.ci_setattr.sa_parent_fid;
-		if (cl_io_is_trunc(io)) {
-			loff_t new_size = parent->u.ci_setattr.sa_attr.lvb_size;
-
-			new_size = lov_size_to_stripe(lsm, 0, new_size, stripe);
-			io->u.ci_setattr.sa_attr.lvb_size = new_size;
-		}
-		break;
-	}
-	case CIT_DATA_VERSION: {
-		io->u.ci_data_version.dv_data_version = 0;
-		io->u.ci_data_version.dv_flags =
-			parent->u.ci_data_version.dv_flags;
-		break;
-	}
-	case CIT_FAULT: {
-		struct cl_object *obj = parent->ci_obj;
-		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
-
-		io->u.ci_fault = parent->u.ci_fault;
-		off = lov_size_to_stripe(lsm, 0, off, stripe);
-		io->u.ci_fault.ft_index = cl_index(obj, off);
-		break;
-	}
-	case CIT_FSYNC: {
-		io->u.ci_fsync.fi_start = start;
-		io->u.ci_fsync.fi_end = end;
-		io->u.ci_fsync.fi_fid = parent->u.ci_fsync.fi_fid;
-		io->u.ci_fsync.fi_mode = parent->u.ci_fsync.fi_mode;
-		break;
-	}
-	case CIT_READ:
-	case CIT_WRITE: {
-		io->u.ci_wr.wr_sync = cl_io_is_sync_write(parent);
-		if (cl_io_is_append(parent)) {
-			io->u.ci_wr.wr_append = 1;
-		} else {
-			io->u.ci_rw.crw_pos = start;
-			io->u.ci_rw.crw_count = end - start;
-		}
-		break;
-	}
-	case CIT_LADVISE: {
-		io->u.ci_ladvise.li_start = start;
-		io->u.ci_ladvise.li_end = end;
-		io->u.ci_ladvise.li_fid = parent->u.ci_ladvise.li_fid;
-		io->u.ci_ladvise.li_advice = parent->u.ci_ladvise.li_advice;
-		io->u.ci_ladvise.li_flags = parent->u.ci_ladvise.li_flags;
-		break;
-	}
-	default:
-		break;
-	}
-}
-
 static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 			   struct lov_io_sub *sub)
 {
@@ -228,7 +157,6 @@ struct lov_io_sub *lov_sub_get(const struct lu_env *env,
  * Lov io operations.
  *
  */
-
 static int lov_page_index(const struct cl_page *page)
 {
 	const struct cl_page_slice *slice;
@@ -358,6 +286,76 @@ static void lov_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		wake_up_all(&lov->lo_waitq);
 }
 
+static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
+			       int stripe, loff_t start, loff_t end)
+{
+	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
+	struct cl_io *parent = lio->lis_cl.cis_io;
+
+	switch (io->ci_type) {
+	case CIT_SETATTR: {
+		io->u.ci_setattr.sa_attr = parent->u.ci_setattr.sa_attr;
+		io->u.ci_setattr.sa_attr_flags =
+			parent->u.ci_setattr.sa_attr_flags;
+		io->u.ci_setattr.sa_avalid = parent->u.ci_setattr.sa_avalid;
+		io->u.ci_setattr.sa_xvalid = parent->u.ci_setattr.sa_xvalid;
+		io->u.ci_setattr.sa_stripe_index = stripe;
+		io->u.ci_setattr.sa_parent_fid =
+			parent->u.ci_setattr.sa_parent_fid;
+		if (cl_io_is_trunc(io)) {
+			loff_t new_size = parent->u.ci_setattr.sa_attr.lvb_size;
+
+			new_size = lov_size_to_stripe(lsm, 0, new_size, stripe);
+			io->u.ci_setattr.sa_attr.lvb_size = new_size;
+		}
+		break;
+	}
+	case CIT_DATA_VERSION: {
+		io->u.ci_data_version.dv_data_version = 0;
+		io->u.ci_data_version.dv_flags =
+			parent->u.ci_data_version.dv_flags;
+		break;
+	}
+	case CIT_FAULT: {
+		struct cl_object *obj = parent->ci_obj;
+		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
+
+		io->u.ci_fault = parent->u.ci_fault;
+		off = lov_size_to_stripe(lsm, 0, off, stripe);
+		io->u.ci_fault.ft_index = cl_index(obj, off);
+		break;
+	}
+	case CIT_FSYNC: {
+		io->u.ci_fsync.fi_start = start;
+		io->u.ci_fsync.fi_end = end;
+		io->u.ci_fsync.fi_fid = parent->u.ci_fsync.fi_fid;
+		io->u.ci_fsync.fi_mode = parent->u.ci_fsync.fi_mode;
+		break;
+	}
+	case CIT_READ:
+	case CIT_WRITE: {
+		io->u.ci_wr.wr_sync = cl_io_is_sync_write(parent);
+		if (cl_io_is_append(parent)) {
+			io->u.ci_wr.wr_append = 1;
+		} else {
+			io->u.ci_rw.crw_pos = start;
+			io->u.ci_rw.crw_count = end - start;
+		}
+		break;
+	}
+	case CIT_LADVISE: {
+		io->u.ci_ladvise.li_start = start;
+		io->u.ci_ladvise.li_end = end;
+		io->u.ci_ladvise.li_fid = parent->u.ci_ladvise.li_fid;
+		io->u.ci_ladvise.li_advice = parent->u.ci_ladvise.li_advice;
+		io->u.ci_ladvise.li_flags = parent->u.ci_ladvise.li_flags;
+		break;
+	}
+	default:
+		break;
+	}
+}
+
 static u64 lov_offset_mod(u64 val, int delta)
 {
 	if (val != OBD_OBJECT_EOF)
@@ -491,24 +489,6 @@ static int lov_io_end_wrapper(const struct lu_env *env, struct cl_io *io)
 	return 0;
 }
 
-static void
-lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
-{
-	struct lov_io *lio = cl2lov_io(env, ios);
-	struct cl_io *parent = lio->lis_cl.cis_io;
-	struct lov_io_sub *sub;
-
-	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
-		lov_io_end_wrapper(sub->sub_env, sub->sub_io);
-
-		parent->u.ci_data_version.dv_data_version +=
-			sub->sub_io->u.ci_data_version.dv_data_version;
-
-		if (!parent->ci_result)
-			parent->ci_result = sub->sub_io->ci_result;
-	}
-}
-
 static int lov_io_iter_fini_wrapper(const struct lu_env *env, struct cl_io *io)
 {
 	cl_io_iter_fini(env, io);
@@ -529,6 +509,24 @@ static void lov_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
 	LASSERT(rc == 0);
 }
 
+static void
+lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct cl_io *parent = lio->lis_cl.cis_io;
+	struct lov_io_sub *sub;
+
+	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
+		lov_io_end_wrapper(sub->sub_env, sub->sub_io);
+
+		parent->u.ci_data_version.dv_data_version +=
+			sub->sub_io->u.ci_data_version.dv_data_version;
+
+		if (!parent->ci_result)
+			parent->ci_result = sub->sub_io->ci_result;
+	}
+}
+
 static void lov_io_iter_fini(const struct lu_env *env,
 			     const struct cl_io_slice *ios)
 {
@@ -602,7 +600,8 @@ static int lov_io_read_ahead(const struct lu_env *env,
 
 	pps = loo->lo_lsm->lsm_entries[0]->lsme_stripe_size >> PAGE_SHIFT;
 
-	CDEBUG(D_READA, DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
+	CDEBUG(D_READA,
+	       DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
 	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
 	       loo->lo_lsm->lsm_entries[0]->lsme_stripe_size, stripe, start);
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index de5e2a2..3677fac 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -600,6 +600,7 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
 		return 0;
 
 	memset(lvb, 0, sizeof(*lvb));
+
 	/* XXX: timestamps can be negative by sanity:test_39m,
 	 * how can it be?
 	 */
@@ -1200,18 +1201,18 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 }
 
 struct fiemap_state {
-	struct fiemap	*fs_fm;
-	u64		fs_start;
-	u64		fs_length;
-	u64		fs_end;
-	u64		fs_end_offset;
-	int		fs_cur_extent;
-	int		fs_cnt_need;
-	int		fs_start_stripe;
-	int		fs_last_stripe;
-	bool		fs_device_done;
-	bool		fs_finish;
-	bool		fs_enough;
+	struct fiemap		*fs_fm;
+	u64			fs_start;
+	u64			fs_length;
+	u64			fs_end;
+	u64			fs_end_offset;
+	int			fs_cur_extent;
+	int			fs_cnt_need;
+	int			fs_start_stripe;
+	int			fs_last_stripe;
+	bool			fs_device_done;
+	bool			fs_finish;
+	bool			fs_enough;
 };
 
 static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 12/28] lustre: lov: remove lsm_stripe_by_[index|offset]_plain
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (10 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 11/28] lustre: lov: move around PFL code and cleanups James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 13/28] lustre: lov: add looping lsm_entry_count times James Simmons
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Since both lsm_stripe_by_index() and lsm_stripe_by_offset() are
the same for lsm_operations replace them with a new universal
function stripe_width().

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_ea.c       | 24 ------------------------
 drivers/staging/lustre/lustre/lov/lov_internal.h |  4 ----
 drivers/staging/lustre/lustre/lov/lov_offset.c   | 23 +++++++++++++----------
 3 files changed, 13 insertions(+), 38 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 3a8d79e..f0ea895 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -309,24 +309,6 @@ void lsm_free(struct lov_stripe_md *lsm)
 	return lsm;
 }
 
-static void
-lsm_stripe_by_index_plain(struct lov_stripe_md *lsm, int *stripeno,
-			  loff_t *lov_off, loff_t *swidth)
-{
-	if (swidth)
-		*swidth = (loff_t)lsm->lsm_entries[0]->lsme_stripe_size *
-			  lsm->lsm_entries[0]->lsme_stripe_count;
-}
-
-static void
-lsm_stripe_by_offset_plain(struct lov_stripe_md *lsm, int *stripeno,
-			   loff_t *lov_off, loff_t *swidth)
-{
-	if (swidth)
-		*swidth = (loff_t)lsm->lsm_entries[0]->lsme_stripe_size *
-			  lsm->lsm_entries[0]->lsme_stripe_count;
-}
-
 static struct lov_stripe_md *
 lsm_unpackmd_v1(struct lov_obd *lov, void *buf, size_t buf_size)
 {
@@ -336,8 +318,6 @@ void lsm_free(struct lov_stripe_md *lsm)
 }
 
 const static struct lsm_operations lsm_v1_ops = {
-	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
-	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
 	.lsm_unpackmd		= lsm_unpackmd_v1,
 };
 
@@ -351,8 +331,6 @@ void lsm_free(struct lov_stripe_md *lsm)
 }
 
 const static struct lsm_operations lsm_v3_ops = {
-	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
-	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
 	.lsm_unpackmd		= lsm_unpackmd_v3,
 };
 
@@ -502,8 +480,6 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 }
 
 const static struct lsm_operations lsm_comp_md_v1_ops = {
-	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
-	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
 	.lsm_unpackmd		= lsm_unpackmd_comp_md_v1,
 };
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 4c9e324..ebe5890 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -80,10 +80,6 @@ static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 }
 
 struct lsm_operations {
-	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, loff_t *,
-				    loff_t *);
-	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, loff_t *,
-				     loff_t *);
 	struct lov_stripe_md *(*lsm_unpackmd)(struct lov_obd *obd, void *buf,
 					      size_t buf_len);
 };
diff --git a/drivers/staging/lustre/lustre/lov/lov_offset.c b/drivers/staging/lustre/lustre/lov/lov_offset.c
index d817aa5..513f1fd 100644
--- a/drivers/staging/lustre/lustre/lov/lov_offset.c
+++ b/drivers/staging/lustre/lustre/lov/lov_offset.c
@@ -37,6 +37,15 @@
 
 #include "lov_internal.h"
 
+static u64 stripe_width(struct lov_stripe_md *lsm, unsigned int index)
+{
+	struct lov_stripe_md_entry *entry = lsm->lsm_entries[index];
+
+	LASSERT(index < lsm->lsm_entry_count);
+
+	return entry->lsme_stripe_size * entry->lsme_stripe_count;
+}
+
 /* compute object size given "stripeno" and the ost size */
 u64 lov_stripe_size(struct lov_stripe_md *lsm, int index, u64 ost_size,
 		    int stripeno)
@@ -45,12 +54,11 @@ u64 lov_stripe_size(struct lov_stripe_md *lsm, int index, u64 ost_size,
 	unsigned long stripe_size;
 	u64 swidth;
 	u64 lov_size;
-	int magic = lsm->lsm_magic;
 
 	if (ost_size == 0)
 		return 0;
 
-	lsm_op_find(magic)->lsm_stripe_by_index(lsm, &stripeno, NULL, &swidth);
+	swidth = stripe_width(lsm, index);
 
 	/* lov_do_div64(a, b) returns a % b, and a = a / b */
 	stripe_size = lov_do_div64(ost_size, ssize);
@@ -129,7 +137,6 @@ int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off,
 {
 	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
-	int magic = lsm->lsm_magic;
 	int ret = 0;
 
 	if (lov_off == OBD_OBJECT_EOF) {
@@ -137,8 +144,7 @@ int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off,
 		return 0;
 	}
 
-	lsm_op_find(magic)->lsm_stripe_by_index(lsm, &stripeno, &lov_off,
-						&swidth);
+	swidth = stripe_width(lsm, index);
 
 	/* lov_do_div64(a, b) returns a % b, and a = a / b */
 	stripe_off = lov_do_div64(lov_off, swidth);
@@ -184,13 +190,11 @@ u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size,
 {
 	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, this_stripe, swidth;
-	int magic = lsm->lsm_magic;
 
 	if (file_size == OBD_OBJECT_EOF)
 		return OBD_OBJECT_EOF;
 
-	lsm_op_find(magic)->lsm_stripe_by_index(lsm, &stripeno, &file_size,
-						&swidth);
+	swidth = stripe_width(lsm, index);
 
 	/* lov_do_div64(a, b) returns a % b, and a = a / b */
 	stripe_off = lov_do_div64(file_size, swidth);
@@ -258,9 +262,8 @@ int lov_stripe_number(struct lov_stripe_md *lsm, int index, u64 lov_off)
 {
 	unsigned long ssize  = lsm->lsm_entries[index]->lsme_stripe_size;
 	u64 stripe_off, swidth;
-	int magic = lsm->lsm_magic;
 
-	lsm_op_find(magic)->lsm_stripe_by_offset(lsm, NULL, &lov_off, &swidth);
+	swidth = stripe_width(lsm, index);
 
 	stripe_off = lov_do_div64(lov_off, swidth);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 13/28] lustre: lov: add looping lsm_entry_count times
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (11 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 12/28] lustre: lov: remove lsm_stripe_by_[index|offset]_plain James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 14/28] lustre: lov: create lov_comp_* wrappers James Simmons
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Create lov_for_each_layout_entry() and lov_lse() to handle when
lsm_entry_count will be greater than one. Modifiy various code
blocks to loop lsm_entry_count times.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  13 +++
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  20 +++-
 drivers/staging/lustre/lustre/lov/lov_io.c         |  88 +++++++++-------
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   6 +-
 drivers/staging/lustre/lustre/lov/lov_object.c     | 116 +++++++++++++--------
 5 files changed, 156 insertions(+), 87 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index ce32823..952da3a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -235,6 +235,11 @@ struct lov_object {
 	struct task_struct	*lo_owner;
 };
 
+#define lov_foreach_layout_entry(lov, entry)				\
+	for (entry = &lov->u.composite.lo_entries[0];			\
+	     entry < &lov->u.composite.lo_entries[lov->u.composite.lo_entry_count];\
+	     entry++)
+
 /**
  * State lov_lock keeps for each sub-lock.
  */
@@ -642,6 +647,14 @@ static inline struct lov_layout_raid0 *lov_r0(struct lov_object *lov, int i)
 	return &lov->u.composite.lo_entries[i].lle_raid0;
 }
 
+static inline struct lov_stripe_md_entry *lov_lse(struct lov_object *lov, int i)
+{
+	LASSERT(lov->lo_lsm);
+	LASSERT(i < lov->lo_lsm->lsm_entry_count);
+
+	return lov->lo_lsm->lsm_entries[i];
+}
+
 /* lov_pack.c */
 int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		  struct lov_user_md __user *lump);
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index f0ea895..f89284a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -507,11 +507,21 @@ const struct lsm_operations *lsm_op_find(int magic)
 
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 {
+	int i;
+
 	CDEBUG(level,
-	       "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, stripe_size %u, stripe_count %u, refc: %d, layout_gen %u, pool [" LOV_POOLNAMEF "]\n",
+	       "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, refc: %d, entry: %u, layout_gen %u\n",
 	       lsm, POSTID(&lsm->lsm_oi), lsm->lsm_maxbytes, lsm->lsm_magic,
-	       lsm->lsm_entries[0]->lsme_stripe_size,
-	       lsm->lsm_entries[0]->lsme_stripe_count,
-	       atomic_read(&lsm->lsm_refc), lsm->lsm_layout_gen,
-	       lsm->lsm_entries[0]->lsme_pool_name);
+	       atomic_read(&lsm->lsm_refc), lsm->lsm_entry_count,
+	       lsm->lsm_layout_gen);
+
+	for (i = 0; i < lsm->lsm_entry_count; i++) {
+		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
+
+		CDEBUG(level,
+		       ": id: %u, magic 0x%08X, stripe count %u, size %u, layout_gen %u, pool: [" LOV_POOLNAMEF "]\n",
+		       lse->lsme_id, lse->lsme_magic,
+		       lse->lsme_stripe_count, lse->lsme_stripe_size,
+		       lse->lsme_layout_gen, lse->lsme_pool_name);
+	}
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index ab97326..7fdbed9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -368,46 +368,59 @@ static int lov_io_iter_init(const struct lu_env *env,
 {
 	struct lov_io	*lio = cl2lov_io(env, ios);
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
+	struct lov_layout_entry *le;
 	struct lov_io_sub    *sub;
 	u64 endpos;
-	u64 start;
-	u64 end;
-	int stripe;
 	int rc = 0;
-	int index = 0;
+	int index;
 
 	endpos = lov_offset_mod(lio->lis_endpos, -1);
-	for (stripe = 0; stripe < lio->lis_stripe_count; stripe++) {
-		if (!lov_stripe_intersects(lsm, index, stripe, lio->lis_pos,
-					   endpos, &start, &end))
-			continue;
-
-		if (unlikely(!lov_r0(lio->lis_object, index)->lo_sub[stripe])) {
-			if (ios->cis_io->ci_type == CIT_READ ||
-			    ios->cis_io->ci_type == CIT_WRITE ||
-			    ios->cis_io->ci_type == CIT_FAULT)
-				return -EIO;
 
-			continue;
-		}
+	index = 0;
+	lov_foreach_layout_entry(lio->lis_object, le) {
+		struct lov_layout_raid0 *r0 = &le->lle_raid0;
+		int stripe;
+		u64 start;
+		u64 end;
+
+		index++;
+
+		for (stripe = 0; stripe < r0->lo_nr; stripe++) {
+			if (!lov_stripe_intersects(lsm, index - 1, stripe,
+						   lio->lis_pos,
+						   endpos, &start, &end))
+				continue;
+
+			if (unlikely(!r0->lo_sub[stripe])) {
+				if (ios->cis_io->ci_type == CIT_READ ||
+				    ios->cis_io->ci_type == CIT_WRITE ||
+				    ios->cis_io->ci_type == CIT_FAULT)
+					return -EIO;
+
+				continue;
+			}
+
+			end = lov_offset_mod(end, 1);
+			sub = lov_sub_get(env, lio, stripe);
+			if (IS_ERR(sub)) {
+				rc = PTR_ERR(sub);
+				break;
+			}
 
-		end = lov_offset_mod(end, 1);
-		sub = lov_sub_get(env, lio, stripe);
-		if (IS_ERR(sub)) {
-			rc = PTR_ERR(sub);
-			break;
-		}
+			lov_io_sub_inherit(sub->sub_io, lio, stripe, start, end);
+			rc = cl_io_iter_init(sub->sub_env, sub->sub_io);
+			if (rc) {
+				cl_io_iter_fini(sub->sub_env, sub->sub_io);
+				break;
+			}
 
-		lov_io_sub_inherit(sub->sub_io, lio, stripe, start, end);
-		rc = cl_io_iter_init(sub->sub_env, sub->sub_io);
-		if (rc) {
-			cl_io_iter_fini(sub->sub_env, sub->sub_io);
-			break;
-		}
-		CDEBUG(D_VFSTRACE, "shrink: %d [%llu, %llu)\n",
-		       stripe, start, end);
+			CDEBUG(D_VFSTRACE, "shrink: %d [%llu, %llu)\n",
+			       stripe, start, end);
 
-		list_add_tail(&sub->sub_linkage, &lio->lis_active);
+			list_add_tail(&sub->sub_linkage, &lio->lis_active);
+		}
+		if (rc)
+			break;
 	}
 	return rc;
 }
@@ -417,13 +430,18 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 {
 	struct lov_io	*lio = cl2lov_io(env, ios);
 	struct cl_io	 *io  = ios->cis_io;
-	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
-	unsigned long ssize = lsm->lsm_entries[0]->lsme_stripe_size;
 	u64 start = io->u.ci_rw.crw_pos;
+	struct lov_stripe_md_entry *lse;
+	unsigned long ssize;
 	loff_t next;
+	int index = 0;
 
 	LASSERT(io->ci_type == CIT_READ || io->ci_type == CIT_WRITE);
 
+	lse = lov_lse(lio->lis_object, index);
+
+	ssize = lse->lsme_stripe_size;
+
 	/* fast path for common case. */
 	if (lio->lis_nr_subios != 1 && !cl_io_is_append(io)) {
 		lov_do_div64(start, ssize);
@@ -598,12 +616,12 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	if (ra_end != CL_PAGE_EOF)
 		ra_end = lov_stripe_pgoff(loo->lo_lsm, index, ra_end, stripe);
 
-	pps = loo->lo_lsm->lsm_entries[0]->lsme_stripe_size >> PAGE_SHIFT;
+	pps = lov_lse(loo, index)->lsme_stripe_size >> PAGE_SHIFT;
 
 	CDEBUG(D_READA,
 	       DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
 	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
-	       loo->lo_lsm->lsm_entries[0]->lsme_stripe_size, stripe, start);
+	       lov_lse(loo, index)->lsme_stripe_size, stripe, start);
 
 	/* never exceed the end of the stripe */
 	ra->cra_end = min_t(pgoff_t, ra_end, start + pps - start % pps - 1);
diff --git a/drivers/staging/lustre/lustre/lov/lov_merge.c b/drivers/staging/lustre/lustre/lov/lov_merge.c
index 020795f..79edc26 100644
--- a/drivers/staging/lustre/lustre/lov/lov_merge.c
+++ b/drivers/staging/lustre/lustre/lov/lov_merge.c
@@ -44,6 +44,7 @@
 int lov_merge_lvb_kms(struct lov_stripe_md *lsm, int index,
 		      struct ost_lvb *lvb, __u64 *kms_place)
 {
+	struct lov_stripe_md_entry *lse = lsm->lsm_entries[index];
 	__u64 size = 0;
 	__u64 kms = 0;
 	__u64 blocks = 0;
@@ -59,8 +60,9 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm, int index,
 	CDEBUG(D_INODE, "MDT ID " DOSTID " initial value: s=%llu m=%llu a=%llu c=%llu b=%llu\n",
 	       POSTID(&lsm->lsm_oi), lvb->lvb_size, lvb->lvb_mtime,
 	       lvb->lvb_atime, lvb->lvb_ctime, lvb->lvb_blocks);
-	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
-		struct lov_oinfo *loi = lsm->lsm_entries[0]->lsme_oinfo[i];
+
+	for (i = 0; i < lse->lsme_stripe_count; i++) {
+		struct lov_oinfo *loi = lse->lsme_oinfo[i];
 		u64 lov_size, tmpsize;
 
 		if (OST_LVB_IS_ERR(loi->loi_lvb.lvb_blocks)) {
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 3677fac..8fd92a0 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -217,10 +217,11 @@ static int lov_page_slice_fixup(struct lov_object *lov,
 }
 
 static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
-			  struct lov_object *lov, struct lov_stripe_md *lsm,
+			  struct lov_object *lov, int index,
 			  const struct cl_object_conf *conf,
 			  struct lov_layout_raid0 *r0)
 {
+	struct lov_stripe_md_entry *lse = lov_lse(lov, index);
 	struct cl_object *stripe;
 	struct lov_thread_info *lti = lov_env_info(env);
 	struct cl_object_conf *subconf = &lti->lti_stripe_conf;
@@ -230,7 +231,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	int i;
 
 	spin_lock_init(&r0->lo_sub_lock);
-	r0->lo_nr  = lsm->lsm_entries[0]->lsme_stripe_count;
+	r0->lo_nr = lse->lsme_stripe_count;
 	LASSERT(r0->lo_nr <= lov_targets_nr(dev));
 
 	r0->lo_sub = kvzalloc(r0->lo_nr * sizeof(r0->lo_sub[0]),
@@ -245,11 +246,10 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	 * Create stripe cl_objects.
 	 */
 	for (i = 0; i < r0->lo_nr; ++i) {
+		struct lov_oinfo *oinfo = lse->lsme_oinfo[i];
 		struct cl_device *subdev;
-		struct lov_oinfo *oinfo;
 		int ost_idx;
 
-		oinfo = lsm->lsm_entries[0]->lsme_oinfo[i];
 		if (lov_oinfo_is_dummy(oinfo))
 			continue;
 
@@ -324,7 +324,7 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
 	for (i = 0; i < entry_count; i++) {
 		struct lov_layout_entry *le = &comp->lo_entries[i];
 
-		result = lov_init_raid0(env, dev, lov, lsm, conf,
+		result = lov_init_raid0(env, dev, lov, i, conf,
 					&le->lle_raid0);
 		if (result < 0)
 			break;
@@ -467,13 +467,13 @@ static int lov_delete_composite(const struct lu_env *env,
 				struct lov_object *lov,
 				union lov_layout_state *state)
 {
-	struct lov_layout_composite *comp = &state->composite;
-	struct lov_layout_entry *entry = &comp->lo_entries[0];
+	struct lov_layout_entry *entry;
 
 	dump_lsm(D_INODE, lov->lo_lsm);
 
 	lov_layout_wait(env, lov);
-	lov_delete_raid0(env, lov, &entry->lle_raid0);
+	lov_foreach_layout_entry(lov, entry)
+		lov_delete_raid0(env, lov, &entry->lle_raid0);
 
 	return 0;
 }
@@ -500,9 +500,10 @@ static void lov_fini_composite(const struct lu_env *env,
 	struct lov_layout_composite *comp = &state->composite;
 
 	if (comp->lo_entries) {
-		struct lov_layout_entry *entry = &comp->lo_entries[0];
+		struct lov_layout_entry *entry;
 
-		lov_fini_raid0(env, &entry->lle_raid0);
+		lov_foreach_layout_entry(lov, entry)
+			lov_fini_raid0(env, &entry->lle_raid0);
 
 		kvfree(comp->lo_entries);
 		comp->lo_entries = NULL;
@@ -548,15 +549,24 @@ static int lov_print_composite(const struct lu_env *env, void *cookie,
 			       lu_printer_t p, const struct lu_object *o)
 {
 	struct lov_object *lov = lu2lov(o);
-	struct lov_layout_raid0	*r0 = lov_r0(lov, 0);
 	struct lov_stripe_md *lsm = lov->lo_lsm;
+	int i;
 
-	(*p)(env, cookie, "stripes: %d, %s, lsm{%p 0x%08X %d %u %u}:\n",
-	     r0->lo_nr, lov->lo_layout_invalid ? "invalid" : "valid", lsm,
+	(*p)(env, cookie, "entries: %d, %s, lsm{%p 0x%08X %d %u}:\n",
+	     lsm->lsm_entry_count,
+	     lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
-	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
+	     lsm->lsm_layout_gen);
+
+	for (i = 0; i < lsm->lsm_entry_count; i++) {
+		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
 
-	lov_print_raid0(env, cookie, p, r0);
+		(*p)(env, cookie, ": { 0x%08X, %u, %u, %u, %u }\n",
+		     lse->lsme_magic,
+		     lse->lsme_id, lse->lsme_layout_gen,
+		     lse->lsme_stripe_count, lse->lsme_stripe_size);
+		lov_print_raid0(env, cookie, p, lov_r0(lov, i));
+	}
 
 	return 0;
 }
@@ -589,10 +599,11 @@ static int lov_attr_get_empty(const struct lu_env *env, struct cl_object *obj,
 }
 
 static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
-			      struct cl_attr *attr, struct lov_layout_raid0 *r0)
+			      unsigned int index, struct lov_layout_raid0 *r0)
 {
 	struct lov_stripe_md *lsm = lov->lo_lsm;
 	struct ost_lvb *lvb = &lov_env_info(env)->lti_lvb;
+	struct cl_attr *attr = &r0->lo_attr;
 	int result = 0;
 	u64 kms = 0;
 
@@ -621,7 +632,7 @@ static int lov_attr_get_raid0(const struct lu_env *env, struct lov_object *lov,
 	 * sub-object attributes.
 	 */
 	lov_stripe_lock(lsm);
-	result = lov_merge_lvb_kms(lsm, 0, lvb, &kms);
+	result = lov_merge_lvb_kms(lsm, index, lvb, &kms);
 	lov_stripe_unlock(lsm);
 	if (result)
 		return result;
@@ -638,24 +649,33 @@ static int lov_attr_get_composite(const struct lu_env *env,
 				  struct cl_attr *attr)
 {
 	struct lov_object *lov = cl2lov(obj);
-	struct lov_layout_raid0 *r0 = lov_r0(lov, 0);
-	struct cl_attr *lov_attr = &r0->lo_attr;
-	int result;
+	struct lov_layout_entry *entry;
+	int result = 0;
+	int index = 0;
 
-	result = lov_attr_get_raid0(env, lov, attr, r0);
-	if (result)
-		return result;
+	attr->cat_blocks = 0;
+	attr->cat_size = 0;
+	lov_foreach_layout_entry(lov, entry) {
+		struct lov_layout_raid0 *r0 = &entry->lle_raid0;
+		struct cl_attr *lov_attr = &r0->lo_attr;
 
-	attr->cat_blocks = lov_attr->cat_blocks;
-	attr->cat_size = lov_attr->cat_size;
-	attr->cat_kms = lov_attr->cat_kms;
-	if (attr->cat_atime < lov_attr->cat_atime)
-		attr->cat_atime = lov_attr->cat_atime;
-	if (attr->cat_ctime < lov_attr->cat_ctime)
-		attr->cat_ctime = lov_attr->cat_ctime;
-	if (attr->cat_mtime < lov_attr->cat_mtime)
-		attr->cat_mtime = lov_attr->cat_mtime;
+		result = lov_attr_get_raid0(env, lov, index, r0);
+		if (result)
+			break;
 
+		/* merge results */
+		attr->cat_blocks += lov_attr->cat_blocks;
+		if (attr->cat_size < lov_attr->cat_size)
+			attr->cat_size = lov_attr->cat_size;
+		if (attr->cat_kms < lov_attr->cat_kms)
+			attr->cat_kms = lov_attr->cat_kms;
+		if (attr->cat_atime < lov_attr->cat_atime)
+			attr->cat_atime = lov_attr->cat_atime;
+		if (attr->cat_ctime < lov_attr->cat_ctime)
+			attr->cat_ctime = lov_attr->cat_ctime;
+		if (attr->cat_mtime < lov_attr->cat_mtime)
+			attr->cat_mtime = lov_attr->cat_mtime;
+	}
 	return result;
 }
 
@@ -1089,8 +1109,7 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index,
 		*stripe_count = lsme->lsme_stripe_count;
 	} else {
 		for (j = 0, i = start_stripe; j < lsme->lsme_stripe_count;
-		     i = (i + 1) % lsme->lsme_stripe_count,
-		     j++) {
+		     i = (i + 1) % lsme->lsme_stripe_count, j++) {
 			if (lov_stripe_intersects(lsm, index, i, fm_start, fm_end,
 						  &obd_start, &obd_end) == 0)
 				break;
@@ -1681,18 +1700,25 @@ int lov_read_and_clear_async_rc(struct cl_object *clob)
 			int i;
 
 			lsm = lov->lo_lsm;
-			for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count;
-			     i++) {
-				struct lov_oinfo *loi;
-
-				loi = lsm->lsm_entries[0]->lsme_oinfo[i];
-				if (lov_oinfo_is_dummy(loi))
-					continue;
-
-				if (loi->loi_ar.ar_rc && !rc)
-					rc = loi->loi_ar.ar_rc;
-				loi->loi_ar.ar_rc = 0;
+			LASSERT(lsm);
+			for (i = 0; i < lsm->lsm_entry_count; i++) {
+				struct lov_stripe_md_entry *lse;
+				int j;
+
+				lse = lsm->lsm_entries[i];
+				for (j = 0; j < lse->lsme_stripe_count; j++) {
+					struct lov_oinfo *loi;
+
+					loi = lse->lsme_oinfo[i];
+					if (lov_oinfo_is_dummy(loi))
+						continue;
+
+					if (loi->loi_ar.ar_rc && !rc)
+						rc = loi->loi_ar.ar_rc;
+					loi->loi_ar.ar_rc = 0;
+				}
 			}
+			break;
 		}
 		case LLT_RELEASED:
 		case LLT_EMPTY:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 14/28] lustre: lov: create lov_comp_* wrappers
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (12 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 13/28] lustre: lov: add looping lsm_entry_count times James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 15/28] lustre: clio: client side implementation for PFL James Simmons
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Add new lov_comp_*() wrappers to get the index, stripe, and
entries for PFL components.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_internal.h  | 15 ++++++++++++++
 drivers/staging/lustre/lustre/lov/lov_io.c        | 20 ++++++++++--------
 drivers/staging/lustre/lustre/lov/lov_lock.c      |  3 ++-
 drivers/staging/lustre/lustre/lov/lov_object.c    | 25 +++++++++++++++--------
 drivers/staging/lustre/lustre/lov/lov_page.c      |  4 ++--
 drivers/staging/lustre/lustre/lov/lovsub_object.c |  9 ++++----
 6 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index ebe5890..ef47c67 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -79,6 +79,21 @@ static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 	return lsm && !lsm->lsm_is_released;
 }
 
+static inline unsigned int lov_comp_index(int entry, int stripe)
+{
+	return stripe;
+}
+
+static inline int lov_comp_stripe(int index)
+{
+	return index & 0xffff;
+}
+
+static inline int lov_comp_entry(int index)
+{
+	return 0;
+}
+
 struct lsm_operations {
 	struct lov_stripe_md *(*lsm_unpackmd)(struct lov_obd *obd, void *buf,
 					      size_t buf_len);
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 7fdbed9..635e5a6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -72,8 +72,8 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	struct cl_io      *sub_io;
 	struct cl_object  *sub_obj;
 	struct cl_io      *io  = lio->lis_cl.cis_io;
-	int stripe = sub->sub_subio_index;
-	int index = 0;
+	int index = lov_comp_entry(sub->sub_subio_index);
+	int stripe = lov_comp_stripe(sub->sub_subio_index);
 	int rc;
 
 	LASSERT(!sub->sub_io);
@@ -286,11 +286,13 @@ static void lov_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		wake_up_all(&lov->lo_waitq);
 }
 
-static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
+static void lov_io_sub_inherit(struct lov_io_sub *sub, struct lov_io *lio,
 			       int stripe, loff_t start, loff_t end)
 {
+	struct cl_io *io = sub->sub_io;
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
 	struct cl_io *parent = lio->lis_cl.cis_io;
+	int index = lov_comp_entry(sub->sub_subio_index);
 
 	switch (io->ci_type) {
 	case CIT_SETATTR: {
@@ -305,7 +307,8 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		if (cl_io_is_trunc(io)) {
 			loff_t new_size = parent->u.ci_setattr.sa_attr.lvb_size;
 
-			new_size = lov_size_to_stripe(lsm, 0, new_size, stripe);
+			new_size = lov_size_to_stripe(lsm, index, new_size,
+						      stripe);
 			io->u.ci_setattr.sa_attr.lvb_size = new_size;
 		}
 		break;
@@ -321,7 +324,7 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
 
 		io->u.ci_fault = parent->u.ci_fault;
-		off = lov_size_to_stripe(lsm, 0, off, stripe);
+		off = lov_size_to_stripe(lsm, index, off, stripe);
 		io->u.ci_fault.ft_index = cl_index(obj, off);
 		break;
 	}
@@ -401,13 +404,14 @@ static int lov_io_iter_init(const struct lu_env *env,
 			}
 
 			end = lov_offset_mod(end, 1);
-			sub = lov_sub_get(env, lio, stripe);
+			sub = lov_sub_get(env, lio,
+					  lov_comp_index(index - 1, stripe));
 			if (IS_ERR(sub)) {
 				rc = PTR_ERR(sub);
 				break;
 			}
 
-			lov_io_sub_inherit(sub->sub_io, lio, stripe, start, end);
+			lov_io_sub_inherit(sub, lio, stripe, start, end);
 			rc = cl_io_iter_init(sub->sub_env, sub->sub_io);
 			if (rc) {
 				cl_io_iter_fini(sub->sub_env, sub->sub_io);
@@ -588,7 +592,7 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	if (unlikely(!r0->lo_sub[stripe]))
 		return -EIO;
 
-	sub = lov_sub_get(env, lio, stripe);
+	sub = lov_sub_get(env, lio, lov_comp_index(index, stripe));
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index 36c9eb7..cc08e96 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -168,7 +168,8 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 			descr->cld_mode  = lock->cll_descr.cld_mode;
 			descr->cld_gid   = lock->cll_descr.cld_gid;
 			descr->cld_enq_flags = lock->cll_descr.cld_enq_flags;
-			lls->sub_index = i;
+
+			lls->sub_index = lov_comp_index(index, i);
 
 			/* initialize sub lock */
 			result = lov_sublock_init(env, lock, lls);
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 8fd92a0..38258ce 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -132,6 +132,8 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 			struct cl_object *subobj, struct lov_layout_raid0 *r0,
 			int idx)
 {
+	int stripe = lov_comp_stripe(idx);
+	int entry = lov_comp_entry(idx);
 	struct cl_object_header *hdr;
 	struct cl_object_header *subhdr;
 	struct cl_object_header *parent;
@@ -154,8 +156,9 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 	subhdr = cl_object_header(subobj);
 
 	oinfo = lov->lo_lsm->lsm_entries[0]->lsme_oinfo[idx];
-	CDEBUG(D_INODE, DFID "@%p[%d] -> " DFID "@%p: ostid: " DOSTID " idx: %d gen: %d\n",
-	       PFID(&subhdr->coh_lu.loh_fid), subhdr, idx,
+	CDEBUG(D_INODE,
+	       DFID "@%p[%d:%d] -> " DFID "@%p: ostid: " DOSTID " ost idx: %d gen: %d\n",
+	       PFID(&subhdr->coh_lu.loh_fid), subhdr, entry, stripe,
 	       PFID(&hdr->coh_lu.loh_fid), hdr, POSTID(&oinfo->loi_oi),
 	       oinfo->loi_ost_idx, oinfo->loi_ost_gen);
 
@@ -167,9 +170,9 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 		spin_unlock(&subhdr->coh_attr_guard);
 		subhdr->coh_nesting = hdr->coh_nesting + 1;
 		lu_object_ref_add(&subobj->co_lu, "lov-parent", lov);
-		r0->lo_sub[idx] = cl2lovsub(subobj);
-		r0->lo_sub[idx]->lso_super = lov;
-		r0->lo_sub[idx]->lso_index = idx;
+		r0->lo_sub[stripe] = cl2lovsub(subobj);
+		r0->lo_sub[stripe]->lso_super = lov;
+		r0->lo_sub[stripe]->lso_index = idx;
 		result = 0;
 	} else {
 		struct lu_object  *old_obj;
@@ -279,7 +282,8 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			goto out;
 		}
 
-		result = lov_init_sub(env, lov, stripe, r0, i);
+		result = lov_init_sub(env, lov, stripe, r0,
+				      lov_comp_index(index, i));
 		if (result == -EAGAIN) { /* try again */
 			--i;
 			result = 0;
@@ -354,14 +358,15 @@ static int lov_init_released(const struct lu_env *env, struct lov_device *dev,
 static struct cl_object *lov_find_subobj(const struct lu_env *env,
 					 struct lov_object *lov,
 					 struct lov_stripe_md *lsm,
-					 int stripe_idx)
+					 int index)
 {
 	struct lov_device *dev = lu2lov_dev(lov2lu(lov)->lo_dev);
-	struct lov_oinfo *oinfo = lsm->lsm_entries[0]->lsme_oinfo[stripe_idx];
 	struct lov_thread_info *lti = lov_env_info(env);
 	struct lu_fid *ofid = &lti->lti_fid;
+	int stripe = lov_comp_stripe(index);
 	struct cl_device *subdev;
 	struct cl_object *result;
+	struct lov_oinfo *oinfo;
 	int ost_idx;
 	int rc;
 
@@ -370,6 +375,7 @@ static struct cl_object *lov_find_subobj(const struct lu_env *env,
 		goto out;
 	}
 
+	oinfo = lsm->lsm_entries[0]->lsme_oinfo[stripe];
 	ost_idx = oinfo->loi_ost_idx;
 	rc = ostid_to_fid(ofid, &oinfo->loi_oi, ost_idx);
 	if (rc) {
@@ -1291,7 +1297,8 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 	len_mapped_single_call = 0;
 
 	/* find lobsub object */
-	subobj = lov_find_subobj(env, cl2lov(obj), lsm, stripeno);
+	subobj = lov_find_subobj(env, cl2lov(obj), lsm,
+				 lov_comp_index(index, stripeno));
 	if (IS_ERR(subobj))
 		return PTR_ERR(subobj);
 	/* If the output buffer is very large and the objects have many
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index ad34fc3..e227279 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -87,10 +87,10 @@ int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
 	rc = lov_stripe_offset(loo->lo_lsm, entry, offset, stripe, &suboff);
 	LASSERT(rc == 0);
 
-	lpg->lps_index = stripe;
+	lpg->lps_index = lov_comp_index(entry, stripe);
 	cl_page_slice_add(page, &lpg->lps_cl, obj, index, &lov_comp_page_ops);
 
-	sub = lov_sub_get(env, lio, stripe);
+	sub = lov_sub_get(env, lio, lpg->lps_index);
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
 
diff --git a/drivers/staging/lustre/lustre/lov/lovsub_object.c b/drivers/staging/lustre/lustre/lov/lovsub_object.c
index cd7806b..ca7c8a0 100644
--- a/drivers/staging/lustre/lustre/lov/lovsub_object.c
+++ b/drivers/staging/lustre/lustre/lov/lovsub_object.c
@@ -79,8 +79,8 @@ static void lovsub_object_free(const struct lu_env *env, struct lu_object *obj)
 	 * object handling in lu_object_find.
 	 */
 	if (lov) {
-		int index = 0;
-		int stripe = los->lso_index;
+		int index = lov_comp_entry(los->lso_index);
+		int stripe = lov_comp_stripe(los->lso_index);
 		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
 
 		LASSERT(lov->lo_type == LLT_COMP);
@@ -107,8 +107,9 @@ static int lovsub_attr_update(const struct lu_env *env, struct cl_object *obj,
 			      const struct cl_attr *attr, unsigned int valid)
 {
 	struct lov_object *lov = cl2lovsub(obj)->lso_super;
+	struct lovsub_object *los = cl2lovsub(obj);
 
-	lov_r0(lov, 0)->lo_attr_valid = 0;
+	lov_r0(lov, lov_comp_entry(los->lso_index))->lo_attr_valid = 0;
 	return 0;
 }
 
@@ -137,7 +138,7 @@ static void lovsub_req_attr_set(const struct lu_env *env, struct cl_object *obj,
 	 * There is no OBD_MD_* flag for obdo::o_stripe_idx, so set it
 	 * unconditionally. It never changes anyway.
 	 */
-	attr->cra_oa->o_stripe_idx = subobj->lso_index;
+	attr->cra_oa->o_stripe_idx = lov_comp_stripe(subobj->lso_index);
 }
 
 static const struct cl_object_operations lovsub_ops = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 15/28] lustre: clio: client side implementation for PFL
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (13 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 14/28] lustre: lov: create lov_comp_* wrappers James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 16/28] lustre: clio: getstripe support comp layout James Simmons
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Make client layer support composite layout.

Plain layout will be stored in LOV layer as a composite layout
containing a single component.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24850
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_user.h |   9 +
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  25 +-
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  21 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |  10 +-
 drivers/staging/lustre/lustre/lov/lov_io.c         | 301 +++++++++++----------
 drivers/staging/lustre/lustre/lov/lov_lock.c       |  83 +++---
 drivers/staging/lustre/lustre/lov/lov_object.c     | 283 ++++++++++---------
 drivers/staging/lustre/lustre/lov/lov_offset.c     |  12 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c       |   2 +-
 drivers/staging/lustre/lustre/lov/lov_page.c       |   8 +-
 10 files changed, 436 insertions(+), 318 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index 3751b22..67b2ae4 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -401,6 +401,15 @@ struct lu_extent {
 	__u64	e_end;
 };
 
+#define DEXT "[ %#llx , %#llx )"
+#define PEXT(ext) (ext)->e_start, (ext)->e_end
+
+static inline bool lu_extent_is_overlapped(struct lu_extent *e1,
+					    struct lu_extent *e2)
+{
+	return e1->e_start < e2->e_end && e2->e_start < e1->e_end;
+}
+
 enum lov_comp_md_entry_flags {
 	LCME_FL_PRIMARY		= 0x00000001,   /* Not used */
 	LCME_FL_STALE		= 0x00000002,   /* Not used */
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 952da3a..96e6636 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -224,6 +224,7 @@ struct lov_object {
 			 */
 			unsigned int lo_entry_count;
 			struct lov_layout_entry {
+				struct lu_extent lle_extent;
 				struct lov_layout_raid0 lle_raid0;
 			} *lo_entries;
 		} composite;
@@ -320,15 +321,9 @@ struct lov_thread_info {
  */
 struct lov_io_sub {
 	/**
-	 * true, iff cl_io_init() was successfully executed against
-	 * lov_io_sub::sub_io.
+	 * Linkage into a list (hanging off lov_io::lis_subios)
 	 */
-	u16			 sub_io_initialized:1,
-	/**
-	 * True, iff lov_io_sub::sub_io and lov_io_sub::sub_env weren't
-	 * allocated, but borrowed from a per-device emergency pool.
-	 */
-				 sub_borrowed:1;
+	struct list_head	sub_list;
 	/**
 	 * Linkage into a list (hanging off lov_io::lis_active) of all
 	 * sub-io's active for the current IO iteration.
@@ -340,7 +335,7 @@ struct lov_io_sub {
 	 * independently, with lov acting as a scheduler to maximize overall
 	 * throughput.
 	 */
-	struct cl_io		*sub_io;
+	struct cl_io		sub_io;
 	/**
 	 * environment, in which sub-io executes.
 	 */
@@ -351,6 +346,7 @@ struct lov_io_sub {
 	 * \see cl_env_get()
 	 */
 	u16			sub_refcheck;
+	u16			sub_reenter;
 };
 
 /**
@@ -384,14 +380,13 @@ struct lov_io {
 	 * exclusive (i.e., next offset after last byte affected by io).
 	 */
 	u64			lis_endpos;
-	int			lis_stripe_count;
-	int			lis_active_subios;
+	int			lis_nr_subios;
 
 	/**
 	 * the index of ls_single_subio in ls_subios array
 	 */
 	int			lis_single_subio_index;
-	struct cl_io		lis_single_subio;
+	struct lov_io_sub	lis_single_subio;
 
 	/**
 	 * List of active sub-io's. Active sub-io's are under the range
@@ -400,10 +395,9 @@ struct lov_io {
 	struct list_head	lis_active;
 
 	/**
-	 * size of ls_subios array, actually the highest stripe #
+	 * All sub-io's created in this lov_io.
 	 */
-	int		lis_nr_subios;
-	struct lov_io_sub *lis_subs;
+	struct list_head	lis_subios;
 };
 
 struct lov_session {
@@ -466,6 +460,7 @@ struct lu_object *lovsub_object_alloc(const struct lu_env *env,
 				      struct lu_device *dev);
 
 struct lov_stripe_md *lov_lsm_addref(struct lov_object *lov);
+int lov_lsm_entry(const struct lov_stripe_md *lsm, u64 offset);
 
 #define lov_foreach_target(lov, var)		    \
 	for (var = 0; var < lov_targets_nr(lov); ++var)
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index f89284a..124c12d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -519,9 +519,26 @@ void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
 
 		CDEBUG(level,
-		       ": id: %u, magic 0x%08X, stripe count %u, size %u, layout_gen %u, pool: [" LOV_POOLNAMEF "]\n",
-		       lse->lsme_id, lse->lsme_magic,
+		       DEXT ": id: %u, magic 0x%08X, stripe count %u, size %u, layout_gen %u, pool: [" LOV_POOLNAMEF "]\n",
+		       PEXT(&lse->lsme_extent), lse->lsme_id, lse->lsme_magic,
 		       lse->lsme_stripe_count, lse->lsme_stripe_size,
 		       lse->lsme_layout_gen, lse->lsme_pool_name);
 	}
 }
+
+int lov_lsm_entry(const struct lov_stripe_md *lsm, u64 offset)
+{
+	int i;
+
+	for (i = 0; i < lsm->lsm_entry_count; i++) {
+		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
+
+		if ((offset >= lse->lsme_extent.e_start &&
+		     offset < lse->lsme_extent.e_end) ||
+		    (offset == OBD_OBJECT_EOF &&
+		     lse->lsme_extent.e_end == OBD_OBJECT_EOF))
+			return i;
+	}
+
+	return -1;
+}
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index ef47c67..29325ff 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -81,7 +81,10 @@ static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 
 static inline unsigned int lov_comp_index(int entry, int stripe)
 {
-	return stripe;
+	LASSERT(entry >= 0 && entry <= SHRT_MAX);
+	LASSERT(stripe >= 0 && stripe < USHRT_MAX);
+
+	return entry << 16 | stripe;
 }
 
 static inline int lov_comp_stripe(int index)
@@ -91,7 +94,7 @@ static inline int lov_comp_stripe(int index)
 
 static inline int lov_comp_entry(int index)
 {
-	return 0;
+	return index >> 16;
 }
 
 struct lsm_operations {
@@ -191,8 +194,7 @@ int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off,
 u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size,
 		       int stripeno);
 int lov_stripe_intersects(struct lov_stripe_md *lsm, int index, int stripeno,
-			  u64 start, u64 end,
-			  u64 *obd_start, u64 *obd_end);
+			  struct lu_extent *ext, u64 *obd_start, u64 *obd_end);
 int lov_stripe_number(struct lov_stripe_md *lsm, int index, u64 lov_off);
 pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, int index,
 			 pgoff_t stripe_index, int stripe);
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 635e5a6..d9b2a81 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -43,24 +43,46 @@
 /** \addtogroup lov
  *  @{
  */
+
+static inline struct lov_io_sub *lov_sub_alloc(struct lov_io *lio, int index)
+{
+	struct lov_io_sub *sub;
+
+	if (lio->lis_nr_subios == 0) {
+		LASSERT(lio->lis_single_subio_index == -1);
+		sub = &lio->lis_single_subio;
+		lio->lis_single_subio_index = index;
+		memset(sub, 0, sizeof(*sub));
+	} else {
+		sub = kzalloc(sizeof(*sub), GFP_KERNEL);
+	}
+
+	if (sub) {
+		INIT_LIST_HEAD(&sub->sub_list);
+		INIT_LIST_HEAD(&sub->sub_linkage);
+		sub->sub_subio_index = index;
+	}
+
+	return sub;
+}
+
+static inline void lov_sub_free(struct lov_io *lio, struct lov_io_sub *sub)
+{
+	if (sub->sub_subio_index == lio->lis_single_subio_index) {
+		LASSERT(sub == &lio->lis_single_subio);
+		lio->lis_single_subio_index = -1;
+	} else {
+		kfree(sub);
+	}
+}
+
 static void lov_io_sub_fini(const struct lu_env *env, struct lov_io *lio,
 			    struct lov_io_sub *sub)
 {
-	if (sub->sub_io) {
-		if (sub->sub_io_initialized) {
-			cl_io_fini(sub->sub_env, sub->sub_io);
-			sub->sub_io_initialized = 0;
-			lio->lis_active_subios--;
-		}
-		if (sub->sub_subio_index == lio->lis_single_subio_index)
-			lio->lis_single_subio_index = -1;
-		else if (!sub->sub_borrowed)
-			kfree(sub->sub_io);
-		sub->sub_io = NULL;
-	}
-	if (!IS_ERR_OR_NULL(sub->sub_env)) {
-		if (!sub->sub_borrowed)
-			cl_env_put(sub->sub_env, &sub->sub_refcheck);
+	cl_io_fini(sub->sub_env, &sub->sub_io);
+
+	if (sub->sub_env && !IS_ERR(sub->sub_env)) {
+		cl_env_put(sub->sub_env, &sub->sub_refcheck);
 		sub->sub_env = NULL;
 	}
 }
@@ -74,46 +96,24 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	struct cl_io      *io  = lio->lis_cl.cis_io;
 	int index = lov_comp_entry(sub->sub_subio_index);
 	int stripe = lov_comp_stripe(sub->sub_subio_index);
-	int rc;
+	int rc = 0;
 
-	LASSERT(!sub->sub_io);
 	LASSERT(!sub->sub_env);
-	LASSERT(sub->sub_subio_index < lio->lis_stripe_count);
 
 	if (unlikely(!lov_r0(lov, index)->lo_sub[stripe]))
 		return -EIO;
 
-	sub->sub_io_initialized = 0;
-	sub->sub_borrowed = 0;
-
 	/* obtain new environment */
 	sub->sub_env = cl_env_get(&sub->sub_refcheck);
-	if (IS_ERR(sub->sub_env)) {
+	if (IS_ERR(sub->sub_env))
 		rc = PTR_ERR(sub->sub_env);
-		goto fini_lov_io;
-	}
-
-	/*
-	 * First sub-io. Use ->lis_single_subio to
-	 * avoid dynamic allocation.
-	 */
-	if (lio->lis_active_subios == 0) {
-		sub->sub_io = &lio->lis_single_subio;
-		lio->lis_single_subio_index = stripe;
-	} else {
-		sub->sub_io = kzalloc(sizeof(*sub->sub_io),
-				      GFP_NOFS);
-		if (!sub->sub_io) {
-			rc = -ENOMEM;
-			goto fini_lov_io;
-		}
-	}
 
 	sub_obj = lovsub2cl(lov_r0(lov, index)->lo_sub[stripe]);
-	sub_io = sub->sub_io;
+	sub_io = &sub->sub_io;
 
 	sub_io->ci_obj = sub_obj;
 	sub_io->ci_result = 0;
+
 	sub_io->ci_parent = io;
 	sub_io->ci_lockreq = io->ci_lockreq;
 	sub_io->ci_type = io->ci_type;
@@ -121,31 +121,42 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 	sub_io->ci_noatime = io->ci_noatime;
 
 	rc = cl_io_sub_init(sub->sub_env, sub_io, io->ci_type, sub_obj);
-	if (rc >= 0) {
-		lio->lis_active_subios++;
-		sub->sub_io_initialized = 1;
-		rc = 0;
-	}
-fini_lov_io:
-	if (rc)
+	if (rc < 0)
 		lov_io_sub_fini(env, lio, sub);
+
 	return rc;
 }
 
 struct lov_io_sub *lov_sub_get(const struct lu_env *env,
 			       struct lov_io *lio, int index)
 {
-	int rc;
-	struct lov_io_sub *sub = &lio->lis_subs[index];
+	struct lov_io_sub *sub;
+	int rc = 0;
 
-	LASSERT(index < lio->lis_stripe_count);
+	list_for_each_entry(sub, &lio->lis_subios, sub_list) {
+		if (sub->sub_subio_index == index) {
+			rc = 1;
+			break;
+		}
+	}
+
+	if (rc == 0) {
+		sub = lov_sub_alloc(lio, index);
+		if (!sub) {
+			rc = -ENOMEM;
+			goto out;
+		}
 
-	if (!sub->sub_io_initialized) {
-		sub->sub_subio_index = index;
 		rc = lov_io_sub_init(env, lio, sub);
-	} else {
-		rc = 0;
+		if (rc < 0) {
+			lov_sub_free(lio, sub);
+			goto out;
+		}
+
+		list_add_tail(&sub->sub_list, &lio->lis_subios);
+		lio->lis_nr_subios++;
 	}
+out:
 	if (rc < 0)
 		sub = ERR_PTR(rc);
 
@@ -162,6 +173,7 @@ static int lov_page_index(const struct cl_page *page)
 	const struct cl_page_slice *slice;
 
 	slice = cl_page_at(page, &lov_device_type);
+	LASSERT(slice);
 	LASSERT(slice->cpl_obj);
 
 	return cl2lov_page(slice)->lps_index;
@@ -170,28 +182,13 @@ static int lov_page_index(const struct cl_page *page)
 static int lov_io_subio_init(const struct lu_env *env, struct lov_io *lio,
 			     struct cl_io *io)
 {
-	struct lov_stripe_md *lsm;
-	int result;
-
 	LASSERT(lio->lis_object);
-	lsm = lio->lis_object->lo_lsm;
 
-	/*
-	 * Need to be optimized, we can't afford to allocate a piece of memory
-	 * when writing a page. -jay
-	 */
-	lio->lis_subs = kcalloc(lsm->lsm_entries[0]->lsme_stripe_count,
-				sizeof(lio->lis_subs[0]),
-				GFP_KERNEL);
-	if (lio->lis_subs) {
-		lio->lis_nr_subios = lio->lis_stripe_count;
-		lio->lis_single_subio_index = -1;
-		lio->lis_active_subios = 0;
-		result = 0;
-	} else {
-		result = -ENOMEM;
-	}
-	return result;
+	INIT_LIST_HEAD(&lio->lis_subios);
+	lio->lis_single_subio_index = -1;
+	lio->lis_nr_subios = 0;
+
+	return 0;
 }
 
 static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
@@ -200,7 +197,7 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 	io->ci_result = 0;
 	lio->lis_object = obj;
 
-	lio->lis_stripe_count = obj->lo_lsm->lsm_entries[0]->lsme_stripe_count;
+	LASSERT(obj->lo_lsm);
 
 	switch (io->ci_type) {
 	case CIT_READ:
@@ -272,14 +269,21 @@ static void lov_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 {
 	struct lov_io *lio = cl2lov_io(env, ios);
 	struct lov_object *lov = cl2lov(ios->cis_obj);
-	int i;
 
-	if (lio->lis_subs) {
-		for (i = 0; i < lio->lis_nr_subios; i++)
-			lov_io_sub_fini(env, lio, &lio->lis_subs[i]);
-		kvfree(lio->lis_subs);
-		lio->lis_nr_subios = 0;
+	LASSERT(list_empty(&lio->lis_active));
+
+	while (!list_empty(&lio->lis_subios)) {
+		struct lov_io_sub *sub = list_entry(lio->lis_subios.next,
+						    struct lov_io_sub,
+						    sub_list);
+
+		list_del_init(&sub->sub_list);
+		lio->lis_nr_subios--;
+
+		lov_io_sub_fini(env, lio, sub);
+		lov_sub_free(lio, sub);
 	}
+	LASSERT(lio->lis_nr_subios == 0);
 
 	LASSERT(atomic_read(&lov->lo_active_ios) > 0);
 	if (atomic_dec_and_test(&lov->lo_active_ios))
@@ -287,12 +291,13 @@ static void lov_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 }
 
 static void lov_io_sub_inherit(struct lov_io_sub *sub, struct lov_io *lio,
-			       int stripe, loff_t start, loff_t end)
+			       loff_t start, loff_t end)
 {
-	struct cl_io *io = sub->sub_io;
+	struct cl_io *io = &sub->sub_io;
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
 	struct cl_io *parent = lio->lis_cl.cis_io;
 	int index = lov_comp_entry(sub->sub_subio_index);
+	int stripe = lov_comp_stripe(sub->sub_subio_index);
 
 	switch (io->ci_type) {
 	case CIT_SETATTR: {
@@ -321,7 +326,7 @@ static void lov_io_sub_inherit(struct lov_io_sub *sub, struct lov_io *lio,
 	}
 	case CIT_FAULT: {
 		struct cl_object *obj = parent->ci_obj;
-		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
+		u64 off = cl_offset(obj, parent->u.ci_fault.ft_index);
 
 		io->u.ci_fault = parent->u.ci_fault;
 		off = lov_size_to_stripe(lsm, index, off, stripe);
@@ -373,11 +378,12 @@ static int lov_io_iter_init(const struct lu_env *env,
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
 	struct lov_layout_entry *le;
 	struct lov_io_sub    *sub;
-	u64 endpos;
+	struct lu_extent ext;
 	int rc = 0;
 	int index;
 
-	endpos = lov_offset_mod(lio->lis_endpos, -1);
+	ext.e_start = lio->lis_pos;
+	ext.e_end = lio->lis_endpos;
 
 	index = 0;
 	lov_foreach_layout_entry(lio->lis_object, le) {
@@ -387,11 +393,12 @@ static int lov_io_iter_init(const struct lu_env *env,
 		u64 end;
 
 		index++;
+		if (!lu_extent_is_overlapped(&ext, &le->lle_extent))
+			continue;
 
 		for (stripe = 0; stripe < r0->lo_nr; stripe++) {
 			if (!lov_stripe_intersects(lsm, index - 1, stripe,
-						   lio->lis_pos,
-						   endpos, &start, &end))
+						   &ext, &start, &end))
 				continue;
 
 			if (unlikely(!r0->lo_sub[stripe])) {
@@ -411,10 +418,10 @@ static int lov_io_iter_init(const struct lu_env *env,
 				break;
 			}
 
-			lov_io_sub_inherit(sub, lio, stripe, start, end);
-			rc = cl_io_iter_init(sub->sub_env, sub->sub_io);
+			lov_io_sub_inherit(sub, lio, start, end);
+			rc = cl_io_iter_init(sub->sub_env, &sub->sub_io);
 			if (rc) {
-				cl_io_iter_fini(sub->sub_env, sub->sub_io);
+				cl_io_iter_fini(sub->sub_env, &sub->sub_io);
 				break;
 			}
 
@@ -437,31 +444,50 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	u64 start = io->u.ci_rw.crw_pos;
 	struct lov_stripe_md_entry *lse;
 	unsigned long ssize;
-	loff_t next;
-	int index = 0;
+	int index;
+	u64 next;
 
 	LASSERT(io->ci_type == CIT_READ || io->ci_type == CIT_WRITE);
 
+	if (cl_io_is_append(io))
+		return lov_io_iter_init(env, ios);
+
+	index = lov_lsm_entry(lio->lis_object->lo_lsm, io->u.ci_rw.crw_pos);
+	if (index < 0) { /* non-existing layout component */
+		if (io->ci_type == CIT_READ) {
+			/* TODO: it needs to detect the next component and
+			 * then set the next pos
+			 */
+			io->ci_continue = 0;
+
+			return lov_io_iter_init(env, ios);
+		}
+
+		return -ENODATA;
+	}
+
 	lse = lov_lse(lio->lis_object, index);
 
 	ssize = lse->lsme_stripe_size;
+	lov_do_div64(start, ssize);
+	next = (start + 1) * ssize;
+	if (next <= start * ssize)
+		next = ~0ull;
+
+	LASSERT(io->u.ci_rw.crw_pos >= lse->lsme_extent.e_start);
+	next = min_t(u64, next, lse->lsme_extent.e_end);
+	next = min_t(u64, next, lio->lis_io_endpos);
+
+	io->ci_continue = next < lio->lis_io_endpos;
+	io->u.ci_rw.crw_count = next - io->u.ci_rw.crw_pos;
+	lio->lis_pos = io->u.ci_rw.crw_pos;
+	lio->lis_endpos = io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count;
+
+	CDEBUG(D_VFSTRACE,
+	       "stripe: %llu chunk: [%llu, %llu) %llu\n",
+	       (u64)start, lio->lis_pos, lio->lis_endpos,
+	       (u64)lio->lis_io_endpos);
 
-	/* fast path for common case. */
-	if (lio->lis_nr_subios != 1 && !cl_io_is_append(io)) {
-		lov_do_div64(start, ssize);
-		next = (start + 1) * ssize;
-		if (next <= start * ssize)
-			next = ~0ull;
-
-		io->ci_continue = next < lio->lis_io_endpos;
-		io->u.ci_rw.crw_count = min_t(loff_t, lio->lis_io_endpos,
-					      next) - io->u.ci_rw.crw_pos;
-		lio->lis_pos    = io->u.ci_rw.crw_pos;
-		lio->lis_endpos = io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count;
-		CDEBUG(D_VFSTRACE, "stripe: %llu chunk: [%llu, %llu) %llu\n",
-		       (__u64)start, lio->lis_pos, lio->lis_endpos,
-		       (__u64)lio->lis_io_endpos);
-	}
 	/*
 	 * XXX The following call should be optimized: we know, that
 	 * [lio->lis_pos, lio->lis_endpos) intersects with exactly one stripe.
@@ -477,12 +503,12 @@ static int lov_io_call(const struct lu_env *env, struct lov_io *lio,
 	int rc = 0;
 
 	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
-		rc = iofunc(sub->sub_env, sub->sub_io);
+		rc = iofunc(sub->sub_env, &sub->sub_io);
 		if (rc)
 			break;
 
 		if (parent->ci_result == 0)
-			parent->ci_result = sub->sub_io->ci_result;
+			parent->ci_result = sub->sub_io.ci_result;
 	}
 	return rc;
 }
@@ -539,13 +565,13 @@ static void lov_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
 	struct lov_io_sub *sub;
 
 	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
-		lov_io_end_wrapper(sub->sub_env, sub->sub_io);
+		lov_io_end_wrapper(sub->sub_env, &sub->sub_io);
 
 		parent->u.ci_data_version.dv_data_version +=
-			sub->sub_io->u.ci_data_version.dv_data_version;
+			sub->sub_io.u.ci_data_version.dv_data_version;
 
 		if (!parent->ci_result)
-			parent->ci_result = sub->sub_io->ci_result;
+			parent->ci_result = sub->sub_io.ci_result;
 	}
 }
 
@@ -581,12 +607,18 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	unsigned int pps; /* pages per stripe */
 	struct lov_io_sub *sub;
 	pgoff_t ra_end;
+	u64 offset;
 	u64 suboff;
 	int stripe;
-	int index = 0;
+	int index;
 	int rc;
 
-	stripe = lov_stripe_number(loo->lo_lsm, index, cl_offset(obj, start));
+	offset = cl_offset(obj, start);
+	index = lov_lsm_entry(loo->lo_lsm, offset);
+	if (index < 0)
+		return -ENODATA;
+
+	stripe = lov_stripe_number(loo->lo_lsm, index, offset);
 
 	r0 = lov_r0(loo, index);
 	if (unlikely(!r0->lo_sub[stripe]))
@@ -596,8 +628,8 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
 
-	lov_stripe_offset(loo->lo_lsm, index, cl_offset(obj, start), stripe, &suboff);
-	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,
+	lov_stripe_offset(loo->lo_lsm, index, offset, stripe, &suboff);
+	rc = cl_io_read_ahead(sub->sub_env, &sub->sub_io,
 			      cl_index(lovsub2cl(r0->lo_sub[stripe]), suboff),
 			      ra);
 
@@ -623,8 +655,8 @@ static int lov_io_read_ahead(const struct lu_env *env,
 	pps = lov_lse(loo, index)->lsme_stripe_size >> PAGE_SHIFT;
 
 	CDEBUG(D_READA,
-	       DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
-	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
+	       DFID " max_index = %lu, pps = %u, index = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
+	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps, index,
 	       lov_lse(loo, index)->lsme_stripe_size, stripe, start);
 
 	/* never exceed the end of the stripe */
@@ -659,20 +691,17 @@ static int lov_io_submit(const struct lu_env *env,
 	int index;
 	int rc = 0;
 
-	if (lio->lis_active_subios == 1) {
+	if (lio->lis_nr_subios == 1) {
 		int idx = lio->lis_single_subio_index;
 
-		LASSERT(idx < lio->lis_nr_subios);
 		sub = lov_sub_get(env, lio, idx);
 		LASSERT(!IS_ERR(sub));
-		LASSERT(sub->sub_io == &lio->lis_single_subio);
-		rc = cl_io_submit_rw(sub->sub_env, sub->sub_io,
+		LASSERT(sub == &lio->lis_single_subio);
+		rc = cl_io_submit_rw(sub->sub_env, &sub->sub_io,
 				     crt, queue);
 		return rc;
 	}
 
-	LASSERT(lio->lis_subs);
-
 	cl_page_list_init(plist);
 	while (qin->pl_nr > 0) {
 		struct cl_2queue *cl2q = &lov_env_info(env)->lti_cl2q;
@@ -693,7 +722,7 @@ static int lov_io_submit(const struct lu_env *env,
 
 		sub = lov_sub_get(env, lio, index);
 		if (!IS_ERR(sub)) {
-			rc = cl_io_submit_rw(sub->sub_env, sub->sub_io,
+			rc = cl_io_submit_rw(sub->sub_env, &sub->sub_io,
 					     crt, cl2q);
 		} else {
 			rc = PTR_ERR(sub);
@@ -724,20 +753,17 @@ static int lov_io_commit_async(const struct lu_env *env,
 	struct cl_page *page;
 	int rc = 0;
 
-	if (lio->lis_active_subios == 1) {
+	if (lio->lis_nr_subios == 1) {
 		int idx = lio->lis_single_subio_index;
 
-		LASSERT(idx < lio->lis_nr_subios);
 		sub = lov_sub_get(env, lio, idx);
 		LASSERT(!IS_ERR(sub));
-		LASSERT(sub->sub_io == &lio->lis_single_subio);
-		rc = cl_io_commit_async(sub->sub_env, sub->sub_io, queue,
+		LASSERT(sub == &lio->lis_single_subio);
+		rc = cl_io_commit_async(sub->sub_env, &sub->sub_io, queue,
 					from, to, cb);
 		return rc;
 	}
 
-	LASSERT(lio->lis_subs);
-
 	cl_page_list_init(plist);
 	while (queue->pl_nr > 0) {
 		int stripe_to = to;
@@ -761,7 +787,7 @@ static int lov_io_commit_async(const struct lu_env *env,
 
 		sub = lov_sub_get(env, lio, index);
 		if (!IS_ERR(sub)) {
-			rc = cl_io_commit_async(sub->sub_env, sub->sub_io,
+			rc = cl_io_commit_async(sub->sub_env, &sub->sub_io,
 						plist, from, stripe_to, cb);
 		} else {
 			rc = PTR_ERR(sub);
@@ -797,7 +823,8 @@ static int lov_io_fault_start(const struct lu_env *env,
 	sub = lov_sub_get(env, lio, lov_page_index(fio->ft_page));
 	if (IS_ERR(sub))
 		return PTR_ERR(sub);
-	sub->sub_io->u.ci_fault.ft_nob = fio->ft_nob;
+	sub->sub_io.u.ci_fault.ft_nob = fio->ft_nob;
+
 	return lov_io_start(env, ios);
 }
 
@@ -810,7 +837,7 @@ static void lov_io_fsync_end(const struct lu_env *env,
 
 	*written = 0;
 	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
-		struct cl_io *subio = sub->sub_io;
+		struct cl_io *subio = &sub->sub_io;
 
 		lov_io_end_wrapper(sub->sub_env, subio);
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index cc08e96..ba31be4 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -76,7 +76,7 @@ static struct lov_sublock_env *lov_sublock_env_get(const struct lu_env *env,
 		sub = lov_sub_get(env, lio, lls->sub_index);
 		if (!IS_ERR(sub)) {
 			subenv->lse_env = sub->sub_env;
-			subenv->lse_io  = sub->sub_io;
+			subenv->lse_io = &sub->sub_io;
 		} else {
 			subenv = (void *)sub;
 		}
@@ -114,52 +114,65 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 					  const struct cl_object *obj,
 					  struct cl_lock *lock)
 {
-	struct lov_object *loo = cl2lov(obj);
-	struct lov_layout_raid0 *r0;
-	struct lov_lock	*lovlck;
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_lock *lovlck;
+	struct lu_extent ext;
 	int result = 0;
-	int index = 0;
+	int index;
 	int i;
 	int nr;
 	u64 start;
 	u64 end;
-	u64 file_start;
-	u64 file_end;
-
-	CDEBUG(D_INODE, "%p: lock/io FID " DFID "/" DFID ", lock/io clobj %p/%p\n",
-	       loo, PFID(lu_object_fid(lov2lu(loo))),
-	       PFID(lu_object_fid(&obj->co_lu)),
-	       lov2cl(loo), obj);
-
-	file_start = cl_offset(lov2cl(loo), lock->cll_descr.cld_start);
-	file_end   = cl_offset(lov2cl(loo), lock->cll_descr.cld_end + 1) - 1;
-
-	r0 = lov_r0(loo, index);
-	for (i = 0, nr = 0; i < r0->lo_nr; i++) {
-		/*
-		 * XXX for wide striping smarter algorithm is desirable,
-		 * breaking out of the loop, early.
-		 */
-		if (likely(r0->lo_sub[i]) && /* spare layout */
-		    lov_stripe_intersects(loo->lo_lsm, index, i,
-					  file_start, file_end, &start, &end))
-			nr++;
+
+	ext.e_start = cl_offset(obj, lock->cll_descr.cld_start);
+	if (lock->cll_descr.cld_end == CL_PAGE_EOF)
+		ext.e_end = OBD_OBJECT_EOF;
+	else
+		ext.e_end = cl_offset(obj, lock->cll_descr.cld_end + 1);
+
+	nr = 0;
+	for (index = lov_lsm_entry(lov->lo_lsm, ext.e_start);
+	     index != -1 && index < lov->lo_lsm->lsm_entry_count; index++) {
+		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
+
+		/* assume lsm entries are sorted. */
+		if (!lu_extent_is_overlapped(&ext,
+					     &lov_lse(lov, index)->lsme_extent))
+			break;
+
+		for (i = 0; i < r0->lo_nr; i++) {
+			if (likely(r0->lo_sub[i]) && /* spare layout */
+			    lov_stripe_intersects(lov->lo_lsm, index, i,
+						  &ext, &start, &end))
+				nr++;
+		}
 	}
-	LASSERT(nr > 0);
+	if (nr == 0)
+		return ERR_PTR(-EINVAL);
+
 	lovlck = kvzalloc(offsetof(struct lov_lock, lls_sub[nr]),
 				 GFP_NOFS);
 	if (!lovlck)
 		return ERR_PTR(-ENOMEM);
 
 	lovlck->lls_nr = nr;
-	for (i = 0, nr = 0; i < r0->lo_nr; ++i) {
-		if (likely(r0->lo_sub[i]) &&
-		    lov_stripe_intersects(loo->lo_lsm, index, i,
-					  file_start, file_end, &start, &end)) {
+	nr = 0;
+	for (index = lov_lsm_entry(lov->lo_lsm, ext.e_start);
+	     index < lov->lo_lsm->lsm_entry_count; index++) {
+		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
+
+		/* assume lsm entries are sorted. */
+		if (!lu_extent_is_overlapped(&ext,
+					     &lov_lse(lov, index)->lsme_extent))
+			break;
+		for (i = 0; i < r0->lo_nr; ++i) {
 			struct lov_lock_sub *lls = &lovlck->lls_sub[nr];
-			struct cl_lock_descr *descr;
+			struct cl_lock_descr *descr = &lls->sub_lock.cll_descr;
 
-			descr = &lls->sub_lock.cll_descr;
+			if (unlikely(!r0->lo_sub[i]) ||
+			    !lov_stripe_intersects(lov->lo_lsm, index, i,
+						   &ext, &start, &end))
+				continue;
 
 			LASSERT(!descr->cld_obj);
 			descr->cld_obj   = lovsub2cl(r0->lo_sub[i]);
@@ -267,8 +280,8 @@ static void lov_lock_cancel(const struct lu_env *env,
 			cl_lock_cancel(subenv->lse_env, sublock);
 		} else {
 			CL_LOCK_DEBUG(D_ERROR, env, slice->cls_lock,
-				      "%s fails with %ld.\n",
-				      __func__, PTR_ERR(subenv));
+				      "lov_lock_cancel fails with %ld.\n",
+				      PTR_ERR(subenv));
 		}
 	}
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 38258ce..337ded6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -130,14 +130,13 @@ static struct cl_object *lov_sub_find(const struct lu_env *env,
 
 static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 			struct cl_object *subobj, struct lov_layout_raid0 *r0,
-			int idx)
+			struct lov_oinfo *oinfo, int idx)
 {
 	int stripe = lov_comp_stripe(idx);
 	int entry = lov_comp_entry(idx);
 	struct cl_object_header *hdr;
 	struct cl_object_header *subhdr;
 	struct cl_object_header *parent;
-	struct lov_oinfo	*oinfo;
 	int result;
 
 	if (OBD_FAIL_CHECK(OBD_FAIL_LOV_INIT)) {
@@ -155,11 +154,10 @@ static int lov_init_sub(const struct lu_env *env, struct lov_object *lov,
 	hdr    = cl_object_header(lov2cl(lov));
 	subhdr = cl_object_header(subobj);
 
-	oinfo = lov->lo_lsm->lsm_entries[0]->lsme_oinfo[idx];
 	CDEBUG(D_INODE,
 	       DFID "@%p[%d:%d] -> " DFID "@%p: ostid: " DOSTID " ost idx: %d gen: %d\n",
-	       PFID(&subhdr->coh_lu.loh_fid), subhdr, entry, stripe,
-	       PFID(&hdr->coh_lu.loh_fid), hdr, POSTID(&oinfo->loi_oi),
+	       PFID(lu_object_fid(&subobj->co_lu)), subhdr, entry, stripe,
+	       PFID(lu_object_fid(lov2lu(lov))), hdr, POSTID(&oinfo->loi_oi),
 	       oinfo->loi_ost_idx, oinfo->loi_ost_gen);
 
 	/* reuse ->coh_attr_guard to protect coh_parent change */
@@ -221,14 +219,13 @@ static int lov_page_slice_fixup(struct lov_object *lov,
 
 static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			  struct lov_object *lov, int index,
-			  const struct cl_object_conf *conf,
 			  struct lov_layout_raid0 *r0)
 {
 	struct lov_stripe_md_entry *lse = lov_lse(lov, index);
-	struct cl_object *stripe;
 	struct lov_thread_info *lti = lov_env_info(env);
 	struct cl_object_conf *subconf = &lti->lti_stripe_conf;
 	struct lu_fid *ofid = &lti->lti_fid;
+	struct cl_object *stripe;
 	int result;
 	int psz;
 	int i;
@@ -238,20 +235,21 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 	LASSERT(r0->lo_nr <= lov_targets_nr(dev));
 
 	r0->lo_sub = kvzalloc(r0->lo_nr * sizeof(r0->lo_sub[0]),
-				     GFP_NOFS);
+			      GFP_KERNEL);
 	if (!r0->lo_sub)
 		return -ENOMEM;
 
 	psz = 0;
 	result = 0;
-	subconf->coc_inode = conf->coc_inode;
+	memset(subconf, 0, sizeof(*subconf));
+
 	/*
 	 * Create stripe cl_objects.
 	 */
 	for (i = 0; i < r0->lo_nr; ++i) {
 		struct lov_oinfo *oinfo = lse->lsme_oinfo[i];
+		int ost_idx = oinfo->loi_ost_idx;
 		struct cl_device *subdev;
-		int ost_idx;
 
 		if (lov_oinfo_is_dummy(oinfo))
 			continue;
@@ -261,7 +259,6 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 		if (result != 0)
 			goto out;
 
-		ost_idx = oinfo->loi_ost_idx;
 		if (!dev->ld_target[ost_idx]) {
 			CERROR("%s: OST %04x is not initialized\n",
 			       lov2obd(dev->ld_lov)->obd_name, ost_idx);
@@ -282,7 +279,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev,
 			goto out;
 		}
 
-		result = lov_init_sub(env, lov, stripe, r0,
+		result = lov_init_sub(env, lov, stripe, r0, oinfo,
 				      lov_comp_index(index, i));
 		if (result == -EAGAIN) { /* try again */
 			--i;
@@ -309,15 +306,17 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
 			      union lov_layout_state *state)
 {
 	struct lov_layout_composite *comp = &state->composite;
-	unsigned int entry_count = 1;
+	unsigned int entry_count;
 	unsigned int psz = 0;
 	int result = 0;
 	int i;
 
+	LASSERT(lsm->lsm_entry_count > 0);
 	LASSERT(!lov->lo_lsm);
 	lov->lo_lsm = lsm_addref(lsm);
 	lov->lo_layout_invalid = true;
 
+	entry_count = lsm->lsm_entry_count;
 	comp->lo_entry_count = entry_count;
 
 	comp->lo_entries = kcalloc(entry_count, sizeof(*comp->lo_entries),
@@ -328,8 +327,8 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
 	for (i = 0; i < entry_count; i++) {
 		struct lov_layout_entry *le = &comp->lo_entries[i];
 
-		result = lov_init_raid0(env, dev, lov, i, conf,
-					&le->lle_raid0);
+		le->lle_extent = lsm->lsm_entries[i]->lsme_extent;
+		result = lov_init_raid0(env, dev, lov, i, &le->lle_raid0);
 		if (result < 0)
 			break;
 
@@ -364,31 +363,30 @@ static struct cl_object *lov_find_subobj(const struct lu_env *env,
 	struct lov_thread_info *lti = lov_env_info(env);
 	struct lu_fid *ofid = &lti->lti_fid;
 	int stripe = lov_comp_stripe(index);
+	int entry = lov_comp_entry(index);
+	struct cl_object *result = NULL;
 	struct cl_device *subdev;
-	struct cl_object *result;
 	struct lov_oinfo *oinfo;
 	int ost_idx;
 	int rc;
 
-	if (lov->lo_type != LLT_COMP) {
-		result = NULL;
+	if (lov->lo_type != LLT_COMP)
+		goto out;
+
+	if (entry >= lsm->lsm_entry_count ||
+	    stripe >= lsm->lsm_entries[entry]->lsme_stripe_count)
 		goto out;
-	}
 
-	oinfo = lsm->lsm_entries[0]->lsme_oinfo[stripe];
+	oinfo = lsm->lsm_entries[entry]->lsme_oinfo[stripe];
 	ost_idx = oinfo->loi_ost_idx;
 	rc = ostid_to_fid(ofid, &oinfo->loi_oi, ost_idx);
-	if (rc) {
-		result = NULL;
+	if (rc)
 		goto out;
-	}
 
 	subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
 	result = lov_sub_find(env, subdev, ofid, NULL);
 out:
-	if (!result)
-		result = ERR_PTR(-EINVAL);
-	return result;
+	return result ? result : ERR_PTR(-EINVAL);
 }
 
 static int lov_delete_empty(const struct lu_env *env, struct lov_object *lov,
@@ -567,8 +565,8 @@ static int lov_print_composite(const struct lu_env *env, void *cookie,
 	for (i = 0; i < lsm->lsm_entry_count; i++) {
 		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
 
-		(*p)(env, cookie, ": { 0x%08X, %u, %u, %u, %u }\n",
-		     lse->lsme_magic,
+		(*p)(env, cookie, DEXT ": { 0x%08X, %u, %u, %u, %u }\n",
+		     PEXT(&lse->lsme_extent), lse->lsme_magic,
 		     lse->lsme_id, lse->lsme_layout_gen,
 		     lse->lsme_stripe_count, lse->lsme_stripe_size);
 		lov_print_raid0(env, cookie, p, lov_r0(lov, i));
@@ -584,10 +582,10 @@ static int lov_print_released(const struct lu_env *env, void *cookie,
 	struct lov_stripe_md	*lsm = lov->lo_lsm;
 
 	(*p)(env, cookie,
-	     "released: %s, lsm{%p 0x%08X %d %u %u}:\n",
+	     "released: %s, lsm{%p 0x%08X %d %u}:\n",
 	     lov->lo_layout_invalid ? "invalid" : "valid", lsm,
 	     lsm->lsm_magic, atomic_read(&lsm->lsm_refc),
-	     lsm->lsm_entries[0]->lsme_stripe_count, lsm->lsm_layout_gen);
+	     lsm->lsm_layout_gen);
 	return 0;
 }
 
@@ -601,6 +599,7 @@ static int lov_print_released(const struct lu_env *env, void *cookie,
 static int lov_attr_get_empty(const struct lu_env *env, struct cl_object *obj,
 			      struct cl_attr *attr)
 {
+	attr->cat_blocks = 0;
 	return 0;
 }
 
@@ -659,16 +658,18 @@ static int lov_attr_get_composite(const struct lu_env *env,
 	int result = 0;
 	int index = 0;
 
-	attr->cat_blocks = 0;
 	attr->cat_size = 0;
+	attr->cat_blocks = 0;
 	lov_foreach_layout_entry(lov, entry) {
 		struct lov_layout_raid0 *r0 = &entry->lle_raid0;
 		struct cl_attr *lov_attr = &r0->lo_attr;
 
 		result = lov_attr_get_raid0(env, lov, index, r0);
-		if (result)
+		if (result != 0)
 			break;
 
+		index++;
+
 		/* merge results */
 		attr->cat_blocks += lov_attr->cat_blocks;
 		if (attr->cat_size < lov_attr->cat_size)
@@ -742,13 +743,15 @@ static enum lov_layout_type lov_type(struct lov_stripe_md *lsm)
 	if (!lsm)
 		return LLT_EMPTY;
 
-	if (lsm->lsm_magic == LOV_MAGIC_COMP_V1)
-		return LLT_EMPTY;
-
 	if (lsm->lsm_is_released)
 		return LLT_RELEASED;
 
-	return LLT_COMP;
+	if (lsm->lsm_magic == LOV_MAGIC_V1 ||
+	    lsm->lsm_magic == LOV_MAGIC_V3 ||
+	    lsm->lsm_magic == LOV_MAGIC_COMP_V1)
+		return LLT_COMP;
+
+	return LLT_EMPTY;
 }
 
 static inline void lov_conf_freeze(struct lov_object *lov)
@@ -926,6 +929,8 @@ int lov_object_init(const struct lu_env *env, struct lu_object *obj,
 				   cconf->u.coc_layout.lb_len);
 		if (IS_ERR(lsm))
 			return PTR_ERR(lsm);
+
+		dump_lsm(D_INODE, lsm);
 	}
 
 	/* no locking is necessary, as object is being created */
@@ -1090,8 +1095,8 @@ int lov_lock_init(const struct lu_env *env, struct cl_object *obj,
  * over which the mapping is spread
  *
  * \param lsm [in]		striping information for the file
- * \param fm_start [in]		logical start of mapping
- * \param fm_end [in]		logical end of mapping
+ * @index			stripe component index
+ * @ext				logical extent of mapping
  * \param start_stripe [in]	starting stripe of the mapping
  * \param stripe_count [out]	the number of stripes across which to map is
  *				returned
@@ -1099,7 +1104,7 @@ int lov_lock_init(const struct lu_env *env, struct cl_object *obj,
  * \retval last_stripe		return the last stripe of the mapping
  */
 static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index,
-				   u64 fm_start, u64 fm_end,
+				   struct lu_extent *ext,
 				   int start_stripe, int *stripe_count)
 {
 	struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index];
@@ -1108,7 +1113,7 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index,
 	u64 obd_end;
 	int i, j;
 
-	if (fm_end - fm_start >
+	if (ext->e_end - ext->e_start >
 	    lsme->lsme_stripe_size * lsme->lsme_stripe_count) {
 		last_stripe = (start_stripe < 1 ? lsme->lsme_stripe_count - 1 :
 						  start_stripe - 1);
@@ -1116,7 +1121,7 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index,
 	} else {
 		for (j = 0, i = start_stripe; j < lsme->lsme_stripe_count;
 		     i = (i + 1) % lsme->lsme_stripe_count, j++) {
-			if (lov_stripe_intersects(lsm, index, i, fm_start, fm_end,
+			if (lov_stripe_intersects(lsm, index, i, ext,
 						  &obd_start, &obd_end) == 0)
 				break;
 		}
@@ -1170,13 +1175,13 @@ static void fiemap_prepare_and_copy_exts(struct fiemap *fiemap,
  *
  * \param fiemap [in]		fiemap request header
  * \param lsm [in]		striping information for the file
- * \param fm_start [in]		logical start of mapping
- * \param fm_end [in]		logical end of mapping
+ * @index			stripe component index
+ * @ext				logical extent of mapping
  * \param start_stripe [out]	starting stripe will be returned in this
  */
 static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 				     struct lov_stripe_md *lsm,
-				     int index, u64 fm_start, u64 fm_end,
+				     int index, struct lu_extent *ext,
 				     int *start_stripe)
 {
 	struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index];
@@ -1209,7 +1214,7 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 	 * If we have finished mapping on previous device, shift logical
 	 * offset to start of next device
 	 */
-	if (lov_stripe_intersects(lsm, index, stripe_no, fm_start, fm_end,
+	if (lov_stripe_intersects(lsm, index, stripe_no, ext,
 				  &lun_start, &lun_end) != 0 &&
 	    local_end < lun_end) {
 		fm_end_offset = local_end;
@@ -1227,16 +1232,15 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap,
 
 struct fiemap_state {
 	struct fiemap		*fs_fm;
-	u64			fs_start;
+	struct lu_extent	fs_ext;
 	u64			fs_length;
-	u64			fs_end;
 	u64			fs_end_offset;
 	int			fs_cur_extent;
 	int			fs_cnt_need;
 	int			fs_start_stripe;
 	int			fs_last_stripe;
 	bool			fs_device_done;
-	bool			fs_finish;
+	bool			fs_finish_stripe;
 	bool			fs_enough;
 };
 
@@ -1264,8 +1268,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 
 	fs->fs_device_done = false;
 	/* Find out range of mapping on this stripe */
-	if ((lov_stripe_intersects(lsm, index, stripeno,
-				   fs->fs_start, fs->fs_end,
+	if ((lov_stripe_intersects(lsm, index, stripeno, &fs->fs_ext,
 				   &lun_start, &obd_object_end)) == 0)
 		return 0;
 
@@ -1279,16 +1282,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 	if (fs->fs_end_offset != 0 && stripeno == fs->fs_start_stripe)
 		lun_start = fs->fs_end_offset;
 
-	lun_end = fs->fs_length;
-	if (lun_end != ~0ULL) {
-		/* Handle fs->fs_start + fs->fs_length overflow */
-		if (fs->fs_start + fs->fs_length < fs->fs_start)
-			fs->fs_length = ~0ULL - fs->fs_start;
-		lun_end = lov_size_to_stripe(lsm, index,
-					     fs->fs_start + fs->fs_length,
-					     stripeno);
-	}
-
+	lun_end = lov_size_to_stripe(lsm, index, fs->fs_ext.e_end, stripeno);
 	if (lun_start == lun_end)
 		return 0;
 
@@ -1316,6 +1310,11 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 		lun_start += len_mapped_single_call;
 		fs->fs_fm->fm_length = req_fm_len - len_mapped_single_call;
 		req_fm_len = fs->fs_fm->fm_length;
+		/**
+		 * If we've collected enough extent map, we'd request 1 more,
+		 * to see whether we coincidentally finished all available
+		 * extent map, so that FIEMAP_EXTENT_LAST would be set.
+		 */
 		fs->fs_fm->fm_extent_count = fs->fs_enough ?
 					     1 : fs->fs_cnt_need;
 		fs->fs_fm->fm_mapped_extents = 0;
@@ -1357,7 +1356,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 			 */
 			if (stripeno == fs->fs_last_stripe) {
 				fiemap->fm_mapped_extents = 0;
-				fs->fs_finish = true;
+				fs->fs_finish_stripe = true;
 				goto obj_put;
 			}
 			break;
@@ -1366,7 +1365,6 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 			 * We've collected enough extents and there are
 			 * more extents after it.
 			 */
-			fs->fs_finish = true;
 			goto obj_put;
 		}
 
@@ -1410,7 +1408,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj,
 	} while (!ost_done && !ost_eof);
 
 	if (stripeno == fs->fs_last_stripe)
-		fs->fs_finish = true;
+		fs->fs_finish_stripe = true;
 obj_put:
 	cl_object_put(env, subobj);
 
@@ -1436,26 +1434,35 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 			     struct fiemap *fiemap, size_t *buflen)
 {
 	unsigned int buffer_size = FIEMAP_BUFFER_SIZE;
+	struct lov_stripe_md_entry *lsme;
 	struct fiemap *fm_local = NULL;
 	struct lov_stripe_md *lsm;
-	int rc = 0;
-	int entry = 0;
-	int cur_stripe;
+	loff_t whole_start;
+	loff_t whole_end;
+	int entry;
+	int start_entry;
+	int end_entry;
+	int cur_stripe = 0;
 	int stripe_count;
+	int rc = 0;
 	struct fiemap_state fs = { NULL };
 
 	lsm = lov_lsm_addref(cl2lov(obj));
 	if (!lsm)
 		return -ENODATA;
 
-	/**
-	 * If the stripe_count > 1 and the application does not understand
-	 * DEVICE_ORDER flag, it cannot interpret the extents correctly.
-	 */
-	if (lsm->lsm_entries[0]->lsme_stripe_count > 1 &&
-	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
-		rc = -ENOTSUPP;
-		goto out;
+	if (!(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
+		/**
+		 * If the entry count > 1 or stripe_count > 1 and the
+		 * application does not understand DEVICE_ORDER flag,
+		 * it cannot interpret the extents correctly.
+		 */
+		if (lsm->lsm_entry_count > 1 ||
+		    (lsm->lsm_entry_count == 1 &&
+		     lsm->lsm_entries[0]->lsme_stripe_count > 1)) {
+			rc = -ENOTSUPP;
+			goto out_lsm;
+		}
 	}
 
 	if (lsm->lsm_is_released) {
@@ -1478,49 +1485,19 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 				FIEMAP_EXTENT_UNKNOWN | FIEMAP_EXTENT_LAST;
 		}
 		rc = 0;
-		goto out;
+		goto out_lsm;
 	}
 
+	/* buffer_size is small to hold fm_extent_count of extents. */
 	if (fiemap_count_to_size(fiemap->fm_extent_count) < buffer_size)
 		buffer_size = fiemap_count_to_size(fiemap->fm_extent_count);
 
 	fm_local = kvzalloc(buffer_size, GFP_NOFS);
 	if (!fm_local) {
 		rc = -ENOMEM;
-		goto out;
-	}
-	fs.fs_fm = fm_local;
-	fs.fs_cnt_need = fiemap_size_to_count(buffer_size);
-
-	fs.fs_start = fiemap->fm_start;
-	/* fs_start is beyond the end of the file */
-	if (fs.fs_start > fmkey->lfik_oa.o_size) {
-		rc = -EINVAL;
-		goto out;
-	}
-	/* Calculate start stripe, last stripe and length of mapping */
-	fs.fs_start_stripe = lov_stripe_number(lsm, 0, fs.fs_start);
-	fs.fs_end = (fs.fs_length == ~0ULL) ? fmkey->lfik_oa.o_size :
-					      fs.fs_start + fs.fs_length - 1;
-	/* If fs_length != ~0ULL but fs_start+fs_length-1 exceeds file size */
-	if (fs.fs_end > fmkey->lfik_oa.o_size) {
-		fs.fs_end = fmkey->lfik_oa.o_size;
-		fs.fs_length = fs.fs_end - fs.fs_start;
+		goto out_lsm;
 	}
 
-	fs.fs_last_stripe = fiemap_calc_last_stripe(lsm, entry,
-						    fs.fs_start, fs.fs_end,
-						    fs.fs_start_stripe,
-						    &stripe_count);
-	fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, entry,
-						     fs.fs_start, fs.fs_end,
-						     &fs.fs_start_stripe);
-	if (fs.fs_end_offset == -EINVAL) {
-		rc = -EINVAL;
-		goto out;
-	}
-
-
 	/**
 	 * Requested extent count exceeds the fiemap buffer size, shrink our
 	 * ambition.
@@ -1530,27 +1507,88 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	if (!fiemap->fm_extent_count)
 		fs.fs_cnt_need = 0;
 
-	fs.fs_finish = false;
 	fs.fs_enough = false;
 	fs.fs_cur_extent = 0;
+	fs.fs_fm = fm_local;
+	fs.fs_cnt_need = fiemap_size_to_count(buffer_size);
+
+	whole_start = fiemap->fm_start;
+	/* whole_start is beyond the end of the file */
+	if (whole_start > fmkey->lfik_oa.o_size) {
+		rc = -EINVAL;
+		goto out_fm_local;
+	}
+	whole_end = (fiemap->fm_length == OBD_OBJECT_EOF) ?
+		     fmkey->lfik_oa.o_size :
+		     whole_start + fiemap->fm_length - 1;
+	/**
+	 * If fiemap->fm_length != OBD_OBJECT_EOF but whole_end exceeds file
+	 * size
+	 */
+	if (whole_end > fmkey->lfik_oa.o_size)
+		whole_end = fmkey->lfik_oa.o_size;
+
+	start_entry = lov_lsm_entry(lsm, whole_start);
+	end_entry = lov_lsm_entry(lsm, whole_end);
+	if (end_entry == -1)
+		end_entry = lsm->lsm_entry_count - 1;
+
+	if (start_entry == -1 || end_entry == -1) {
+		rc = -EINVAL;
+		goto out_fm_local;
+	}
+
+	for (entry = start_entry; entry <= end_entry; entry++) {
+		lsme = lsm->lsm_entries[entry];
+
+		if (entry == start_entry)
+			fs.fs_ext.e_start = whole_start;
+		else
+			fs.fs_ext.e_start = lsme->lsme_extent.e_start;
+		if (entry == end_entry)
+			fs.fs_ext.e_end = whole_end;
+		else
+			fs.fs_ext.e_end = lsme->lsme_extent.e_end - 1;
+		fs.fs_length = fs.fs_ext.e_end - fs.fs_ext.e_start + 1;
+
+		/* Calculate start stripe, last stripe and length of mapping */
+		fs.fs_start_stripe = lov_stripe_number(lsm, entry,
+						       fs.fs_ext.e_start);
+		fs.fs_last_stripe = fiemap_calc_last_stripe(lsm, entry,
+							    &fs.fs_ext,
+							    fs.fs_start_stripe,
+							    &stripe_count);
+		fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, entry,
+							     &fs.fs_ext,
+							     &fs.fs_start_stripe);
+		/* Check each stripe */
+		for (cur_stripe = fs.fs_start_stripe; stripe_count > 0;
+		     --stripe_count,
+		     cur_stripe = (cur_stripe + 1) % lsme->lsme_stripe_count) {
+			rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen,
+					       fmkey, entry, cur_stripe, &fs);
+			if (rc < 0)
+				goto out_fm_local;
+			if (fs.fs_enough)
+				goto finish;
+			if (fs.fs_finish_stripe)
+				break;
+		 } /* for each stripe */
+	} /* for covering layout component */
 
-	/* Check each stripe */
-	for (cur_stripe = fs.fs_start_stripe; stripe_count > 0;
-	     --stripe_count,
-	     cur_stripe = (cur_stripe + 1) %
-			  lsm->lsm_entries[0]->lsme_stripe_count) {
-		rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen,
-				       fmkey, 0, cur_stripe, &fs);
-		if (rc < 0)
-			goto out;
-		if (fs.fs_finish)
-			break;
-	} /* for each stripe */
+	/*
+	 * We've traversed all components, set @entry to the last component
+	 * entry, it's for the last stripe check.
+	 */
+	entry--;
+finish:
 	/*
 	 * Indicate that we are returning device offsets unless file just has
 	 * single stripe
 	 */
-	if (lsm->lsm_entries[0]->lsme_stripe_count > 1)
+	if (lsm->lsm_entry_count > 1 ||
+	    (lsm->lsm_entry_count == 1 &&
+	     lsm->lsm_entries[0]->lsme_stripe_count > 1))
 		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
 
 	if (!fiemap->fm_extent_count)
@@ -1565,8 +1603,9 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 							FIEMAP_EXTENT_LAST;
 skip_last_device_calc:
 	fiemap->fm_mapped_extents = fs.fs_cur_extent;
-out:
+out_fm_local:
 	kvfree(fm_local);
+out_lsm:
 	lov_lsm_put(lsm);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_offset.c b/drivers/staging/lustre/lustre/lov/lov_offset.c
index 513f1fd..ab02c34 100644
--- a/drivers/staging/lustre/lustre/lov/lov_offset.c
+++ b/drivers/staging/lustre/lustre/lov/lov_offset.c
@@ -225,9 +225,19 @@ u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size,
  * stripe does intersect with the lov extent.
  */
 int lov_stripe_intersects(struct lov_stripe_md *lsm, int index, int stripeno,
-			  u64 start, u64 end, u64 *obd_start, u64 *obd_end)
+			  struct lu_extent *ext, u64 *obd_start, u64 *obd_end)
 {
+	struct lov_stripe_md_entry *entry = lsm->lsm_entries[index];
 	int start_side, end_side;
+	u64 start, end;
+
+	if (!lu_extent_is_overlapped(ext, &entry->lsme_extent))
+		return 0;
+
+	start = max_t(u64, ext->e_start, entry->lsme_extent.e_start);
+	end = min_t(u64, ext->e_end, entry->lsme_extent.e_end);
+	if (end != OBD_OBJECT_EOF)
+		end--;
 
 	start_side = lov_stripe_offset(lsm, index, start, stripeno, obd_start);
 	end_side = lov_stripe_offset(lsm, index, end, stripeno, obd_end);
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 8b7a572..ba7c488 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -189,8 +189,8 @@ int lov_free_memmd(struct lov_stripe_md **lsmp)
 	int refc;
 
 	*lsmp = NULL;
-	LASSERT(atomic_read(&lsm->lsm_refc) > 0);
 	refc = atomic_dec_return(&lsm->lsm_refc);
+	LASSERT(refc >= 0);
 	if (refc == 0)
 		lsm_free(lsm);
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index e227279..f53379a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -76,10 +76,16 @@ int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
 	u64 offset;
 	u64	    suboff;
 	int		stripe;
-	int entry = 0;
+	int entry;
 	int		rc;
 
 	offset = cl_offset(obj, index);
+	entry = lov_lsm_entry(loo->lo_lsm, offset);
+	if (entry < 0) {
+		/* non-existing layout component */
+		lov_page_init_empty(env, obj, page, index);
+		return 0;
+	}
 
 	r0 = lov_r0(loo, entry);
 	stripe = lov_stripe_number(loo->lo_lsm, entry, offset);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 16/28] lustre: clio: getstripe support comp layout
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (14 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 15/28] lustre: clio: client side implementation for PFL James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 17/28] lustre: pfl: enhance PFID EA for PFL James Simmons
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

{get/set}stripe support composite layout

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24851
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c          |  33 +++-
 drivers/staging/lustre/lustre/llite/file.c         |  25 ++-
 .../staging/lustre/lustre/llite/llite_internal.h   |   2 +
 drivers/staging/lustre/lustre/llite/xattr.c        |  70 +++++----
 drivers/staging/lustre/lustre/lov/lov_internal.h   |  24 +++
 drivers/staging/lustre/lustre/lov/lov_object.c     |   3 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c       | 172 ++++++++++++---------
 7 files changed, 203 insertions(+), 126 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index f1c1c9c..8fbce96 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -522,6 +522,15 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 			lum_size = sizeof(struct lov_user_md_v3);
 			break;
 		}
+		case LOV_USER_MAGIC_COMP_V1: {
+			if (lump->lmm_magic !=
+			    cpu_to_le32(LOV_USER_MAGIC_COMP_V1))
+				lustre_swab_lov_comp_md_v1(
+					(struct lov_comp_md_v1 *)lump);
+			lum_size = le32_to_cpu(
+				((struct lov_comp_md_v1 *)lump)->lcm_size);
+			break;
+		}
 		case LMV_USER_MAGIC: {
 			if (lump->lmm_magic != cpu_to_le32(LMV_USER_MAGIC))
 				lustre_swab_lmv_user_md(
@@ -562,7 +571,9 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 	 * LOV_USER_MAGIC_V3 have the same initial fields so we do not
 	 * need to make the distinction between the 2 versions
 	 */
-	if (set_default && mgc->u.cli.cl_mgc_mgsexp) {
+	if (set_default && mgc->u.cli.cl_mgc_mgsexp &&
+	    (!lump || le32_to_cpu(lump->lmm_magic) == LOV_USER_MAGIC_V1 ||
+	     le32_to_cpu(lump->lmm_magic) == LOV_USER_MAGIC_V3)) {
 		char *param = NULL;
 		char *buf;
 
@@ -577,23 +588,23 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 		buf += strlen(buf);
 
 		/* Set root stripesize */
-		sprintf(buf, ".stripesize=%u",
-			lump ? le32_to_cpu(lump->lmm_stripe_size) : 0);
+		snprintf(buf, MGS_PARAM_MAXLEN, ".stripesize=%u",
+			 lump ? le32_to_cpu(lump->lmm_stripe_size) : 0);
 		rc = ll_send_mgc_param(mgc->u.cli.cl_mgc_mgsexp, param);
 		if (rc)
 			goto end;
 
 		/* Set root stripecount */
-		sprintf(buf, ".stripecount=%hd",
-			lump ? le16_to_cpu(lump->lmm_stripe_count) : 0);
+		snprintf(buf, MGS_PARAM_MAXLEN, ".stripecount=%hd",
+			 lump ? le16_to_cpu(lump->lmm_stripe_count) : 0);
 		rc = ll_send_mgc_param(mgc->u.cli.cl_mgc_mgsexp, param);
 		if (rc)
 			goto end;
 
 		/* Set root stripeoffset */
-		sprintf(buf, ".stripeoffset=%hd",
-			lump ? le16_to_cpu(lump->lmm_stripe_offset) :
-			(typeof(lump->lmm_stripe_offset))(-1));
+		snprintf(buf, MGS_PARAM_MAXLEN, ".stripeoffset=%hd",
+			 lump ? le16_to_cpu(lump->lmm_stripe_offset) :
+				(typeof(lump->lmm_stripe_offset))(-1));
 		rc = ll_send_mgc_param(mgc->u.cli.cl_mgc_mgsexp, param);
 
 end:
@@ -669,6 +680,10 @@ int ll_dir_getstripe(struct inode *inode, void **plmm, int *plmm_size,
 		if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC)
 			lustre_swab_lov_user_md_v3((struct lov_user_md_v3 *)lmm);
 		break;
+	case LOV_MAGIC_COMP_V1:
+		if (LOV_MAGIC != cpu_to_le32(LOV_MAGIC))
+			lustre_swab_lov_comp_md_v1((struct lov_comp_md_v1 *)lmm);
+		break;
 	case LMV_MAGIC_V1:
 		if (cpu_to_le32(LMV_MAGIC) != LMV_MAGIC)
 			lustre_swab_lmv_mds_md((union lmv_mds_md *)lmm);
@@ -1217,6 +1232,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 
 		int set_default = 0;
 
+		BUILD_BUG_ON(sizeof(struct lov_user_md_v3) <=
+			     sizeof(struct lov_comp_md_v1));
 		LASSERT(sizeof(lumv3) == sizeof(*lumv3p));
 		LASSERT(sizeof(lumv3.lmm_objects[0]) ==
 			sizeof(lumv3p->lmm_objects[0]));
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index a6f149c..8d67d1a 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1430,8 +1430,9 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename,
 
 	lmm = req_capsule_server_sized_get(&req->rq_pill, &RMF_MDT_MD, lmmsize);
 
-	if ((lmm->lmm_magic != cpu_to_le32(LOV_MAGIC_V1)) &&
-	    (lmm->lmm_magic != cpu_to_le32(LOV_MAGIC_V3))) {
+	if (lmm->lmm_magic != cpu_to_le32(LOV_MAGIC_V1) &&
+	    lmm->lmm_magic != cpu_to_le32(LOV_MAGIC_V3) &&
+	    lmm->lmm_magic != cpu_to_le32(LOV_MAGIC_COMP_V1)) {
 		rc = -EPROTO;
 		goto out;
 	}
@@ -1444,9 +1445,13 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename,
 	if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) {
 		int stripe_count;
 
-		stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
-		if (le32_to_cpu(lmm->lmm_pattern) & LOV_PATTERN_F_RELEASED)
-			stripe_count = 0;
+		if (lmm->lmm_magic == cpu_to_le32(LOV_MAGIC_V1) ||
+		    lmm->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)) {
+			stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
+			if (le32_to_cpu(lmm->lmm_pattern) &
+			    LOV_PATTERN_F_RELEASED)
+				stripe_count = 0;
+		}
 
 		/* if function called for directory - we should
 		 * avoid swab not existent lsm objects
@@ -1463,6 +1468,8 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename,
 				lustre_swab_lov_user_md_objects(
 				 ((struct lov_user_md_v3 *)lmm)->lmm_objects,
 				 stripe_count);
+		} else if (lmm->lmm_magic == cpu_to_le32(LOV_MAGIC_COMP_V1)) {
+			lustre_swab_lov_comp_md_v1((struct lov_comp_md_v1 *)lmm);
 		}
 	}
 
@@ -1534,14 +1541,6 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file,
 	rc = ll_lov_setstripe_ea_info(inode, file->f_path.dentry, flags, klum,
 				      lum_size);
 	cl_lov_delay_create_clear(&file->f_flags);
-	if (rc == 0) {
-		__u32 gen;
-
-		put_user(0, &lum->lmm_stripe_count);
-
-		ll_layout_refresh(inode, &gen);
-		rc = ll_file_getstripe(inode, (struct lov_user_md __user *)arg);
-	}
 
 	kfree(klum);
 	return rc;
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 48424a4..e3f5450 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -927,6 +927,8 @@ static inline ssize_t ll_lov_user_md_size(const struct lov_user_md *lum)
 
 		return lov_user_md_size(lum->lmm_stripe_count,
 					LOV_USER_MAGIC_SPECIFIC);
+	case LOV_USER_MAGIC_COMP_V1:
+		return ((struct lov_comp_md_v1 *)lum)->lcm_size;
 	}
 	return -EINVAL;
 }
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index 5e27c85..3a28c4a 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -194,40 +194,53 @@ static int get_hsm_state(struct inode *inode, u32 *hus_states)
 
 static int ll_adjust_lum(struct inode *inode, struct lov_user_md *lump)
 {
+	struct lov_comp_md_v1 *comp_v1 = (struct lov_comp_md_v1 *)lump;
+	struct lov_user_md *v1 = lump;
+	bool need_clear_release = false;
+	bool release_checked = false;
+	bool is_composite = false;
+	u16 entry_count = 1;
 	int rc = 0;
+	int i;
 
 	if (!lump)
 		return 0;
 
-	/* Attributes that are saved via getxattr will always have
-	 * the stripe_offset as 0.  Instead, the MDS should be
-	 * allowed to pick the starting OST index.   b=17846
-	 */
-	if (lump->lmm_stripe_offset == 0)
-		lump->lmm_stripe_offset = -1;
+	if (lump->lmm_magic == LOV_USER_MAGIC_COMP_V1) {
+		entry_count = comp_v1->lcm_entry_count;
+		is_composite = true;
+        }
+
+	for (i = 0; i < entry_count; i++) {
+		if (lump->lmm_magic == LOV_USER_MAGIC_COMP_V1) {
+			void *ptr = comp_v1;
 
-	/* Avoid anyone directly setting the RELEASED flag. */
-	if (lump->lmm_pattern & LOV_PATTERN_F_RELEASED) {
-		/* Only if we have a released flag check if the file
-		 * was indeed archived.
+			ptr += comp_v1->lcm_entries[i].lcme_offset;
+			v1 = (struct lov_user_md *)ptr;
+		}
+
+		/* Attributes that are saved via getxattr will always have
+		 * the stripe_offset as 0.  Instead, the MDS should be
+		 * allowed to pick the starting OST index.   b=17846
 		 */
-		u32 state = HS_NONE;
-
-		rc = get_hsm_state(inode, &state);
-		if (rc)
-			return rc;
-
-		if (!(state & HS_ARCHIVED)) {
-			CDEBUG(D_VFSTRACE,
-			       "hus_states state = %x, pattern = %x\n",
-				state, lump->lmm_pattern);
-			/*
-			 * Here the state is: real file is not
-			 * archived but user is requesting to set
-			 * the RELEASED flag so we mask off the
-			 * released flag from the request
-			 */
-			lump->lmm_pattern ^= LOV_PATTERN_F_RELEASED;
+		if (!is_composite && v1->lmm_stripe_offset == 0)
+			v1->lmm_stripe_offset = -1;
+
+		/* Avoid anyone directly setting the RELEASED flag. */
+		if (v1->lmm_pattern & LOV_PATTERN_F_RELEASED) {
+			if (!release_checked) {
+				u32 state = HS_NONE;
+
+				rc = get_hsm_state(inode, &state);
+				if (rc)
+					return rc;
+
+				if (!(state & HS_ARCHIVED))
+					need_clear_release = true;
+				release_checked = true;
+			}
+			if (need_clear_release)
+				v1->lmm_pattern ^= LOV_PATTERN_F_RELEASED;
 		}
 	}
 
@@ -495,6 +508,9 @@ static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size)
 		 * recognizing layout gen as stripe offset when the
 		 * file is restored. See LU-2809.
 		 */
+		if (((struct lov_mds_md *)buf)->lmm_magic == LOV_MAGIC_COMP_V1)
+			goto out_env;
+
 		((struct lov_mds_md *)buf)->lmm_layout_gen = 0;
 out_env:
 		cl_env_put(env, &refcheck);
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 29325ff..9c0a4f7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -74,6 +74,30 @@ struct lov_stripe_md {
 	struct lov_stripe_md_entry *lsm_entries[];
 };
 
+static inline size_t lov_comp_md_size(const struct lov_stripe_md *lsm)
+{
+	struct lov_stripe_md_entry *lsme;
+	size_t size;
+	int entry;
+
+	if (lsm->lsm_magic == LOV_MAGIC_V1 || lsm->lsm_magic == LOV_MAGIC_V3)
+		return lov_mds_md_size(lsm->lsm_entries[0]->lsme_stripe_count,
+				       lsm->lsm_entries[0]->lsme_magic);
+
+	LASSERT(lsm->lsm_magic == LOV_MAGIC_COMP_V1);
+
+	size = sizeof(struct lov_comp_md_v1);
+	for (entry = 0; entry < lsm->lsm_entry_count; entry++) {
+		lsme = lsm->lsm_entries[entry];
+
+		size += sizeof(*lsme);
+		size += lov_mds_md_size(lsme->lsme_stripe_count,
+					lsme->lsme_magic);
+	}
+
+	return size;
+}
+
 static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
 {
 	return lsm && !lsm->lsm_is_released;
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 337ded6..66fb6f5 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1641,8 +1641,7 @@ static int lov_object_layout_get(const struct lu_env *env,
 		return 0;
 	}
 
-	cl->cl_size = lov_mds_md_size(lsm->lsm_entries[0]->lsme_stripe_count,
-				      lsm->lsm_magic);
+	cl->cl_size = lov_comp_md_size(lsm);
 	cl->cl_layout_gen = lsm->lsm_layout_gen;
 
 	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index ba7c488..79d8a32 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -107,8 +107,8 @@ void lov_dump_lmm_v3(int level, struct lov_mds_md_v3 *lmm)
  * then return the size needed. If \a buf_size is too small then
  * return -ERANGE. Otherwise return the size of the result.
  */
-ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
-		     size_t buf_size)
+ssize_t lov_lsm_pack_v1v3(const struct lov_stripe_md *lsm, void *buf,
+			  size_t buf_size)
 {
 	struct lov_ost_data_v1 *lmm_objects;
 	struct lov_mds_md_v1 *lmmv1 = buf;
@@ -157,6 +157,88 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	return lmm_size;
 }
 
+ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
+		     size_t buf_size)
+{
+	struct lov_comp_md_v1 *lcmv1 = buf;
+	struct lov_comp_md_entry_v1 *lcme;
+	struct lov_ost_data_v1 *lmm_objects;
+	unsigned int offset;
+	unsigned int entry;
+	unsigned int size;
+	unsigned int i;
+	size_t lmm_size;
+
+	if (lsm->lsm_magic == LOV_MAGIC_V1 || lsm->lsm_magic == LOV_MAGIC_V3)
+		return lov_lsm_pack_v1v3(lsm, buf, buf_size);
+
+	lmm_size = lov_comp_md_size(lsm);
+	if (buf_size == 0)
+		return lmm_size;
+
+	if (buf_size < lmm_size)
+		return -ERANGE;
+
+	lcmv1->lcm_magic = cpu_to_le32(lsm->lsm_magic);
+	lcmv1->lcm_size = cpu_to_le32(lmm_size);
+	lcmv1->lcm_layout_gen = cpu_to_le32(lsm->lsm_layout_gen);
+	lcmv1->lcm_entry_count = cpu_to_le16(lsm->lsm_entry_count);
+
+	offset = sizeof(*lcmv1) + sizeof(*lcme) * lsm->lsm_entry_count;
+
+	for (entry = 0; entry < lsm->lsm_entry_count; entry++) {
+		struct lov_stripe_md_entry *lsme;
+		struct lov_mds_md *lmm;
+
+		lsme = lsm->lsm_entries[entry];
+		lcme = &lcmv1->lcm_entries[entry];
+
+		lcme->lcme_id = cpu_to_le32(lsme->lsme_id);
+		lcme->lcme_extent.e_start =
+			cpu_to_le64(lsme->lsme_extent.e_start);
+		lcme->lcme_extent.e_end =
+			cpu_to_le64(lsme->lsme_extent.e_end);
+		lcme->lcme_offset = cpu_to_le32(offset);
+
+		lmm = (struct lov_mds_md *)((char *)lcmv1 + offset);
+		lmm->lmm_magic = cpu_to_le32(lsme->lsme_magic);
+		/* lmm->lmm_oi not set */
+		lmm->lmm_pattern = cpu_to_le32(lsme->lsme_pattern);
+		lmm->lmm_stripe_size = cpu_to_le32(lsme->lsme_stripe_size);
+		lmm->lmm_stripe_count = cpu_to_le16(lsme->lsme_stripe_count);
+		lmm->lmm_layout_gen = cpu_to_le16(lsme->lsme_layout_gen);
+
+		if (lsme->lsme_magic == LOV_MAGIC_V3) {
+			struct lov_mds_md_v3 *lmmv3;
+
+			lmmv3 = (struct lov_mds_md_v3 *)lmm;
+
+			strlcpy(lmmv3->lmm_pool_name, lsme->lsme_pool_name,
+				sizeof(lmmv3->lmm_pool_name));
+			lmm_objects = lmmv3->lmm_objects;
+		} else {
+			lmm_objects = ((struct lov_mds_md_v1 *)lmm)->lmm_objects;
+		}
+
+		for (i = 0; i < lsme->lsme_stripe_count; i++) {
+			struct lov_oinfo *loi = lsme->lsme_oinfo[i];
+
+			ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
+			lmm_objects[i].l_ost_gen =
+				cpu_to_le32(loi->loi_ost_gen);
+			lmm_objects[i].l_ost_idx =
+				cpu_to_le32(loi->loi_ost_idx);
+		}
+
+		size = lov_mds_md_size(lsme->lsme_stripe_count,
+				       lsme->lsme_magic);
+		lcme->lcme_size = cpu_to_le32(size);
+		offset += size;
+	} /* for each layout component */
+
+	return lmm_size;
+}
+
 /* Find the max stripecount we should use */
 __u16 lov_get_stripecnt(struct lov_obd *lov, __u32 magic, __u16 stripe_count)
 {
@@ -227,53 +309,23 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		  struct lov_user_md __user *lump)
 {
 	/* we use lov_user_md_v3 because it is larger than lov_user_md_v1 */
-	struct lov_user_md_v3 lum;
 	struct lov_mds_md *lmmk;
-	u32 stripe_count;
 	ssize_t lmm_size;
 	size_t lmmk_size;
-	size_t lum_size;
-	int rc;
+	int rc = 0;
 
 	if (!lsm)
 		return -ENODATA;
 
-	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
+	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3 &&
+	    lsm->lsm_magic != LOV_MAGIC_COMP_V1) {
 		CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
 		       lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
 		rc = -EIO;
 		goto out;
 	}
 
-	if (!lsm->lsm_is_released)
-		stripe_count = lsm->lsm_entries[0]->lsme_stripe_count;
-	else
-		stripe_count = 0;
-
-	/* we only need the header part from user space to get lmm_magic and
-	 * lmm_stripe_count, (the header part is common to v1 and v3)
-	 */
-	lum_size = sizeof(struct lov_user_md_v1);
-	if (copy_from_user(&lum, lump, lum_size)) {
-		rc = -EFAULT;
-		goto out;
-	}
-	if (lum.lmm_magic != LOV_USER_MAGIC_V1 &&
-	    lum.lmm_magic != LOV_USER_MAGIC_V3 &&
-	    lum.lmm_magic != LOV_USER_MAGIC_SPECIFIC) {
-		rc = -EINVAL;
-		goto out;
-	}
-
-	if (lum.lmm_stripe_count && lum.lmm_stripe_count < stripe_count) {
-		/* Return right size of stripe to user */
-		lum.lmm_stripe_count = stripe_count;
-		rc = copy_to_user(lump, &lum, lum_size);
-		rc = -EOVERFLOW;
-		goto out;
-	}
-
-	lmmk_size = lov_mds_md_size(stripe_count, lsm->lsm_magic);
+	lmmk_size = lov_comp_md_size(lsm);
 	lmmk = kvzalloc(lmmk_size, GFP_KERNEL);
 	if (!lmmk) {
 		rc = -ENOMEM;
@@ -286,54 +338,22 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		goto out_free;
 	}
 
-	/* FIXME: Bug 1185 - copy fields properly when structs change */
-	/* struct lov_user_md_v3 and struct lov_mds_md_v3 must be the same */
-	BUILD_BUG_ON(sizeof(lum) != sizeof(struct lov_mds_md_v3));
-	BUILD_BUG_ON(sizeof(lum.lmm_objects[0]) != sizeof(lmmk->lmm_objects[0]));
-
-	if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC &&
-	    (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1) ||
-	     lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3))) {
-		lustre_swab_lov_mds_md(lmmk);
-		lustre_swab_lov_user_md_objects(
+	if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) {
+		if (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1) ||
+		    lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)) {
+			lustre_swab_lov_mds_md(lmmk);
+			lustre_swab_lov_user_md_objects(
 				(struct lov_user_ost_data *)lmmk->lmm_objects,
 				lmmk->lmm_stripe_count);
-	}
-
-	if (lum.lmm_magic == LOV_USER_MAGIC) {
-		/* User request for v1, we need skip lmm_pool_name */
-		if (lmmk->lmm_magic == LOV_MAGIC_V3) {
-			memmove(((struct lov_mds_md_v1 *)lmmk)->lmm_objects,
-				((struct lov_mds_md_v3 *)lmmk)->lmm_objects,
-				lmmk->lmm_stripe_count *
-				sizeof(struct lov_ost_data_v1));
-			lmm_size -= LOV_MAXPOOLNAME;
+		} else if (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_COMP_V1)) {
+			lustre_swab_lov_comp_md_v1((struct lov_comp_md_v1 *)lmmk);
 		}
-	} else {
-		/* if v3 we just have to update the lum_size */
-		lum_size = sizeof(struct lov_user_md_v3);
 	}
 
-	/* User wasn't expecting this many OST entries */
-	if (lum.lmm_stripe_count == 0) {
-		lmm_size = lum_size;
-	} else if (lum.lmm_stripe_count < lmmk->lmm_stripe_count) {
-		rc = -EOVERFLOW;
-		goto out_free;
-	}
-	/*
-	 * Have a difference between lov_mds_md & lov_user_md.
-	 * So we have to re-order the data before copy to user.
-	 */
-	lum.lmm_stripe_count = lmmk->lmm_stripe_count;
-	lum.lmm_layout_gen = lmmk->lmm_layout_gen;
-	((struct lov_user_md *)lmmk)->lmm_layout_gen = lum.lmm_layout_gen;
-	((struct lov_user_md *)lmmk)->lmm_stripe_count = lum.lmm_stripe_count;
-	if (copy_to_user(lump, lmmk, lmm_size))
+	if (copy_to_user(lump, lmmk, lmmk_size))
 		rc = -EFAULT;
 	else
 		rc = 0;
-
 out_free:
 	kvfree(lmmk);
 out:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 17/28] lustre: pfl: enhance PFID EA for PFL
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (15 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 16/28] lustre: clio: getstripe support comp layout James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 18/28] lustre: pfl: dynamic layout modification with write/truncate James Simmons
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Fan Yong <fan.yong@intel.com>

This is a misc patch that contains some adjustments to
store more stripe information in the OST-object's PFID
EA (XATTR_NAME_FID). It is client duty to transfer the
stripe and PFL information to the OST via the write,
setattr and punch RPC. Then OST will store these
information in the PFID EA.

Signed-off-by: Fan Yong <fan.yong@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/24882
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 14 ++++--
 .../lustre/include/uapi/linux/lustre/lustre_user.h | 17 +++----
 drivers/staging/lustre/lustre/include/cl_object.h  |  1 +
 drivers/staging/lustre/lustre/lov/lov_internal.h   | 16 ++++++
 drivers/staging/lustre/lustre/lov/lov_io.c         |  2 +
 drivers/staging/lustre/lustre/lov/lovsub_object.c  |  4 ++
 drivers/staging/lustre/lustre/osc/osc_io.c         |  4 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    | 13 ++++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 58 +++++++++++++---------
 9 files changed, 89 insertions(+), 40 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
index e47eb52..f7a065e 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
@@ -1134,6 +1134,7 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 							    */
 
 #define OBD_MD_DEFAULT_MEA	(0x0040000000000000ULL) /* default MEA */
+#define OBD_MD_FLOSTLAYOUT	(0x0080000000000000ULL)	/* contain ost_layout */
 #define OBD_MD_FLPROJID		(0x0100000000000000ULL) /* project ID */
 
 #define OBD_MD_FLALLQUOTA (OBD_MD_FLUSRQUOTA | \
@@ -2625,9 +2626,16 @@ struct obdo {
 	__u32		o_parent_ver;
 	struct lustre_handle    o_handle;  /* brw: lock handle to prolong locks
 					    */
-	struct llog_cookie      o_lcookie; /* destroy: unlink cookie from MDS,
-					    * obsolete in 2.8, reused in OSP
-					    */
+	/* Originally, the field is llog_cookie for destroy with unlink cookie
+	 * from MDS, it is obsolete in 2.8. Then reuse it by client to transfer
+	 * layout and PFL information in IO, setattr RPCs. Since llog_cookie is
+	 * not used on wire any longer, remove it from the obdo, then it can be
+	 * enlarged freely in the further without affect related RPCs.
+	 *
+	 * sizeof(ost_layout) + sizeof(__u32) == sizeof(llog_cookie).
+	 */
+	struct ost_layout	o_layout;
+	__u32			o_padding_3;
 	__u32		o_uid_h;
 	__u32		o_gid_h;
 
diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index 67b2ae4..8e6d67b 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -154,16 +154,13 @@ static inline bool fid_is_zero(const struct lu_fid *fid)
 	return !fid->f_seq && !fid->f_oid;
 }
 
-struct filter_fid {
-	struct lu_fid	ff_parent;  /* ff_parent.f_ver == file stripe number */
-};
-
-/* keep this one for compatibility */
-struct filter_fid_old {
-	struct lu_fid	ff_parent;
-	__u64		ff_objid;
-	__u64		ff_seq;
-};
+struct ost_layout {
+	__u32	ol_stripe_size;
+	__u32	ol_stripe_count;
+	__u64	ol_comp_start;
+	__u64	ol_comp_end;
+	__u32	ol_comp_id;
+} __packed;
 
 /* Userspace should treat lu_fid as opaque, and only use the following methods
  * to print or parse them.  Other functions (e.g. compare, swab) could be moved
diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index a1e07f8..d0edeb7c 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1784,6 +1784,7 @@ struct cl_io {
 			unsigned int     sa_avalid;
 			unsigned int		sa_xvalid; /* OP_XVALID */
 			int		sa_stripe_index;
+			struct ost_layout	 sa_layout;
 			const struct lu_fid	*sa_parent_fid;
 		} ci_setattr;
 		struct cl_data_version_io {
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 9c0a4f7..e8102df 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -304,4 +304,20 @@ static inline struct obd_device *lov2obd(const struct lov_obd *lov)
 	return container_of_safe(lov, struct obd_device, u.lov);
 }
 
+static inline void lov_lsm2layout(struct lov_stripe_md *lsm,
+				  struct lov_stripe_md_entry *lsme,
+				  struct ost_layout *ol)
+{
+	ol->ol_stripe_size = lsme->lsme_stripe_size;
+	ol->ol_stripe_count = lsme->lsme_stripe_count;
+	if (lsm->lsm_magic == LOV_MAGIC_COMP_V1) {
+		ol->ol_comp_start = lsme->lsme_extent.e_start;
+		ol->ol_comp_end = lsme->lsme_extent.e_end;
+		ol->ol_comp_id = lsme->lsme_id;
+	} else {
+		ol->ol_comp_start = 0;
+		ol->ol_comp_end = 0;
+		ol->ol_comp_id = 0;
+	}
+}
 #endif
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index d9b2a81..70908b1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -316,6 +316,8 @@ static void lov_io_sub_inherit(struct lov_io_sub *sub, struct lov_io *lio,
 						      stripe);
 			io->u.ci_setattr.sa_attr.lvb_size = new_size;
 		}
+		lov_lsm2layout(lsm, lsm->lsm_entries[index],
+			       &io->u.ci_setattr.sa_layout);
 		break;
 	}
 	case CIT_DATA_VERSION: {
diff --git a/drivers/staging/lustre/lustre/lov/lovsub_object.c b/drivers/staging/lustre/lustre/lov/lovsub_object.c
index ca7c8a0..da4b7f1 100644
--- a/drivers/staging/lustre/lustre/lov/lovsub_object.c
+++ b/drivers/staging/lustre/lustre/lov/lovsub_object.c
@@ -131,6 +131,7 @@ static void lovsub_req_attr_set(const struct lu_env *env, struct cl_object *obj,
 				struct cl_req_attr *attr)
 {
 	struct lovsub_object *subobj = cl2lovsub(obj);
+	struct lov_stripe_md *lsm = subobj->lso_super->lo_lsm;
 
 	cl_req_attr_set(env, &subobj->lso_super->lo_cl, attr);
 
@@ -139,6 +140,9 @@ static void lovsub_req_attr_set(const struct lu_env *env, struct cl_object *obj,
 	 * unconditionally. It never changes anyway.
 	 */
 	attr->cra_oa->o_stripe_idx = lov_comp_stripe(subobj->lso_index);
+	lov_lsm2layout(lsm, lsm->lsm_entries[lov_comp_entry(subobj->lso_index)],
+		       &attr->cra_oa->o_layout);
+	attr->cra_oa->o_valid |= OBD_MD_FLOSTLAYOUT;
 }
 
 static const struct cl_object_operations lovsub_ops = {
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index dabdf6d..8cd0813 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -542,7 +542,9 @@ static int osc_io_setattr_start(const struct lu_env *env,
 		oa->o_oi = loi->loi_oi;
 		obdo_set_parent_fid(oa, io->u.ci_setattr.sa_parent_fid);
 		oa->o_stripe_idx = io->u.ci_setattr.sa_stripe_index;
-		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP;
+		oa->o_layout = io->u.ci_setattr.sa_layout;
+		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP |
+			       OBD_MD_FLOSTLAYOUT;
 		if (ia_avalid & ATTR_CTIME) {
 			oa->o_valid |= OBD_MD_FLCTIME;
 			oa->o_ctime = attr->cat_ctime;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 9c5be30..5fadd5e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1587,6 +1587,15 @@ void lustre_swab_connect(struct obd_connect_data *ocd)
 	BUILD_BUG_ON(offsetof(typeof(*ocd), paddingF) == 0);
 }
 
+static void lustre_swab_ost_layout(struct ost_layout *ol)
+{
+	__swab32s(&ol->ol_stripe_size);
+	__swab32s(&ol->ol_stripe_count);
+	__swab64s(&ol->ol_comp_start);
+	__swab64s(&ol->ol_comp_end);
+	__swab32s(&ol->ol_comp_id);
+}
+
 static void lustre_swab_obdo(struct obdo *o)
 {
 	__swab64s(&o->o_valid);
@@ -1609,8 +1618,8 @@ static void lustre_swab_obdo(struct obdo *o)
 	__swab64s(&o->o_ioepoch);
 	__swab32s(&o->o_stripe_idx);
 	__swab32s(&o->o_parent_ver);
-	/* o_handle is opaque */
-	/* o_lcookie is swabbed elsewhere */
+	lustre_swab_ost_layout(&o->o_layout);
+	BUILD_BUG_ON(offsetof(typeof(*o), o_padding_3) == 0);
 	__swab32s(&o->o_uid_h);
 	__swab32s(&o->o_gid_h);
 	__swab64s(&o->o_data_version);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 90e6b8c..639db24 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -1128,6 +1128,30 @@ void lustre_assert_wire_constants(void)
 	LASSERTF(OBD_CKSUM_CRC32C == 0x00000004UL, "found 0x%.8xUL\n",
 		 (unsigned int)OBD_CKSUM_CRC32C);
 
+	/* Checks for struct ost_layout */
+	LASSERTF((int)sizeof(struct ost_layout) == 28, "found %lld\n",
+		 (long long)(int)sizeof(struct ost_layout));
+	LASSERTF((int)offsetof(struct ost_layout, ol_stripe_size) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct ost_layout, ol_stripe_size));
+	LASSERTF((int)sizeof(((struct ost_layout *)0)->ol_stripe_size) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ost_layout *)0)->ol_stripe_size));
+	LASSERTF((int)offsetof(struct ost_layout, ol_stripe_count) == 4, "found %lld\n",
+		 (long long)(int)offsetof(struct ost_layout, ol_stripe_count));
+	LASSERTF((int)sizeof(((struct ost_layout *)0)->ol_stripe_count) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ost_layout *)0)->ol_stripe_count));
+	LASSERTF((int)offsetof(struct ost_layout, ol_comp_start) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ost_layout, ol_comp_start));
+	LASSERTF((int)sizeof(((struct ost_layout *)0)->ol_comp_start) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct ost_layout *)0)->ol_comp_start));
+	LASSERTF((int)offsetof(struct ost_layout, ol_comp_end) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct ost_layout, ol_comp_end));
+	LASSERTF((int)sizeof(((struct ost_layout *)0)->ol_comp_end) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct ost_layout *)0)->ol_comp_end));
+	LASSERTF((int)offsetof(struct ost_layout, ol_comp_id) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct ost_layout, ol_comp_id));
+	LASSERTF((int)sizeof(((struct ost_layout *)0)->ol_comp_id) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ost_layout *)0)->ol_comp_id));
+
 	/* Checks for struct obdo */
 	LASSERTF((int)sizeof(struct obdo) == 208, "found %lld\n",
 		 (long long)(int)sizeof(struct obdo));
@@ -1215,10 +1239,14 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct obdo, o_handle));
 	LASSERTF((int)sizeof(((struct obdo *)0)->o_handle) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct obdo *)0)->o_handle));
-	LASSERTF((int)offsetof(struct obdo, o_lcookie) == 136, "found %lld\n",
-		 (long long)(int)offsetof(struct obdo, o_lcookie));
-	LASSERTF((int)sizeof(((struct obdo *)0)->o_lcookie) == 32, "found %lld\n",
-		 (long long)(int)sizeof(((struct obdo *)0)->o_lcookie));
+	LASSERTF((int)offsetof(struct obdo, o_layout) == 136, "found %lld\n",
+		 (long long)(int)offsetof(struct obdo, o_layout));
+	LASSERTF((int)sizeof(((struct obdo *)0)->o_layout) == 28, "found %lld\n",
+		 (long long)(int)sizeof(((struct obdo *)0)->o_layout));
+	LASSERTF((int)offsetof(struct obdo, o_padding_3) == 164, "found %lld\n",
+		 (long long)(int)offsetof(struct obdo, o_padding_3));
+	LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_3) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct obdo *)0)->o_padding_3));
 	LASSERTF((int)offsetof(struct obdo, o_uid_h) == 168, "found %lld\n",
 		 (long long)(int)offsetof(struct obdo, o_uid_h));
 	LASSERTF((int)sizeof(((struct obdo *)0)->o_uid_h) == 4, "found %lld\n",
@@ -1331,6 +1359,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_MD_FLGETATTRLOCK);
 	LASSERTF(OBD_MD_FLDATAVERSION == (0x0010000000000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLDATAVERSION);
+	LASSERTF(OBD_MD_FLOSTLAYOUT == (0x0080000000000000ULL), "found 0x%.16llxULL\n",
+		 OBD_MD_FLOSTLAYOUT);
 	LASSERTF(OBD_MD_FLPROJID == (0x0100000000000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLPROJID);
 
@@ -3549,26 +3579,6 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llog_log_hdr *)0)->llh_tail) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct llog_log_hdr *)0)->llh_tail));
 
-	/* Checks for struct llog_cookie */
-	LASSERTF((int)sizeof(struct llog_cookie) == 32, "found %lld\n",
-		 (long long)(int)sizeof(struct llog_cookie));
-	LASSERTF((int)offsetof(struct llog_cookie, lgc_lgl) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_cookie, lgc_lgl));
-	LASSERTF((int)sizeof(((struct llog_cookie *)0)->lgc_lgl) == 20, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_cookie *)0)->lgc_lgl));
-	LASSERTF((int)offsetof(struct llog_cookie, lgc_subsys) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_cookie, lgc_subsys));
-	LASSERTF((int)sizeof(((struct llog_cookie *)0)->lgc_subsys) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_cookie *)0)->lgc_subsys));
-	LASSERTF((int)offsetof(struct llog_cookie, lgc_index) == 24, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_cookie, lgc_index));
-	LASSERTF((int)sizeof(((struct llog_cookie *)0)->lgc_index) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_cookie *)0)->lgc_index));
-	LASSERTF((int)offsetof(struct llog_cookie, lgc_padding) == 28, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_cookie, lgc_padding));
-	LASSERTF((int)sizeof(((struct llog_cookie *)0)->lgc_padding) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_cookie *)0)->lgc_padding));
-
 	/* Checks for struct llogd_body */
 	LASSERTF((int)sizeof(struct llogd_body) == 48, "found %lld\n",
 		 (long long)(int)sizeof(struct llogd_body));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 18/28] lustre: pfl: dynamic layout modification with write/truncate
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (16 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 17/28] lustre: pfl: enhance PFID EA for PFL James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 19/28] lustre: pfl: calculate PFL file LOVEA correctly James Simmons
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

* in lov_init_composite(), skip init sub object without LCME_FL_INIT
  layout component.
* issue layout intent RPC during write/trunc ops when try to write to
  an un-init-ed component (even if at the lock stage).
* After layout intent RPC issued, restart the IO.
* get rid of unused lov_layout_operations::llo_install() interface.
* add an empty mdt_layout_change() interface to handle intent layout
  write RPC.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9008
Reviewed-on: https://review.whamcloud.com/25317
WC-bug-id: https://jira.whamcloud.com/browse/LU-9307
Reviewed-on: https://review.whamcloud.com/26456
WC-bug-id: https://jira.whamcloud.com/browse/LU-9311
Reviewed-on: https://review.whamcloud.com/26474
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_idl.h  |  18 ++--
 drivers/staging/lustre/lustre/include/cl_object.h  |   5 +
 drivers/staging/lustre/lustre/include/lustre_sec.h |   4 +-
 drivers/staging/lustre/lustre/llite/file.c         | 104 ++++++++++++++-------
 .../staging/lustre/lustre/llite/llite_internal.h   |   1 +
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  36 ++++++-
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  51 +++++++---
 drivers/staging/lustre/lustre/lov/lov_internal.h   |  22 +++++
 drivers/staging/lustre/lustre/lov/lov_io.c         |  49 ++++++++--
 drivers/staging/lustre/lustre/lov/lov_lock.c       |  11 ++-
 drivers/staging/lustre/lustre/lov/lov_object.c     |  53 +++++------
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  19 ++--
 drivers/staging/lustre/lustre/lov/lov_page.c       |   2 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  79 +++++++++-------
 drivers/staging/lustre/lustre/obdclass/genops.c    |  16 +++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   6 +-
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |   7 +-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |   5 +-
 18 files changed, 338 insertions(+), 150 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
index f7a065e..d1693e3 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
@@ -2772,22 +2772,22 @@ struct getparent {
 } __packed;
 
 enum {
-	LAYOUT_INTENT_ACCESS    = 0,
-	LAYOUT_INTENT_READ      = 1,
-	LAYOUT_INTENT_WRITE     = 2,
-	LAYOUT_INTENT_GLIMPSE   = 3,
-	LAYOUT_INTENT_TRUNC     = 4,
-	LAYOUT_INTENT_RELEASE   = 5,
-	LAYOUT_INTENT_RESTORE   = 6
+	LAYOUT_INTENT_ACCESS    = 0,	/** generic access */
+	LAYOUT_INTENT_READ      = 1,	/** not used */
+	LAYOUT_INTENT_WRITE     = 2,	/** write file, for comp layout */
+	LAYOUT_INTENT_GLIMPSE   = 3,	/** not used */
+	LAYOUT_INTENT_TRUNC     = 4,	/** truncate file, for comp layout */
+	LAYOUT_INTENT_RELEASE   = 5,	/** reserved for HSM release */
+	LAYOUT_INTENT_RESTORE   = 6	/** reserved for HSM restore */
 };
 
 /* enqueue layout lock with intent */
 struct layout_intent {
-	__u32 li_opc; /* intent operation for enqueue, read, write etc */
+	__u32 li_opc;	/* intent operation for enqueue, read, write etc */
 	__u32 li_flags;
 	__u64 li_start;
 	__u64 li_end;
-};
+} __packed;
 
 /**
  * On the wire version of hsm_progress structure.
diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index d0edeb7c..57ced0f 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1843,6 +1843,11 @@ struct cl_io {
 	 */
 			     ci_ignore_layout:1,
 	/**
+	 * Need MDS intervention to complete a write. This usually means the
+	 * corresponding component is not initialized for the writing extent.
+	 */
+			ci_need_write_intent:1,
+	/**
 	 * Check if layout changed after the IO finishes. Mainly for HSM
 	 * requirement. If IO occurs to openning files, it doesn't need to
 	 * verify layout because HSM won't release openning files.
diff --git a/drivers/staging/lustre/lustre/include/lustre_sec.h b/drivers/staging/lustre/lustre/include/lustre_sec.h
index d35bcbc..43ff594 100644
--- a/drivers/staging/lustre/lustre/include/lustre_sec.h
+++ b/drivers/staging/lustre/lustre/include/lustre_sec.h
@@ -65,6 +65,7 @@
 struct ptlrpc_svc_ctx;
 struct ptlrpc_cli_ctx;
 struct ptlrpc_ctx_ops;
+struct req_msg_field;
 
 /**
  * \addtogroup flavor flavor
@@ -976,7 +977,8 @@ int cli_ctx_is_eternal(struct ptlrpc_cli_ctx *ctx)
 int sptlrpc_cli_alloc_repbuf(struct ptlrpc_request *req, int msgsize);
 void sptlrpc_cli_free_repbuf(struct ptlrpc_request *req);
 int sptlrpc_cli_enlarge_reqbuf(struct ptlrpc_request *req,
-			       int segment, int newsize);
+			       const struct req_msg_field *field,
+			       int newsize);
 int  sptlrpc_cli_unwrap_early_reply(struct ptlrpc_request *req,
 				    struct ptlrpc_request **req_ret);
 void sptlrpc_cli_finish_early_reply(struct ptlrpc_request *early_req);
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 8d67d1a..009e9e8 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3680,6 +3680,7 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	lock_res_and_lock(lock);
 	lvb_ready = ldlm_is_lvb_ready(lock);
 	unlock_res_and_lock(lock);
+
 	/* checking lvb_ready is racy but this is okay. The worst case is
 	 * that multi processes may configure the file on the same time.
 	 */
@@ -3709,7 +3710,6 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 
 	/* refresh layout failed, need to wait */
 	wait_layout = rc == -EBUSY;
-
 out:
 	LDLM_LOCK_PUT(lock);
 	ldlm_lock_decref(lockh, mode);
@@ -3735,38 +3735,37 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	return rc;
 }
 
-static int ll_layout_refresh_locked(struct inode *inode)
+/**
+ * Issue layout intent RPC to MDS.
+ * @inode	file inode
+ * @intent	layout intent
+ *
+ * RETURNS:
+ * 0		on success
+ * retval < 0	error code
+ */
+static int ll_layout_intent(struct inode *inode, struct layout_intent *intent)
 {
 	struct ll_inode_info  *lli = ll_i2info(inode);
 	struct ll_sb_info     *sbi = ll_i2sbi(inode);
 	struct md_op_data     *op_data;
 	struct lookup_intent   it;
-	struct lustre_handle   lockh;
-	enum ldlm_mode	       mode;
 	struct ptlrpc_request *req;
 	int rc;
 
-again:
-	/* mostly layout lock is caching on the local side, so try to match
-	 * it before grabbing layout lock mutex.
-	 */
-	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
-			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
-	if (mode != 0) { /* hit cached lock */
-		rc = ll_layout_lock_set(&lockh, mode, inode);
-		if (rc == -EAGAIN)
-			goto again;
-		return rc;
-	}
-
 	op_data = ll_prep_md_op_data(NULL, inode, inode, NULL,
 				     0, 0, LUSTRE_OPC_ANY, NULL);
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
-	/* have to enqueue one */
+	op_data->op_data = intent;
+	op_data->op_data_size = sizeof(*intent);
+
 	memset(&it, 0, sizeof(it));
 	it.it_op = IT_LAYOUT;
+	if (intent->li_opc == LAYOUT_INTENT_WRITE ||
+	    intent->li_opc == LAYOUT_INTENT_TRUNC)
+		it.it_flags = FMODE_WRITE;
 
 	LDLM_DEBUG_NOLOCK("%s: requeue layout lock for file " DFID "(%p)",
 			  ll_get_fsname(inode->i_sb, NULL, 0),
@@ -3779,18 +3778,11 @@ static int ll_layout_refresh_locked(struct inode *inode)
 
 	ll_finish_md_op_data(op_data);
 
-	mode = it.it_lock_mode;
-	it.it_lock_mode = 0;
-	ll_intent_drop_lock(&it);
-
-	if (rc == 0) {
-		/* set lock data in case this is a new lock */
+	/* set lock data in case this is a new lock */
+	if (!rc)
 		ll_set_lock_data(sbi->ll_md_exp, inode, &it, NULL);
-		lockh.cookie = it.it_lock_handle;
-		rc = ll_layout_lock_set(&lockh, mode, inode);
-		if (rc == -EAGAIN)
-			goto again;
-	}
+
+	ll_intent_drop_lock(&it);
 
 	return rc;
 }
@@ -3812,6 +3804,11 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct layout_intent intent = {
+		.li_opc = LAYOUT_INTENT_ACCESS,
+	};
+	struct lustre_handle lockh;
+	enum ldlm_mode mode;
 	int rc;
 
 	*gen = ll_layout_version_get(lli);
@@ -3825,18 +3822,57 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 	/* take layout lock mutex to enqueue layout lock exclusively. */
 	mutex_lock(&lli->lli_layout_mutex);
 
-	rc = ll_layout_refresh_locked(inode);
-	if (rc < 0)
-		goto out;
+	while (1) {
+		/* mostly layout lock is caching on the local side, so try to
+		 * match it before grabbing layout lock mutex.
+		 */
+		mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
+				       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
+		if (mode != 0) { /* hit cached lock */
+			rc = ll_layout_lock_set(&lockh, mode, inode);
+			if (rc == -EAGAIN)
+				continue;
+			break;
+		}
 
-	*gen = ll_layout_version_get(lli);
-out:
+		rc = ll_layout_intent(inode, &intent);
+		if (rc != 0)
+			break;
+	}
+
+	if (rc == 0)
+		*gen = ll_layout_version_get(lli);
 	mutex_unlock(&lli->lli_layout_mutex);
 
 	return rc;
 }
 
 /**
+ * Issue layout intent RPC indicating where in a file an IO is about to write.
+ *
+ * \param[in] inode    file inode.
+ * \param[in] start    start offset of fille in bytes where an IO is about to
+ *                     write.
+ * \param[in] end      exclusive end offset in bytes of the write range.
+ *
+ * \retval 0   on success
+ * \retval < 0 error code
+ */
+int ll_layout_write_intent(struct inode *inode, u64 start, u64 end)
+{
+	struct layout_intent intent = {
+		.li_opc = LAYOUT_INTENT_WRITE,
+		.li_start = start,
+		.li_end = end,
+	};
+	int rc;
+
+	rc = ll_layout_intent(inode, &intent);
+
+	return rc;
+}
+
+/**
  *  This function send a restore request to the MDT
  */
 int ll_layout_restore(struct inode *inode, loff_t offset, __u64 length)
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index e3f5450..b2a1f54 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1320,6 +1320,7 @@ static inline void d_lustre_revalidate(struct dentry *dentry)
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
 int ll_layout_restore(struct inode *inode, loff_t start, __u64 length);
+int ll_layout_write_intent(struct inode *inode, u64 start, u64 end);
 
 int ll_xattr_init(void);
 void ll_xattr_fini(void);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index d6b27ba..5323fea 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -281,18 +281,18 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 	struct cl_object *obj = io->ci_obj;
 	struct vvp_io    *vio = cl2vvp_io(env, ios);
 	struct inode *inode = vvp_object_inode(obj);
+	int rc;
 
 	CLOBINVRNT(env, obj, vvp_object_invariant(obj));
 
 	CDEBUG(D_VFSTRACE, DFID
-	       " ignore/verify layout %d/%d, layout version %d restore needed %d\n",
+	       " ignore/verify layout %d/%d, layout version %d need write layout %d, restore needed %d\n",
 	       PFID(lu_object_fid(&obj->co_lu)),
 	       io->ci_ignore_layout, io->ci_verify_layout,
-	       vio->vui_layout_gen, io->ci_restore_needed);
+	       vio->vui_layout_gen, io->ci_need_write_intent,
+	       io->ci_restore_needed);
 
 	if (io->ci_restore_needed) {
-		int	rc;
-
 		/* file was detected release, we need to restore it
 		 * before finishing the io
 		 */
@@ -318,6 +318,34 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		}
 	}
 
+	/**
+	 * dynamic layout change needed, send layout intent
+	 * RPC.
+	 */
+	if (io->ci_need_write_intent) {
+		loff_t start = 0;
+		loff_t end = 0;
+
+		LASSERT(io->ci_type == CIT_WRITE || cl_io_is_trunc(io));
+
+		io->ci_need_write_intent = 0;
+
+		if (io->ci_type == CIT_WRITE) {
+			start = io->u.ci_rw.crw_pos;
+			end = io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count;
+		} else {
+			end = io->u.ci_setattr.sa_attr.lvb_size;
+		}
+
+		CDEBUG(D_VFSTRACE, DFID" type %d [%llx, %llx)\n",
+		       PFID(lu_object_fid(&obj->co_lu)), io->ci_type,
+		       start, end);
+		rc = ll_layout_write_intent(inode, start, end);
+		io->ci_result = rc;
+		if (!rc)
+			io->ci_need_restart = 1;
+	}
+
 	if (!io->ci_ignore_layout && io->ci_verify_layout) {
 		__u32 gen = 0;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 124c12d..fd67fc9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -117,6 +117,10 @@ static void lsme_free(struct lov_stripe_md_entry *lsme)
 	unsigned int stripe_count = lsme->lsme_stripe_count;
 	unsigned int i;
 
+	if (!lsme_inited(lsme) ||
+	    lsme->lsme_pattern & LOV_PATTERN_F_RELEASED)
+		stripe_count = 0;
+
 	for (i = 0; i < stripe_count; i++)
 		kmem_cache_free(lov_oinfo_slab, lsme->lsme_oinfo[i]);
 
@@ -141,7 +145,7 @@ void lsm_free(struct lov_stripe_md *lsm)
  */
 static struct lov_stripe_md_entry *
 lsme_unpack(struct lov_obd *lov, struct lov_mds_md *lmm, size_t buf_size,
-	    const char *pool_name, struct lov_ost_data_v1 *objects,
+	    const char *pool_name, bool inited, struct lov_ost_data_v1 *objects,
 	    loff_t *maxbytes)
 {
 	struct lov_stripe_md_entry *lsme;
@@ -159,7 +163,7 @@ void lsm_free(struct lov_stripe_md *lsm)
 		return ERR_PTR(-EINVAL);
 
 	pattern = le32_to_cpu(lmm->lmm_pattern);
-	if (pattern & LOV_PATTERN_F_RELEASED)
+	if (pattern & LOV_PATTERN_F_RELEASED || !inited)
 		stripe_count = 0;
 	else
 		stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
@@ -185,8 +189,10 @@ void lsm_free(struct lov_stripe_md *lsm)
 
 	lsme->lsme_magic = magic;
 	lsme->lsme_pattern = pattern;
+	lsme->lsme_flags = 0;
 	lsme->lsme_stripe_size = le32_to_cpu(lmm->lmm_stripe_size);
-	lsme->lsme_stripe_count = stripe_count;
+	/* preserve the possible -1 stripe count for uninstantiated component */
+	lsme->lsme_stripe_count = le16_to_cpu(lmm->lmm_stripe_count);
 	lsme->lsme_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
 
 	if (pool_name) {
@@ -282,10 +288,12 @@ void lsm_free(struct lov_stripe_md *lsm)
 
 	pattern = le32_to_cpu(lmm->lmm_pattern);
 
-	lsme = lsme_unpack(lov, lmm, buf_size, pool_name, objects, &maxbytes);
+	lsme = lsme_unpack(lov, lmm, buf_size, pool_name, true, objects,
+			   &maxbytes);
 	if (IS_ERR(lsme))
 		return ERR_CAST(lsme);
 
+	lsme->lsme_flags = LCME_FL_INIT;
 	lsme->lsme_extent.e_start = 0;
 	lsme->lsme_extent.e_end = LUSTRE_EOF;
 
@@ -371,7 +379,7 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 
 static struct lov_stripe_md_entry *
 lsme_unpack_comp(struct lov_obd *lov, struct lov_mds_md *lmm,
-		 size_t lmm_buf_size, loff_t *maxbytes)
+		 size_t lmm_buf_size, bool inited, loff_t *maxbytes)
 {
 	unsigned int stripe_count;
 	unsigned int magic;
@@ -380,6 +388,10 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 	if (stripe_count == 0)
 		return ERR_PTR(-EINVAL);
 
+	/* un-instantiated lmm contains no ost id info, i.e. lov_ost_data_v1 */
+	if (!inited)
+		stripe_count = 0;
+
 	magic = le32_to_cpu(lmm->lmm_magic);
 	if (magic != LOV_MAGIC_V1 && magic != LOV_MAGIC_V3)
 		return ERR_PTR(-EINVAL);
@@ -389,12 +401,12 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 
 	if (magic == LOV_MAGIC_V1) {
 		return lsme_unpack(lov, lmm, lmm_buf_size, NULL,
-				   lmm->lmm_objects, maxbytes);
+				   inited, lmm->lmm_objects, maxbytes);
 	} else {
 		struct lov_mds_md_v3 *lmm3 = (struct lov_mds_md_v3 *)lmm;
 
 		return lsme_unpack(lov, lmm, lmm_buf_size, lmm3->lmm_pool_name,
-				   lmm3->lmm_objects, maxbytes);
+				   inited, lmm3->lmm_objects, maxbytes);
 	}
 }
 
@@ -440,6 +452,7 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 		blob = (char *)lcm + blob_offset;
 
 		lsme = lsme_unpack_comp(lov, blob, blob_size,
+					le32_to_cpu(lcme->lcme_flags) & LCME_FL_INIT,
 					(i == entry_count - 1) ? &maxbytes :
 					NULL);
 		if (IS_ERR(lsme)) {
@@ -452,6 +465,7 @@ static int lsm_verify_comp_md_v1(struct lov_comp_md_v1 *lcm,
 
 		lsm->lsm_entries[i] = lsme;
 		lsme->lsme_id = le32_to_cpu(lcme->lcme_id);
+		lsme->lsme_flags = le32_to_cpu(lcme->lcme_flags);
 		lu_extent_le_to_cpu(&lsme->lsme_extent, &lcme->lcme_extent);
 
 		if (i == entry_count - 1) {
@@ -507,7 +521,7 @@ const struct lsm_operations *lsm_op_find(int magic)
 
 void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 {
-	int i;
+	int i, j;
 
 	CDEBUG(level,
 	       "lsm %p, objid " DOSTID ", maxbytes %#llx, magic 0x%08X, refc: %d, entry: %u, layout_gen %u\n",
@@ -519,10 +533,23 @@ void dump_lsm(unsigned int level, const struct lov_stripe_md *lsm)
 		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
 
 		CDEBUG(level,
-		       DEXT ": id: %u, magic 0x%08X, stripe count %u, size %u, layout_gen %u, pool: [" LOV_POOLNAMEF "]\n",
-		       PEXT(&lse->lsme_extent), lse->lsme_id, lse->lsme_magic,
-		       lse->lsme_stripe_count, lse->lsme_stripe_size,
-		       lse->lsme_layout_gen, lse->lsme_pool_name);
+		       DEXT ": id: %u, flags: %x, magic 0x%08X, layout_gen %u, stripe count %u, sstripe size %u, pool: [" LOV_POOLNAMEF "]\n",
+		       PEXT(&lse->lsme_extent), lse->lsme_id, lse->lsme_flags,
+		       lse->lsme_magic, lse->lsme_layout_gen,
+                       lse->lsme_stripe_count, lse->lsme_stripe_size,
+		       lse->lsme_pool_name);
+		if (!lsme_inited(lse) ||
+		    lse->lsme_pattern & LOV_PATTERN_F_RELEASED)
+			continue;
+
+		for (j = 0; j < lse->lsme_stripe_count; j++) {
+			CDEBUG(level,
+			       "   oinfo:%p: ostid: " DOSTID " ost idx: %d gen: %d\n",
+			       lse->lsme_oinfo[j],
+			       POSTID(&lse->lsme_oinfo[j]->loi_oi),
+			       lse->lsme_oinfo[j]->loi_ost_idx,
+			       lse->lsme_oinfo[j]->loi_ost_gen);
+		}
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index e8102df..5e3eae7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -48,6 +48,7 @@ struct lov_stripe_md_entry {
 	struct lu_extent	lsme_extent;
 	u32			lsme_id;
 	u32			lsme_magic;
+	u32			lsme_flags;
 	u32			lsme_pattern;
 	u32			lsme_stripe_size;
 	u16			lsme_stripe_count;
@@ -56,6 +57,17 @@ struct lov_stripe_md_entry {
 	struct lov_oinfo       *lsme_oinfo[];
 };
 
+static inline void copy_lsm_entry(struct lov_stripe_md_entry *dst,
+				  struct lov_stripe_md_entry *src)
+{
+	unsigned int i;
+
+	for (i = 0; i < src->lsme_stripe_count; i++)
+		*dst->lsme_oinfo[i] = *src->lsme_oinfo[i];
+
+	memcpy(dst, src, offsetof(typeof(*src), lsme_oinfo));
+}
+
 struct lov_stripe_md {
 	atomic_t	lsm_refc;
 	spinlock_t	lsm_lock;
@@ -74,6 +86,16 @@ struct lov_stripe_md {
 	struct lov_stripe_md_entry *lsm_entries[];
 };
 
+static inline bool lsme_inited(const struct lov_stripe_md_entry *lsme)
+{
+	return lsme->lsme_flags & LCME_FL_INIT;
+}
+
+static inline bool lsm_entry_inited(const struct lov_stripe_md *lsm, int index)
+{
+	return lsme_inited(lsm->lsm_entries[index]);
+}
+
 static inline size_t lov_comp_md_size(const struct lov_stripe_md *lsm)
 {
 	struct lov_stripe_md_entry *lsme;
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 70908b1..8a1bb85 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -394,6 +394,11 @@ static int lov_io_iter_init(const struct lu_env *env,
 		u64 start;
 		u64 end;
 
+		CDEBUG(D_VFSTRACE, "component[%d] flags %#x\n",
+		       index, lsm->lsm_entries[index]->lsme_flags);
+		if (!lsm_entry_inited(lsm, index))
+			break;
+
 		index++;
 		if (!lu_extent_is_overlapped(&ext, &le->lle_extent))
 			continue;
@@ -442,6 +447,7 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 			       const struct cl_io_slice *ios)
 {
 	struct lov_io	*lio = cl2lov_io(env, ios);
+	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
 	struct cl_io	 *io  = ios->cis_io;
 	u64 start = io->u.ci_rw.crw_pos;
 	struct lov_stripe_md_entry *lse;
@@ -454,7 +460,7 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	if (cl_io_is_append(io))
 		return lov_io_iter_init(env, ios);
 
-	index = lov_lsm_entry(lio->lis_object->lo_lsm, io->u.ci_rw.crw_pos);
+	index = lov_lsm_entry(lsm, io->u.ci_rw.crw_pos);
 	if (index < 0) { /* non-existing layout component */
 		if (io->ci_type == CIT_READ) {
 			/* TODO: it needs to detect the next component and
@@ -476,7 +482,9 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	if (next <= start * ssize)
 		next = ~0ull;
 
-	LASSERT(io->u.ci_rw.crw_pos >= lse->lsme_extent.e_start);
+	LASSERTF(io->u.ci_rw.crw_pos >= lse->lsme_extent.e_start,
+		 "pos %lld, [%lld, %lld]\n", io->u.ci_rw.crw_pos,
+		 lse->lsme_extent.e_start, lse->lsme_extent.e_end);
 	next = min_t(u64, next, lse->lsme_extent.e_end);
 	next = min_t(u64, next, lio->lis_io_endpos);
 
@@ -486,9 +494,16 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	lio->lis_endpos = io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count;
 
 	CDEBUG(D_VFSTRACE,
-	       "stripe: %llu chunk: [%llu, %llu) %llu\n",
-	       (u64)start, lio->lis_pos, lio->lis_endpos,
-	       (u64)lio->lis_io_endpos);
+	       "stripe: %llu chunk: [%llu, %llu] %llu\n",
+	       start, lio->lis_pos, lio->lis_endpos,
+	       lio->lis_io_endpos);
+
+	index = lov_lsm_entry(lsm, lio->lis_endpos - 1);
+	if (index > 0 && !lsm_entry_inited(lsm, index)) {
+		io->ci_need_write_intent = 1;
+		io->ci_result = -ENODATA;
+		return io->ci_result;
+	}
 
 	/*
 	 * XXX The following call should be optimized: we know, that
@@ -497,6 +512,26 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	return lov_io_iter_init(env, ios);
 }
 
+static int lov_io_setattr_iter_init(const struct lu_env *env,
+				    const struct cl_io_slice *ios)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct cl_io *io = ios->cis_io;
+	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
+	int index;
+
+	if (cl_io_is_trunc(io) && lio->lis_pos) {
+		index = lov_lsm_entry(lsm, lio->lis_pos - 1);
+		if (index > 0 && !lsm_entry_inited(lsm, index)) {
+			io->ci_need_write_intent = 1;
+			io->ci_result = -ENODATA;
+			return io->ci_result;
+		}
+	}
+
+	return lov_io_iter_init(env, ios);
+}
+
 static int lov_io_call(const struct lu_env *env, struct lov_io *lio,
 		       int (*iofunc)(const struct lu_env *, struct cl_io *))
 {
@@ -617,7 +652,7 @@ static int lov_io_read_ahead(const struct lu_env *env,
 
 	offset = cl_offset(obj, start);
 	index = lov_lsm_entry(loo->lo_lsm, offset);
-	if (index < 0)
+	if (index < 0 || !lsm_entry_inited(loo->lo_lsm, index))
 		return -ENODATA;
 
 	stripe = lov_stripe_number(loo->lo_lsm, index, offset);
@@ -870,7 +905,7 @@ static void lov_io_fsync_end(const struct lu_env *env,
 		},
 		[CIT_SETATTR] = {
 			.cio_fini      = lov_io_fini,
-			.cio_iter_init = lov_io_iter_init,
+			.cio_iter_init = lov_io_setattr_iter_init,
 			.cio_iter_fini = lov_io_iter_fini,
 			.cio_lock      = lov_io_lock,
 			.cio_unlock    = lov_io_unlock,
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index ba31be4..9a46424 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -132,7 +132,7 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 
 	nr = 0;
 	for (index = lov_lsm_entry(lov->lo_lsm, ext.e_start);
-	     index != -1 && index < lov->lo_lsm->lsm_entry_count; index++) {
+	     index >= 0 && index < lov->lo_lsm->lsm_entry_count; index++) {
 		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
 
 		/* assume lsm entries are sorted. */
@@ -147,8 +147,11 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 				nr++;
 		}
 	}
-	if (nr == 0)
-		return ERR_PTR(-EINVAL);
+	/**
+	 * Aggressive lock request (from cl_setattr_ost) which asks for
+	 * [eof, -1) lock, could come across uninstantiated layout extent,
+	 * hence a 0 nr is possible.
+	 */
 
 	lovlck = kvzalloc(offsetof(struct lov_lock, lls_sub[nr]),
 				 GFP_NOFS);
@@ -158,7 +161,7 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env,
 	lovlck->lls_nr = nr;
 	nr = 0;
 	for (index = lov_lsm_entry(lov->lo_lsm, ext.e_start);
-	     index < lov->lo_lsm->lsm_entry_count; index++) {
+	     index >= 0 && index < lov->lo_lsm->lsm_entry_count; index++) {
 		struct lov_layout_raid0 *r0 = lov_r0(lov, index);
 
 		/* assume lsm entries are sorted. */
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 66fb6f5..680d232 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -64,8 +64,6 @@ struct lov_layout_operations {
 			  union lov_layout_state *state);
 	void (*llo_fini)(const struct lu_env *env, struct lov_object *lov,
 			 union lov_layout_state *state);
-	void (*llo_install)(const struct lu_env *env, struct lov_object *lov,
-			    union lov_layout_state *state);
 	int  (*llo_print)(const struct lu_env *env, void *cookie,
 			  lu_printer_t p, const struct lu_object *o);
 	int  (*llo_page_init)(const struct lu_env *env, struct cl_object *obj,
@@ -92,16 +90,6 @@ static void lov_lsm_put(struct lov_stripe_md *lsm)
  * Lov object layout operations.
  *
  */
-
-static void lov_install_empty(const struct lu_env *env,
-			      struct lov_object *lov,
-			      union  lov_layout_state *state)
-{
-	/*
-	 * File without objects.
-	 */
-}
-
 static int lov_init_empty(const struct lu_env *env, struct lov_device *dev,
 			  struct lov_object *lov, struct lov_stripe_md *lsm,
 			  const struct cl_object_conf *conf,
@@ -110,12 +98,6 @@ static int lov_init_empty(const struct lu_env *env, struct lov_device *dev,
 	return 0;
 }
 
-static void lov_install_composite(const struct lu_env *env,
-				  struct lov_object *lov,
-				  union lov_layout_state *state)
-{
-}
-
 static struct cl_object *lov_sub_find(const struct lu_env *env,
 				      struct cl_device *dev,
 				      const struct lu_fid *fid,
@@ -328,6 +310,14 @@ static int lov_init_composite(const struct lu_env *env, struct lov_device *dev,
 		struct lov_layout_entry *le = &comp->lo_entries[i];
 
 		le->lle_extent = lsm->lsm_entries[i]->lsme_extent;
+		/**
+		 * If the component has not been init-ed on MDS side, for
+		 * PFL layout, we'd know that the components beyond this one
+		 * will be dynamically init-ed later on file write/trunc ops.
+		 */
+		if (!lsm_entry_inited(lsm, i))
+			continue;
+
 		result = lov_init_raid0(env, dev, lov, i, &le->lle_raid0);
 		if (result < 0)
 			break;
@@ -471,13 +461,15 @@ static int lov_delete_composite(const struct lu_env *env,
 				struct lov_object *lov,
 				union lov_layout_state *state)
 {
+	struct lov_layout_composite *comp = &state->composite;
 	struct lov_layout_entry *entry;
 
 	dump_lsm(D_INODE, lov->lo_lsm);
 
 	lov_layout_wait(env, lov);
-	lov_foreach_layout_entry(lov, entry)
-		lov_delete_raid0(env, lov, &entry->lle_raid0);
+	if (comp->lo_entries)
+		lov_foreach_layout_entry(lov, entry)
+			lov_delete_raid0(env, lov, &entry->lle_raid0);
 
 	return 0;
 }
@@ -565,9 +557,9 @@ static int lov_print_composite(const struct lu_env *env, void *cookie,
 	for (i = 0; i < lsm->lsm_entry_count; i++) {
 		struct lov_stripe_md_entry *lse = lsm->lsm_entries[i];
 
-		(*p)(env, cookie, DEXT ": { 0x%08X, %u, %u, %u, %u }\n",
+		(*p)(env, cookie, DEXT ": { 0x%08X, %u, %u, %#x, %u, %u }\n",
 		     PEXT(&lse->lsme_extent), lse->lsme_magic,
-		     lse->lsme_id, lse->lsme_layout_gen,
+		     lse->lsme_id, lse->lsme_layout_gen, lse->lsme_flags,
 		     lse->lsme_stripe_count, lse->lsme_stripe_size);
 		lov_print_raid0(env, cookie, p, lov_r0(lov, i));
 	}
@@ -664,6 +656,10 @@ static int lov_attr_get_composite(const struct lu_env *env,
 		struct lov_layout_raid0 *r0 = &entry->lle_raid0;
 		struct cl_attr *lov_attr = &r0->lo_attr;
 
+		/* PFL: This component has not been init-ed. */
+		if (!lsm_entry_inited(lov->lo_lsm, index))
+			break;
+
 		result = lov_attr_get_raid0(env, lov, index, r0);
 		if (result != 0)
 			break;
@@ -691,7 +687,6 @@ static int lov_attr_get_composite(const struct lu_env *env,
 		.llo_init      = lov_init_empty,
 		.llo_delete    = lov_delete_empty,
 		.llo_fini      = lov_fini_empty,
-		.llo_install   = lov_install_empty,
 		.llo_print     = lov_print_empty,
 		.llo_page_init = lov_page_init_empty,
 		.llo_lock_init = lov_lock_init_empty,
@@ -702,7 +697,6 @@ static int lov_attr_get_composite(const struct lu_env *env,
 		.llo_init      = lov_init_released,
 		.llo_delete    = lov_delete_empty,
 		.llo_fini      = lov_fini_released,
-		.llo_install   = lov_install_empty,
 		.llo_print     = lov_print_released,
 		.llo_page_init = lov_page_init_empty,
 		.llo_lock_init = lov_lock_init_empty,
@@ -713,7 +707,6 @@ static int lov_attr_get_composite(const struct lu_env *env,
 		.llo_init	= lov_init_composite,
 		.llo_delete	= lov_delete_composite,
 		.llo_fini	= lov_fini_composite,
-		.llo_install	= lov_install_composite,
 		.llo_print	= lov_print_composite,
 		.llo_page_init	= lov_page_init_composite,
 		.llo_lock_init	= lov_lock_init_composite,
@@ -894,7 +887,6 @@ static int lov_layout_change(const struct lu_env *unused,
 		goto out;
 	}
 
-	new_ops->llo_install(env, lov, state);
 	lov->lo_type = llt;
 out:
 	cl_env_put(env, &refcheck);
@@ -937,8 +929,6 @@ int lov_object_init(const struct lu_env *env, struct lu_object *obj,
 	lov->lo_type = lov_type(lsm);
 	ops = &lov_dispatch[lov->lo_type];
 	rc = ops->llo_init(env, dev, lov, lsm, cconf, set);
-	if (!rc)
-		ops->llo_install(env, lov, set);
 
 	lov_lsm_put(lsm);
 
@@ -959,6 +949,7 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj,
 				   conf->u.coc_layout.lb_len);
 		if (IS_ERR(lsm))
 			return PTR_ERR(lsm);
+		dump_lsm(D_INODE, lsm);
 	}
 
 	lov_conf_lock(lov);
@@ -1541,6 +1532,9 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 	for (entry = start_entry; entry <= end_entry; entry++) {
 		lsme = lsm->lsm_entries[entry];
 
+		if (!lsme_inited(lsme))
+			break;
+
 		if (entry == start_entry)
 			fs.fs_ext.e_start = whole_start;
 		else
@@ -1751,6 +1745,9 @@ int lov_read_and_clear_async_rc(struct cl_object *clob)
 				int j;
 
 				lse = lsm->lsm_entries[i];
+				if (!lsme_inited(lse))
+					break;
+
 				for (j = 0; j < lse->lsme_stripe_count; j++) {
 					struct lov_oinfo *loi;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 79d8a32..32e4b33 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -146,6 +146,9 @@ ssize_t lov_lsm_pack_v1v3(const struct lov_stripe_md *lsm, void *buf,
 		lmm_objects = lmmv1->lmm_objects;
 	}
 
+	if (lsm->lsm_is_released)
+		return lmm_size;
+
 	for (i = 0; i < lsm->lsm_entries[0]->lsme_stripe_count; i++) {
 		struct lov_oinfo *loi = lsm->lsm_entries[0]->lsme_oinfo[i];
 
@@ -189,11 +192,13 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	for (entry = 0; entry < lsm->lsm_entry_count; entry++) {
 		struct lov_stripe_md_entry *lsme;
 		struct lov_mds_md *lmm;
+		u16 stripecnt;
 
 		lsme = lsm->lsm_entries[entry];
 		lcme = &lcmv1->lcm_entries[entry];
 
 		lcme->lcme_id = cpu_to_le32(lsme->lsme_id);
+		lcme->lcme_flags = cpu_to_le32(lsme->lsme_flags);
 		lcme->lcme_extent.e_start =
 			cpu_to_le64(lsme->lsme_extent.e_start);
 		lcme->lcme_extent.e_end =
@@ -220,7 +225,13 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 			lmm_objects = ((struct lov_mds_md_v1 *)lmm)->lmm_objects;
 		}
 
-		for (i = 0; i < lsme->lsme_stripe_count; i++) {
+		if (lsme_inited(lsme) &&
+		    !(lsme->lsme_pattern & LOV_PATTERN_F_RELEASED))
+			stripecnt = lsme->lsme_stripe_count;
+		else
+			stripecnt = 0;
+
+		for (i = 0; i < stripecnt; i++) {
 			struct lov_oinfo *loi = lsme->lsme_oinfo[i];
 
 			ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
@@ -230,8 +241,7 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 				cpu_to_le32(loi->loi_ost_idx);
 		}
 
-		size = lov_mds_md_size(lsme->lsme_stripe_count,
-				       lsme->lsme_magic);
+		size = lov_mds_md_size(stripecnt, lsme->lsme_magic);
 		lcme->lcme_size = cpu_to_le32(size);
 		offset += size;
 	} /* for each layout component */
@@ -314,9 +324,6 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 	size_t lmmk_size;
 	int rc = 0;
 
-	if (!lsm)
-		return -ENODATA;
-
 	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3 &&
 	    lsm->lsm_magic != LOV_MAGIC_COMP_V1) {
 		CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index f53379a..8b68d3c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -81,7 +81,7 @@ int lov_page_init_composite(const struct lu_env *env, struct cl_object *obj,
 
 	offset = cl_offset(obj, index);
 	entry = lov_lsm_entry(loo->lo_lsm, offset);
-	if (entry < 0) {
+	if (entry < 0 || !lsm_entry_inited(loo->lo_lsm, entry)) {
 		/* non-existing layout component */
 		lov_page_init_empty(env, obj, page, index);
 		return 0;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 7d4ba9c..0abe426 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -214,20 +214,32 @@ static inline void mdc_clear_replay_flag(struct ptlrpc_request *req, int rc)
  * but this is incredibly unlikely, and questionable whether the client
  * could do MDS recovery under OOM anyways...
  */
-static void mdc_realloc_openmsg(struct ptlrpc_request *req,
-				struct mdt_body *body)
+static int mdc_save_lovea(struct ptlrpc_request *req,
+			  const struct req_msg_field *field,
+			  void *data, u32 size)
 {
-	int     rc;
-
-	/* FIXME: remove this explicit offset. */
-	rc = sptlrpc_cli_enlarge_reqbuf(req, DLM_INTENT_REC_OFF + 4,
-					body->mbo_eadatasize);
-	if (rc) {
-		CERROR("Can't enlarge segment %d size to %d\n",
-		       DLM_INTENT_REC_OFF + 4, body->mbo_eadatasize);
-		body->mbo_valid &= ~OBD_MD_FLEASIZE;
-		body->mbo_eadatasize = 0;
+	struct req_capsule *pill = &req->rq_pill;
+	int rc = 0;
+	void *lmm;
+
+	if (req_capsule_get_size(pill, field, RCL_CLIENT) < size) {
+		rc = sptlrpc_cli_enlarge_reqbuf(req, field, size);
+		if (rc) {
+			CERROR("%s: Can't enlarge ea size to %d: rc = %d\n",
+			       req->rq_export->exp_obd->obd_name,
+			       size, rc);
+			return rc;
+		}
+	} else {
+		req_capsule_shrink(pill, field, size, RCL_CLIENT);
 	}
+
+	req_capsule_set_size(pill, field, RCL_CLIENT, size);
+	lmm = req_capsule_client_get(pill, field);
+	if (lmm)
+		memcpy(lmm, data, size);
+
+	return rc;
 }
 
 static struct ptlrpc_request *
@@ -470,7 +482,7 @@ static struct ptlrpc_request *mdc_intent_getattr_pack(struct obd_export *exp,
 
 static struct ptlrpc_request *mdc_intent_layout_pack(struct obd_export *exp,
 						     struct lookup_intent *it,
-						     struct md_op_data *unused)
+						     struct md_op_data *op_data)
 {
 	struct obd_device     *obd = class_exp2obd(exp);
 	struct ptlrpc_request *req;
@@ -496,10 +508,9 @@ static struct ptlrpc_request *mdc_intent_layout_pack(struct obd_export *exp,
 
 	/* pack the layout intent request */
 	layout = req_capsule_client_get(&req->rq_pill, &RMF_LAYOUT_INTENT);
-	/* LAYOUT_INTENT_ACCESS is generic, specific operation will be
-	 * set for replication
-	 */
-	layout->li_opc = LAYOUT_INTENT_ACCESS;
+	LASSERT(op_data->op_data);
+	LASSERT(op_data->op_data_size == sizeof(*layout));
+	memcpy(layout, op_data->op_data, sizeof(*layout));
 
 	req_capsule_set_size(&req->rq_pill, &RMF_DLM_LVB, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
@@ -649,24 +660,13 @@ static int mdc_finish_enqueue(struct obd_export *exp,
 			 * (for example error one).
 			 */
 			if ((it->it_op & IT_OPEN) && req->rq_replay) {
-				void *lmm;
-
-				if (req_capsule_get_size(pill, &RMF_EADATA,
-							 RCL_CLIENT) <
-				    body->mbo_eadatasize)
-					mdc_realloc_openmsg(req, body);
-				else
-					req_capsule_shrink(pill, &RMF_EADATA,
-							   body->mbo_eadatasize,
-							   RCL_CLIENT);
-
-				req_capsule_set_size(pill, &RMF_EADATA,
-						     RCL_CLIENT,
-						     body->mbo_eadatasize);
-
-				lmm = req_capsule_client_get(pill, &RMF_EADATA);
-				if (lmm)
-					memcpy(lmm, eadata, body->mbo_eadatasize);
+				rc = mdc_save_lovea(req, &RMF_EADATA, eadata,
+						    body->mbo_eadatasize);
+				if (rc) {
+					body->mbo_valid &= ~OBD_MD_FLEASIZE;
+					body->mbo_eadatasize = 0;
+					rc = 0;
+				}
 			}
 		}
 	} else if (it->it_op & IT_LAYOUT) {
@@ -680,6 +680,15 @@ static int mdc_finish_enqueue(struct obd_export *exp,
 								lvb_len);
 			if (!lvb_data)
 				return -EPROTO;
+
+			/**
+			 * save replied layout data to the request buffer for
+			 * recovery consideration (lest MDS reinitialize
+			 * another set of OST objects).
+			 */
+			if (req->rq_transno)
+				(void)mdc_save_lovea(req, &RMF_EADATA, lvb_data,
+						     lvb_len);
 		}
 	}
 
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index 76bc73f..03df181 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -1546,6 +1546,16 @@ static inline bool obd_mod_rpc_slot_avail(struct client_obd *cli,
 	return avail;
 }
 
+static inline bool obd_skip_mod_rpc_slot(const struct lookup_intent *it)
+{
+	if (it &&
+	    (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
+	     it->it_op == IT_READDIR ||
+	     (it->it_op == IT_LAYOUT && !(it->it_flags & FMODE_WRITE))))
+		return true;
+	return false;
+}
+
 /* Get a modify RPC slot from the obd client @cli according
  * to the kind of operation @opc that is going to be sent
  * and the intent @it of the operation if it applies.
@@ -1563,8 +1573,7 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, __u32 opc,
 	/* read-only metadata RPCs don't consume a slot on MDT
 	 * for reply reconstruction
 	 */
-	if (it && (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
-		   it->it_op == IT_LAYOUT || it->it_op == IT_READDIR))
+	if (obd_skip_mod_rpc_slot(it))
 		return 0;
 
 	if (opc == MDS_CLOSE)
@@ -1610,8 +1619,7 @@ void obd_put_mod_rpc_slot(struct client_obd *cli, u32 opc,
 {
 	bool close_req = false;
 
-	if (it && (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
-		   it->it_op == IT_LAYOUT || it->it_op == IT_READDIR))
+	if (obd_skip_mod_rpc_slot(it))
 		return;
 
 	if (opc == MDS_CLOSE)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index d3c0dd6..a155200 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1797,9 +1797,9 @@ int req_capsule_server_pack(struct req_capsule *pill)
  * Returns the PTLRPC request or reply (\a loc) buffer offset of a \a pill
  * corresponding to the given RMF (\a field).
  */
-static u32 __req_capsule_offset(const struct req_capsule *pill,
-				const struct req_msg_field *field,
-				enum req_location loc)
+u32 __req_capsule_offset(const struct req_capsule *pill,
+			 const struct req_msg_field *field,
+			 enum req_location loc)
 {
 	u32 offset;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index 0e4a215..177010c 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -88,7 +88,7 @@ void ptlrpc_set_add_new_req(struct ptlrpcd_ctl *pc,
 void ptlrpc_initiate_recovery(struct obd_import *imp);
 
 int lustre_unpack_req_ptlrpc_body(struct ptlrpc_request *req, int offset);
-int lustre_unpack_rep_ptlrpc_body(struct ptlrpc_request *req, int offset);
+int lustre_unpack_rep_ptlrpc_body(struct ptlrpc_request *req, int effset);
 
 int ptlrpc_sysfs_register_service(struct kset *parent,
 				  struct ptlrpc_service *svc);
@@ -284,6 +284,11 @@ void sptlrpc_conf_choose_flavor(enum lustre_sec_part from,
 int  sptlrpc_init(void);
 void sptlrpc_fini(void);
 
+/* layout.c */
+u32 __req_capsule_offset(const struct req_capsule *pill,
+			 const struct req_msg_field *field,
+			 enum req_location loc);
+
 static inline bool ptlrpc_recoverable_error(int rc)
 {
 	return (rc == -ENOTCONN || rc == -ENODEV);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 9c59871..53f4d4f 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -1611,11 +1611,14 @@ void _sptlrpc_enlarge_msg_inplace(struct lustre_msg *msg,
  * so caller should refresh its local pointers if needed.
  */
 int sptlrpc_cli_enlarge_reqbuf(struct ptlrpc_request *req,
-			       int segment, int newsize)
+			       const struct req_msg_field *field,
+			       int newsize)
 {
+	struct req_capsule *pill = &req->rq_pill;
 	struct ptlrpc_cli_ctx *ctx = req->rq_cli_ctx;
 	struct ptlrpc_sec_cops *cops;
 	struct lustre_msg *msg = req->rq_reqmsg;
+	int segment = __req_capsule_offset(pill, field, RCL_CLIENT);
 
 	LASSERT(ctx);
 	LASSERT(msg);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 19/28] lustre: pfl: calculate PFL file LOVEA correctly
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (17 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 18/28] lustre: pfl: dynamic layout modification with write/truncate James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 20/28] lustre: lov: keep minimum LOVEA size James Simmons
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

PFL file could contain uninstantiated component, so it could still
keeps the specified -1 stripe count,
lov_mds_md_size()/lov_user_md_size() should heed this case,
otherwise its LOVEA size could be errneous big.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9335
Reviewed-on: https://review.whamcloud.com/26597
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h  | 3 +++
 drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
index d1693e3..deb0f0e 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
@@ -1036,6 +1036,9 @@ struct lov_mds_md_v3 {		/* LOV EA mds/wire data (little-endian) */
 
 static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 {
+	if (stripes == (__u16)-1)
+		stripes = 0;
+
 	if (lmm_magic == LOV_MAGIC_V3)
 		return sizeof(struct lov_mds_md_v3) +
 				stripes * sizeof(struct lov_ost_data_v1);
diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index 8e6d67b..b19adef 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -457,6 +457,9 @@ struct lov_comp_md_v1 {
 
 static inline __u32 lov_user_md_size(__u16 stripes, __u32 lmm_magic)
 {
+	if (stripes == (__u16)-1)
+		stripes = 0;
+
 	if (lmm_magic == LOV_USER_MAGIC_V1)
 		return sizeof(struct lov_user_md_v1) +
 				stripes * sizeof(struct lov_user_ost_data_v1);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 20/28] lustre: lov: keep minimum LOVEA size
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (18 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 19/28] lustre: pfl: calculate PFL file LOVEA correctly James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 21/28] lustre: pfl: Read should not trigger layout write intent James Simmons
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

For a PFL file, some of its component could be un-instantiated, and
their lov_ost_data_v1 array is not needed, we should keep its LOVEA
as small as possible.

An unstantiated component's stripe offset should be set.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9489
Reviewed-on: https://review.whamcloud.com/27089
WC-bug-id: https://jira.whamcloud.com/browse/LU-9941
Reviewed-on: https://review.whamcloud.com/28845
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_internal.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 5e3eae7..dd4dd24 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -96,6 +96,11 @@ static inline bool lsm_entry_inited(const struct lov_stripe_md *lsm, int index)
 	return lsme_inited(lsm->lsm_entries[index]);
 }
 
+static inline bool lsm_is_composite(u32 magic)
+{
+	return magic == LOV_MAGIC_COMP_V1;
+}
+
 static inline size_t lov_comp_md_size(const struct lov_stripe_md *lsm)
 {
 	struct lov_stripe_md_entry *lsme;
@@ -110,8 +115,15 @@ static inline size_t lov_comp_md_size(const struct lov_stripe_md *lsm)
 
 	size = sizeof(struct lov_comp_md_v1);
 	for (entry = 0; entry < lsm->lsm_entry_count; entry++) {
+		u16 stripe_count;
+
 		lsme = lsm->lsm_entries[entry];
 
+		if (lsme_inited(lsme))
+			stripe_count = lsme->lsme_stripe_count;
+		else
+			stripe_count = 0;
+
 		size += sizeof(*lsme);
 		size += lov_mds_md_size(lsme->lsme_stripe_count,
 					lsme->lsme_magic);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 21/28] lustre: pfl: Read should not trigger layout write intent
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (19 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 20/28] lustre: lov: keep minimum LOVEA size James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 22/28] lustre: pfl: fix hang with grouplocks James Simmons
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Jinshan Xiong <jinshan.xiong@gmail.com>

In lov_io_rw_iter_init(), only write not read operation should
trigger layout write intent.

For append write, it has to make sure all uninited components
are instantiated.

Page mkwrite should also trigger write intent.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9008
Reviewed-on: https://review.whamcloud.com/26499
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/vvp_io.c | 20 ++++++++++++-----
 drivers/staging/lustre/lustre/lov/lov_io.c   | 33 +++++++++++++++++-----------
 2 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 5323fea..37f415f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -323,18 +323,26 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 	 * RPC.
 	 */
 	if (io->ci_need_write_intent) {
+		loff_t end = OBD_OBJECT_EOF;
 		loff_t start = 0;
-		loff_t end = 0;
-
-		LASSERT(io->ci_type == CIT_WRITE || cl_io_is_trunc(io));
 
 		io->ci_need_write_intent = 0;
 
+		LASSERT(io->ci_type == CIT_WRITE ||
+			cl_io_is_trunc(io) || cl_io_is_mkwrite(io));
+
 		if (io->ci_type == CIT_WRITE) {
-			start = io->u.ci_rw.crw_pos;
-			end = io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count;
-		} else {
+			if (!cl_io_is_append(io)) {
+				start = io->u.ci_rw.crw_pos;
+				end = start + io->u.ci_rw.crw_count;
+			}
+		} else if (cl_io_is_trunc(io)) {
 			end = io->u.ci_setattr.sa_attr.lvb_size;
+		} else { /* mkwrite */
+			pgoff_t index = io->u.ci_fault.ft_index;
+
+			start = cl_offset(io->ci_obj, index);
+			end = cl_offset(io->ci_obj, index + 1);
 		}
 
 		CDEBUG(D_VFSTRACE, DFID" type %d [%llx, %llx)\n",
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 8a1bb85..0d809b1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -378,6 +378,7 @@ static int lov_io_iter_init(const struct lu_env *env,
 {
 	struct lov_io	*lio = cl2lov_io(env, ios);
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
+	struct cl_io *io = ios->cis_io;
 	struct lov_layout_entry *le;
 	struct lov_io_sub    *sub;
 	struct lu_extent ext;
@@ -394,15 +395,28 @@ static int lov_io_iter_init(const struct lu_env *env,
 		u64 start;
 		u64 end;
 
-		CDEBUG(D_VFSTRACE, "component[%d] flags %#x\n",
-		       index, lsm->lsm_entries[index]->lsme_flags);
-		if (!lsm_entry_inited(lsm, index))
-			break;
-
 		index++;
 		if (!lu_extent_is_overlapped(&ext, &le->lle_extent))
 			continue;
 
+		CDEBUG(D_VFSTRACE, "component[%d] flags %#x\n",
+		       index - 1, lsm->lsm_entries[index - 1]->lsme_flags);
+		if (!lsm_entry_inited(lsm, index - 1)) {
+			/* truncate IO will trigger write intent as well, and
+			 * it's handled in lov_io_setattr_iter_init()
+			 */
+			if (io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io)) {
+			    io->ci_need_write_intent = 1;
+				rc = -ENODATA;
+				break;
+			}
+
+			/* Read from uninitialized components should return
+			 * zero filled pages.
+			 */
+			continue;
+		}
+
 		for (stripe = 0; stripe < r0->lo_nr; stripe++) {
 			if (!lov_stripe_intersects(lsm, index - 1, stripe,
 						   &ext, &start, &end))
@@ -498,13 +512,6 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	       start, lio->lis_pos, lio->lis_endpos,
 	       lio->lis_io_endpos);
 
-	index = lov_lsm_entry(lsm, lio->lis_endpos - 1);
-	if (index > 0 && !lsm_entry_inited(lsm, index)) {
-		io->ci_need_write_intent = 1;
-		io->ci_result = -ENODATA;
-		return io->ci_result;
-	}
-
 	/*
 	 * XXX The following call should be optimized: we know, that
 	 * [lio->lis_pos, lio->lis_endpos) intersects with exactly one stripe.
@@ -520,7 +527,7 @@ static int lov_io_setattr_iter_init(const struct lu_env *env,
 	struct lov_stripe_md *lsm = lio->lis_object->lo_lsm;
 	int index;
 
-	if (cl_io_is_trunc(io) && lio->lis_pos) {
+	if (cl_io_is_trunc(io) && lio->lis_pos > 0) {
 		index = lov_lsm_entry(lsm, lio->lis_pos - 1);
 		if (index > 0 && !lsm_entry_inited(lsm, index)) {
 			io->ci_need_write_intent = 1;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 22/28] lustre: pfl: fix hang with grouplocks
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (20 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 21/28] lustre: pfl: Read should not trigger layout write intent James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 23/28] lustre: pfl: fix ost pool op->size handling James Simmons
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

This is a makeshift fix. When we hold a group lock of a file,
there should no data written to the file, since during the
write IO, the file's layout could possibly change, and the
write IO will try to update its layout, which could be blocked
by itself.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9344
Reviewed-on: https://review.whamcloud.com/26646
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h |  2 ++
 drivers/staging/lustre/lustre/llite/file.c        | 26 +++++++++++++++++++++++
 drivers/staging/lustre/lustre/lov/lov_object.c    |  1 +
 3 files changed, 29 insertions(+)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 57ced0f..ee71f1c 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -288,6 +288,8 @@ struct cl_layout {
 	size_t		cl_size;
 	/** Layout generation. */
 	u32		cl_layout_gen;
+	/** whether layout is a composite one */
+	bool		cl_is_composite;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 009e9e8..5f6695f 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1551,6 +1551,7 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file,
 {
 	struct ll_inode_info   *lli = ll_i2info(inode);
 	struct ll_file_data    *fd = LUSTRE_FPRIVATE(file);
+	struct cl_object *obj = lli->lli_clob;
 	struct ll_grouplock    grouplock;
 	int		     rc;
 
@@ -1572,6 +1573,31 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file,
 	LASSERT(!fd->fd_grouplock.lg_lock);
 	spin_unlock(&lli->lli_lock);
 
+	/**
+	 * XXX: group lock needs to protect all OST objects while PFL
+	 * can add new OST objects during the IO, so we'd instantiate
+	 * all OST objects before getting its group lock.
+	 */
+	if (obj) {
+		struct cl_layout cl = {
+			.cl_is_composite = false,
+		};
+		struct lu_env *env;
+		u16 refcheck;
+
+		env = cl_env_get(&refcheck);
+		if (IS_ERR(env))
+			return PTR_ERR(env);
+
+		rc = cl_object_layout_get(env, obj, &cl);
+		if (!rc && cl.cl_is_composite)
+			rc = ll_layout_write_intent(inode, 0, OBD_OBJECT_EOF);
+
+		cl_env_put(env, &refcheck);
+		if (rc)
+			return rc;
+	}
+
 	rc = cl_get_grouplock(ll_i2info(inode)->lli_clob,
 			      arg, (file->f_flags & O_NONBLOCK), &grouplock);
 	if (rc)
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 680d232..e2eca01 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1637,6 +1637,7 @@ static int lov_object_layout_get(const struct lu_env *env,
 
 	cl->cl_size = lov_comp_md_size(lsm);
 	cl->cl_layout_gen = lsm->lsm_layout_gen;
+	cl->cl_is_composite = lsm_is_composite(lsm->lsm_magic);
 
 	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
 	lov_lsm_put(lsm);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 23/28] lustre: pfl: fix ost pool op->size handling
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (21 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 22/28] lustre: pfl: fix hang with grouplocks James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 24/28] lustre: lov: readahead shouldn't exceed component boundary James Simmons
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

This patch fixes the misunderstanding of ost_pool::op->size, it
indicates the buffer size allocated instead of the array count.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9359
Reviewed-on: https://review.whamcloud.com/26706
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_internal.h |  1 -
 drivers/staging/lustre/lustre/lov/lov_io.c       |  3 ++-
 drivers/staging/lustre/lustre/lov/lov_pool.c     | 20 +++++++++++---------
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index dd4dd24..3878cad 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -195,7 +195,6 @@ struct lsm_operations {
 })
 #endif
 
-#define pool_tgt_size(p)	((p)->pool_obds.op_size)
 #define pool_tgt_count(p)	((p)->pool_obds.op_count)
 #define pool_tgt_array(p)	((p)->pool_obds.op_array)
 #define pool_tgt_rw_sem(p)	((p)->pool_obds.op_rw_sem)
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 0d809b1..ec0d14f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -100,7 +100,8 @@ static int lov_io_sub_init(const struct lu_env *env, struct lov_io *lio,
 
 	LASSERT(!sub->sub_env);
 
-	if (unlikely(!lov_r0(lov, index)->lo_sub[stripe]))
+	if (unlikely(!lov_r0(lov, index)->lo_sub ||
+		     !lov_r0(lov, index)->lo_sub[stripe]))
 		return -EIO;
 
 	/* obtain new environment */
diff --git a/drivers/staging/lustre/lustre/lov/lov_pool.c b/drivers/staging/lustre/lustre/lov/lov_pool.c
index c79c2ae..b90fb1c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pool.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pool.c
@@ -238,8 +238,9 @@ int lov_ost_pool_init(struct ost_pool *op, unsigned int count)
 	op->op_array = NULL;
 	op->op_count = 0;
 	init_rwsem(&op->op_rw_sem);
-	op->op_size = count;
-	op->op_array = kcalloc(op->op_size, sizeof(op->op_array[0]), GFP_NOFS);
+	op->op_size = count * sizeof(op->op_array[0]);
+	op->op_array = kcalloc(count, sizeof(op->op_array[0]),
+			       GFP_KERNEL);
 	if (!op->op_array) {
 		op->op_size = 0;
 		return -ENOMEM;
@@ -250,24 +251,25 @@ int lov_ost_pool_init(struct ost_pool *op, unsigned int count)
 /* Caller must hold write op_rwlock */
 int lov_ost_pool_extend(struct ost_pool *op, unsigned int min_count)
 {
-	__u32 *new;
-	int new_size;
+	int new_count;
+	u32 *new;
 
 	LASSERT(min_count != 0);
 
-	if (op->op_count < op->op_size)
+	if (op->op_count * sizeof(op->op_array[0]) < op->op_size)
 		return 0;
 
-	new_size = max(min_count, 2 * op->op_size);
-	new = kcalloc(new_size, sizeof(op->op_array[0]), GFP_NOFS);
+	new_count = max_t(u32, min_count,
+			  2 * op->op_size / sizeof(op->op_array[0]));
+	new = kcalloc(new_count, sizeof(op->op_array[0]), GFP_KERNEL);
 	if (!new)
 		return -ENOMEM;
 
 	/* copy old array to new one */
-	memcpy(new, op->op_array, op->op_size * sizeof(op->op_array[0]));
+	memcpy(new, op->op_array, op->op_size);
 	kfree(op->op_array);
 	op->op_array = new;
-	op->op_size = new_size;
+	op->op_size = new_count * sizeof(op->op_array[0]);
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 24/28] lustre: lov: readahead shouldn't exceed component boundary
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (22 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 23/28] lustre: pfl: fix ost pool op->size handling James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:29 ` [lustre-devel] [PATCH 25/28] lustre: uapi: support negative flags James Simmons
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Jinshan Xiong <jinshan.xiong@gmail.com>

Otherwise, it will extend the readahead RPC to the next component
while the actual lock of that component is not checked.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9340
Reviewed-on: https://review.whamcloud.com/26677
Reviewed-on: https://review.whamcloud.com/26861
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_io.c   | 23 ++++++++++++++++-------
 drivers/staging/lustre/lustre/lov/lov_page.c |  4 +++-
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index ec0d14f..9a3352f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -684,28 +684,37 @@ static int lov_io_read_ahead(const struct lu_env *env,
 		return rc;
 
 	/**
-	 * Adjust the stripe index by layout of raid0. ra->cra_end is
+	 * Adjust the stripe index by layout of comp. ra->cra_end is
 	 * the maximum page index covered by an underlying DLM lock.
 	 * This function converts cra_end from stripe level to file
-	 * level, and make sure it's not beyond stripe boundary.
+	 * level, and make sure it's not beyond stripe and component
+	 * boundary.
 	 */
-	if (r0->lo_nr == 1)	/* single stripe file */
-		return 0;
 
 	/* cra_end is stripe level, convert it into file level */
 	ra_end = ra->cra_end;
 	if (ra_end != CL_PAGE_EOF)
-		ra_end = lov_stripe_pgoff(loo->lo_lsm, index, ra_end, stripe);
+		ra->cra_end = lov_stripe_pgoff(loo->lo_lsm, index,
+					       ra_end, stripe);
+
+	/* boundary of current component */
+	ra_end = cl_index(obj, (loff_t)lov_lse(loo, index)->lsme_extent.e_end);
+	if (ra_end != CL_PAGE_EOF && ra->cra_end >= ra_end)
+		ra->cra_end = ra_end - 1;
+
+	if (r0->lo_nr == 1) /* single stripe file */
+		return 0;
 
 	pps = lov_lse(loo, index)->lsme_stripe_size >> PAGE_SHIFT;
 
 	CDEBUG(D_READA,
 	       DFID " max_index = %lu, pps = %u, index = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
-	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps, index,
+	       PFID(lu_object_fid(lov2lu(loo))), ra->cra_end, pps, index,
 	       lov_lse(loo, index)->lsme_stripe_size, stripe, start);
 
 	/* never exceed the end of the stripe */
-	ra->cra_end = min_t(pgoff_t, ra_end, start + pps - start % pps - 1);
+	ra->cra_end = min_t(pgoff_t,
+			    ra->cra_end, start + pps - start % pps - 1);
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index 8b68d3c..90e2981 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -56,7 +56,9 @@ static int lov_comp_page_print(const struct lu_env *env,
 {
 	struct lov_page *lp = cl2lov_page(slice);
 
-	return (*printer)(env, cookie, LUSTRE_LOV_NAME "-page@%p, raid0\n", lp);
+	return (*printer)(env, cookie,
+			  LUSTRE_LOV_NAME "-page@%p, comp index: %x\n",
+			  lp, lp->lps_index);
 }
 
 static const struct cl_page_operations lov_comp_page_ops = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 25/28] lustre: uapi: support negative flags
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (23 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 24/28] lustre: lov: readahead shouldn't exceed component boundary James Simmons
@ 2018-12-17 16:29 ` James Simmons
  2018-12-17 16:30 ` [lustre-devel] [PATCH 26/28] lustre: llite: return v1/v3 layout for legacy app James Simmons
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:29 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

'flags' can be negative flags.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8998
Reviewed-on: https://review.whamcloud.com/26490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index b19adef..833a57a 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -413,9 +413,12 @@ enum lov_comp_md_entry_flags {
 	LCME_FL_OFFLINE		= 0x00000004,   /* Not used */
 	LCME_FL_PREFERRED	= 0x00000008,	/* Not used */
 	LCME_FL_INIT		= 0x00000010,	/* instantiated */
+	LCME_FL_NEG		= 0x80000000,	/* used to indicate a negative
+						 * flag, won't be stored on disk
+						 */
 };
 
-#define LCME_KNOWN_FLAGS	LCME_FL_INIT
+#define LCME_KNOWN_FLAGS	(LCME_FL_NEG | LCME_FL_INIT)
 
 /* lcme_id can be specified as certain flags, and the the first
  * bit of lcme_id is used to indicate that the ID is representing
@@ -426,7 +429,7 @@ enum lcme_id {
 	LCME_ID_INVAL	= 0x0,
 	LCME_ID_MAX	= 0x7FFFFFFF,
 	LCME_ID_ALL	= 0xFFFFFFFF,
-	LCME_ID_NONE	= 0x80000000
+	LCME_ID_NOT_ID	= LCME_FL_NEG
 };
 
 #define LCME_ID_MASK	LCME_ID_MAX
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 26/28] lustre: llite: return v1/v3 layout for legacy app
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (24 preceding siblings ...)
  2018-12-17 16:29 ` [lustre-devel] [PATCH 25/28] lustre: uapi: support negative flags James Simmons
@ 2018-12-17 16:30 ` James Simmons
  2018-12-17 16:30 ` [lustre-devel] [PATCH 27/28] lustre: llite: restore ll_file_getstripe in ll_lov_setstripe James Simmons
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:30 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

Legacy app such as ADIO fetches LOVEA by ioctl LL_IOC_LOV_GETSTRIPE
and treats file layout as v1/v3 blindly, we'd return a reasonable
v1/v3 in this case.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9490
Reviewed-on: https://review.whamcloud.com/27183
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_object.c |  2 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c   | 72 +++++++++++++++++++++++---
 2 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index e2eca01..ca18c1e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1615,7 +1615,7 @@ static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 	if (!lsm)
 		return -ENODATA;
 
-	rc = lov_getstripe(cl2lov(obj), lsm, lum);
+	rc = lov_getstripe(env, cl2lov(obj), lsm, lum);
 	lov_lsm_put(lsm);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 32e4b33..10be119 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -315,12 +315,14 @@ struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, void *buf,
  * the maximum number of OST indices which will fit in the user buffer.
  * lmm_magic must be LOV_USER_MAGIC.
  */
-int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
-		  struct lov_user_md __user *lump)
+int lov_getstripe(const struct lu_env *env, struct lov_object *obj,
+		  struct lov_stripe_md *lsm, struct lov_user_md __user *lump)
 {
 	/* we use lov_user_md_v3 because it is larger than lov_user_md_v1 */
 	struct lov_mds_md *lmmk;
-	ssize_t lmm_size;
+	struct lov_user_md_v1 lum;
+	ssize_t lmm_size, lum_size = 0;
+	static bool printed;
 	size_t lmmk_size;
 	int rc = 0;
 
@@ -332,6 +334,13 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		goto out;
 	}
 
+	if (!printed) {
+		LCONSOLE_WARN("%s: using old ioctl(LL_IOC_LOV_GETSTRIPE) on " DFID ", use llapi_layout_get_by_path()\n",
+			      current->comm,
+			      PFID(&obj->lo_cl.co_lu.lo_header->loh_fid));
+		printed = true;
+	}
+
 	lmmk_size = lov_comp_md_size(lsm);
 	lmmk = kvzalloc(lmmk_size, GFP_KERNEL);
 	if (!lmmk) {
@@ -357,10 +366,61 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		}
 	}
 
-	if (copy_to_user(lump, lmmk, lmmk_size))
+	/* Legacy appication passes limited buffer, we need to figure out
+	 * the user buffer size by the passed in lmm_stripe_count.
+	 */
+	if (copy_from_user(&lum, lump, sizeof(struct lov_user_md_v1))) {
 		rc = -EFAULT;
-	else
-		rc = 0;
+		goto out_free;
+	}
+
+	if (lum.lmm_magic == LOV_USER_MAGIC_V1 ||
+	    lum.lmm_magic == LOV_USER_MAGIC_V3)
+		lum_size = lov_user_md_size(lum.lmm_stripe_count,
+					    lum.lmm_magic);
+
+	if (lum_size != 0) {
+		struct lov_mds_md *comp_md = lmmk;
+
+		/* Legacy app (ADIO for instance) treats the layout as V1/V3
+		 * blindly, we'd return a reasonable V1/V3 for them.
+		 */
+		if (lmmk->lmm_magic == LOV_MAGIC_COMP_V1) {
+			struct lov_comp_md_v1 *comp_v1;
+			struct cl_object *cl_obj;
+			struct cl_attr attr;
+			int i;
+
+			attr.cat_size = 0;
+			cl_obj = cl_object_top(&obj->lo_cl);
+			cl_object_attr_get(env, cl_obj, &attr);
+
+			/* return the last instantiated component if file size
+			 * is non-zero, otherwise, return the last component.
+			 */
+			comp_v1 = (struct lov_comp_md_v1 *)lmmk;
+			i = attr.cat_size == 0 ? comp_v1->lcm_entry_count : 0;
+			for (; i < comp_v1->lcm_entry_count; i++) {
+				if (!(comp_v1->lcm_entries[i].lcme_flags &
+				    LCME_FL_INIT))
+					break;
+			}
+			if (i > 0)
+				i--;
+			comp_md = (struct lov_mds_md *)((char *)comp_v1 +
+					comp_v1->lcm_entries[i].lcme_offset);
+		}
+		if (copy_to_user(lump, comp_md, lum_size)) {
+			rc = -EFAULT;
+			goto out_free;
+		}
+	} else {
+		if (copy_to_user(lump, lmmk, lmmk_size)) {
+			rc = -EFAULT;
+			goto out_free;
+		}
+	}
+
 out_free:
 	kvfree(lmmk);
 out:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 27/28] lustre: llite: restore ll_file_getstripe in ll_lov_setstripe
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (25 preceding siblings ...)
  2018-12-17 16:30 ` [lustre-devel] [PATCH 26/28] lustre: llite: return v1/v3 layout for legacy app James Simmons
@ 2018-12-17 16:30 ` James Simmons
  2018-12-17 16:30 ` [lustre-devel] [PATCH 28/28] lustre: lov: do not split IO for single striped file James Simmons
  2018-12-18  6:21 ` [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client NeilBrown
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:30 UTC (permalink / raw)
  To: lustre-devel

From: Bobi Jam <bobijam@hotmail.com>

Commit fafe6b4d4a6fa63cedff3bd44e6578009578b3d7 has get rid of
the call to ll_file_getstripe in ll_lov_setstripe.

Add a @size parameter for series of xxx_getstripe interfaces,
indicating the max buffer size that user provides to hold the
stripe information. It is mainly for the ll_lov_setstripe, which
will call ll_file_getstripe to fetch basic stripe inforation.

Add LL_IOC_LOV_SETSTRIPE_NEW/LL_IOC_LOV_GETSTRIPE_NEW ioctl interface
which defines the interface correctly, which could be used in later
Lustre versions.

Signed-off-by: Bobi Jam <bobijam@hotmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9367
Reviewed-on: https://review.whamcloud.com/26915
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_user.h |  2 ++
 drivers/staging/lustre/lustre/include/cl_object.h  |  4 +--
 drivers/staging/lustre/lustre/llite/dir.c          |  5 ++-
 drivers/staging/lustre/lustre/llite/file.c         | 36 +++++++++++++++-------
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  5 +--
 drivers/staging/lustre/lustre/lov/lov_object.c     |  4 +--
 drivers/staging/lustre/lustre/lov/lov_pack.c       | 33 ++++++++++++++------
 drivers/staging/lustre/lustre/obdclass/cl_object.c |  5 +--
 8 files changed, 64 insertions(+), 30 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
index 833a57a..e9451a5 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_user.h
@@ -227,7 +227,9 @@ struct ost_id {
 #define LL_IOC_SETFLAGS		 _IOW('f', 152, long)
 #define LL_IOC_CLRFLAGS		 _IOW('f', 153, long)
 #define LL_IOC_LOV_SETSTRIPE	    _IOW('f', 154, long)
+#define LL_IOC_LOV_SETSTRIPE_NEW	_IOWR('f', 154, struct lov_user_md)
 #define LL_IOC_LOV_GETSTRIPE	    _IOW('f', 155, long)
+#define LL_IOC_LOV_GETSTRIPE_NEW	_IOR('f', 155, struct lov_user_md)
 #define LL_IOC_LOV_SETEA		_IOW('f', 156, long)
 /*	LL_IOC_RECREATE_OBJ		157 obsolete */
 /*	LL_IOC_RECREATE_FID		158 obsolete */
diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index ee71f1c..4f0e8e2 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -390,7 +390,7 @@ struct cl_object_operations {
 	 * Object getstripe method.
 	 */
 	int (*coo_getstripe)(const struct lu_env *env, struct cl_object *obj,
-			     struct lov_user_md __user *lum);
+			     struct lov_user_md __user *lum, size_t size);
 	/**
 	 * Get FIEMAP mapping from the object.
 	 */
@@ -2057,7 +2057,7 @@ int  cl_conf_set(const struct lu_env *env, struct cl_object *obj,
 int cl_object_prune(const struct lu_env *env, struct cl_object *obj);
 void cl_object_kill(const struct lu_env *env, struct cl_object *obj);
 int  cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
-			 struct lov_user_md __user *lum);
+			 struct lov_user_md __user *lum, size_t size);
 int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		     struct ll_fiemap_info_key *fmkey, struct fiemap *fiemap,
 		     size_t *buflen);
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 8fbce96..25cd42ee 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1224,6 +1224,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 
 		return rc;
 	}
+	case LL_IOC_LOV_SETSTRIPE_NEW:
 	case LL_IOC_LOV_SETSTRIPE: {
 		struct lov_user_md_v3 lumv3;
 		struct lov_user_md_v1 *lumv1 = (struct lov_user_md_v1 *)&lumv3;
@@ -1362,6 +1363,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	case IOC_OBD_STATFS:
 		return ll_obd_statfs(inode, (void __user *)arg);
 	case LL_IOC_LOV_GETSTRIPE:
+	case LL_IOC_LOV_GETSTRIPE_NEW:
 	case LL_IOC_MDC_GETINFO:
 	case IOC_MDC_GETFILEINFO:
 	case IOC_MDC_GETFILESTRIPE: {
@@ -1404,7 +1406,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		}
 
 		if (cmd == IOC_MDC_GETFILESTRIPE ||
-		    cmd == LL_IOC_LOV_GETSTRIPE) {
+		    cmd == LL_IOC_LOV_GETSTRIPE ||
+		    cmd == LL_IOC_LOV_GETSTRIPE_NEW) {
 			lump = (struct lov_user_md __user *)arg;
 		} else {
 			struct lov_user_mds_data __user *lmdp;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 5f6695f..1980c79 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1481,7 +1481,7 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename,
 }
 
 static int ll_lov_setea(struct inode *inode, struct file *file,
-			unsigned long arg)
+			void __user *arg)
 {
 	__u64			 flags = MDS_OPEN_HAS_OBJS | FMODE_WRITE;
 	struct lov_user_md	*lump;
@@ -1496,7 +1496,7 @@ static int ll_lov_setea(struct inode *inode, struct file *file,
 	if (!lump)
 		return -ENOMEM;
 
-	if (copy_from_user(lump, (struct lov_user_md __user *)arg, lum_size)) {
+	if (copy_from_user(lump, arg, lum_size)) {
 		kvfree(lump);
 		return -EFAULT;
 	}
@@ -1509,8 +1509,7 @@ static int ll_lov_setea(struct inode *inode, struct file *file,
 	return rc;
 }
 
-static int ll_file_getstripe(struct inode *inode,
-			     struct lov_user_md __user *lum)
+static int ll_file_getstripe(struct inode *inode, void __user *lum, size_t size)
 {
 	struct lu_env *env;
 	u16 refcheck;
@@ -1520,13 +1519,13 @@ static int ll_file_getstripe(struct inode *inode,
 	if (IS_ERR(env))
 		return PTR_ERR(env);
 
-	rc = cl_object_getstripe(env, ll_i2info(inode)->lli_clob, lum);
+	rc = cl_object_getstripe(env, ll_i2info(inode)->lli_clob, lum, size);
 	cl_env_put(env, &refcheck);
 	return rc;
 }
 
 static int ll_lov_setstripe(struct inode *inode, struct file *file,
-			    unsigned long arg)
+			    void __user *arg)
 {
 	struct lov_user_md __user *lum = (struct lov_user_md __user *)arg;
 	struct lov_user_md *klum;
@@ -1540,8 +1539,22 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file,
 	lum_size = rc;
 	rc = ll_lov_setstripe_ea_info(inode, file->f_path.dentry, flags, klum,
 				      lum_size);
-	cl_lov_delay_create_clear(&file->f_flags);
+	if (!rc) {
+		u32 gen;
+
+		rc = put_user(0, &lum->lmm_stripe_count);
+		if (rc)
+			goto out;
 
+		rc = ll_layout_refresh(inode, &gen);
+		if (rc)
+			goto out;
+
+		rc = ll_file_getstripe(inode, arg, lum_size);
+	}
+
+	cl_lov_delay_create_clear(&file->f_flags);
+out:
 	kfree(klum);
 	return rc;
 }
@@ -2293,9 +2306,10 @@ int ll_ioctl_fssetxattr(struct inode *inode, unsigned int cmd,
 		}
 		return 0;
 	case LL_IOC_LOV_SETSTRIPE:
-		return ll_lov_setstripe(inode, file, arg);
+	case LL_IOC_LOV_SETSTRIPE_NEW:
+		return ll_lov_setstripe(inode, file, (void __user *) arg);
 	case LL_IOC_LOV_SETEA:
-		return ll_lov_setea(inode, file, arg);
+		return ll_lov_setea(inode, file, (void __user *) arg);
 	case LL_IOC_LOV_SWAP_LAYOUTS: {
 		struct file *file2;
 		struct lustre_swap_layouts lsl;
@@ -2348,8 +2362,8 @@ int ll_ioctl_fssetxattr(struct inode *inode, unsigned int cmd,
 		return rc;
 	}
 	case LL_IOC_LOV_GETSTRIPE:
-		return ll_file_getstripe(inode,
-					 (struct lov_user_md __user *)arg);
+	case LL_IOC_LOV_GETSTRIPE_NEW:
+		return ll_file_getstripe(inode, (void __user *)arg, 0);
 	case FSFILT_IOC_GETFLAGS:
 	case FSFILT_IOC_SETFLAGS:
 		return ll_iocontrol(inode, file, cmd, arg);
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 96e6636..5d4c83b 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -651,8 +651,9 @@ static inline struct lov_stripe_md_entry *lov_lse(struct lov_object *lov, int i)
 }
 
 /* lov_pack.c */
-int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
-		  struct lov_user_md __user *lump);
+int lov_getstripe(const struct lu_env *env, struct lov_object *obj,
+		  struct lov_stripe_md *lsm, struct lov_user_md __user *lump,
+		  size_t size);
 
 /** @} lov */
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index ca18c1e..4696f7c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1605,7 +1605,7 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 }
 
 static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
-				struct lov_user_md __user *lum)
+				struct lov_user_md __user *lum, size_t size)
 {
 	struct lov_object *lov = cl2lov(obj);
 	struct lov_stripe_md *lsm;
@@ -1615,7 +1615,7 @@ static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 	if (!lsm)
 		return -ENODATA;
 
-	rc = lov_getstripe(env, cl2lov(obj), lsm, lum);
+	rc = lov_getstripe(env, cl2lov(obj), lsm, lum, size);
 	lov_lsm_put(lsm);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 10be119..ef3c040 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -314,12 +314,16 @@ struct lov_stripe_md *lov_unpackmd(struct lov_obd *lov, void *buf,
  * @lump is a pointer to an in-core struct with lmm_ost_count indicating
  * the maximum number of OST indices which will fit in the user buffer.
  * lmm_magic must be LOV_USER_MAGIC.
+ *
+ * If @size > 0, User specified limited buffer size, usually the buffer is from
+ * ll_lov_setstripe(), and the buffer can only hold basic layout template info.
  */
 int lov_getstripe(const struct lu_env *env, struct lov_object *obj,
-		  struct lov_stripe_md *lsm, struct lov_user_md __user *lump)
+		  struct lov_stripe_md *lsm, struct lov_user_md __user *lump,
+		  size_t size)
 {
 	/* we use lov_user_md_v3 because it is larger than lov_user_md_v1 */
-	struct lov_mds_md *lmmk;
+	struct lov_mds_md *lmmk, *lmm;
 	struct lov_user_md_v1 lum;
 	ssize_t lmm_size, lum_size = 0;
 	static bool printed;
@@ -410,15 +414,24 @@ int lov_getstripe(const struct lu_env *env, struct lov_object *obj,
 			comp_md = (struct lov_mds_md *)((char *)comp_v1 +
 					comp_v1->lcm_entries[i].lcme_offset);
 		}
-		if (copy_to_user(lump, comp_md, lum_size)) {
-			rc = -EFAULT;
-			goto out_free;
-		}
+
+		lmm = comp_md;
+		lmm_size = lum_size;
 	} else {
-		if (copy_to_user(lump, lmmk, lmmk_size)) {
-			rc = -EFAULT;
-			goto out_free;
-		}
+		lmm = lmmk;
+		lmm_size = lmmk_size;
+	}
+	/**
+	 * User specified limited buffer size, usually the buffer is
+	 * from ll_lov_setstripe(), and the buffer can only hold basic
+	 * layout template info.
+	 */
+	if (size == 0 || size > lmm_size)
+		size = lmm_size;
+
+	if (copy_to_user(lump, lmm, size)) {
+		rc = -EFAULT;
+		goto out_free;
 	}
 
 out_free:
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 09fc7e7..b2bf570 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -323,7 +323,7 @@ int cl_object_prune(const struct lu_env *env, struct cl_object *obj)
  * Get stripe information of this object.
  */
 int cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
-			struct lov_user_md __user *uarg)
+			struct lov_user_md __user *uarg, size_t size)
 {
 	struct lu_object_header *top;
 	int result = 0;
@@ -331,7 +331,8 @@ int cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 	top = obj->co_lu.lo_header;
 	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
 		if (obj->co_ops->coo_getstripe) {
-			result = obj->co_ops->coo_getstripe(env, obj, uarg);
+			result = obj->co_ops->coo_getstripe(env, obj, uarg,
+							    size);
 			if (result)
 				break;
 		}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 28/28] lustre: lov: do not split IO for single striped file
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (26 preceding siblings ...)
  2018-12-17 16:30 ` [lustre-devel] [PATCH 27/28] lustre: llite: restore ll_file_getstripe in ll_lov_setstripe James Simmons
@ 2018-12-17 16:30 ` James Simmons
  2018-12-18  6:21 ` [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client NeilBrown
  28 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-17 16:30 UTC (permalink / raw)
  To: lustre-devel

From: Jinshan Xiong <jinshan.xiong@gmail.com>

stripe size for single striped file is not reliable, it shouldn't
be used to split I/O.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9841
Reviewed-on: https://review.whamcloud.com/28451
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_io.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 9a3352f..47bb618 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -466,7 +466,6 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 	struct cl_io	 *io  = ios->cis_io;
 	u64 start = io->u.ci_rw.crw_pos;
 	struct lov_stripe_md_entry *lse;
-	unsigned long ssize;
 	int index;
 	u64 next;
 
@@ -491,11 +490,15 @@ static int lov_io_rw_iter_init(const struct lu_env *env,
 
 	lse = lov_lse(lio->lis_object, index);
 
-	ssize = lse->lsme_stripe_size;
-	lov_do_div64(start, ssize);
-	next = (start + 1) * ssize;
-	if (next <= start * ssize)
-		next = ~0ull;
+	next = MAX_LFS_FILESIZE;
+	if (lse->lsme_stripe_count > 1) {
+		unsigned long ssize = lse->lsme_stripe_size;
+
+		lov_do_div64(start, ssize);
+		next = (start + 1) * ssize;
+		if (next <= start * ssize)
+			next = MAX_LFS_FILESIZE;
+	}
 
 	LASSERTF(io->u.ci_rw.crw_pos >= lse->lsme_extent.e_start,
 		 "pos %lld, [%lld, %lld]\n", io->u.ci_rw.crw_pos,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout
  2018-12-17 16:29 ` [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout James Simmons
@ 2018-12-17 23:54   ` NeilBrown
  2018-12-18  1:47     ` Patrick Farrell
  2018-12-27  1:57     ` James Simmons
  0 siblings, 2 replies; 41+ messages in thread
From: NeilBrown @ 2018-12-17 23:54 UTC (permalink / raw)
  To: lustre-devel

On Mon, Dec 17 2018, James Simmons wrote:

> From: Niu Yawei <yawei.niu@intel.com>
>
> Added basic structures and magic numbers for composite layout.
>

This would be a great place to (brief) explain what PFL does and what it
is going to do with this data structures.
What are the "components" and how do they form a "composite layout" ??

> +
> +enum lov_comp_md_entry_flags {
> +	LCME_FL_PRIMARY		= 0x00000001,   /* Not used */
> +	LCME_FL_STALE		= 0x00000002,   /* Not used */
> +	LCME_FL_OFFLINE		= 0x00000004,   /* Not used */
> +	LCME_FL_PREFERRED	= 0x00000008,	/* Not used */
> +	LCME_FL_INIT		= 0x00000010,	/* instantiated */
> +};
> +
> +#define LCME_KNOWN_FLAGS	LCME_FL_INIT

What is a "KNOWN" flags?  What isn't known about the other ones?

> +
> +/* lcme_id can be specified as certain flags, and the the first
                                                     ^^^^^^^
Too many "the"s.                                                     

> + * bit of lcme_id is used to indicate that the ID is representing
> + * certain LCME_FL_* but not a real ID. Which implies we can have
> + * at most 31 flags (see LCME_FL_XXX).
> + */
> +enum lcme_id {
> +	LCME_ID_INVAL	= 0x0,
> +	LCME_ID_MAX	= 0x7FFFFFFF,
> +	LCME_ID_ALL	= 0xFFFFFFFF,
> +	LCME_ID_NONE	= 0x80000000
> +};
> +
> +#define LCME_ID_MASK	LCME_ID_MAX

Why is MASK a #define, but MAX an enum ??

> +
> +struct lov_comp_md_entry_v1 {
> +	__u32			lcme_id;	/* unique id of component */
> +	__u32			lcme_flags;	/* LCME_FL_XXX */
> +	struct lu_extent	lcme_extent;	/* file extent for component */
> +	__u32			lcme_offset;	/* offset of component blob,
> +						 * start from lov_comp_md_v1
> +						 */
> +	__u32			lcme_size;	/* size of component blob */
> +	__u64			lcme_padding[2];
> +} __packed;
> +
> +enum lov_comp_md_flags;

This enum is empty, and never used.

It eventually gets some LCM_FL_* names added... maybe it should wait
until those are added??

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/c1c61bb3/attachment.sig>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work
  2018-12-17 16:29 ` [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work James Simmons
@ 2018-12-18  0:00   ` NeilBrown
  2018-12-27  1:59     ` James Simmons
  0 siblings, 1 reply; 41+ messages in thread
From: NeilBrown @ 2018-12-18  0:00 UTC (permalink / raw)
  To: lustre-devel

On Mon, Dec 17 2018, James Simmons wrote:

> From: Bobi Jam <bobijam@hotmail.com>
>
> Move lov_tgt_maxbytes() and lsm_free_plain() toward the top of
> lov_ea.c for upcoming PFL work.

No mention of the lsm_op_find() move??
That function is still in lov_internal.h in OpenSFS lustre !!

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/65953ebc/attachment.sig>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  2018-12-17 16:29 ` [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling James Simmons
@ 2018-12-18  0:09   ` NeilBrown
  2018-12-18  1:49     ` Patrick Farrell
  2018-12-27  2:04     ` James Simmons
  0 siblings, 2 replies; 41+ messages in thread
From: NeilBrown @ 2018-12-18  0:09 UTC (permalink / raw)
  To: lustre-devel

On Mon, Dec 17 2018, James Simmons wrote:

> From: Bobi Jam <bobijam@hotmail.com>
>
> Several of the struct lsm_operations functions for both v1 and v3
> are nearly identical. Let merge them together.
                        Let's
                       
>  
> -const struct lsm_operations lsm_v1_ops = {
> -	.lsm_free	    = lsm_free_plain,
> -	.lsm_stripe_by_index    = lsm_stripe_by_index_plain,
> -	.lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
> -	.lsm_lmm_verify	 = lsm_lmm_verify_v1,
> -	.lsm_unpackmd	   = lsm_unpackmd_v1,
> +const static struct lsm_operations lsm_v1_ops = {
> +	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
> +	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
> +	.lsm_lmm_verify		= lsm_lmm_verify_v1,
> +	.lsm_unpackmd		= lsm_unpackmd_v1,

I *SO* wish you would stop combining white-spaces fixes with other
changes in the same patch!!!
The above hunk should just add the 'static' and remove the 'lsm_free'.
The rest is just noise and makes it harder to review the patch.


Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/87d9594b/attachment.sig>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout
  2018-12-17 23:54   ` NeilBrown
@ 2018-12-18  1:47     ` Patrick Farrell
  2018-12-27  1:57     ` James Simmons
  1 sibling, 0 replies; 41+ messages in thread
From: Patrick Farrell @ 2018-12-18  1:47 UTC (permalink / raw)
  To: lustre-devel


Re: KNOWN.

These are on disk flags.  It?s a mask used to check for unknown flags.

________________________________
From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
Sent: Monday, December 17, 2018 5:54:41 PM
To: James Simmons; Andreas Dilger; Oleg Drokin; Bobi Jam; Jinshan Xiong
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout

On Mon, Dec 17 2018, James Simmons wrote:

> From: Niu Yawei <yawei.niu@intel.com>
>
> Added basic structures and magic numbers for composite layout.
>

This would be a great place to (brief) explain what PFL does and what it
is going to do with this data structures.
What are the "components" and how do they form a "composite layout" ??

> +
> +enum lov_comp_md_entry_flags {
> +     LCME_FL_PRIMARY         = 0x00000001,   /* Not used */
> +     LCME_FL_STALE           = 0x00000002,   /* Not used */
> +     LCME_FL_OFFLINE         = 0x00000004,   /* Not used */
> +     LCME_FL_PREFERRED       = 0x00000008,   /* Not used */
> +     LCME_FL_INIT            = 0x00000010,   /* instantiated */
> +};
> +
> +#define LCME_KNOWN_FLAGS     LCME_FL_INIT

What is a "KNOWN" flags?  What isn't known about the other ones?

> +
> +/* lcme_id can be specified as certain flags, and the the first
                                                     ^^^^^^^
Too many "the"s.

> + * bit of lcme_id is used to indicate that the ID is representing
> + * certain LCME_FL_* but not a real ID. Which implies we can have
> + * at most 31 flags (see LCME_FL_XXX).
> + */
> +enum lcme_id {
> +     LCME_ID_INVAL   = 0x0,
> +     LCME_ID_MAX     = 0x7FFFFFFF,
> +     LCME_ID_ALL     = 0xFFFFFFFF,
> +     LCME_ID_NONE    = 0x80000000
> +};
> +
> +#define LCME_ID_MASK LCME_ID_MAX

Why is MASK a #define, but MAX an enum ??

> +
> +struct lov_comp_md_entry_v1 {
> +     __u32                   lcme_id;        /* unique id of component */
> +     __u32                   lcme_flags;     /* LCME_FL_XXX */
> +     struct lu_extent        lcme_extent;    /* file extent for component */
> +     __u32                   lcme_offset;    /* offset of component blob,
> +                                              * start from lov_comp_md_v1
> +                                              */
> +     __u32                   lcme_size;      /* size of component blob */
> +     __u64                   lcme_padding[2];
> +} __packed;
> +
> +enum lov_comp_md_flags;

This enum is empty, and never used.

It eventually gets some LCM_FL_* names added... maybe it should wait
until those are added??

Thanks,
NeilBrown
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/09b52b51/attachment.html>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  2018-12-18  0:09   ` NeilBrown
@ 2018-12-18  1:49     ` Patrick Farrell
  2018-12-27  2:10       ` James Simmons
  2018-12-27  2:04     ` James Simmons
  1 sibling, 1 reply; 41+ messages in thread
From: Patrick Farrell @ 2018-12-18  1:49 UTC (permalink / raw)
  To: lustre-devel


Just throwing this in, I agree with Neil very strongly here.  I?ve seen some patches recently with ~20 lines of change and 200 lines of whitespace.

________________________________
From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
Sent: Monday, December 17, 2018 6:09:57 PM
To: James Simmons; Andreas Dilger; Oleg Drokin; Bobi Jam; Jinshan Xiong
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling

On Mon, Dec 17 2018, James Simmons wrote:

> From: Bobi Jam <bobijam@hotmail.com>
>
> Several of the struct lsm_operations functions for both v1 and v3
> are nearly identical. Let merge them together.
                        Let's

>
> -const struct lsm_operations lsm_v1_ops = {
> -     .lsm_free           = lsm_free_plain,
> -     .lsm_stripe_by_index    = lsm_stripe_by_index_plain,
> -     .lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
> -     .lsm_lmm_verify  = lsm_lmm_verify_v1,
> -     .lsm_unpackmd      = lsm_unpackmd_v1,
> +const static struct lsm_operations lsm_v1_ops = {
> +     .lsm_stripe_by_index    = lsm_stripe_by_index_plain,
> +     .lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
> +     .lsm_lmm_verify         = lsm_lmm_verify_v1,
> +     .lsm_unpackmd           = lsm_unpackmd_v1,

I *SO* wish you would stop combining white-spaces fixes with other
changes in the same patch!!!
The above hunk should just add the 'static' and remove the 'lsm_free'.
The rest is just noise and makes it harder to review the patch.


Thanks,
NeilBrown
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/62ee40ec/attachment-0001.html>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client
  2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
                   ` (27 preceding siblings ...)
  2018-12-17 16:30 ` [lustre-devel] [PATCH 28/28] lustre: lov: do not split IO for single striped file James Simmons
@ 2018-12-18  6:21 ` NeilBrown
  2018-12-20  1:39   ` NeilBrown
  28 siblings, 1 reply; 41+ messages in thread
From: NeilBrown @ 2018-12-18  6:21 UTC (permalink / raw)
  To: lustre-devel

On Mon, Dec 17 2018, James Simmons wrote:

> This is the initial PFL port to the linux lustre client. This opens
> up feed back on the port so far. Currently sanity passes but the
> test for sanity-pfl fail as below. I have been tracking downing
> various bugs but this one remains and I haven't found out why its
> failing. So far from what I can tell is lov_io_setattr_iter_init()
> it returning -ENODATA due to lsm_entry_inited() is not initialized.

Having that invariant in cl_io_iter_fini() seems strange.
It is guaranteed to fir eif cl_io_iter_init() fails - if that is not
permitted, I would expect an invariant a lot closer to the failure.

What happens if you just remove the LINVRNT() ??

NeilBrown

> Hoping that sending this out more eyes might help to see where this
> last problem is.
>
> Lustre: DEBUG MARKER: == sanity-pfl test 0: Create full components file, no reused OSTs =======
> ============================= 10:53:08 (1545061988)
> Lustre: DEBUG MARKER: create directory /lustre/lustre/d0.sanity-pfl
> Lustre: DEBUG MARKER: create comp_file
> Lustre: DEBUG MARKER: instantiate components
> LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) ASSERTION( io->ci_state == CIS_UNLOCKED )
> failed:
> LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) LBUG
> Pid: 19350, comm: dd 4.20.0-rc6+ #1 SMP PREEMPT Sat Dec 15 11:22:06 EST 2018
> Call Trace:
>   libcfs_call_trace+0x8b/0xc0 [libcfs]
>   lbug_with_loc+0x41/0x90 [libcfs]
>   cl_io_iter_fini+0x10c/0x110 [obdclass]
>   cl_io_loop+0x46/0x220 [obdclass]
>   cl_setattr_ost+0x1ed/0x2a0 [lustre]
>   ll_setattr_raw+0x797/0x980 [lustre]
>   notify_change+0x1dc/0x430
>   do_truncate+0x72/0xc0
>   do_sys_ftruncate+0xf5/0x160
>   do_syscall_64+0x68/0x38f
>
> Bobi Jam (20):
>   lustre: lov: move code for PFL work
>   lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
>   lustre: lov: fold lmm_verify() handling into lmm_unpackmd()
>   lustre: lov: create struct lov_stripe_md_entry
>   lustre: lov: add composite layout unpacking
>   lustre: lov: embedded raid0 in struct lov_layout_composite
>   lustre: lov: migrate lov raid0 to future PFL component handling
>   lustre: lov: reduce code indentation
>   lustre: lov: change lo_entries to array.
>   lustre: lov: move around PFL code and cleanups
>   lustre: lov: remove lsm_stripe_by_[index|offset]_plain
>   lustre: lov: add looping lsm_entry_count times
>   lustre: lov: create lov_comp_* wrappers
>   lustre: clio: client side implementation for PFL
>   lustre: pfl: dynamic layout modification with write/truncate
>   lustre: pfl: calculate PFL file LOVEA correctly
>   lustre: lov: keep minimum LOVEA size
>   lustre: pfl: fix hang with grouplocks
>   lustre: pfl: fix ost pool op->size handling
>   lustre: llite: restore ll_file_getstripe in ll_lov_setstripe
>
> Fan Yong (1):
>   lustre: pfl: enhance PFID EA for PFL
>
> Jinshan Xiong (3):
>   lustre: pfl: Read should not trigger layout write intent
>   lustre: lov: readahead shouldn't exceed component boundary
>   lustre: lov: do not split IO for single striped file
>
> Niu Yawei (4):
>   lustre: pfl: Basic data structures for composite layout
>   lustre: clio: getstripe support comp layout
>   lustre: uapi: support negative flags
>   lustre: llite: return v1/v3 layout for legacy app
>
>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  |  36 +-
>  .../lustre/include/uapi/linux/lustre/lustre_user.h |  88 ++-
>  drivers/staging/lustre/lustre/include/cl_object.h  |  12 +-
>  drivers/staging/lustre/lustre/include/lustre_sec.h |   4 +-
>  .../staging/lustre/lustre/include/lustre_swab.h    |   1 +
>  drivers/staging/lustre/lustre/include/obd.h        |   4 -
>  drivers/staging/lustre/lustre/llite/dir.c          |  38 +-
>  drivers/staging/lustre/lustre/llite/file.c         | 185 +++--
>  .../staging/lustre/lustre/llite/llite_internal.h   |   3 +
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |  44 +-
>  drivers/staging/lustre/lustre/llite/xattr.c        |  70 +-
>  .../staging/lustre/lustre/lov/lov_cl_internal.h    | 191 ++---
>  drivers/staging/lustre/lustre/lov/lov_ea.c         | 570 ++++++++++----
>  drivers/staging/lustre/lustre/lov/lov_internal.h   | 175 +++--
>  drivers/staging/lustre/lustre/lov/lov_io.c         | 651 +++++++++-------
>  drivers/staging/lustre/lustre/lov/lov_lock.c       |  94 ++-
>  drivers/staging/lustre/lustre/lov/lov_merge.c      |  12 +-
>  drivers/staging/lustre/lustre/lov/lov_object.c     | 833 ++++++++++++---------
>  drivers/staging/lustre/lustre/lov/lov_offset.c     |  65 +-
>  drivers/staging/lustre/lustre/lov/lov_pack.c       | 364 +++++----
>  drivers/staging/lustre/lustre/lov/lov_page.c       |  42 +-
>  drivers/staging/lustre/lustre/lov/lov_pool.c       |  20 +-
>  drivers/staging/lustre/lustre/lov/lovsub_object.c  |  23 +-
>  drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  79 +-
>  drivers/staging/lustre/lustre/obdclass/cl_object.c |   5 +-
>  drivers/staging/lustre/lustre/obdclass/genops.c    |  16 +-
>  drivers/staging/lustre/lustre/osc/osc_io.c         |   4 +-
>  drivers/staging/lustre/lustre/ptlrpc/layout.c      |   6 +-
>  .../staging/lustre/lustre/ptlrpc/pack_generic.c    |  84 ++-
>  .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |   7 +-
>  drivers/staging/lustre/lustre/ptlrpc/sec.c         |   5 +-
>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 125 +++-
>  32 files changed, 2483 insertions(+), 1373 deletions(-)
>
> -- 
> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181218/04e0a916/attachment.sig>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client
  2018-12-18  6:21 ` [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client NeilBrown
@ 2018-12-20  1:39   ` NeilBrown
  2018-12-27  1:53     ` James Simmons
  0 siblings, 1 reply; 41+ messages in thread
From: NeilBrown @ 2018-12-20  1:39 UTC (permalink / raw)
  To: lustre-devel

On Tue, Dec 18 2018, NeilBrown wrote:

> On Mon, Dec 17 2018, James Simmons wrote:
>
>> This is the initial PFL port to the linux lustre client. This opens
>> up feed back on the port so far. Currently sanity passes but the
>> test for sanity-pfl fail as below. I have been tracking downing
>> various bugs but this one remains and I haven't found out why its
>> failing. So far from what I can tell is lov_io_setattr_iter_init()
>> it returning -ENODATA due to lsm_entry_inited() is not initialized.
>
> Having that invariant in cl_io_iter_fini() seems strange.
> It is guaranteed to fir eif cl_io_iter_init() fails - if that is not
> permitted, I would expect an invariant a lot closer to the failure.
>
> What happens if you just remove the LINVRNT() ??

I dug through the code some more, and I'm sure that LINVRNT() is wrong.

The cl_io_iter() call is meant to fail early, before ci_state gets to
CIS_LOCKED, let alone CIS_UNLOCKED.  It sets ->ci_need_write_intent when
it records the failure.  The code is then meant to fall through to
the cl_io_fini() call in cl_setattr_ost(), which calls into vvp_io_fini)_
which notices ->ci_need_write_intent, and calls ll_layout_write_intent(),
which presumably initializes the things that weren't initialized before.
This also sets ->ci_need_restart = 1 so that cl_setattr_ost() loops
around to "again:" and calls cl_io_init() again.

So the invariant in cl_io_iter_fini() should probably be

	LINVRNT(io->ci_state == CIS_INIT || io->ci_state == CIS_UNLOCKED);

or something like that.  Maybe needs CIS_IT_ENDED as well.

	LINVRNT(io->ci_state <= CIS_INIT || io->ci_state >= CIS_UNLOCKED);

??

Thanks,
NeilBrown

>
> NeilBrown
>
>> Hoping that sending this out more eyes might help to see where this
>> last problem is.
>>
>> Lustre: DEBUG MARKER: == sanity-pfl test 0: Create full components file, no reused OSTs =======
>> ============================= 10:53:08 (1545061988)
>> Lustre: DEBUG MARKER: create directory /lustre/lustre/d0.sanity-pfl
>> Lustre: DEBUG MARKER: create comp_file
>> Lustre: DEBUG MARKER: instantiate components
>> LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) ASSERTION( io->ci_state == CIS_UNLOCKED )
>> failed:
>> LustreError: 19350:0:(cl_io.c:439:cl_io_iter_fini()) LBUG
>> Pid: 19350, comm: dd 4.20.0-rc6+ #1 SMP PREEMPT Sat Dec 15 11:22:06 EST 2018
>> Call Trace:
>>   libcfs_call_trace+0x8b/0xc0 [libcfs]
>>   lbug_with_loc+0x41/0x90 [libcfs]
>>   cl_io_iter_fini+0x10c/0x110 [obdclass]
>>   cl_io_loop+0x46/0x220 [obdclass]
>>   cl_setattr_ost+0x1ed/0x2a0 [lustre]
>>   ll_setattr_raw+0x797/0x980 [lustre]
>>   notify_change+0x1dc/0x430
>>   do_truncate+0x72/0xc0
>>   do_sys_ftruncate+0xf5/0x160
>>   do_syscall_64+0x68/0x38f
>>
>> Bobi Jam (20):
>>   lustre: lov: move code for PFL work
>>   lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
>>   lustre: lov: fold lmm_verify() handling into lmm_unpackmd()
>>   lustre: lov: create struct lov_stripe_md_entry
>>   lustre: lov: add composite layout unpacking
>>   lustre: lov: embedded raid0 in struct lov_layout_composite
>>   lustre: lov: migrate lov raid0 to future PFL component handling
>>   lustre: lov: reduce code indentation
>>   lustre: lov: change lo_entries to array.
>>   lustre: lov: move around PFL code and cleanups
>>   lustre: lov: remove lsm_stripe_by_[index|offset]_plain
>>   lustre: lov: add looping lsm_entry_count times
>>   lustre: lov: create lov_comp_* wrappers
>>   lustre: clio: client side implementation for PFL
>>   lustre: pfl: dynamic layout modification with write/truncate
>>   lustre: pfl: calculate PFL file LOVEA correctly
>>   lustre: lov: keep minimum LOVEA size
>>   lustre: pfl: fix hang with grouplocks
>>   lustre: pfl: fix ost pool op->size handling
>>   lustre: llite: restore ll_file_getstripe in ll_lov_setstripe
>>
>> Fan Yong (1):
>>   lustre: pfl: enhance PFID EA for PFL
>>
>> Jinshan Xiong (3):
>>   lustre: pfl: Read should not trigger layout write intent
>>   lustre: lov: readahead shouldn't exceed component boundary
>>   lustre: lov: do not split IO for single striped file
>>
>> Niu Yawei (4):
>>   lustre: pfl: Basic data structures for composite layout
>>   lustre: clio: getstripe support comp layout
>>   lustre: uapi: support negative flags
>>   lustre: llite: return v1/v3 layout for legacy app
>>
>>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  |  36 +-
>>  .../lustre/include/uapi/linux/lustre/lustre_user.h |  88 ++-
>>  drivers/staging/lustre/lustre/include/cl_object.h  |  12 +-
>>  drivers/staging/lustre/lustre/include/lustre_sec.h |   4 +-
>>  .../staging/lustre/lustre/include/lustre_swab.h    |   1 +
>>  drivers/staging/lustre/lustre/include/obd.h        |   4 -
>>  drivers/staging/lustre/lustre/llite/dir.c          |  38 +-
>>  drivers/staging/lustre/lustre/llite/file.c         | 185 +++--
>>  .../staging/lustre/lustre/llite/llite_internal.h   |   3 +
>>  drivers/staging/lustre/lustre/llite/vvp_io.c       |  44 +-
>>  drivers/staging/lustre/lustre/llite/xattr.c        |  70 +-
>>  .../staging/lustre/lustre/lov/lov_cl_internal.h    | 191 ++---
>>  drivers/staging/lustre/lustre/lov/lov_ea.c         | 570 ++++++++++----
>>  drivers/staging/lustre/lustre/lov/lov_internal.h   | 175 +++--
>>  drivers/staging/lustre/lustre/lov/lov_io.c         | 651 +++++++++-------
>>  drivers/staging/lustre/lustre/lov/lov_lock.c       |  94 ++-
>>  drivers/staging/lustre/lustre/lov/lov_merge.c      |  12 +-
>>  drivers/staging/lustre/lustre/lov/lov_object.c     | 833 ++++++++++++---------
>>  drivers/staging/lustre/lustre/lov/lov_offset.c     |  65 +-
>>  drivers/staging/lustre/lustre/lov/lov_pack.c       | 364 +++++----
>>  drivers/staging/lustre/lustre/lov/lov_page.c       |  42 +-
>>  drivers/staging/lustre/lustre/lov/lov_pool.c       |  20 +-
>>  drivers/staging/lustre/lustre/lov/lovsub_object.c  |  23 +-
>>  drivers/staging/lustre/lustre/mdc/mdc_locks.c      |  79 +-
>>  drivers/staging/lustre/lustre/obdclass/cl_object.c |   5 +-
>>  drivers/staging/lustre/lustre/obdclass/genops.c    |  16 +-
>>  drivers/staging/lustre/lustre/osc/osc_io.c         |   4 +-
>>  drivers/staging/lustre/lustre/ptlrpc/layout.c      |   6 +-
>>  .../staging/lustre/lustre/ptlrpc/pack_generic.c    |  84 ++-
>>  .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |   7 +-
>>  drivers/staging/lustre/lustre/ptlrpc/sec.c         |   5 +-
>>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 125 +++-
>>  32 files changed, 2483 insertions(+), 1373 deletions(-)
>>
>> -- 
>> 1.8.3.1
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20181220/04645519/attachment.sig>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client
  2018-12-20  1:39   ` NeilBrown
@ 2018-12-27  1:53     ` James Simmons
  0 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-27  1:53 UTC (permalink / raw)
  To: lustre-devel


> On Tue, Dec 18 2018, NeilBrown wrote:
> 
> > On Mon, Dec 17 2018, James Simmons wrote:
> >
> >> This is the initial PFL port to the linux lustre client. This opens
> >> up feed back on the port so far. Currently sanity passes but the
> >> test for sanity-pfl fail as below. I have been tracking downing
> >> various bugs but this one remains and I haven't found out why its
> >> failing. So far from what I can tell is lov_io_setattr_iter_init()
> >> it returning -ENODATA due to lsm_entry_inited() is not initialized.
> >
> > Having that invariant in cl_io_iter_fini() seems strange.
> > It is guaranteed to fir eif cl_io_iter_init() fails - if that is not
> > permitted, I would expect an invariant a lot closer to the failure.
> >
> > What happens if you just remove the LINVRNT() ??
> 
> I dug through the code some more, and I'm sure that LINVRNT() is wrong.
> 
> The cl_io_iter() call is meant to fail early, before ci_state gets to
> CIS_LOCKED, let alone CIS_UNLOCKED.  It sets ->ci_need_write_intent when
> it records the failure.  The code is then meant to fall through to
> the cl_io_fini() call in cl_setattr_ost(), which calls into vvp_io_fini)_
> which notices ->ci_need_write_intent, and calls ll_layout_write_intent(),
> which presumably initializes the things that weren't initialized before.
> This also sets ->ci_need_restart = 1 so that cl_setattr_ost() loops
> around to "again:" and calls cl_io_init() again.
> 
> So the invariant in cl_io_iter_fini() should probably be
> 
> 	LINVRNT(io->ci_state == CIS_INIT || io->ci_state == CIS_UNLOCKED);
> 
> or something like that.  Maybe needs CIS_IT_ENDED as well.
> 
> 	LINVRNT(io->ci_state <= CIS_INIT || io->ci_state >= CIS_UNLOCKED);
> 
> ??

You are right. I spent two weeks thinking I did the port wrong :-( I used
the second version which worked and saw only sanity-pfl test 11 failing.
I opened a ticket on this issue : 

https://jira.whamcloud.com/browse/LU-11828

and have pushed a patch for Bobi Jam to look at. We should have something
worked out soon. So PFL mostly worked outside of that. I will combine this
fix with a bunch others. I tracked down the majority of the causes of the
failures seen in the sanity testing.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout
  2018-12-17 23:54   ` NeilBrown
  2018-12-18  1:47     ` Patrick Farrell
@ 2018-12-27  1:57     ` James Simmons
  1 sibling, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-27  1:57 UTC (permalink / raw)
  To: lustre-devel


> > From: Niu Yawei <yawei.niu@intel.com>
> >
> > Added basic structures and magic numbers for composite layout.
> >
> 
> This would be a great place to (brief) explain what PFL does and what it
> is going to do with this data structures.
> What are the "components" and how do they form a "composite layout" ??

Would the link - http://wiki.lustre.org/PFL_Prototype_High_Level_Design
be enough ?
 
> > +
> > +enum lov_comp_md_entry_flags {
> > +	LCME_FL_PRIMARY		= 0x00000001,   /* Not used */
> > +	LCME_FL_STALE		= 0x00000002,   /* Not used */
> > +	LCME_FL_OFFLINE		= 0x00000004,   /* Not used */
> > +	LCME_FL_PREFERRED	= 0x00000008,	/* Not used */
> > +	LCME_FL_INIT		= 0x00000010,	/* instantiated */
> > +};
> > +
> > +#define LCME_KNOWN_FLAGS	LCME_FL_INIT
> 
> What is a "KNOWN" flags?  What isn't known about the other ones?

Patrick answered this one :-)

> > + * bit of lcme_id is used to indicate that the ID is representing
> > + * certain LCME_FL_* but not a real ID. Which implies we can have
> > + * at most 31 flags (see LCME_FL_XXX).
> > + */
> > +enum lcme_id {
> > +	LCME_ID_INVAL	= 0x0,
> > +	LCME_ID_MAX	= 0x7FFFFFFF,
> > +	LCME_ID_ALL	= 0xFFFFFFFF,
> > +	LCME_ID_NONE	= 0x80000000
> > +};
> > +
> > +#define LCME_ID_MASK	LCME_ID_MAX
> 
> Why is MASK a #define, but MAX an enum ??

Actually I looked and this mask is only needed for server code. Sadly 
developers are still using lustre_idl.h and lustre_user.h as dumpsters :-(
Will remove.

> > +
> > +struct lov_comp_md_entry_v1 {
> > +	__u32			lcme_id;	/* unique id of component */
> > +	__u32			lcme_flags;	/* LCME_FL_XXX */
> > +	struct lu_extent	lcme_extent;	/* file extent for component */
> > +	__u32			lcme_offset;	/* offset of component blob,
> > +						 * start from lov_comp_md_v1
> > +						 */
> > +	__u32			lcme_size;	/* size of component blob */
> > +	__u64			lcme_padding[2];
> > +} __packed;
> > +
> > +enum lov_comp_md_flags;
> 
> This enum is empty, and never used.
> 
> It eventually gets some LCM_FL_* names added... maybe it should wait
> until those are added??

Also removed. Strange this exist in the 2.10 branch but its not used.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work
  2018-12-18  0:00   ` NeilBrown
@ 2018-12-27  1:59     ` James Simmons
  0 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-27  1:59 UTC (permalink / raw)
  To: lustre-devel


> On Mon, Dec 17 2018, James Simmons wrote:
> 
> > From: Bobi Jam <bobijam@hotmail.com>
> >
> > Move lov_tgt_maxbytes() and lsm_free_plain() toward the top of
> > lov_ea.c for upcoming PFL work.
> 
> No mention of the lsm_op_find() move??
> That function is still in lov_internal.h in OpenSFS lustre !!

Added details of that move. This is one of those changes that Greg
asked for in the past. He didn't like extern struct .... in a header
file along with a inline function to handle those external structs.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  2018-12-18  0:09   ` NeilBrown
  2018-12-18  1:49     ` Patrick Farrell
@ 2018-12-27  2:04     ` James Simmons
  1 sibling, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-27  2:04 UTC (permalink / raw)
  To: lustre-devel


> > -const struct lsm_operations lsm_v1_ops = {
> > -	.lsm_free	    = lsm_free_plain,
> > -	.lsm_stripe_by_index    = lsm_stripe_by_index_plain,
> > -	.lsm_stripe_by_offset   = lsm_stripe_by_offset_plain,
> > -	.lsm_lmm_verify	 = lsm_lmm_verify_v1,
> > -	.lsm_unpackmd	   = lsm_unpackmd_v1,
> > +const static struct lsm_operations lsm_v1_ops = {
> > +	.lsm_stripe_by_index	= lsm_stripe_by_index_plain,
> > +	.lsm_stripe_by_offset	= lsm_stripe_by_offset_plain,
> > +	.lsm_lmm_verify		= lsm_lmm_verify_v1,
> > +	.lsm_unpackmd		= lsm_unpackmd_v1,
> 
> I *SO* wish you would stop combining white-spaces fixes with other
> changes in the same patch!!!
> The above hunk should just add the 'static' and remove the 'lsm_free'.
> The rest is just noise and makes it harder to review the patch.

I can do that for the next update. This is just the early verison. The 
reason that got in is often I do comparsion of the same files between the
LTS and linux client. For me I find that the easiest way to see if I do
a porting mistake. 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling
  2018-12-18  1:49     ` Patrick Farrell
@ 2018-12-27  2:10       ` James Simmons
  0 siblings, 0 replies; 41+ messages in thread
From: James Simmons @ 2018-12-27  2:10 UTC (permalink / raw)
  To: lustre-devel


> Just throwing this in, I agree with Neil very strongly here.? I?ve seen some patches recently with ~20 lines of change and 200 lines of
> whitespace.

Must be a patch for OpenSFS branch :-) Is this one of Arshad Hussain 
patches? Those are meant to be cleanup / checkpatch fixing patches.
Thankfully Arshad is doing this work. For a long time the policy for
the OpenSFS branch was to do all the clean ups with real fixes. The
reasoning was not to lose "real" fixes in the noise of cleanups. With
Arshad that seems to be changing now.

> ___________________________________________________________________________________________________________________________________________
> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> Sent: Monday, December 17, 2018 6:09:57 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin; Bobi Jam; Jinshan Xiong
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling ?
> On Mon, Dec 17 2018, James Simmons wrote:
> 
> > From: Bobi Jam <bobijam@hotmail.com>
> >
> > Several of the struct lsm_operations functions for both v1 and v3
> > are nearly identical. Let merge them together.
> ??????????????????????? Let's
> ??????????????????????
> >?
> > -const struct lsm_operations lsm_v1_ops = {
> > -???? .lsm_free?????????? = lsm_free_plain,
> > -???? .lsm_stripe_by_index??? = lsm_stripe_by_index_plain,
> > -???? .lsm_stripe_by_offset?? = lsm_stripe_by_offset_plain,
> > -???? .lsm_lmm_verify? = lsm_lmm_verify_v1,
> > -???? .lsm_unpackmd????? = lsm_unpackmd_v1,
> > +const static struct lsm_operations lsm_v1_ops = {
> > +???? .lsm_stripe_by_index??? = lsm_stripe_by_index_plain,
> > +???? .lsm_stripe_by_offset?? = lsm_stripe_by_offset_plain,
> > +???? .lsm_lmm_verify???????? = lsm_lmm_verify_v1,
> > +???? .lsm_unpackmd?????????? = lsm_unpackmd_v1,
> 
> I *SO* wish you would stop combining white-spaces fixes with other
> changes in the same patch!!!
> The above hunk should just add the 'static' and remove the 'lsm_free'.
> The rest is just noise and makes it harder to review the patch.
> 
> 
> Thanks,
> NeilBrown
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2018-12-27  2:10 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17 16:29 [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 01/28] lustre: pfl: Basic data structures for composite layout James Simmons
2018-12-17 23:54   ` NeilBrown
2018-12-18  1:47     ` Patrick Farrell
2018-12-27  1:57     ` James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 02/28] lustre: lov: move code for PFL work James Simmons
2018-12-18  0:00   ` NeilBrown
2018-12-27  1:59     ` James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 03/28] lustre: lov: merge lov_mds_md_v3 and lov_mds_md_v1 handling James Simmons
2018-12-18  0:09   ` NeilBrown
2018-12-18  1:49     ` Patrick Farrell
2018-12-27  2:10       ` James Simmons
2018-12-27  2:04     ` James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 04/28] lustre: lov: fold lmm_verify() handling into lmm_unpackmd() James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 05/28] lustre: lov: create struct lov_stripe_md_entry James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 06/28] lustre: lov: add composite layout unpacking James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 07/28] lustre: lov: embedded raid0 in struct lov_layout_composite James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 08/28] lustre: lov: migrate lov raid0 to future PFL component handling James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 09/28] lustre: lov: reduce code indentation James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 10/28] lustre: lov: change lo_entries to array James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 11/28] lustre: lov: move around PFL code and cleanups James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 12/28] lustre: lov: remove lsm_stripe_by_[index|offset]_plain James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 13/28] lustre: lov: add looping lsm_entry_count times James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 14/28] lustre: lov: create lov_comp_* wrappers James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 15/28] lustre: clio: client side implementation for PFL James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 16/28] lustre: clio: getstripe support comp layout James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 17/28] lustre: pfl: enhance PFID EA for PFL James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 18/28] lustre: pfl: dynamic layout modification with write/truncate James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 19/28] lustre: pfl: calculate PFL file LOVEA correctly James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 20/28] lustre: lov: keep minimum LOVEA size James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 21/28] lustre: pfl: Read should not trigger layout write intent James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 22/28] lustre: pfl: fix hang with grouplocks James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 23/28] lustre: pfl: fix ost pool op->size handling James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 24/28] lustre: lov: readahead shouldn't exceed component boundary James Simmons
2018-12-17 16:29 ` [lustre-devel] [PATCH 25/28] lustre: uapi: support negative flags James Simmons
2018-12-17 16:30 ` [lustre-devel] [PATCH 26/28] lustre: llite: return v1/v3 layout for legacy app James Simmons
2018-12-17 16:30 ` [lustre-devel] [PATCH 27/28] lustre: llite: restore ll_file_getstripe in ll_lov_setstripe James Simmons
2018-12-17 16:30 ` [lustre-devel] [PATCH 28/28] lustre: lov: do not split IO for single striped file James Simmons
2018-12-18  6:21 ` [lustre-devel] [PATCH RFC 00/28] lustre: PFL port to linux client NeilBrown
2018-12-20  1:39   ` NeilBrown
2018-12-27  1:53     ` James Simmons

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.