lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52
@ 2021-06-13 23:11 James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 01/27] lustre: uapi: add mdt_hash_name James Simmons
                   ` (26 more replies)
  0 siblings, 27 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

Update the linux client to support OpenSFS versoon 2.14.52

Alex Zhuravlev (1):
  lustre: mdc: start changelog thread upon first access

Alexander Boyko (1):
  lustre: llog: changelog purge deletes plain llog

Andreas Dilger (2):
  lnet: libcfs: allow comma-separated masks
  lustre: lmv: change default hash type to crush

Artem Blagodarenko (2):
  lustre: obd: check if sbi->ll_md_exp is initialized
  lnet: do not crash if lnet_sock_getaddr returns error

Bobi Jam (2):
  lustre: flr: mmap write/punch does not stale other mirrors
  lustre: llite: refresh layout after mirror merge/split

Chris Horn (2):
  lnet: Fix destination NID for discovery PUSH
  lnet: Check if discovery toggled off in ping reply

James Simmons (2):
  lustre: uapi: rename CONFIG_T_* to MGS_CFG_T_*
  lustre: rename tgt_pool_* functions.

Lai Siyao (2):
  lustre: llite: default lsm update may memory leak
  lustre: pcc: don't alloc FID in LLITE for pcc open

Li Xi (1):
  lustre: osc: cleanup comment in osc_object_is_contended

Mr NeilBrown (1):
  lnet: o2iblnd: fix bug in list_first_entry() change.

Olaf Faaland (1):
  lnet: simplify lnet_ni_add_interface

Oleg Drokin (1):
  lustre: update version to 2.14.52

Patrick Farrell (2):
  lustre: osc: Batch gang_lookup cbs
  lustre: llite: Return errors for aio

Qian Yingjin (1):
  lustre: ptlrpc: move more members in PTLRPC request into pill

Sebastien Buisson (2):
  lustre: sec: forbid file rename from enc to unencrypted dir
  lustre: llite: add selinux testing

Sergey Cheremencev (1):
  lustre: quota: default OST Pool Quotas

Vitaly Fertman (2):
  lustre: ptlrpc: do not match reply with resent RPC
  lustre: vvp: wait for nrpages to be updated

Yang Sheng (1):
  lustre: uapi: add mdt_hash_name

 fs/lustre/include/cl_object.h           |   5 ++
 fs/lustre/include/lu_object.h           |  12 +--
 fs/lustre/include/lustre_disk.h         |   3 +-
 fs/lustre/include/lustre_lmv.h          |  19 +++--
 fs/lustre/include/lustre_net.h          |  75 ++--------------
 fs/lustre/include/lustre_osc.h          |   7 +-
 fs/lustre/include/lustre_req_layout.h   |  78 ++++++++++++++++-
 fs/lustre/include/obd.h                 |   2 +-
 fs/lustre/include/obd_class.h           |  11 +--
 fs/lustre/include/obd_support.h         |   5 ++
 fs/lustre/llite/dcache.c                |   6 +-
 fs/lustre/llite/dir.c                   |  32 ++++---
 fs/lustre/llite/file.c                  |   6 +-
 fs/lustre/llite/llite_internal.h        |   5 +-
 fs/lustre/llite/llite_lib.c             | 110 ++++++++++++------------
 fs/lustre/llite/llite_mmap.c            |   4 +-
 fs/lustre/llite/llite_nfs.c             |   2 +-
 fs/lustre/llite/namei.c                 |  48 ++++++-----
 fs/lustre/llite/rw26.c                  |   3 +-
 fs/lustre/llite/statahead.c             |   2 +-
 fs/lustre/llite/vvp_io.c                |   3 +-
 fs/lustre/llite/vvp_object.c            |   9 +-
 fs/lustre/lmv/lmv_intent.c              |   6 +-
 fs/lustre/lmv/lmv_obd.c                 |   5 +-
 fs/lustre/lov/lov_io.c                  |   8 +-
 fs/lustre/lov/lov_obd.c                 |  12 +--
 fs/lustre/lov/lov_pool.c                |  10 +--
 fs/lustre/mdc/mdc_acl.c                 |   3 +-
 fs/lustre/mdc/mdc_changelog.c           |  54 ++++++++----
 fs/lustre/mdc/mdc_dev.c                 |  52 +++++------
 fs/lustre/mdc/mdc_internal.h            |  36 ++++----
 fs/lustre/mdc/mdc_lib.c                 | 146 ++++++++++++++++---------------
 fs/lustre/mdc/mdc_locks.c               |  10 +--
 fs/lustre/mdc/mdc_reint.c               |  13 +--
 fs/lustre/mdc/mdc_request.c             |  43 +++++-----
 fs/lustre/mgc/mgc_internal.h            |   8 +-
 fs/lustre/mgc/mgc_request.c             |  29 ++++---
 fs/lustre/obdclass/cl_page.c            |   3 +
 fs/lustre/obdclass/llog.c               |   4 +
 fs/lustre/obdclass/lprocfs_status.c     |   1 +
 fs/lustre/obdclass/lu_tgt_pool.c        |  28 +++---
 fs/lustre/obdclass/obd_mount.c          |   4 +-
 fs/lustre/obdecho/echo_client.c         |   4 +-
 fs/lustre/osc/osc_cache.c               | 147 +++++++++++++++++---------------
 fs/lustre/osc/osc_io.c                  |  33 ++++---
 fs/lustre/osc/osc_lock.c                |  19 +++--
 fs/lustre/osc/osc_object.c              |   4 -
 fs/lustre/ptlrpc/client.c               |  21 +++--
 fs/lustre/ptlrpc/layout.c               |  32 +++++--
 fs/lustre/ptlrpc/niobuf.c               |  20 +++--
 fs/lustre/ptlrpc/pack_generic.c         |  70 ++++++++-------
 fs/lustre/ptlrpc/ptlrpc_internal.h      |   2 +-
 fs/lustre/ptlrpc/sec.c                  |   6 +-
 fs/lustre/ptlrpc/sec_plain.c            |   4 +-
 fs/lustre/ptlrpc/service.c              |  18 +++-
 fs/lustre/ptlrpc/wiretest.c             |  10 ++-
 include/linux/lnet/lib-types.h          |   2 +
 include/uapi/linux/lustre/lustre_idl.h  |  14 +--
 include/uapi/linux/lustre/lustre_user.h |  14 +--
 include/uapi/linux/lustre/lustre_ver.h  |   4 +-
 net/lnet/klnds/o2iblnd/o2iblnd.c        |   2 +-
 net/lnet/klnds/socklnd/socklnd.c        |   5 +-
 net/lnet/libcfs/libcfs_string.c         |   6 +-
 net/lnet/lnet/acceptor.c                |   5 +-
 net/lnet/lnet/config.c                  |  27 +++---
 net/lnet/lnet/peer.c                    |  80 +++++++----------
 66 files changed, 798 insertions(+), 673 deletions(-)

-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 01/27] lustre: uapi: add mdt_hash_name
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 02/27] lustre: uapi: rename CONFIG_T_* to MGS_CFG_T_* James Simmons
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Yang Sheng, Lustre Development List

From: Yang Sheng <ys@whamcloud.com>

Add mdt_hash_name to map LMV_HASH_NAME_* to a string. Will be
used to enhance debugging information.

WC-bug-id: https://jira.whamcloud.com/browse/LU-11776
Lustre-commit: 00141b1a746d4733 ("LU-11776 utils: add support lfs find with mdt hash flag")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39340
Reviewed-by: James Simmons <jsimmons@infradead.org>
WC-bug-id: https://jira.whamcloud.com/browse/LU-8837
Lustre-commit: 5ecd5a5ecfb880236 ("LU-14291 lustre: limit header scope for server only handling")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43096
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/include/lustre_lmv.h          | 19 +++++++++++++------
 fs/lustre/llite/llite_lib.c             |  2 +-
 include/uapi/linux/lustre/lustre_user.h | 11 +++++++----
 3 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/fs/lustre/include/lustre_lmv.h b/fs/lustre/include/lustre_lmv.h
index a74f0a5..6861dd0 100644
--- a/fs/lustre/include/lustre_lmv.h
+++ b/fs/lustre/include/lustre_lmv.h
@@ -115,16 +115,20 @@ static inline bool lmv_dir_bad_hash(const struct lmv_stripe_md *lsm)
 
 static inline void lsm_md_dump(int mask, const struct lmv_stripe_md *lsm)
 {
+	bool valid_hash = lmv_dir_bad_hash(lsm);
 	int i;
 
 	/* If lsm_md_magic == LMV_MAGIC_FOREIGN pool_name may not be a null
 	 * terminated string so only print LOV_MAXPOOLNAME bytes.
 	 */
 	CDEBUG(mask,
-	       "magic %#x stripe count %d master mdt %d hash type %#x max inherit %hhu version %d migrate offset %d migrate hash %#x pool %.*s\n",
+	       "magic %#x stripe count %d master mdt %d hash type %s:%#x max inherit %hhu version %d migrate offset %d migrate hash %#x pool %.*s\n",
 	       lsm->lsm_md_magic, lsm->lsm_md_stripe_count,
-	       lsm->lsm_md_master_mdt_index, lsm->lsm_md_hash_type,
-	       lsm->lsm_md_max_inherit, lsm->lsm_md_layout_version,
+	       lsm->lsm_md_master_mdt_index,
+	       valid_hash ? "invalid hash" :
+			    mdt_hash_name[lsm->lsm_md_hash_type & (LMV_HASH_TYPE_MAX - 1)],
+	       lsm->lsm_md_hash_type, lsm->lsm_md_max_inherit,
+	       lsm->lsm_md_layout_version,
 	       lsm->lsm_md_migrate_offset, lsm->lsm_md_migrate_hash,
 	       LOV_MAXPOOLNAME, lsm->lsm_md_pool_name);
 
@@ -404,10 +408,13 @@ static inline bool lmv_user_magic_supported(u32 lum_magic)
 
 #define LMV_DEBUG(mask, lmv, msg)					\
 	CDEBUG(mask,							\
-	       "%s LMV: magic=%#x count=%u index=%u hash=%#x version=%u migrate offset=%u migrate hash=%u.\n",\
+	       "%s LMV: magic=%#x count=%u index=%u hash=%s:%#x version=%u migrate offset=%u migrate hash=%s:%u.\n",\
 	       msg, (lmv)->lmv_magic, (lmv)->lmv_stripe_count,          \
-	       (lmv)->lmv_master_mdt_index, (lmv)->lmv_hash_type,       \
-	       (lmv)->lmv_layout_version, (lmv)->lmv_migrate_offset,    \
+	       (lmv)->lmv_master_mdt_index,				\
+	       mdt_hash_name[(lmv)->lmv_hash_type & (LMV_HASH_TYPE_MAX - 1)],\
+	       (lmv)->lmv_hash_type, (lmv)->lmv_layout_version,		\
+	       (lmv)->lmv_migrate_offset,				\
+	       mdt_hash_name[(lmv)->lmv_migrate_hash & (LMV_HASH_TYPE_MAX - 1)],\
 	       (lmv)->lmv_migrate_hash)
 
 /* master LMV is sane */
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index 0c914c9..39bdee0 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -2772,7 +2772,7 @@ void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req)
 
 	op_data->op_fid1 = body->mbo_fid1;
 	op_data->op_open_handle = body->mbo_open_handle;
-	op_data->op_mod_time = get_seconds();
+	op_data->op_mod_time = ktime_get_real_seconds();
 	md_close(exp, op_data, NULL, &close_req);
 	ptlrpc_req_finished(close_req);
 	ll_finish_md_op_data(op_data);
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index bcb9f86..972678f 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -694,11 +694,14 @@ enum lmv_hash_type {
 	LMV_HASH_TYPE_MAX,
 };
 
-#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_FNV_1A_64
+static __attribute__((unused)) const char *mdt_hash_name[] = {
+	"none",
+	"all_char",
+	"fnv_1a_64",
+	"crush",
+};
 
-#define LMV_HASH_NAME_ALL_CHARS	"all_char"
-#define LMV_HASH_NAME_FNV_1A_64	"fnv_1a_64"
-#define LMV_HASH_NAME_CRUSH	"crush"
+#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_FNV_1A_64
 
 /* Right now only the lower part(0-16bits) of lmv_hash_type is being used,
  * and the higher part will be the flag to indicate the status of object,
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 02/27] lustre: uapi: rename CONFIG_T_* to MGS_CFG_T_*
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 01/27] lustre: uapi: add mdt_hash_name James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 03/27] lnet: o2iblnd: fix bug in list_first_entry() change James Simmons
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

The Linux kernel uses CONFIG_* as a way to determine if a feature
is available. Using CONFIG_* in an UAPI is considered an error
and in the most recent kernels will break a build. While we don't
have any CONFIG_* in our UAPI headers we do have CONFIG_T_*
which is used for config logs. This naming confuses the Linux
kernel build system so just rename these variables to MGS_CFG_T_*
instead.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14651
Lustre-commit: 4d5a2eba617780ea ("LU-14651 uapi: rename CONFIG_T_* to MGS_CFG_T_*")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43494
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/include/lustre_disk.h        |  3 ++-
 fs/lustre/include/obd_class.h          |  2 +-
 fs/lustre/mgc/mgc_internal.h           |  8 ++++----
 fs/lustre/mgc/mgc_request.c            | 27 ++++++++++++++-------------
 fs/lustre/ptlrpc/wiretest.c            |  8 ++++----
 include/uapi/linux/lustre/lustre_idl.h | 13 +++++++------
 6 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/fs/lustre/include/lustre_disk.h b/fs/lustre/include/lustre_disk.h
index 81a0d40..d8686fc 100644
--- a/fs/lustre/include/lustre_disk.h
+++ b/fs/lustre/include/lustre_disk.h
@@ -154,7 +154,8 @@ struct lustre_sb_info {
 int lmd_parse(char *options, struct lustre_mount_data *lmd);
 
 /* mgc_request.c */
-int mgc_fsname2resid(char *fsname, struct ldlm_res_id *res_id, int type);
+int mgc_fsname2resid(char *fsname, struct ldlm_res_id *res_id,
+		     enum mgs_cfg_type type);
 
 /** @} disk */
 
diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h
index eb52733..5cbed01 100644
--- a/fs/lustre/include/obd_class.h
+++ b/fs/lustre/include/obd_class.h
@@ -205,7 +205,7 @@ struct config_llog_data {
 	struct config_llog_data	       *cld_recover;	/* imperative recover log */
 	struct obd_export	       *cld_mgcexp;
 	struct mutex			cld_lock;
-	int				cld_type;
+	enum mgs_cfg_type		cld_type;
 	unsigned int			cld_stopping:1, /*
 							 * we were told to stop
 							 * watching
diff --git a/fs/lustre/mgc/mgc_internal.h b/fs/lustre/mgc/mgc_internal.h
index e323f90..a2a09d4 100644
--- a/fs/lustre/mgc/mgc_internal.h
+++ b/fs/lustre/mgc/mgc_internal.h
@@ -43,14 +43,14 @@
 
 int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld);
 
-static inline int cld_is_sptlrpc(struct config_llog_data *cld)
+static inline bool cld_is_sptlrpc(struct config_llog_data *cld)
 {
-	return cld->cld_type == CONFIG_T_SPTLRPC;
+	return cld->cld_type == MGS_CFG_T_SPTLRPC;
 }
 
-static inline int cld_is_recover(struct config_llog_data *cld)
+static inline bool cld_is_recover(struct config_llog_data *cld)
 {
-	return cld->cld_type == CONFIG_T_RECOVER;
+	return cld->cld_type == MGS_CFG_T_RECOVER;
 }
 
 #endif  /* _MGC_INTERNAL_H */
diff --git a/fs/lustre/mgc/mgc_request.c b/fs/lustre/mgc/mgc_request.c
index c2ad5d3..5ea965c 100644
--- a/fs/lustre/mgc/mgc_request.c
+++ b/fs/lustre/mgc/mgc_request.c
@@ -50,7 +50,7 @@
 #include "mgc_internal.h"
 
 static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
-			  int type)
+			  enum mgs_cfg_type type)
 {
 	u64 resname = 0;
 
@@ -69,12 +69,12 @@ static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
 	res_id->name[0] = cpu_to_le64(resname);
 	/* XXX: unfortunately, sptlprc and config llog share one lock */
 	switch (type) {
-	case CONFIG_T_CONFIG:
-	case CONFIG_T_SPTLRPC:
+	case MGS_CFG_T_CONFIG:
+	case MGS_CFG_T_SPTLRPC:
 		resname = 0;
 		break;
-	case CONFIG_T_RECOVER:
-	case CONFIG_T_PARAMS:
+	case MGS_CFG_T_RECOVER:
+	case MGS_CFG_T_PARAMS:
 		resname = type;
 		break;
 	default:
@@ -86,7 +86,8 @@ static int mgc_name2resid(char *name, int len, struct ldlm_res_id *res_id,
 	return 0;
 }
 
-int mgc_fsname2resid(char *fsname, struct ldlm_res_id *res_id, int type)
+int mgc_fsname2resid(char *fsname, struct ldlm_res_id *res_id,
+		     enum mgs_cfg_type type)
 {
 	/* fsname is at most 8 chars long, maybe contain "-".
 	 * e.g. "lustre", "SUN-000"
@@ -96,7 +97,7 @@ int mgc_fsname2resid(char *fsname, struct ldlm_res_id *res_id, int type)
 EXPORT_SYMBOL(mgc_fsname2resid);
 
 static int mgc_logname2resid(char *logname, struct ldlm_res_id *res_id,
-			     int type)
+			     enum mgs_cfg_type type)
 {
 	char *name_end;
 	int len;
@@ -190,7 +191,7 @@ struct config_llog_data *config_log_find(char *logname,
 static
 struct config_llog_data *do_config_log_add(struct obd_device *obd,
 					   char *logname,
-					   int type,
+					   enum mgs_cfg_type type,
 					   struct config_llog_instance *cfg,
 					   struct super_block *sb)
 {
@@ -258,13 +259,13 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	LASSERT(lcfg.cfg_instance);
 	strcat(logname, "-cliir");
 
-	cld = do_config_log_add(obd, logname, CONFIG_T_RECOVER, &lcfg, sb);
+	cld = do_config_log_add(obd, logname, MGS_CFG_T_RECOVER, &lcfg, sb);
 	return cld;
 }
 
 static struct config_llog_data *
 config_log_find_or_add(struct obd_device *obd, char *logname,
-		       struct super_block *sb, int type,
+		       struct super_block *sb, enum mgs_cfg_type type,
 		       struct config_llog_instance *cfg)
 {
 	struct config_llog_instance lcfg = *cfg;
@@ -314,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 
 	if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
 		sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
-						     CONFIG_T_SPTLRPC, cfg);
+						     MGS_CFG_T_SPTLRPC, cfg);
 		if (IS_ERR(sptlrpc_cld)) {
 			CERROR("can't create sptlrpc log: %s\n", seclogname);
 			rc = PTR_ERR(sptlrpc_cld);
@@ -324,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 
 	if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
 		params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
-						    CONFIG_T_PARAMS, cfg);
+						    MGS_CFG_T_PARAMS, cfg);
 		if (IS_ERR(params_cld)) {
 			rc = PTR_ERR(params_cld);
 			CERROR("%s: can't create params log: rc = %d\n",
@@ -333,7 +334,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 		}
 	}
 
-	cld = do_config_log_add(obd, logname, CONFIG_T_CONFIG, cfg, sb);
+	cld = do_config_log_add(obd, logname, MGS_CFG_T_CONFIG, cfg, sb);
 	if (IS_ERR(cld)) {
 		CERROR("can't create log: %s\n", logname);
 		rc = PTR_ERR(cld);
diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c
index 71f9e32..03fd815 100644
--- a/fs/lustre/ptlrpc/wiretest.c
+++ b/fs/lustre/ptlrpc/wiretest.c
@@ -4151,10 +4151,10 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
 
-	BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
-	BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
-	BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
-	BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
+	BUILD_BUG_ON(MGS_CFG_T_CONFIG != 0);
+	BUILD_BUG_ON(MGS_CFG_T_SPTLRPC != 1);
+	BUILD_BUG_ON(MGS_CFG_T_RECOVER != 2);
+	BUILD_BUG_ON(MGS_CFG_T_PARAMS != 3);
 
 	/* Checks for struct mgs_config_res */
 	LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h
index c79010b..d62b3cd 100644
--- a/include/uapi/linux/lustre/lustre_idl.h
+++ b/include/uapi/linux/lustre/lustre_idl.h
@@ -2362,17 +2362,18 @@ struct mgs_nidtbl_entry {
 	} u;
 };
 
-enum {
-	CONFIG_T_CONFIG  = 0,
-	CONFIG_T_SPTLRPC = 1,
-	CONFIG_T_RECOVER = 2,
-	CONFIG_T_PARAMS  = 3,
+enum mgs_cfg_type {
+	MGS_CFG_T_CONFIG	= 0,
+	MGS_CFG_T_SPTLRPC	= 1,
+	MGS_CFG_T_RECOVER	= 2,
+	MGS_CFG_T_PARAMS	= 3,
+	MGS_CFG_T_MAX
 };
 
 struct mgs_config_body {
 	char		mcb_name[MTI_NAME_MAXLEN]; /* logname */
 	__u64		mcb_offset;    /* next index of config log to request */
-	__u16		mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
+	__u16		mcb_type;      /* type of log: MGS_CFG_T_[CONFIG|RECOVER] */
 	__u8		mcb_nm_cur_pass;
 	__u8		mcb_bits;      /* bits unit size of config log */
 	__u32		mcb_units;     /* # of units for bulk transfer */
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 03/27] lnet: o2iblnd: fix bug in list_first_entry() change.
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 01/27] lustre: uapi: add mdt_hash_name James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 02/27] lustre: uapi: rename CONFIG_T_* to MGS_CFG_T_* James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 04/27] lustre: flr: mmap write/punch does not stale other mirrors James Simmons
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

This comparison should be != NULL, else a NULL pointer could be
dereferenced.

Fixes: 4d8bf0c25f10 ("lustre: use list_first_entry() in lnet/klnds subdirectory.")
WC-bug-id: https://jira.whamcloud.com/browse/LU-12678
Lustre-commit: 0024460d797490ae ("LU-12678 o2iblnd: fix bug in list_first_entry() change.")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/43558
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/o2iblnd/o2iblnd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c
index d670180..d722e6c 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1721,7 +1721,7 @@ static void kiblnd_fail_poolset(struct kib_poolset *ps, struct list_head *zombie
 	spin_lock(&ps->ps_lock);
 	while ((po = list_first_entry_or_null(&ps->ps_pool_list,
 					      struct kib_pool,
-					      po_list)) == NULL) {
+					      po_list)) != NULL) {
 		po->po_failed = 1;
 		if (!po->po_allocated)
 			list_move(&po->po_list, zombies);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 04/27] lustre: flr: mmap write/punch does not stale other mirrors
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (2 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 03/27] lnet: o2iblnd: fix bug in list_first_entry() change James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 05/27] lustre: llite: default lsm update may memory leak James Simmons
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Bobi Jam <bobijam@whamcloud.com>

mmap write and punch/fallocate do not stale other mirrors and makes
FLR file contains different content in different mirrors.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14647
Lustre-commit: 03511484c668355c ("LU-14647 flr: mmap write/punch does not stale other mirrors")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43470
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/cl_object.h | 5 +++++
 fs/lustre/llite/llite_mmap.c  | 4 +++-
 fs/lustre/llite/vvp_io.c      | 3 ++-
 fs/lustre/lov/lov_io.c        | 8 +++++---
 4 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h
index b69c04a..c615091 100644
--- a/fs/lustre/include/cl_object.h
+++ b/fs/lustre/include/cl_object.h
@@ -2456,6 +2456,11 @@ int cl_io_lru_reserve(const struct lu_env *env, struct cl_io *io,
 int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
 		     pgoff_t start, struct cl_read_ahead *ra);
 
+static inline int cl_io_is_fault_writable(const struct cl_io *io)
+{
+	return io->ci_type == CIT_FAULT && io->u.ci_fault.ft_writable;
+}
+
 /**
  * True, if @io is an O_APPEND write(2).
  */
diff --git a/fs/lustre/llite/llite_mmap.c b/fs/lustre/llite/llite_mmap.c
index a234a83..ebcb8d9 100644
--- a/fs/lustre/llite/llite_mmap.c
+++ b/fs/lustre/llite/llite_mmap.c
@@ -117,6 +117,9 @@ struct vm_area_struct *our_vma(struct mm_struct *mm, unsigned long addr,
 	else if (vma->vm_flags & VM_RAND_READ)
 		io->ci_rand_read = 1;
 
+	if (vma->vm_flags & VM_WRITE)
+		fio->ft_writable = 1;
+
 	rc = cl_io_init(env, io, CIT_FAULT, io->ci_obj);
 	if (rc == 0) {
 		struct vvp_io *vio = vvp_env_io(env);
@@ -128,7 +131,6 @@ struct vm_area_struct *our_vma(struct mm_struct *mm, unsigned long addr,
 		io->ci_lockreq = CILR_MANDATORY;
 		vio->vui_fd = fd;
 	} else {
-		LASSERT(rc < 0);
 		cl_io_fini(env, io);
 		if (io->ci_need_restart)
 			goto restart;
diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c
index 12a28d9..12314fd 100644
--- a/fs/lustre/llite/vvp_io.c
+++ b/fs/lustre/llite/vvp_io.c
@@ -363,7 +363,8 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		io->ci_need_write_intent = 0;
 
 		LASSERT(io->ci_type == CIT_WRITE || cl_io_is_fallocate(io) ||
-			cl_io_is_trunc(io) || cl_io_is_mkwrite(io));
+			cl_io_is_trunc(io) || cl_io_is_mkwrite(io) ||
+			cl_io_is_fault_writable(io));
 
 		CDEBUG(D_VFSTRACE, DFID" write layout, type %u " DEXT "\n",
 		       PFID(lu_object_fid(&obj->co_lu)), io->ci_type,
diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c
index 86e3fbd..9012ad6 100644
--- a/fs/lustre/lov/lov_io.c
+++ b/fs/lustre/lov/lov_io.c
@@ -221,8 +221,9 @@ static int lov_io_mirror_write_intent(struct lov_io *lio,
 	*ext = (typeof(*ext)) { lio->lis_pos, lio->lis_endpos };
 	io->ci_need_write_intent = 0;
 
-	if (!(io->ci_type == CIT_WRITE || cl_io_is_trunc(io) ||
-	      cl_io_is_mkwrite(io)))
+	if (!(io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io) ||
+	      cl_io_is_fallocate(io) || cl_io_is_trunc(io) ||
+	      cl_io_is_fault_writable(io)))
 		return 0;
 
 	/* FLR: check if it needs to send a write intent RPC to server.
@@ -574,7 +575,8 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 	/* check if it needs to instantiate layout */
 	if (!(io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io) ||
 	      cl_io_is_fallocate(io) ||
-	      (cl_io_is_trunc(io) && io->u.ci_setattr.sa_attr.lvb_size > 0))) {
+	      (cl_io_is_trunc(io) && io->u.ci_setattr.sa_attr.lvb_size > 0)) ||
+	      cl_io_is_fault_writable(io)) {
 		result = 0;
 		goto out;
 	}
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 05/27] lustre: llite: default lsm update may memory leak
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (3 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 04/27] lustre: flr: mmap write/punch does not stale other mirrors James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 06/27] lustre: pcc: don't alloc FID in LLITE for pcc open James Simmons
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lai Siyao, Lustre Development List

From: Lai Siyao <lai.siyao@whamcloud.com>

ll_update_default_lsm_md() should check whether lli_default_lsm_md
is set before setting it to the data from lustre_md, and if it's set,
release the old data to avoid memory leak

WC-bug-id: https://jira.whamcloud.com/browse/LU-14004
Lustre-commit: cd2ad336177f8f31 ("LU-14004 llite: default lsm update may memory leak")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40103
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/llite_lib.c | 35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index 39bdee0..e98972d 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -1505,30 +1505,27 @@ static void ll_update_default_lsm_md(struct inode *inode, struct lustre_md *md)
 			}
 			up_write(&lli->lli_lsm_sem);
 		}
-	} else if (lli->lli_default_lsm_md) {
-		/* update default lsm if it changes */
+		return;
+	}
+
+	if (lli->lli_default_lsm_md) {
+		/* do nonthing if default lsm isn't changed */
 		down_read(&lli->lli_lsm_sem);
 		if (lli->lli_default_lsm_md &&
-		    !lsm_md_eq(lli->lli_default_lsm_md, md->default_lmv)) {
-			up_read(&lli->lli_lsm_sem);
-			down_write(&lli->lli_lsm_sem);
-			if (lli->lli_default_lsm_md)
-				lmv_free_memmd(lli->lli_default_lsm_md);
-			lli->lli_default_lsm_md = md->default_lmv;
-			lsm_md_dump(D_INODE, md->default_lmv);
-			md->default_lmv = NULL;
-			up_write(&lli->lli_lsm_sem);
-		} else {
+		    lsm_md_eq(lli->lli_default_lsm_md, md->default_lmv)) {
 			up_read(&lli->lli_lsm_sem);
+			return;
 		}
-	} else {
-		/* init default lsm */
-		down_write(&lli->lli_lsm_sem);
-		lli->lli_default_lsm_md = md->default_lmv;
-		lsm_md_dump(D_INODE, md->default_lmv);
-		md->default_lmv = NULL;
-		up_write(&lli->lli_lsm_sem);
+		up_read(&lli->lli_lsm_sem);
 	}
+
+	down_write(&lli->lli_lsm_sem);
+	if (lli->lli_default_lsm_md)
+		lmv_free_memmd(lli->lli_default_lsm_md);
+	lli->lli_default_lsm_md = md->default_lmv;
+	lsm_md_dump(D_INODE, md->default_lmv);
+	md->default_lmv = NULL;
+	up_write(&lli->lli_lsm_sem);
 }
 
 static int ll_update_lsm_md(struct inode *inode, struct lustre_md *md)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 06/27] lustre: pcc: don't alloc FID in LLITE for pcc open
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (4 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 05/27] lustre: llite: default lsm update may memory leak James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 07/27] lustre: quota: default OST Pool Quotas James Simmons
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lai Siyao, Lustre Development List

From: Lai Siyao <lai.siyao@whamcloud.com>

ll_lookup_it(IT_OPEN) always alloc FID on MDT0 for pcc open, but
the open request is sent to MDT where the name hash points to,
which may be different from the MDT where the FID is, which will
trigger osp_md_create() assertion because file is created remotely.

This FID allocation is not necessary, and it can be left to be done
in lmv_intent_open() by LMV layer, because the MDT is chosen in
LMV. Then when it's done, the FID allocated can be used to initialize
PCC inode.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13852
Lustre-commit: 223728a97c397e6e ("LU-13852 pcc: don't alloc FID in LLITE for pcc open")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39568
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/namei.c    | 29 +++++++++++------------------
 fs/lustre/lmv/lmv_intent.c |  6 ++----
 2 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index f5f34b0..a2f5d8d 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -912,8 +912,6 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	}
 
 	if (pca && pca->pca_dataset) {
-		struct pcc_dataset *dataset = pca->pca_dataset;
-
 		lum = kzalloc(sizeof(*lum), GFP_NOFS);
 		if (!lum) {
 			retval = ERR_PTR(-ENOMEM);
@@ -924,22 +922,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 		lum->lmm_pattern = LOV_PATTERN_F_RELEASED | LOV_PATTERN_RAID0;
 		op_data->op_data = lum;
 		op_data->op_data_size = sizeof(*lum);
-		op_data->op_archive_id = dataset->pccd_rwid;
-
-		rc = obd_fid_alloc(NULL, ll_i2mdexp(parent), &op_data->op_fid2,
-				   op_data);
-		if (rc) {
-			retval = ERR_PTR(rc);
-			goto out;
-		}
-
-		rc = pcc_inode_create(parent->i_sb, dataset, &op_data->op_fid2,
-				      &pca->pca_dentry);
-		if (rc) {
-			retval = ERR_PTR(rc);
-			goto out;
-		}
-
+		op_data->op_archive_id = pca->pca_dataset->pccd_rwid;
 		it->it_flags |= MDS_OPEN_PCC;
 	}
 
@@ -980,6 +963,16 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 		goto out;
 	}
 
+	if (pca && pca->pca_dataset) {
+		rc = pcc_inode_create(parent->i_sb, pca->pca_dataset,
+				      &op_data->op_fid2,
+				      &pca->pca_dentry);
+		if (rc) {
+			retval = ERR_PTR(rc);
+			goto out;
+		}
+	}
+
 	/* dir layout may change */
 	ll_unlock_md_op_lsm(op_data);
 	rc = ll_lookup_it_finish(req, it, parent, &dentry,
diff --git a/fs/lustre/lmv/lmv_intent.c b/fs/lustre/lmv/lmv_intent.c
index 398bd17..88201e6 100644
--- a/fs/lustre/lmv/lmv_intent.c
+++ b/fs/lustre/lmv/lmv_intent.c
@@ -349,8 +349,7 @@ static int lmv_intent_open(struct obd_export *exp, struct md_op_data *op_data,
 		op_data->op_mds = tgt->ltd_index;
 	} else {
 		LASSERT(fid_is_sane(&op_data->op_fid1));
-		LASSERT(it->it_flags & MDS_OPEN_PCC ||
-			fid_is_zero(&op_data->op_fid2));
+		LASSERT(fid_is_zero(&op_data->op_fid2));
 		LASSERT(op_data->op_name);
 
 		tgt = lmv_locate_tgt(lmv, op_data);
@@ -361,8 +360,7 @@ static int lmv_intent_open(struct obd_export *exp, struct md_op_data *op_data,
 	/* If it is ready to open the file by FID, do not need
 	 * allocate FID at all, otherwise it will confuse MDT
 	 */
-	if ((it->it_op & IT_CREAT) && !(it->it_flags & MDS_OPEN_BY_FID ||
-					it->it_flags & MDS_OPEN_PCC)) {
+	if ((it->it_op & IT_CREAT) && !(it->it_flags & MDS_OPEN_BY_FID)) {
 		/*
 		 * For lookup(IT_CREATE) cases allocate new fid and setup FLD
 		 * for it.
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 07/27] lustre: quota: default OST Pool Quotas
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (5 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 06/27] lustre: pcc: don't alloc FID in LLITE for pcc open James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 08/27] lustre: rename tgt_pool_* functions James Simmons
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Sergey Cheremencev, Lustre Development List

From: Sergey Cheremencev <sergey.cheremencev@hpe.com>

Patch makes ability to set default quota limits per OST pool.

HPE-bug-id: LUS-9133
WC-bug-id: https://jira.whamcloud.com/browse/LU-13952
Lustre-commit: 25a70a88c9eb35b7 ("LU-13952 quota: default OST Pool Quotas")
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/39873
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/dir.c                   | 2 ++
 include/uapi/linux/lustre/lustre_user.h | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 13676c1..d7466f3 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -1096,12 +1096,14 @@ int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 	case LUSTRE_Q_SETDEFAULT:
 	case LUSTRE_Q_SETQUOTAPOOL:
 	case LUSTRE_Q_SETINFOPOOL:
+	case LUSTRE_Q_SETDEFAULT_POOL:
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		break;
 	case Q_GETQUOTA:
 	case LUSTRE_Q_GETDEFAULT:
 	case LUSTRE_Q_GETQUOTAPOOL:
+	case LUSTRE_Q_GETDEFAULT_POOL:
 		if (check_owner(type, id) && !capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		break;
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index 972678f..aae6642 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -975,7 +975,8 @@ static inline void obd_uuid2fsname(char *buf, char *uuid, int buflen)
 #define LUSTRE_Q_SETQUOTAPOOL	0x800010	/* set user pool quota */
 #define LUSTRE_Q_GETINFOPOOL	0x800011	/* get pool quota info */
 #define LUSTRE_Q_SETINFOPOOL	0x800012	/* set pool quota info */
-
+#define LUSTRE_Q_GETDEFAULT_POOL	0x800013 /* get default pool quota*/
+#define LUSTRE_Q_SETDEFAULT_POOL	0x800014 /* set default pool quota */
 /* In the current Lustre implementation, the grace time is either the time
  * or the timestamp to be used after some quota ID exceeds the soft limt,
  * 48 bits should be enough, its high 16 bits can be used as quota flags.
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 08/27] lustre: rename tgt_pool_* functions.
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (6 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 07/27] lustre: quota: default OST Pool Quotas James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 09/27] lustre: llite: refresh layout after mirror merge/split James Simmons
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

Functions starting with tgt_* represents code for target handling
used by Lustre servers. Now that the pool functions are used by
both clients and servers rename it to lu_tgt_* to mirror how
lu_tgt_desc_* is used since both represents remote server targets.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14291
Lustre-commit: 7f3b06a0ab527c3a ("LU-14291 lustre: rename tgt_pool_* functions.")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43624
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
---
 fs/lustre/include/lu_object.h    | 12 ++++++------
 fs/lustre/lov/lov_obd.c          | 12 ++++++------
 fs/lustre/lov/lov_pool.c         | 10 +++++-----
 fs/lustre/obdclass/lu_tgt_pool.c | 28 ++++++++++++++--------------
 4 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/fs/lustre/include/lu_object.h b/fs/lustre/include/lu_object.h
index b1d7577..bbc4533 100644
--- a/fs/lustre/include/lu_object.h
+++ b/fs/lustre/include/lu_object.h
@@ -1413,12 +1413,12 @@ struct lu_tgt_pool {
 	struct rw_semaphore op_rw_sem;	/* to protect lu_tgt_pool use */
 };
 
-int tgt_pool_init(struct lu_tgt_pool *op, unsigned int count);
-int tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count);
-int tgt_pool_remove(struct lu_tgt_pool *op, u32 idx);
-int tgt_pool_free(struct lu_tgt_pool *op);
-int tgt_check_index(int idx, struct lu_tgt_pool *osts);
-int tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count);
+int lu_tgt_pool_init(struct lu_tgt_pool *op, unsigned int count);
+int lu_tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count);
+int lu_tgt_pool_remove(struct lu_tgt_pool *op, u32 idx);
+int lu_tgt_pool_free(struct lu_tgt_pool *op);
+int lu_tgt_check_index(int idx, struct lu_tgt_pool *osts);
+int lu_tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count);
 
 /* bitflags used in rr / qos allocation */
 enum lq_flag {
diff --git a/fs/lustre/lov/lov_obd.c b/fs/lustre/lov/lov_obd.c
index 42a137d..61159fd 100644
--- a/fs/lustre/lov/lov_obd.c
+++ b/fs/lustre/lov/lov_obd.c
@@ -95,7 +95,7 @@ void lov_tgts_putref(struct obd_device *obd)
 			 * being the maximum tgt index for computing the
 			 * mds_max_easize. So we can't shrink it.
 			 */
-			tgt_pool_remove(&lov->lov_packed, i);
+			lu_tgt_pool_remove(&lov->lov_packed, i);
 			lov->lov_tgts[i] = NULL;
 			lov->lov_death_row--;
 		}
@@ -544,7 +544,7 @@ static int lov_add_target(struct obd_device *obd, struct obd_uuid *uuidp,
 		return -ENOMEM;
 	}
 
-	rc = tgt_pool_add(&lov->lov_packed, index, lov->lov_tgt_size);
+	rc = lu_tgt_pool_add(&lov->lov_packed, index, lov->lov_tgt_size);
 	if (rc) {
 		mutex_unlock(&lov->lov_lock);
 		kfree(tgt);
@@ -763,7 +763,7 @@ int lov_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	if (rc)
 		goto out_hash;
 
-	rc = tgt_pool_init(&lov->lov_packed, 0);
+	rc = lu_tgt_pool_init(&lov->lov_packed, 0);
 	if (rc)
 		goto out_pool;
 
@@ -777,7 +777,7 @@ int lov_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	return 0;
 
 out_tunables:
-	tgt_pool_free(&lov->lov_packed);
+	lu_tgt_pool_free(&lov->lov_packed);
 out_pool:
 	lov_pool_hash_destroy(&lov->lov_pools_hash_body);
 out_hash:
@@ -804,7 +804,7 @@ static int lov_cleanup(struct obd_device *obd)
 		lov_pool_del(obd, pool->pool_name);
 	}
 	lov_pool_hash_destroy(&lov->lov_pools_hash_body);
-	tgt_pool_free(&lov->lov_packed);
+	lu_tgt_pool_free(&lov->lov_packed);
 
 	lprocfs_obd_cleanup(obd);
 	if (lov->lov_tgts) {
@@ -1254,7 +1254,7 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 			continue;
 
 		if (pool &&
-		    tgt_check_index(tgt->ltd_index, &pool->pool_obds))
+		    lu_tgt_check_index(tgt->ltd_index, &pool->pool_obds))
 			continue;
 
 		if (!tgt->ltd_active || tgt->ltd_reap) {
diff --git a/fs/lustre/lov/lov_pool.c b/fs/lustre/lov/lov_pool.c
index d01c475..25e980f 100644
--- a/fs/lustre/lov/lov_pool.c
+++ b/fs/lustre/lov/lov_pool.c
@@ -82,7 +82,7 @@ void lov_pool_putref(struct pool_desc *pool)
 	CDEBUG(D_INFO, "pool %p\n", pool);
 	if (atomic_dec_and_test(&pool->pool_refcount)) {
 		LASSERT(list_empty(&pool->pool_list));
-		tgt_pool_free(&pool->pool_obds);
+		lu_tgt_pool_free(&pool->pool_obds);
 		kfree_rcu(pool, rcu);
 	}
 }
@@ -268,7 +268,7 @@ int lov_pool_new(struct obd_device *obd, char *poolname)
 	 * up to deletion
 	 */
 	atomic_set(&new_pool->pool_refcount, 1);
-	rc = tgt_pool_init(&new_pool->pool_obds, 0);
+	rc = lu_tgt_pool_init(&new_pool->pool_obds, 0);
 	if (rc)
 		goto out_err;
 
@@ -310,7 +310,7 @@ int lov_pool_new(struct obd_device *obd, char *poolname)
 	lov->lov_pool_count--;
 	spin_unlock(&obd->obd_dev_lock);
 	debugfs_remove_recursive(new_pool->pool_debugfs_entry);
-	tgt_pool_free(&new_pool->pool_obds);
+	lu_tgt_pool_free(&new_pool->pool_obds);
 	kfree(new_pool);
 
 	return rc;
@@ -401,7 +401,7 @@ int lov_pool_add(struct obd_device *obd, char *poolname, char *ostname)
 		goto out;
 	}
 
-	rc = tgt_pool_add(&pool->pool_obds, lov_idx, lov->lov_tgt_size);
+	rc = lu_tgt_pool_add(&pool->pool_obds, lov_idx, lov->lov_tgt_size);
 	if (rc)
 		goto out;
 
@@ -453,7 +453,7 @@ int lov_pool_remove(struct obd_device *obd, char *poolname, char *ostname)
 		goto out;
 	}
 
-	tgt_pool_remove(&pool->pool_obds, lov_idx);
+	lu_tgt_pool_remove(&pool->pool_obds, lov_idx);
 
 	CDEBUG(D_CONFIG, "%s removed from " LOV_POOLNAMEF "\n", ostname,
 	       poolname);
diff --git a/fs/lustre/obdclass/lu_tgt_pool.c b/fs/lustre/obdclass/lu_tgt_pool.c
index a8e1028..8f52fb4 100644
--- a/fs/lustre/obdclass/lu_tgt_pool.c
+++ b/fs/lustre/obdclass/lu_tgt_pool.c
@@ -29,7 +29,7 @@
  * This file is part of Lustre, http://www.lustre.org/
  */
 /*
- * lustre/target/tgt_pool.c
+ * lustre/obdclass/lu_tgt_pool.c
  *
  * This file handles creation, lookup, and removal of pools themselves, as
  * well as adding and removing targets to pools.
@@ -60,7 +60,7 @@
  *		negative error number on failure
  */
 #define POOL_INIT_COUNT 2
-int tgt_pool_init(struct lu_tgt_pool *op, unsigned int count)
+int lu_tgt_pool_init(struct lu_tgt_pool *op, unsigned int count)
 {
 	if (count == 0)
 		count = POOL_INIT_COUNT;
@@ -77,7 +77,7 @@ int tgt_pool_init(struct lu_tgt_pool *op, unsigned int count)
 
 	return 0;
 }
-EXPORT_SYMBOL(tgt_pool_init);
+EXPORT_SYMBOL(lu_tgt_pool_init);
 
 /**
  * Increase the op_array size to hold more targets in this pool.
@@ -92,7 +92,7 @@ int tgt_pool_init(struct lu_tgt_pool *op, unsigned int count)
  * Return:	0 on success
  *		negative error number on failure.
  */
-int tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count)
+int lu_tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count)
 {
 	u32 *new;
 	u32 new_size;
@@ -116,7 +116,7 @@ int tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count)
 
 	return 0;
 }
-EXPORT_SYMBOL(tgt_pool_extend);
+EXPORT_SYMBOL(lu_tgt_pool_extend);
 
 /**
  * Add a new target to an existing pool.
@@ -131,14 +131,14 @@ int tgt_pool_extend(struct lu_tgt_pool *op, unsigned int min_count)
  * Return:	0 if target could be added to the pool
  *		negative error if target \a idx was not added
  */
-int tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count)
+int lu_tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count)
 {
 	unsigned int i;
 	int rc = 0;
 
 	down_write(&op->op_rw_sem);
 
-	rc = tgt_pool_extend(op, min_count);
+	rc = lu_tgt_pool_extend(op, min_count);
 	if (rc)
 		goto out;
 
@@ -156,7 +156,7 @@ int tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count)
 	up_write(&op->op_rw_sem);
 	return rc;
 }
-EXPORT_SYMBOL(tgt_pool_add);
+EXPORT_SYMBOL(lu_tgt_pool_add);
 
 /**
  * Remove an existing pool from the system.
@@ -172,7 +172,7 @@ int tgt_pool_add(struct lu_tgt_pool *op, u32 idx, unsigned int min_count)
  * Return:	0 on success
  *		negative error number on failure
  */
-int tgt_pool_remove(struct lu_tgt_pool *op, u32 idx)
+int lu_tgt_pool_remove(struct lu_tgt_pool *op, u32 idx)
 {
 	unsigned int i;
 
@@ -192,9 +192,9 @@ int tgt_pool_remove(struct lu_tgt_pool *op, u32 idx)
 	up_write(&op->op_rw_sem);
 	return -EINVAL;
 }
-EXPORT_SYMBOL(tgt_pool_remove);
+EXPORT_SYMBOL(lu_tgt_pool_remove);
 
-int tgt_check_index(int idx, struct lu_tgt_pool *osts)
+int lu_tgt_check_index(int idx, struct lu_tgt_pool *osts)
 {
 	int rc = 0, i;
 
@@ -208,7 +208,7 @@ int tgt_check_index(int idx, struct lu_tgt_pool *osts)
 	up_read(&osts->op_rw_sem);
 	return rc;
 }
-EXPORT_SYMBOL(tgt_check_index);
+EXPORT_SYMBOL(lu_tgt_check_index);
 
 /**
  * Free the pool after it was emptied and removed from /proc.
@@ -221,7 +221,7 @@ int tgt_check_index(int idx, struct lu_tgt_pool *osts)
  *
  * Return:	0 on success or if pool was already freed
  */
-int tgt_pool_free(struct lu_tgt_pool *op)
+int lu_tgt_pool_free(struct lu_tgt_pool *op)
 {
 	if (op->op_size == 0)
 		return 0;
@@ -236,4 +236,4 @@ int tgt_pool_free(struct lu_tgt_pool *op)
 	up_write(&op->op_rw_sem);
 	return 0;
 }
-EXPORT_SYMBOL(tgt_pool_free);
+EXPORT_SYMBOL(lu_tgt_pool_free);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 09/27] lustre: llite: refresh layout after mirror merge/split
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (7 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 08/27] lustre: rename tgt_pool_* functions James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 10/27] lustre: ptlrpc: do not match reply with resent RPC James Simmons
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Bobi Jam <bobijam@whamcloud.com>

mirror merge/split updates file's LOVEA and revokes client's layout
lock, but the client issuing the layout change needs to refresh its
layout (lov->lsm) as well.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14549
Lustre-commit: bd7a20f8be4644eb ("LU-14549 llite: refresh layout after mirror merge/split")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43716
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/file.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 26aa7be..7c14cf2 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -3520,6 +3520,8 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc,
 	case LL_LEASE_LAYOUT_SPLIT:
 		if (layout_file)
 			fput(layout_file);
+
+		ll_layout_refresh(inode, &fd->fd_layout_version);
 		break;
 	case LL_LEASE_PCC_ATTACH:
 		if (!rc)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 10/27] lustre: ptlrpc: do not match reply with resent RPC
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (8 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 09/27] lustre: llite: refresh layout after mirror merge/split James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 11/27] lustre: vvp: wait for nrpages to be updated James Simmons
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Vitaly Fertman, Lustre Development List

From: Vitaly Fertman <c17818@cray.com>

The server is able to filter by the connection ID, and drop late
coming RPCs of previous connections, however it does not happen for
replies. At the same time, this is a problem in some cases.

Allocate new matchbits for resends and check replies by them, instead
of xid. Connect RPCs are exceptions due to interop with old server -
at the time of connect we do not know yet if the server supports it.

HPE-bug-id: LUS-9596
WC-bug-id: https://jira.whamcloud.com/browse/LU-14594
Lustre-commit: 057fafc018d7369d ("LU-14594 ptlrpc: do not match reply with resent RPC")
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/158446
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/43242
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_net.h         |  3 +++
 fs/lustre/include/obd_support.h        |  1 +
 fs/lustre/llite/llite_lib.c            |  6 ++++--
 fs/lustre/obdclass/lprocfs_status.c    |  1 +
 fs/lustre/obdclass/obd_mount.c         |  4 +++-
 fs/lustre/obdecho/echo_client.c        |  4 +++-
 fs/lustre/ptlrpc/client.c              | 21 ++++++++++-----------
 fs/lustre/ptlrpc/niobuf.c              | 20 +++++++++++++++-----
 fs/lustre/ptlrpc/pack_generic.c        | 18 ++++++++++++++++++
 fs/lustre/ptlrpc/ptlrpc_internal.h     |  2 +-
 fs/lustre/ptlrpc/service.c             | 18 ++++++++++++++++--
 fs/lustre/ptlrpc/wiretest.c            |  2 ++
 include/uapi/linux/lustre/lustre_idl.h |  1 +
 13 files changed, 78 insertions(+), 23 deletions(-)

diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h
index f84ee46..c894d0f 100644
--- a/fs/lustre/include/lustre_net.h
+++ b/fs/lustre/include/lustre_net.h
@@ -867,6 +867,8 @@ struct ptlrpc_request {
 	u64				rq_xid;
 	/** bulk match bits */
 	u64				rq_mbits;
+	/** reply match bits */
+	u64				rq_rep_mbits;
 	/**
 	 * List item to for replay list. Not yet committed requests get linked
 	 * there.
@@ -2104,6 +2106,7 @@ int lustre_shrink_msg(struct lustre_msg *msg, int segment,
 timeout_t lustre_msg_get_service_timeout(struct lustre_msg *msg);
 char *lustre_msg_get_jobid(struct lustre_msg *msg);
 u32 lustre_msg_get_cksum(struct lustre_msg *msg);
+u64 lustre_msg_get_mbits(struct lustre_msg *msg);
 u32 lustre_msg_calc_cksum(struct lustre_msg *msg, u32 buf);
 void lustre_msg_set_handle(struct lustre_msg *msg,
 			   struct lustre_handle *handle);
diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index 4628fab..962a99b 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -366,6 +366,7 @@
 #define OBD_FAIL_PTLRPC_ROUND_XID			0x530
 #define OBD_FAIL_PTLRPC_CONNECT_RACE			0x531
 #define OBD_FAIL_PTLRPC_IDLE_RACE			0x533
+#define OBD_FAIL_PTLRPC_ENQ_RESEND			0x534
 
 #define OBD_FAIL_OBD_PING_NET				0x600
 /*	OBD_FAIL_OBD_LOG_CANCEL_NET	0x601 obsolete since 1.5 */
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index e98972d..66444fe 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -315,7 +315,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt)
 				   OBD_CONNECT2_PCC |
 				   OBD_CONNECT2_CRUSH | OBD_CONNECT2_LSEEK |
 				   OBD_CONNECT2_GETATTR_PFID |
-				   OBD_CONNECT2_DOM_LVB;
+				   OBD_CONNECT2_DOM_LVB |
+				   OBD_CONNECT2_REP_MBITS;
 
 	if (sbi->ll_flags & LL_SBI_LRU_RESIZE)
 		data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE;
@@ -519,7 +520,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt)
 				  OBD_CONNECT_FLAGS2 | OBD_CONNECT_GRANT_SHRINK;
 
 	data->ocd_connect_flags2 = OBD_CONNECT2_LOCKAHEAD |
-				   OBD_CONNECT2_INC_XID | OBD_CONNECT2_LSEEK;
+				   OBD_CONNECT2_INC_XID | OBD_CONNECT2_LSEEK |
+				   OBD_CONNECT2_REP_MBITS;
 
 	if (!OBD_FAIL_CHECK(OBD_FAIL_OSC_CONNECT_GRANT_PARAM))
 		data->ocd_connect_flags |= OBD_CONNECT_GRANT_PARAM;
diff --git a/fs/lustre/obdclass/lprocfs_status.c b/fs/lustre/obdclass/lprocfs_status.c
index cd5a2fa..0cad91d 100644
--- a/fs/lustre/obdclass/lprocfs_status.c
+++ b/fs/lustre/obdclass/lprocfs_status.c
@@ -130,6 +130,7 @@
 	"getattr_pfid",		/* 0x20000 */
 	"lseek",		/* 0x40000 */
 	"dom_lvb",		/* 0x80000 */
+	"reply_mbits",		/* 0x100000 */
 	NULL
 };
 
diff --git a/fs/lustre/obdclass/obd_mount.c b/fs/lustre/obdclass/obd_mount.c
index 0a5e338..19684fb 100644
--- a/fs/lustre/obdclass/obd_mount.c
+++ b/fs/lustre/obdclass/obd_mount.c
@@ -395,7 +395,9 @@ int lustre_start_mgc(struct super_block *sb)
 	/* We connect to the MGS at setup, and don't disconnect until cleanup */
 	data->ocd_connect_flags = OBD_CONNECT_VERSION | OBD_CONNECT_AT |
 				  OBD_CONNECT_FULL20 | OBD_CONNECT_IMP_RECOV |
-				  OBD_CONNECT_LVB_TYPE | OBD_CONNECT_BULK_MBITS;
+				  OBD_CONNECT_LVB_TYPE |
+				  OBD_CONNECT_BULK_MBITS | OBD_CONNECT_FLAGS2;
+	data->ocd_connect_flags2 = OBD_CONNECT2_REP_MBITS;
 
 	if (lmd_is_client(lsi->lsi_lmd) &&
 	    lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)
diff --git a/fs/lustre/obdecho/echo_client.c b/fs/lustre/obdecho/echo_client.c
index 0452942..c3a12ce 100644
--- a/fs/lustre/obdecho/echo_client.c
+++ b/fs/lustre/obdecho/echo_client.c
@@ -1653,7 +1653,9 @@ static int echo_client_setup(const struct lu_env *env,
 				 OBD_CONNECT_BRW_SIZE |
 				 OBD_CONNECT_GRANT | OBD_CONNECT_FULL20 |
 				 OBD_CONNECT_64BITHASH | OBD_CONNECT_LVB_TYPE |
-				 OBD_CONNECT_FID;
+				 OBD_CONNECT_FID | OBD_CONNECT_FLAGS2;
+	ocd->ocd_connect_flags2 = OBD_CONNECT2_REP_MBITS;
+
 	ocd->ocd_brw_size = DT_MAX_BRW_SIZE;
 	ocd->ocd_version = LUSTRE_VERSION_CODE;
 	ocd->ocd_group = FID_SEQ_ECHO;
diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c
index a812b29..83d269c 100644
--- a/fs/lustre/ptlrpc/client.c
+++ b/fs/lustre/ptlrpc/client.c
@@ -3223,12 +3223,11 @@ u64 ptlrpc_next_xid(void)
  * request to ensure previous bulk fails and avoid problems with lost replies
  * and therefore several transfers landing into the same buffer from different
  * sending attempts.
+ * Also, to avoid previous reply landing to a different sending attempt.
  */
-void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req)
+void ptlrpc_set_mbits(struct ptlrpc_request *req)
 {
-	struct ptlrpc_bulk_desc *bd = req->rq_bulk;
-
-	LASSERT(bd);
+	int md_count = req->rq_bulk ? req->rq_bulk->bd_md_count : 1;
 
 	/*
 	 * Generate new matchbits for all resend requests, including
@@ -3244,7 +3243,7 @@ void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req)
 		 * 'resend for the -EINPROGRESS resend'. To make it simple,
 		 * we opt to generate mbits for all resend cases.
 		 */
-		if (OCD_HAS_FLAG(&bd->bd_import->imp_connect_data,
+		if (OCD_HAS_FLAG(&req->rq_import->imp_connect_data,
 				 BULK_MBITS)) {
 			req->rq_mbits = ptlrpc_next_xid();
 		} else {
@@ -3256,15 +3255,15 @@ void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req)
 			req->rq_mbits = req->rq_xid;
 		}
 
-		CDEBUG(D_HA, "resend bulk old x%llu new x%llu\n",
+		CDEBUG(D_HA, "resend with new mbits old x%llu new x%llu\n",
 		       old_mbits, req->rq_mbits);
 	} else if (!(lustre_msg_get_flags(req->rq_reqmsg) & MSG_REPLAY)) {
 		/* Request being sent first time, use xid as matchbits. */
-		if (OCD_HAS_FLAG(&bd->bd_import->imp_connect_data, BULK_MBITS)
-		    || req->rq_mbits == 0) {
+		if (OCD_HAS_FLAG(&req->rq_import->imp_connect_data,
+				 BULK_MBITS) || req->rq_mbits == 0) {
 			req->rq_mbits = req->rq_xid;
 		} else {
-			req->rq_mbits -= bd->bd_md_count - 1;
+			req->rq_mbits -= md_count - 1;
 		}
 	} else {
 		/*
@@ -3279,12 +3278,12 @@ void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req)
 	 * that server can infer the number of bulks that were prepared,
 	 * see LU-1431
 	 */
-	req->rq_mbits += bd->bd_md_count - 1;
+	req->rq_mbits += md_count - 1;
 
 	/* Set rq_xid as rq_mbits to indicate the final bulk for the old
 	 * server which does not support OBD_CONNECT_BULK_MBITS. LU-6808
 	 */
-	if (!OCD_HAS_FLAG(&bd->bd_import->imp_connect_data, BULK_MBITS))
+	if (!OCD_HAS_FLAG(&req->rq_import->imp_connect_data, BULK_MBITS))
 		req->rq_xid = req->rq_mbits;
 }
 
diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c
index cf9940b..614bb63 100644
--- a/fs/lustre/ptlrpc/niobuf.c
+++ b/fs/lustre/ptlrpc/niobuf.c
@@ -432,7 +432,8 @@ int ptlrpc_send_reply(struct ptlrpc_request *req, int flags)
 			  LNET_ACK_REQ : LNET_NOACK_REQ,
 			  &rs->rs_cb_id, req->rq_self, req->rq_source,
 			  ptlrpc_req2svc(req)->srv_rep_portal,
-			  req->rq_xid, req->rq_reply_off, NULL);
+			  req->rq_rep_mbits ? req->rq_rep_mbits : req->rq_xid,
+			  req->rq_reply_off, NULL);
 out:
 	if (unlikely(rc != 0))
 		ptlrpc_req_drop_rs(req);
@@ -487,7 +488,9 @@ int ptlrpc_error(struct ptlrpc_request *req)
 int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
 {
 	int rc;
+	u32 opc;
 	unsigned int mpflag = 0;
+	bool rep_mbits = false;
 	struct lnet_handle_md bulk_cookie;
 	struct ptlrpc_connection *connection;
 	struct lnet_me *reply_me;
@@ -550,8 +553,14 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
 			  "Allocating new XID for resend on EINPROGRESS");
 	}
 
-	if (request->rq_bulk) {
-		ptlrpc_set_bulk_mbits(request);
+	opc = lustre_msg_get_opc(request->rq_reqmsg);
+	if (opc != OST_CONNECT && opc != MDS_CONNECT &&
+	    opc != MGS_CONNECT && OCD_HAS_FLAG(&imp->imp_connect_data, FLAGS2))
+		rep_mbits = imp->imp_connect_data.ocd_connect_flags2 &
+			    OBD_CONNECT2_REP_MBITS;
+
+	if (request->rq_bulk || rep_mbits) {
+		ptlrpc_set_mbits(request);
 		lustre_msg_set_mbits(request->rq_reqmsg, request->rq_mbits);
 	}
 
@@ -624,8 +633,9 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
 		} else {
 			reply_me = LNetMEAttach(request->rq_reply_portal,
 						connection->c_peer,
-						request->rq_xid, 0,
-						LNET_UNLINK, LNET_INS_AFTER);
+						rep_mbits ? request->rq_mbits :
+						request->rq_xid,
+						0, LNET_UNLINK, LNET_INS_AFTER);
 		}
 
 		if (IS_ERR(reply_me)) {
diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c
index 047573a..133202d 100644
--- a/fs/lustre/ptlrpc/pack_generic.c
+++ b/fs/lustre/ptlrpc/pack_generic.c
@@ -1230,6 +1230,24 @@ u32 lustre_msg_get_cksum(struct lustre_msg *msg)
 	}
 }
 
+u64 lustre_msg_get_mbits(struct lustre_msg *msg)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		if (!pb) {
+			CERROR("invalid msg %p: no ptlrpc body!\n", msg);
+			return 0;
+		}
+		return pb->pb_mbits;
+	}
+	default:
+		CERROR("incorrect message magic: %08x\n", msg->lm_magic);
+		return 0;
+	}
+}
+
 u32 lustre_msg_calc_cksum(struct lustre_msg *msg, u32 buf)
 {
 	switch (msg->lm_magic) {
diff --git a/fs/lustre/ptlrpc/ptlrpc_internal.h b/fs/lustre/ptlrpc/ptlrpc_internal.h
index 62c3c97..f1f414c 100644
--- a/fs/lustre/ptlrpc/ptlrpc_internal.h
+++ b/fs/lustre/ptlrpc/ptlrpc_internal.h
@@ -75,7 +75,7 @@ void ptlrpc_set_add_new_req(struct ptlrpcd_ctl *pc,
 void ptlrpc_expired_set(struct ptlrpc_request_set *set);
 time64_t ptlrpc_set_next_timeout(struct ptlrpc_request_set *set);
 void ptlrpc_resend_req(struct ptlrpc_request *request);
-void ptlrpc_set_bulk_mbits(struct ptlrpc_request *req);
+void ptlrpc_set_mbits(struct ptlrpc_request *req);
 void ptlrpc_assign_next_xid_nolock(struct ptlrpc_request *req);
 u64 ptlrpc_known_replied_xid(struct obd_import *imp);
 void ptlrpc_add_unreplied(struct ptlrpc_request *req);
diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c
index 3d9192d..2917ca3 100644
--- a/fs/lustre/ptlrpc/service.c
+++ b/fs/lustre/ptlrpc/service.c
@@ -1554,6 +1554,7 @@ static int ptlrpc_server_handle_req_in(struct ptlrpc_service_part *svcpt,
 	struct ptlrpc_service *svc = svcpt->scp_service;
 	struct ptlrpc_request *req;
 	u32 deadline;
+	u32 opc;
 	int rc;
 
 	spin_lock(&svcpt->scp_lock);
@@ -1608,8 +1609,9 @@ static int ptlrpc_server_handle_req_in(struct ptlrpc_service_part *svcpt,
 		goto err_req;
 	}
 
+	opc = lustre_msg_get_opc(req->rq_reqmsg);
 	if (OBD_FAIL_CHECK(OBD_FAIL_PTLRPC_DROP_REQ_OPC) &&
-	    lustre_msg_get_opc(req->rq_reqmsg) == cfs_fail_val) {
+	    opc == cfs_fail_val) {
 		CERROR("drop incoming rpc opc %u, x%llu\n",
 		       cfs_fail_val, req->rq_xid);
 		goto err_req;
@@ -1623,7 +1625,7 @@ static int ptlrpc_server_handle_req_in(struct ptlrpc_service_part *svcpt,
 		goto err_req;
 	}
 
-	switch (lustre_msg_get_opc(req->rq_reqmsg)) {
+	switch (opc) {
 	case MDS_WRITEPAGE:
 	case OST_WRITE:
 		req->rq_bulk_write = 1;
@@ -1688,8 +1690,20 @@ static int ptlrpc_server_handle_req_in(struct ptlrpc_service_part *svcpt,
 		req->rq_svc_thread->t_env->le_ses = &req->rq_session;
 	}
 
+
+	if (unlikely(OBD_FAIL_PRECHECK(OBD_FAIL_PTLRPC_ENQ_RESEND) &&
+		     (opc == LDLM_ENQUEUE) &&
+		     (lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT)))
+		OBD_FAIL_TIMEOUT(OBD_FAIL_PTLRPC_ENQ_RESEND, 6);
+
 	ptlrpc_at_add_timed(req);
 
+	if (opc != OST_CONNECT && opc != MDS_CONNECT &&
+	    opc != MGS_CONNECT && req->rq_export) {
+		if (exp_connect_flags2(req->rq_export) & OBD_CONNECT2_REP_MBITS)
+			req->rq_rep_mbits = lustre_msg_get_mbits(req->rq_reqmsg);
+	}
+
 	/* Move it over to the request processing queue */
 	rc = ptlrpc_server_request_add(svcpt, req);
 	if (rc)
diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c
index 03fd815..db97748 100644
--- a/fs/lustre/ptlrpc/wiretest.c
+++ b/fs/lustre/ptlrpc/wiretest.c
@@ -1250,6 +1250,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT2_LSEEK);
 	LASSERTF(OBD_CONNECT2_DOM_LVB == 0x80000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT2_DOM_LVB);
+	LASSERTF(OBD_CONNECT2_REP_MBITS == 0x100000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT2_REP_MBITS);
 	LASSERTF(OBD_CKSUM_CRC32 == 0x00000001UL, "found 0x%.8xUL\n",
 		 (unsigned int)OBD_CKSUM_CRC32);
 	LASSERTF(OBD_CKSUM_ADLER == 0x00000002UL, "found 0x%.8xUL\n",
diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h
index d62b3cd..813e4fc 100644
--- a/include/uapi/linux/lustre/lustre_idl.h
+++ b/include/uapi/linux/lustre/lustre_idl.h
@@ -839,6 +839,7 @@ struct ptlrpc_body_v2 {
 #define OBD_CONNECT2_GETATTR_PFID     0x20000ULL /* pack parent FID in getattr */
 #define OBD_CONNECT2_LSEEK	      0x40000ULL /* SEEK_HOLE/DATA RPC */
 #define OBD_CONNECT2_DOM_LVB	      0x80000ULL /* pack DOM glimpse data in LVB */
+#define OBD_CONNECT2_REP_MBITS	     0x100000ULL /* match reply by mbits, not xid */
 /* XXX README XXX:
  * Please DO NOT add flag values here before first ensuring that this same
  * flag value is not in use on some other branch.  Please clear any such
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 11/27] lustre: vvp: wait for nrpages to be updated
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (9 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 10/27] lustre: ptlrpc: do not match reply with resent RPC James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 12/27] lustre: obd: check if sbi->ll_md_exp is initialized James Simmons
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Vitaly Fertman, Lustre Development List

From: Vitaly Fertman <c17818@cray.com>

truncate_inode_pages() says there still may be a page in a process
of deletion upon return. wait for another thread which is doing
__delete_from_page_cache() to get nrpages updated.

HPE-bug-id: LUS-8842
WC-bug-id: https://jira.whamcloud.com/browse/LU-14644
Lustre-commit: 7d5d004506650c37 ("LU-14644 vvp: wait for nrpages to be updated")
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/158557
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/43464
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/llite_internal.h |  1 +
 fs/lustre/llite/llite_lib.c      | 51 ++++++++++++++++++++++------------------
 fs/lustre/llite/vvp_object.c     |  9 ++-----
 3 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index 72aa564..a1e5e468 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -1204,6 +1204,7 @@ int ll_statfs_internal(struct ll_sb_info *sbi, struct obd_statfs *osfs,
 int ll_update_inode(struct inode *inode, struct lustre_md *md);
 void ll_update_inode_flags(struct inode *inode, unsigned int ext_flags);
 int ll_read_inode2(struct inode *inode, void *opaque);
+void ll_truncate_inode_pages_final(struct inode *inode);
 void ll_delete_inode(struct inode *inode);
 int ll_iocontrol(struct inode *inode, struct file *file,
 		 unsigned int cmd, unsigned long arg);
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index 66444fe..fe49030 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -2471,6 +2471,33 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 	return 0;
 }
 
+void ll_truncate_inode_pages_final(struct inode *inode)
+{
+	struct address_space *mapping = &inode->i_data;
+	unsigned long nrpages;
+	unsigned long flags;
+
+	truncate_inode_pages_final(mapping);
+
+	/* Workaround for LU-118: Note nrpages may not be totally updated when
+	 * truncate_inode_pages() returns, as there can be a page in the process
+	 * of deletion (inside __delete_from_page_cache()) in the specified
+	 * range. Thus mapping->nrpages can be non-zero when this function
+	 * returns even after truncation of the whole mapping.  Only do this if
+	 * npages isn't already zero.
+	 */
+	nrpages = mapping->nrpages;
+	if (nrpages) {
+		xa_lock_irqsave(&mapping->i_pages, flags);
+		nrpages = mapping->nrpages;
+		xa_unlock_irqrestore(&mapping->i_pages, flags);
+	} /* Workaround end */
+
+	LASSERTF(nrpages == 0, "%s: inode="DFID"(%p) nrpages=%lu, see https://jira.whamcloud.com/browse/LU-118\n",
+		 ll_i2sbi(inode)->ll_fsname,
+		 PFID(ll_inode2fid(inode)), inode, nrpages);
+}
+
 int ll_read_inode2(struct inode *inode, void *opaque)
 {
 	struct lustre_md *md = opaque;
@@ -2519,9 +2546,6 @@ int ll_read_inode2(struct inode *inode, void *opaque)
 void ll_delete_inode(struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
-	struct address_space *mapping = &inode->i_data;
-	unsigned long nrpages;
-	unsigned long flags;
 
 	if (S_ISREG(inode->i_mode) && lli->lli_clob) {
 		/* It is last chance to write out dirty pages,
@@ -2534,27 +2558,8 @@ void ll_delete_inode(struct inode *inode)
 		cl_sync_file_range(inode, 0, OBD_OBJECT_EOF, inode->i_nlink ?
 				   CL_FSYNC_LOCAL : CL_FSYNC_DISCARD, 1);
 	}
-	truncate_inode_pages_final(mapping);
-
-	/* Workaround for LU-118: Note nrpages may not be totally updated when
-	 * truncate_inode_pages() returns, as there can be a page in the process
-	 * of deletion (inside __delete_from_page_cache()) in the specified
-	 * range. Thus mapping->nrpages can be non-zero when this function
-	 * returns even after truncation of the whole mapping.  Only do this if
-	 * npages isn't already zero.
-	 */
-	nrpages = mapping->nrpages;
-	if (nrpages) {
-		xa_lock_irqsave(&mapping->i_pages, flags);
-		nrpages = mapping->nrpages;
-		xa_unlock_irqrestore(&mapping->i_pages, flags);
-	} /* Workaround end */
-
-	LASSERTF(nrpages == 0,
-		 "%s: inode="DFID"(%p) nrpages=%lu, see https://jira.whamcloud.com/browse/LU-118\n",
-		 ll_i2sbi(inode)->ll_fsname,
-		 PFID(ll_inode2fid(inode)), inode, nrpages);
 
+	ll_truncate_inode_pages_final(inode);
 	ll_clear_inode(inode);
 	clear_inode(inode);
 }
diff --git a/fs/lustre/llite/vvp_object.c b/fs/lustre/llite/vvp_object.c
index e999caa..096d996 100644
--- a/fs/lustre/llite/vvp_object.c
+++ b/fs/lustre/llite/vvp_object.c
@@ -164,13 +164,8 @@ static int vvp_prune(const struct lu_env *env, struct cl_object *obj)
 		return rc;
 	}
 
-	truncate_inode_pages(inode->i_mapping, 0);
-	if (inode->i_mapping->nrpages) {
-		CDEBUG(D_VFSTRACE, DFID ": still has %lu pages remaining\n",
-		       PFID(lu_object_fid(&obj->co_lu)),
-		       inode->i_mapping->nrpages);
-		return -EIO;
-	}
+	ll_truncate_inode_pages_final(inode);
+	clear_bit(AS_EXITING, &inode->i_mapping->flags);
 
 	return 0;
 }
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 12/27] lustre: obd: check if sbi->ll_md_exp is initialized
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (10 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 11/27] lustre: vvp: wait for nrpages to be updated James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 13/27] lustre: osc: Batch gang_lookup cbs James Simmons
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Artem Blagodarenko, Lustre Development List

From: Artem Blagodarenko <artem.blagodarenko@hpe.com>

Null reference at the start of obd_statfs() function is possible
because of ll_fill_super vs lctl race.

ll_md_exp is initialized in ll_fill_super()->
client_common_fill_super(), but if mount process stucks
in lustre_process_log() it doesn't reach client_common_fill_super().

HPE-bug-id: LUS-9150
WC-bug-id: https://jira.whamcloud.com/browse/LU-13942
Lustre-commit: 1de8c3739d6bac76b ("LU-13942 obd: check if sbi->ll_md_exp is initialized")
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/157732
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/39812
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_class.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h
index 5cbed01..4cc5a7df 100644
--- a/fs/lustre/include/obd_class.h
+++ b/fs/lustre/include/obd_class.h
@@ -921,12 +921,13 @@ static inline int obd_statfs(const struct lu_env *env, struct obd_export *exp,
 			     struct obd_statfs *osfs, time64_t max_age,
 			     u32 flags)
 {
-	struct obd_device *obd = exp->exp_obd;
+	struct obd_device *obd;
 	int rc = 0;
 
-	if (unlikely(!obd))
+	if (unlikely(!exp) || !exp->exp_obd)
 		return -EINVAL;
 
+	obd = exp->exp_obd;
 	rc = obd_check_dev_active(obd);
 	if (rc)
 		return rc;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 13/27] lustre: osc: Batch gang_lookup cbs
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (11 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 12/27] lustre: obd: check if sbi->ll_md_exp is initialized James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 14/27] lustre: llite: Return errors for aio James Simmons
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Patrick Farrell, Alexander Zarochentsev, Lustre Development List

From: Patrick Farrell <paf@cray.com>

The osc_page_gang_lookup call backs can be trivially
converted to operate in batches rather than one page at a
time.  This improves cancellation time for locks protecting
large numbers of pages by about 10% (after landing
another optimization (LU-11290 ldlm: page discard speedup)
it shows 6% for canceling a lock for 30GB cached file ).

Truncate to zero time (with one lock protecting many pages)
was improved by about 5-10% as well.  Lock weighing
performance should be improved slightly as well, but is
tricky to benchmark.

HPE-bug-id: LUS-6432
WC-bug-id: https://jira.whamcloud.com/browse/LU-11290
Lustre-commit: 0d6d0b7bc95a82de ("LU-11290 osc: Batch gang_lookup cbs")
Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/33089
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_osc.h |   7 +-
 fs/lustre/mdc/mdc_dev.c        |  46 +++++++------
 fs/lustre/osc/osc_cache.c      | 147 ++++++++++++++++++++++-------------------
 fs/lustre/osc/osc_io.c         |  33 +++++----
 fs/lustre/osc/osc_lock.c       |  19 ++++--
 5 files changed, 138 insertions(+), 114 deletions(-)

diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h
index f83d1e6..0947677 100644
--- a/fs/lustre/include/lustre_osc.h
+++ b/fs/lustre/include/lustre_osc.h
@@ -629,14 +629,13 @@ static inline void osc_io_unplug(const struct lu_env *env,
 	(void)__osc_io_unplug(env, cli, osc, 0);
 }
 
-typedef bool (*osc_page_gang_cbt)(const struct lu_env *, struct cl_io *,
-				  struct osc_page *, void *);
+typedef bool (*osc_page_gang_cbt)(const struct lu_env *env, struct cl_io *io,
+				  void **pvec, int count, void *cbdata);
 bool osc_page_gang_lookup(const struct lu_env *env, struct cl_io *io,
 			  struct osc_object *osc, pgoff_t start, pgoff_t end,
 			  osc_page_gang_cbt cb, void *cbdata);
-
 bool osc_discard_cb(const struct lu_env *env, struct cl_io *io,
-		    struct osc_page *ops, void *cbdata);
+		    void **pvec, int count, void *cbdata);
 
 /* osc_dev.c */
 int osc_device_init(const struct lu_env *env, struct lu_device *d,
diff --git a/fs/lustre/mdc/mdc_dev.c b/fs/lustre/mdc/mdc_dev.c
index 70f8987..0db05b5 100644
--- a/fs/lustre/mdc/mdc_dev.c
+++ b/fs/lustre/mdc/mdc_dev.c
@@ -183,33 +183,37 @@ struct ldlm_lock *mdc_dlmlock_at_pgoff(const struct lu_env *env,
  * Check if page @page is covered by an extra lock or discard it.
  */
 static bool mdc_check_and_discard_cb(const struct lu_env *env, struct cl_io *io,
-				    struct osc_page *ops, void *cbdata)
+				     void **pvec, int count, void *cbdata)
 {
 	struct osc_thread_info *info = osc_env_info(env);
 	struct osc_object *osc = cbdata;
 	pgoff_t index;
-
-	index = osc_index(ops);
-	if (index >= info->oti_fn_index) {
-		struct ldlm_lock *tmp;
-		struct cl_page *page = ops->ops_cl.cpl_page;
-
-		/* refresh non-overlapped index */
-		tmp = mdc_dlmlock_at_pgoff(env, osc, index,
-					   OSC_DAP_FL_TEST_LOCK | OSC_DAP_FL_AST);
-		if (tmp) {
-			info->oti_fn_index = CL_PAGE_EOF;
-			LDLM_LOCK_PUT(tmp);
-		} else if (cl_page_own(env, io, page) == 0) {
-			/* discard the page */
-			cl_page_discard(env, io, page);
-			cl_page_disown(env, io, page);
-		} else {
-			LASSERT(page->cp_state == CPS_FREEING);
+	int i;
+
+	for (i = 0; i < count; i++) {
+		struct osc_page *ops = pvec[i];
+
+		index = osc_index(ops);
+		if (index >= info->oti_fn_index) {
+			struct ldlm_lock *tmp;
+			struct cl_page *page = ops->ops_cl.cpl_page;
+
+			/* refresh non-overlapped index */
+			tmp = mdc_dlmlock_at_pgoff(env, osc, index,
+						   OSC_DAP_FL_TEST_LOCK | OSC_DAP_FL_AST);
+			if (tmp) {
+				info->oti_fn_index = CL_PAGE_EOF;
+				LDLM_LOCK_PUT(tmp);
+			} else if (cl_page_own(env, io, page) == 0) {
+				/* discard the page */
+				cl_page_discard(env, io, page);
+				cl_page_disown(env, io, page);
+			} else {
+				LASSERT(page->cp_state == CPS_FREEING);
+			}
 		}
+		info->oti_next_index = index + 1;
 	}
-
-	info->oti_next_index = index + 1;
 	return true;
 }
 
diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c
index fc8079a..8dd12b1 100644
--- a/fs/lustre/osc/osc_cache.c
+++ b/fs/lustre/osc/osc_cache.c
@@ -3171,11 +3171,10 @@ bool osc_page_gang_lookup(const struct lu_env *env, struct cl_io *io,
 		spin_unlock(&osc->oo_tree_lock);
 		tree_lock = false;
 
+		res = (*cb)(env, io, pvec, j, cbdata);
+
 		for (i = 0; i < j; ++i) {
 			ops = pvec[i];
-			if (res)
-				res = (*cb)(env, io, ops, cbdata);
-
 			page = ops->ops_cl.cpl_page;
 			lu_ref_del(&page->cp_reference, "gang_lookup", current);
 			cl_pagevec_put(env, page, pagevec);
@@ -3204,55 +3203,93 @@ bool osc_page_gang_lookup(const struct lu_env *env, struct cl_io *io,
  * Check if page @page is covered by an extra lock or discard it.
  */
 static bool check_and_discard_cb(const struct lu_env *env, struct cl_io *io,
-				 struct osc_page *ops, void *cbdata)
+				 void **pvec, int count, void *cbdata)
 {
 	struct osc_thread_info *info = osc_env_info(env);
 	struct osc_object *osc = cbdata;
-	struct cl_page *page = ops->ops_cl.cpl_page;
-	pgoff_t index;
-	bool discard = false;
-
-	index = osc_index(ops);
-	/* negative lock caching */
-	if (index < info->oti_ng_index) {
-		discard = true;
-	} else if (index >= info->oti_fn_index) {
-		struct ldlm_lock *tmp;
-
-		/* refresh non-overlapped index */
-		tmp = osc_dlmlock_at_pgoff(env, osc, index,
-					   OSC_DAP_FL_TEST_LOCK |
-					   OSC_DAP_FL_AST | OSC_DAP_FL_RIGHT);
-		if (tmp) {
-			u64 end = tmp->l_policy_data.l_extent.end;
-			u64 start = tmp->l_policy_data.l_extent.start;
-
-			/* no lock covering this page */
-			if (index < cl_index(osc2cl(osc), start)) {
-				/* no lock at @index, first lock at @start */
-				info->oti_ng_index = cl_index(osc2cl(osc),
-							      start);
+	int i;
+
+	for (i = 0; i < count; i++) {
+		struct osc_page *ops = pvec[i];
+		struct cl_page *page = ops->ops_cl.cpl_page;
+		pgoff_t index = osc_index(ops);
+		bool discard = false;
+
+		/* negative lock caching */
+		if (index < info->oti_ng_index) {
+			discard = true;
+		} else if (index >= info->oti_fn_index) {
+			struct ldlm_lock *tmp;
+
+			/* refresh non-overlapped index */
+			tmp = osc_dlmlock_at_pgoff(env, osc, index,
+						   OSC_DAP_FL_TEST_LOCK |
+						   OSC_DAP_FL_AST | OSC_DAP_FL_RIGHT);
+			if (tmp) {
+				u64 end = tmp->l_policy_data.l_extent.end;
+				u64 start = tmp->l_policy_data.l_extent.start;
+
+				/* no lock covering this page */
+				if (index < cl_index(osc2cl(osc), start)) {
+					/* no lock at @index,
+					 * first lock at @start
+					 */
+					info->oti_ng_index = cl_index(osc2cl(osc),
+								      start);
+					discard = true;
+				} else {
+					/* Cache the first-non-overlapped
+					 * index so as to skip all pages
+					 * within [index, oti_fn_index).
+					 * This is safe because if tmp lock
+					 * is canceled, it will discard these
+					 * pages.
+					 */
+					info->oti_fn_index = cl_index(osc2cl(osc),
+								      end + 1);
+					if (end == OBD_OBJECT_EOF)
+						info->oti_fn_index = CL_PAGE_EOF;
+				}
+				LDLM_LOCK_PUT(tmp);
+			} else {
+				info->oti_ng_index = CL_PAGE_EOF;
 				discard = true;
+			}
+		}
+
+		if (discard) {
+			if (cl_page_own(env, io, page) == 0) {
+				/* discard the page */
+				cl_page_discard(env, io, page);
+				cl_page_disown(env, io, page);
 			} else {
-				/* Cache the first-non-overlapped index so as to
-				 * skip all pages within [index, oti_fn_index).
-				 * This is safe because if tmp lock is canceled,
-				 * it will discard these pages.
-				 */
-				info->oti_fn_index = cl_index(osc2cl(osc),
-							      end + 1);
-				if (end == OBD_OBJECT_EOF)
-					info->oti_fn_index = CL_PAGE_EOF;
+				LASSERT(page->cp_state == CPS_FREEING);
 			}
-			LDLM_LOCK_PUT(tmp);
-		} else {
-			info->oti_ng_index = CL_PAGE_EOF;
-			discard = true;
 		}
+
+		info->oti_next_index = index + 1;
 	}
+	return true;
+}
 
-	if (discard) {
+bool osc_discard_cb(const struct lu_env *env, struct cl_io *io,
+		    void **pvec, int count, void *cbdata)
+{
+	struct osc_thread_info *info = osc_env_info(env);
+	int i;
+
+	for (i = 0; i < count; i++) {
+		struct osc_page *ops = pvec[i];
+		struct cl_page *page = ops->ops_cl.cpl_page;
+
+		/* page is top page. */
+		info->oti_next_index = osc_index(ops) + 1;
 		if (cl_page_own(env, io, page) == 0) {
+			if (page->cp_type == CPT_CACHEABLE &&
+			    PageDirty(cl_page_vmpage(page)))
+				CL_PAGE_DEBUG(D_ERROR, env, page,
+					      "discard dirty page?\n");
+
 			/* discard the page */
 			cl_page_discard(env, io, page);
 			cl_page_disown(env, io, page);
@@ -3261,32 +3298,6 @@ static bool check_and_discard_cb(const struct lu_env *env, struct cl_io *io,
 		}
 	}
 
-	info->oti_next_index = index + 1;
-
-	return true;
-}
-
-bool osc_discard_cb(const struct lu_env *env, struct cl_io *io,
-		    struct osc_page *ops, void *cbdata)
-{
-	struct osc_thread_info *info = osc_env_info(env);
-	struct cl_page *page = ops->ops_cl.cpl_page;
-
-	/* page is top page. */
-	info->oti_next_index = osc_index(ops) + 1;
-	if (cl_page_own(env, io, page) == 0) {
-		if (page->cp_type == CPT_CACHEABLE &&
-		    PageDirty(cl_page_vmpage(page)))
-			CL_PAGE_DEBUG(D_ERROR, env, page,
-				      "discard dirty page?\n");
-
-		/* discard the page */
-		cl_page_discard(env, io, page);
-		cl_page_disown(env, io, page);
-	} else {
-		LASSERT(page->cp_state == CPS_FREEING);
-	}
-
 	return true;
 }
 EXPORT_SYMBOL(osc_discard_cb);
diff --git a/fs/lustre/osc/osc_io.c b/fs/lustre/osc/osc_io.c
index b792c22..de214ba 100644
--- a/fs/lustre/osc/osc_io.c
+++ b/fs/lustre/osc/osc_io.c
@@ -491,22 +491,27 @@ static int osc_async_upcall(void *a, int rc)
  * Checks that there are no pages being written in the extent being truncated.
  */
 static bool trunc_check_cb(const struct lu_env *env, struct cl_io *io,
-			  struct osc_page *ops, void *cbdata)
+			   void **pvec, int count, void *cbdata)
 {
-	struct cl_page *page = ops->ops_cl.cpl_page;
-	struct osc_async_page *oap;
-	u64 start = *(u64 *)cbdata;
-
-	oap = &ops->ops_oap;
-	if (oap->oap_cmd & OBD_BRW_WRITE &&
-	    !list_empty(&oap->oap_pending_item))
-		CL_PAGE_DEBUG(D_ERROR, env, page, "exists %llu/%s.\n",
-			      start, current->comm);
-
-	if (PageLocked(page->cp_vmpage))
-		CDEBUG(D_CACHE, "page %p index %lu locked for %d.\n",
-		       ops, osc_index(ops), oap->oap_cmd & OBD_BRW_RWMASK);
+	int i;
 
+	for (i = 0; i < count; i++) {
+		struct osc_page *ops = pvec[i];
+		struct cl_page *page = ops->ops_cl.cpl_page;
+		struct osc_async_page *oap;
+		u64 start = *(u64 *)cbdata;
+
+		oap = &ops->ops_oap;
+		if (oap->oap_cmd & OBD_BRW_WRITE &&
+		    !list_empty(&oap->oap_pending_item))
+			CL_PAGE_DEBUG(D_ERROR, env, page, "exists %llu/%s.\n",
+				      start, current->comm);
+
+		if (PageLocked(page->cp_vmpage))
+			CDEBUG(D_CACHE, "page %p index %lu locked for %d.\n",
+			       ops, osc_index(ops),
+			       oap->oap_cmd & OBD_BRW_RWMASK);
+	}
 	return true;
 }
 
diff --git a/fs/lustre/osc/osc_lock.c b/fs/lustre/osc/osc_lock.c
index e0de371..422f3e5 100644
--- a/fs/lustre/osc/osc_lock.c
+++ b/fs/lustre/osc/osc_lock.c
@@ -647,16 +647,21 @@ int osc_ldlm_glimpse_ast(struct ldlm_lock *dlmlock, void *data)
 EXPORT_SYMBOL(osc_ldlm_glimpse_ast);
 
 static bool weigh_cb(const struct lu_env *env, struct cl_io *io,
-		     struct osc_page *ops, void *cbdata)
+		     void **pvec, int count, void *cbdata)
 {
-	struct cl_page *page = ops->ops_cl.cpl_page;
+	int i;
 
-	if (cl_page_is_vmlocked(env, page) ||
-	    PageDirty(page->cp_vmpage) ||
-	    PageWriteback(page->cp_vmpage))
-		return false;
+	for (i = 0; i < count; i++) {
+		struct osc_page *ops = pvec[i];
+		struct cl_page *page = ops->ops_cl.cpl_page;
 
-	*(pgoff_t *)cbdata = osc_index(ops) + 1;
+		if (cl_page_is_vmlocked(env, page) ||
+		    PageDirty(page->cp_vmpage) ||
+		    PageWriteback(page->cp_vmpage))
+			return false;
+
+		*(pgoff_t *)cbdata = osc_index(ops) + 1;
+	}
 	return true;
 }
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 14/27] lustre: llite: Return errors for aio
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (12 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 13/27] lustre: osc: Batch gang_lookup cbs James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 15/27] lnet: do not crash if lnet_sock_getaddr returns error James Simmons
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Patrick Farrell, Lustre Development List

From: Patrick Farrell <farr0186@gmail.com>

The aio code incorrectly discards errors from
ll_direct_rw_pages.  Fix this and add a test for this.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14687
Lustre-commit: 3e1f8d30cb0209b3 ("LU-14687 llite: Return errors for aio")
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-on: https://review.whamcloud.com/43722
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_support.h | 1 +
 fs/lustre/llite/rw26.c          | 3 ++-
 fs/lustre/obdclass/cl_page.c    | 3 +++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index 962a99b..188e552 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -479,6 +479,7 @@
 #define OBD_FAIL_LLITE_SHORT_COMMIT			0x1415
 #define OBD_FAIL_LLITE_CREATE_FILE_PAUSE2		0x1416
 #define OBD_FAIL_LLITE_RACE_MOUNT			0x1417
+#define OBD_FAIL_LLITE_PAGE_ALLOC			0x1418
 
 #define OBD_FAIL_FID_INDIR				0x1501
 #define OBD_FAIL_FID_INLMA				0x1502
diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c
index 74f3b0b..2de956d 100644
--- a/fs/lustre/llite/rw26.c
+++ b/fs/lustre/llite/rw26.c
@@ -435,7 +435,8 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 			vio->u.readwrite.vui_written += tot_bytes;
 		else
 			vio->u.readwrite.vui_read += tot_bytes;
-		result = -EIOCBQUEUED;
+		if (result == 0)
+			result = -EIOCBQUEUED;
 	}
 
 	return result;
diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c
index 4b6386c..1c9e91d 100644
--- a/fs/lustre/obdclass/cl_page.c
+++ b/fs/lustre/obdclass/cl_page.c
@@ -158,6 +158,9 @@ static struct cl_page *__cl_page_alloc(struct cl_object *o)
 	struct cl_page *cl_page = NULL;
 	unsigned short bufsize = cl_object_header(o)->coh_page_bufsize;
 
+	if (OBD_FAIL_CHECK(OBD_FAIL_LLITE_PAGE_ALLOC))
+		return NULL;
+
 check:
 	/* the number of entries in cl_page_kmem_array is expected to
 	 * only be 2-3 entries, so the lookup overhead should be low.
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 15/27] lnet: do not crash if lnet_sock_getaddr returns error
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (13 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 14/27] lustre: llite: Return errors for aio James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 16/27] lustre: sec: forbid file rename from enc to unencrypted dir James Simmons
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Artem Blagodarenko, Lustre Development List

From: Artem Blagodarenko <artem.blagodarenko@hpe.com>

Some issues with network lead to panic in ksocknal_accept

rc = lnet_sock_getaddr(sock, true, &peer_ip, &peer_port);
LASSERT(rc == 0); /* we succeeded before */

Let's pass this error to the caller.

HPE-bug-id: LUS-9256
WC-bug-id: https://jira.whamcloud.com/browse/LU-13950
Lustre-commit: 48a9ea82eb30bbb ("LU-13950 lnet: do not crash if lnet_sock_getaddr returns error")
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/157753
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-on: https://review.whamcloud.com/39834
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 5 ++++-
 net/lnet/lnet/acceptor.c         | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 3a667e5..eb8c736 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -744,7 +744,10 @@ struct ksock_peer_ni *
 	struct sockaddr_storage peer;
 
 	rc = lnet_sock_getaddr(sock, true, &peer);
-	LASSERT(!rc);			/* we succeeded before */
+	if (rc != 0) {
+		CERROR("Can't determine new connection's address\n");
+		return rc;
+	}
 
 	cr = kzalloc(sizeof(*cr), GFP_NOFS);
 	if (!cr) {
diff --git a/net/lnet/lnet/acceptor.c b/net/lnet/lnet/acceptor.c
index b301ffa..3708b89 100644
--- a/net/lnet/lnet/acceptor.c
+++ b/net/lnet/lnet/acceptor.c
@@ -200,7 +200,10 @@ struct socket *
 	LASSERT(sizeof(cr) <= 16);		/* not too big for the stack */
 
 	rc = lnet_sock_getaddr(sock, true, &peer);
-	LASSERT(!rc);				/* we succeeded before */
+	if (rc != 0) {
+		CERROR("Can't determine new connection's address\n");
+		return rc;
+	}
 
 	if (!lnet_accept_magic(magic, LNET_PROTO_ACCEPTOR_MAGIC)) {
 		if (lnet_accept_magic(magic, LNET_PROTO_MAGIC)) {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 16/27] lustre: sec: forbid file rename from enc to unencrypted dir
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (14 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 15/27] lnet: do not crash if lnet_sock_getaddr returns error James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 17/27] lustre: mdc: start changelog thread upon first access James Simmons
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Sebastien Buisson <sbuisson@ddn.com>

fscrypt allows renaming an encrypted file from an encrypted directory
into an unencrypted directory. But it leaves the file encrypted,
sitting in an unencrypted directory, which can lead to unexpected
issues.
So just prevent this kind of rename, and adapt sanity-sec test_47
accordingly.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14629
Lustre-commit: 1158386ac9c6a63 ("LU-14629 sec: forbid file rename from enc to unencrypted dir")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/43404
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/namei.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index a2f5d8d..43cbfbd 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -1792,6 +1792,11 @@ static int ll_rename(struct inode *src, struct dentry *src_dchild,
 	err = fscrypt_prepare_rename(src, src_dchild, tgt, tgt_dchild, flags);
 	if (err)
 		return err;
+	/* we prevent an encrypted file from being renamed
+	 * into an unencrypted dir
+	 */
+	if (IS_ENCRYPTED(src) && !IS_ENCRYPTED(tgt))
+		return -EXDEV;
 
 	if (src_dchild->d_inode)
 		mode = src_dchild->d_inode->i_mode;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 17/27] lustre: mdc: start changelog thread upon first access
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (15 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 16/27] lustre: sec: forbid file rename from enc to unencrypted dir James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 18/27] lustre: llog: changelog purge deletes plain llog James Simmons
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Alex Zhuravlev <bzzz@whamcloud.com>

thus leaving the caller a chance to set CHANGELOG_FLAG_FOLLOW,
otherwise the thread (started from open()) can reach the end
of the changelog and exit early.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14663
Lustre-commit: 72a08ea547dceb54 ("LU-14663 mdc: start changelog thread upon first access")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43513
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/mdc/mdc_changelog.c | 54 +++++++++++++++++++++++++++++--------------
 1 file changed, 37 insertions(+), 17 deletions(-)

diff --git a/fs/lustre/mdc/mdc_changelog.c b/fs/lustre/mdc/mdc_changelog.c
index 31c6c8a..d366720 100644
--- a/fs/lustre/mdc/mdc_changelog.c
+++ b/fs/lustre/mdc/mdc_changelog.c
@@ -363,6 +363,33 @@ static int chlg_load(void *args)
 	return rc;
 }
 
+static int chlg_start_thread(struct file *file)
+{
+	struct chlg_reader_state *crs = file->private_data;
+	struct task_struct *task;
+	int rc = 0;
+
+	if (likely(crs->crs_prod_task))
+		return 0;
+	if (unlikely(file->f_mode & FMODE_READ) == 0)
+		return 0;
+
+	mutex_lock(&crs->crs_lock);
+	if (!crs->crs_prod_task) {
+		task = kthread_run(chlg_load, crs, "chlg_load_thread");
+		if (IS_ERR(task)) {
+			rc = PTR_ERR(task);
+			CERROR("%s: cannot start changelog thread: rc = %d\n",
+			       crs->crs_ced->ced_name, rc);
+			goto out;
+		}
+		crs->crs_prod_task = task;
+	}
+out:
+	mutex_unlock(&crs->crs_lock);
+	return rc;
+}
+
 /**
  * Read handler, dequeues records from the chlg_reader_state if any.
  * No partial records are copied to userland so this function can return less
@@ -396,6 +423,10 @@ static ssize_t chlg_read(struct file *file, char __user *buff, size_t count,
 			return -EAGAIN;
 	}
 
+	rc = chlg_start_thread(file);
+	if (rc)
+		return rc;
+
 	rc = wait_event_interruptible(crs->crs_waitq_cons,
 			crs->crs_rec_count > 0 || crs->crs_eof || crs->crs_err);
 
@@ -601,8 +632,6 @@ static int chlg_open(struct inode *inode, struct file *file)
 {
 	struct chlg_reader_state *crs;
 	struct chlg_registered_dev *dev;
-	struct task_struct *task;
-	int rc;
 
 	dev = container_of(inode->i_cdev, struct chlg_registered_dev, ced_cdev);
 
@@ -620,24 +649,10 @@ static int chlg_open(struct inode *inode, struct file *file)
 	init_waitqueue_head(&crs->crs_waitq_prod);
 	init_waitqueue_head(&crs->crs_waitq_cons);
 
-	if (file->f_mode & FMODE_READ) {
-		task = kthread_run(chlg_load, crs, "chlg_load_thread");
-		if (IS_ERR(task)) {
-			rc = PTR_ERR(task);
-			CERROR("%s: cannot start changelog thread: rc = %d\n",
-			       dev->ced_name, rc);
-			goto err_crs;
-		}
-		crs->crs_prod_task = task;
-	}
+	crs->crs_prod_task = NULL;
 
 	file->private_data = crs;
 	return 0;
-
-err_crs:
-	kref_put(&dev->ced_refs, chlg_dev_clear);
-	kfree(crs);
-	return rc;
 }
 
 /**
@@ -679,6 +694,11 @@ static unsigned int chlg_poll(struct file *file, poll_table *wait)
 {
 	struct chlg_reader_state *crs = file->private_data;
 	unsigned int mask = 0;
+	int rc;
+
+	rc = chlg_start_thread(file);
+	if (rc)
+		return rc;
 
 	mutex_lock(&crs->crs_lock);
 	poll_wait(file, &crs->crs_waitq_cons, wait);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 18/27] lustre: llog: changelog purge deletes plain llog
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (16 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 17/27] lustre: mdc: start changelog thread upon first access James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 19/27] lnet: libcfs: allow comma-separated masks James Simmons
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Alexander Boyko, Lustre Development List

From: Alexander Boyko <alexander.boyko@hpe.com>

With a massive cancel records changelog could delete a plain
llog file and skip one by one record cancelling.
Also patch fixes the race between llog_destroy and llog_next_block.

HPE-bug-id: LUS-9950
WC-bug-id: https://jira.whamcloud.com/browse/LU-14688
Lustre-commit: d813c75df6798efb ("LU-14688 mdt: changelog purge deletes plain llog")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/43719
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/obdclass/llog.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/lustre/obdclass/llog.c b/fs/lustre/obdclass/llog.c
index c342734..768bc47 100644
--- a/fs/lustre/obdclass/llog.c
+++ b/fs/lustre/obdclass/llog.c
@@ -323,6 +323,10 @@ static int llog_process_thread(void *arg)
 			CDEBUG(D_OTHER,
 			       "cur_offset %llu, chunk_offset %llu, buf_offset %u, rc = %d\n",
 			       cur_offset, (u64)chunk_offset, buf_offset, rc);
+		if (rc == -ESTALE) {
+			rc = 0;
+			goto out;
+		}
 		/* we`ve tried to reread the chunk, but there is no
 		 * new records
 		 */
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 19/27] lnet: libcfs: allow comma-separated masks
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (17 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 18/27] lustre: llog: changelog purge deletes plain llog James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 20/27] lustre: osc: cleanup comment in osc_object_is_contended James Simmons
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Andreas Dilger <adilger@whamcloud.com>

For debug and changelog mask names, allow a comma-separated list
of names to be given, so that the space-separated list does not
need to be quoted for use.

Fix a couple of test cases where the debug parameter is set and
printed overly verbosely during tests.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13055
Lustre-commit: 6b6fde1026311a28 ("LU-13055 libcfs: allow comma-separated masks")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43741
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/libcfs/libcfs_string.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/lnet/libcfs/libcfs_string.c b/net/lnet/libcfs/libcfs_string.c
index d2460f3..4259f8b8 100644
--- a/net/lnet/libcfs/libcfs_string.c
+++ b/net/lnet/libcfs/libcfs_string.c
@@ -52,14 +52,14 @@ int cfs_str2mask(const char *str, const char *(*bit2str)(int bit),
 	char op = '\0';
 	int newmask = minmask, i, len, found = 0;
 
-	/* <str> must be a list of tokens separated by whitespace
+	/* <str> must be a list of tokens separated by whitespace or comma,
 	 * and optionally an operator ('+' or '-').  If an operator
 	 * appears first in <str>, '*oldmask' is used as the starting point
 	 * (relative), otherwise minmask is used (absolute).  An operator
 	 * applies to all following tokens up to the next operator.
 	 */
 	while (*str != '\0') {
-		while (isspace(*str))
+		while (isspace(*str) || *str == ',')
 			str++;
 		if (*str == '\0')
 			break;
@@ -77,7 +77,7 @@ int cfs_str2mask(const char *str, const char *(*bit2str)(int bit),
 		/* find token length */
 		len = 0;
 		while (str[len] != '\0' && !isspace(str[len]) &&
-		       str[len] != '+' && str[len] != '-')
+		       str[len] != '+' && str[len] != '-' && str[len] != ',')
 			len++;
 
 		/* match token */
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 20/27] lustre: osc: cleanup comment in osc_object_is_contended
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (18 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 19/27] lnet: libcfs: allow comma-separated masks James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 21/27] lnet: simplify lnet_ni_add_interface James Simmons
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Li Xi, Lustre Development List

From: Li Xi <lixi@ddn.com>

ll_file_is_contended() does not exist any more, so the comment
is invalid.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14702
Lustre-commit: 269a157c600a68fa ("LU-14702 osc: cleanup comment in osc_object_is_contended")
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/43775
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/osc/osc_object.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/lustre/osc/osc_object.c b/fs/lustre/osc/osc_object.c
index 8f36789..0dd926a 100644
--- a/fs/lustre/osc/osc_object.c
+++ b/fs/lustre/osc/osc_object.c
@@ -344,10 +344,6 @@ int osc_object_is_contended(struct osc_object *obj)
 	if (!obj->oo_contended)
 		return 0;
 
-	/*
-	 * I like copy-paste. the code is copied from
-	 * ll_file_is_contended.
-	 */
 	retry_time = ktime_add_ns(obj->oo_contention_time,
 				  osc_contention_time * NSEC_PER_SEC);
 	if (ktime_after(ktime_get(), retry_time)) {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 21/27] lnet: simplify lnet_ni_add_interface
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (19 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 20/27] lustre: osc: cleanup comment in osc_object_is_contended James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 22/27] lustre: lmv: change default hash type to crush James Simmons
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Olaf Faaland <faaland1@llnl.gov>

Remove an unnecessary counter and move the comment before
the relevant code.  Improve error messages.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14665
Lustre-commit: b77a6d86936c32bb ("LU-14665 lnet: simplify lnet_ni_add_interface")
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/43525
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/config.c | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c
index 5f4b90b..0117611 100644
--- a/net/lnet/lnet/config.c
+++ b/net/lnet/lnet/config.c
@@ -369,31 +369,26 @@ struct lnet_net *
 static int
 lnet_ni_add_interface(struct lnet_ni *ni, char *iface)
 {
-	int niface = 0;
-
 	if (!ni)
 		return -ENOMEM;
 
-	/* Allocate a separate piece of memory and copy
-	 * into it the string, so we don't have
-	 * a depencency on the tokens string.  This way we
-	 * can free the tokens at the end of the function.
-	 * The newly allocated ni_interface[] can be
-	 * freed when freeing the NI */
-	if (ni->ni_interface)
-		niface++;
-
-	if (niface >= 1) {
-		LCONSOLE_ERROR_MSG(0x115, "Too many interfaces "
-				   "for net %s\n",
-				   libcfs_net2str(LNET_NIDNET(ni->ni_nid)));
+	if (ni->ni_interface) {
+		LCONSOLE_ERROR_MSG(0x115, "%s: interface %s already set for net %s: rc = %d\n",
+				   iface, ni->ni_interface,
+				   libcfs_net2str(LNET_NIDNET(ni->ni_nid)),
+				   -EINVAL);
 		return -EINVAL;
 	}
 
+	/* Allocate memory for the interface, so the code parsing input into
+	 * tokens and adding interfaces can free the input safely.
+	 * ni->ni_interface is freed in lnet_ni_free().
+	 */
 	ni->ni_interface = kstrdup(iface, GFP_KERNEL);
 
 	if (!ni->ni_interface) {
-		CERROR("Can't allocate net interface name\n");
+		CERROR("%s: cannot allocate net interface name: rc = %d\n",
+		       iface, -ENOMEM);
 		return -ENOMEM;
 	}
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 22/27] lustre: lmv: change default hash type to crush
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (20 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 21/27] lnet: simplify lnet_ni_add_interface James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 23/27] lustre: ptlrpc: move more members in PTLRPC request into pill James Simmons
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Andreas Dilger <adilger@whamcloud.com>

Change the default hash type to CRUSH to minimize the number
of directory entries that need to be migrated.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14459
Lustre-commit: bb60caa1c6e7c14c ("LU-14459 lmv: change default hash type to crush")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43684
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/dir.c                   | 26 ++++++++++----------------
 include/uapi/linux/lustre/lustre_user.h |  2 +-
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index d7466f3..bd15fee 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -417,23 +417,17 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 	    !OBD_FAIL_CHECK(OBD_FAIL_LLITE_NO_CHECK_DEAD))
 		return -ENOENT;
 
+	/* MDS < 2.14 doesn't support 'crush' hash type, and cannot handle
+	 * unknown hash if client doesn't set a valid one. switch to fnv_1a_64.
+	 */
 	if (!(exp_connect_flags2(sbi->ll_md_exp) & OBD_CONNECT2_CRUSH)) {
-		if ((lump->lum_hash_type & LMV_HASH_TYPE_MASK) ==
-		     LMV_HASH_TYPE_CRUSH) {
-			/* if server doesn't support 'crush' hash type,
-			 * switch to fnv_1a_64.
-			 */
-			lump->lum_hash_type &= ~LMV_HASH_TYPE_MASK;
-			lump->lum_hash_type |= LMV_HASH_TYPE_FNV_1A_64;
-		} else if ((lump->lum_hash_type & LMV_HASH_TYPE_MASK) ==
-			    LMV_HASH_TYPE_UNKNOWN) {
-			/* from 2.14 MDT will choose default hash type if client
-			 * doesn't set a valid one, while old server doesn't
-			 * handle it.
-			 */
-			lump->lum_hash_type &= ~LMV_HASH_TYPE_MASK;
-			lump->lum_hash_type |= LMV_HASH_TYPE_DEFAULT;
-		}
+		enum lmv_hash_type type = lump->lum_hash_type &
+					  LMV_HASH_TYPE_MASK;
+
+		if (type == LMV_HASH_TYPE_CRUSH ||
+		    type == LMV_HASH_TYPE_UNKNOWN)
+			lump->lum_hash_type = (lump->lum_hash_type ^ type) |
+					      LMV_HASH_TYPE_FNV_1A_64;
 	}
 
 	if (unlikely(!lmv_user_magic_supported(cpu_to_le32(lump->lum_magic))))
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index aae6642..49b013c 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -701,7 +701,7 @@ static __attribute__((unused)) const char *mdt_hash_name[] = {
 	"crush",
 };
 
-#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_FNV_1A_64
+#define LMV_HASH_TYPE_DEFAULT LMV_HASH_TYPE_CRUSH
 
 /* Right now only the lower part(0-16bits) of lmv_hash_type is being used,
  * and the higher part will be the flag to indicate the status of object,
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 23/27] lustre: ptlrpc: move more members in PTLRPC request into pill
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (21 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 22/27] lustre: lmv: change default hash type to crush James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 24/27] lustre: llite: add selinux testing James Simmons
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Qian Yingjin <qian@ddn.com>

Some data members in the data structure @ptlrpc_request can be
moved into the data structure @rep_capsule:
/** Request message - what client sent */
struct lustre_msg *rq_reqmsg;
/** Reply message - server response */
struct lustre_msg *rq_repmsg;
/** Fields that help to see if request and reply were swabbed */
u32 rq_req_swab_mask;
u32 rq_rep_swab_mask;

After these data structures are reconstructed, @rep_capsule can
be more common used and it makes pack and unpack sub requests
in a batch PtlRPC request for the coming batch metadata processing
more easily.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14138
Lustre-commit: f75d2a1fc9b17b38 ("LU-14138 ptlrpc: move more members in PTLRPC request into pill")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/40669
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_net.h        |  72 +----------------
 fs/lustre/include/lustre_req_layout.h |  78 +++++++++++++++++-
 fs/lustre/include/obd.h               |   2 +-
 fs/lustre/include/obd_class.h         |   4 +-
 fs/lustre/llite/dcache.c              |   6 +-
 fs/lustre/llite/dir.c                 |   2 +-
 fs/lustre/llite/file.c                |   4 +-
 fs/lustre/llite/llite_internal.h      |   4 +-
 fs/lustre/llite/llite_lib.c           |  16 ++--
 fs/lustre/llite/llite_nfs.c           |   2 +-
 fs/lustre/llite/namei.c               |   8 +-
 fs/lustre/llite/statahead.c           |   2 +-
 fs/lustre/lmv/lmv_obd.c               |   5 +-
 fs/lustre/mdc/mdc_acl.c               |   3 +-
 fs/lustre/mdc/mdc_dev.c               |   6 +-
 fs/lustre/mdc/mdc_internal.h          |  36 ++++-----
 fs/lustre/mdc/mdc_lib.c               | 146 +++++++++++++++++-----------------
 fs/lustre/mdc/mdc_locks.c             |  10 +--
 fs/lustre/mdc/mdc_reint.c             |  13 +--
 fs/lustre/mdc/mdc_request.c           |  43 +++++-----
 fs/lustre/mgc/mgc_request.c           |   2 +-
 fs/lustre/ptlrpc/layout.c             |  32 ++++++--
 fs/lustre/ptlrpc/pack_generic.c       |  52 +++++-------
 fs/lustre/ptlrpc/sec.c                |   6 +-
 fs/lustre/ptlrpc/sec_plain.c          |   4 +-
 25 files changed, 282 insertions(+), 276 deletions(-)

diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h
index c894d0f..f72f7c6 100644
--- a/fs/lustre/include/lustre_net.h
+++ b/fs/lustre/include/lustre_net.h
@@ -770,6 +770,10 @@ struct ptlrpc_srv_req {
 #define rq_user_desc		rq_srv.sr_user_desc
 #define rq_ops			rq_srv.sr_ops
 #define rq_rqbd			rq_srv.sr_rqbd
+#define rq_reqmsg		rq_pill.rc_reqmsg
+#define rq_repmsg		rq_pill.rc_repmsg
+#define rq_req_swab_mask	rq_pill.rc_req_swab_mask
+#define rq_rep_swab_mask	rq_pill.rc_rep_swab_mask
 
 /**
  * Represents remote procedure call.
@@ -857,10 +861,6 @@ struct ptlrpc_request {
 	int				rq_replen;
 	/** Pool if request is from preallocated list */
 	struct ptlrpc_request_pool     *rq_pool;
-	/** Request message - what client sent */
-	struct lustre_msg	       *rq_reqmsg;
-	/** Reply message - server response */
-	struct lustre_msg	       *rq_repmsg;
 	/** Transaction number */
 	u64				rq_transno;
 	/** xid */
@@ -932,10 +932,6 @@ struct ptlrpc_request {
 
 	/** @} */
 
-	/** Fields that help to see if request and reply were swabbed or not */
-	u32			rq_req_swab_mask;
-	u32			rq_rep_swab_mask;
-
 	/** how many early replies (for stats) */
 	int			rq_early_count;
 
@@ -1011,62 +1007,6 @@ static inline bool ptlrpc_nrs_req_can_move(struct ptlrpc_request *req)
 /** @} nrs */
 
 /**
- * Returns true if request buffer at offset @index was already swabbed
- */
-static inline bool lustre_req_swabbed(struct ptlrpc_request *req, size_t index)
-{
-	LASSERT(index < sizeof(req->rq_req_swab_mask) * 8);
-	return req->rq_req_swab_mask & BIT(index);
-}
-
-/**
- * Returns true if request reply buffer at offset @index was already swabbed
- */
-static inline bool lustre_rep_swabbed(struct ptlrpc_request *req, size_t index)
-{
-	LASSERT(index < sizeof(req->rq_rep_swab_mask) * 8);
-	return req->rq_rep_swab_mask & BIT(index);
-}
-
-/**
- * Returns true if request needs to be swabbed into local cpu byteorder
- */
-static inline bool ptlrpc_req_need_swab(struct ptlrpc_request *req)
-{
-	return lustre_req_swabbed(req, MSG_PTLRPC_HEADER_OFF);
-}
-
-/**
- * Returns true if request reply needs to be swabbed into local cpu byteorder
- */
-static inline bool ptlrpc_rep_need_swab(struct ptlrpc_request *req)
-{
-	return lustre_rep_swabbed(req, MSG_PTLRPC_HEADER_OFF);
-}
-
-/**
- * Mark request buffer at offset @index that it was already swabbed
- */
-static inline void lustre_set_req_swabbed(struct ptlrpc_request *req,
-					  size_t index)
-{
-	LASSERT(index < sizeof(req->rq_req_swab_mask) * 8);
-	LASSERT((req->rq_req_swab_mask & BIT(index)) == 0);
-	req->rq_req_swab_mask |= BIT(index);
-}
-
-/**
- * Mark request reply buffer at offset @index that it was already swabbed
- */
-static inline void lustre_set_rep_swabbed(struct ptlrpc_request *req,
-					  size_t index)
-{
-	LASSERT(index < sizeof(req->rq_rep_swab_mask) * 8);
-	LASSERT((req->rq_rep_swab_mask & BIT(index)) == 0);
-	req->rq_rep_swab_mask |= BIT(index);
-}
-
-/**
  * Convert numerical request phase value @phase into text string description
  */
 static inline const char *
@@ -2047,10 +1987,6 @@ struct ptlrpc_service *ptlrpc_register_service(struct ptlrpc_service_conf *conf,
 				 MDS_REG_MAXREQSIZE : OUT_MAXREQSIZE)
 #define PTLRPC_MAX_BUFLEN	(OST_IO_MAXREQSIZE > MD_MAX_BUFLEN ? \
 				 OST_IO_MAXREQSIZE : MD_MAX_BUFLEN)
-bool ptlrpc_buf_need_swab(struct ptlrpc_request *req, const int inout,
-			  u32 index);
-void ptlrpc_buf_set_swabbed(struct ptlrpc_request *req, const int inout,
-			    u32 index);
 int ptlrpc_unpack_rep_msg(struct ptlrpc_request *req, int len);
 int ptlrpc_unpack_req_msg(struct ptlrpc_request *req, int len);
 
diff --git a/fs/lustre/include/lustre_req_layout.h b/fs/lustre/include/lustre_req_layout.h
index f6ebda3..9f22134b 100644
--- a/fs/lustre/include/lustre_req_layout.h
+++ b/fs/lustre/include/lustre_req_layout.h
@@ -62,10 +62,17 @@ enum req_location {
 #define REQ_MAX_FIELD_NR 12
 
 struct req_capsule {
-	struct ptlrpc_request		*rc_req;
-	const struct req_format		*rc_fmt;
-	enum req_location		 rc_loc;
-	u32				 rc_area[RCL_NR][REQ_MAX_FIELD_NR];
+	struct ptlrpc_request	*rc_req;
+	/** Request message - what client sent */
+	struct lustre_msg	*rc_reqmsg;
+	/** Reply message - server response */
+	struct lustre_msg	*rc_repmsg;
+	/** Fields that help to see if request and reply were swabved or not */
+	u32			 rc_req_swab_mask;
+	u32			 rc_rep_swab_mask;
+	const struct req_format	*rc_fmt;
+	enum req_location	 rc_loc;
+	u32			 rc_area[RCL_NR][REQ_MAX_FIELD_NR];
 };
 
 void req_capsule_init(struct req_capsule *pill, struct ptlrpc_request *req,
@@ -117,6 +124,69 @@ int req_capsule_field_present(const struct req_capsule *pill,
 void req_capsule_shrink(struct req_capsule *pill,
 			const struct req_msg_field *field,
 			u32 newlen, enum req_location loc);
+bool req_capsule_need_swab(struct req_capsule *pill, enum req_location loc,
+			   u32 index);
+void req_capsule_set_swabbed(struct req_capsule *pill, enum req_location loc,
+			     u32 index);
+
+/**
+ * Returns true if request buffer at offset \a index was already swabbed
+ */
+static inline bool req_capsule_req_swabbed(struct req_capsule *pill,
+					   size_t index)
+{
+	LASSERT(index < sizeof(pill->rc_req_swab_mask) * 8);
+	return pill->rc_req_swab_mask & BIT(index);
+}
+
+/**
+ * Returns true if request reply buffer at offset \a index was already swabbed
+ */
+static inline bool req_capsule_rep_swabbed(struct req_capsule *pill,
+					   size_t index)
+{
+	LASSERT(index < sizeof(pill->rc_rep_swab_mask) * 8);
+	return pill->rc_rep_swab_mask & BIT(index);
+}
+
+/**
+ * Returns true if request needs to be swabbed into local cpu byteorder
+ */
+static inline bool req_capsule_req_need_swab(struct req_capsule *pill)
+{
+	return req_capsule_req_swabbed(pill, MSG_PTLRPC_HEADER_OFF);
+}
+
+/**
+ * Returns true if request reply needs to be swabbed into local cpu byteorder
+ */
+static inline bool req_capsule_rep_need_swab(struct req_capsule *pill)
+{
+	return req_capsule_rep_swabbed(pill, MSG_PTLRPC_HEADER_OFF);
+}
+
+/**
+ * Mark request buffer at offset \a index that it was already swabbed
+ */
+static inline void req_capsule_set_req_swabbed(struct req_capsule *pill,
+					       size_t index)
+{
+	LASSERT(index < sizeof(pill->rc_req_swab_mask) * 8);
+	LASSERT((pill->rc_req_swab_mask & BIT(index)) == 0);
+	pill->rc_req_swab_mask |= BIT(index);
+}
+
+/**
+ * Mark request reply buffer at offset \a index that it was already swabbed
+ */
+static inline void req_capsule_set_rep_swabbed(struct req_capsule *pill,
+					       size_t index)
+{
+	LASSERT(index < sizeof(pill->rc_rep_swab_mask) * 8);
+	LASSERT((pill->rc_rep_swab_mask & BIT(index)) == 0);
+	pill->rc_rep_swab_mask |= BIT(index);
+}
+
 int  req_layout_init(void);
 void req_layout_fini(void);
 
diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h
index 678953a..86d7839 100644
--- a/fs/lustre/include/obd.h
+++ b/fs/lustre/include/obd.h
@@ -1028,7 +1028,7 @@ struct md_ops {
 
 	int (*init_ea_size)(struct obd_export *, u32, u32);
 
-	int (*get_lustre_md)(struct obd_export *, struct ptlrpc_request *,
+	int (*get_lustre_md)(struct obd_export *exp, struct req_capsule *pill,
 			     struct obd_export *, struct obd_export *,
 			     struct lustre_md *);
 
diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h
index 4cc5a7df..2fe4ea2 100644
--- a/fs/lustre/include/obd_class.h
+++ b/fs/lustre/include/obd_class.h
@@ -1432,7 +1432,7 @@ static inline int md_unlink(struct obd_export *exp, struct md_op_data *op_data,
 }
 
 static inline int md_get_lustre_md(struct obd_export *exp,
-				   struct ptlrpc_request *req,
+				   struct req_capsule *pill,
 				   struct obd_export *dt_exp,
 				   struct obd_export *md_exp,
 				   struct lustre_md *md)
@@ -1443,7 +1443,7 @@ static inline int md_get_lustre_md(struct obd_export *exp,
 	if (rc)
 		return rc;
 
-	return MDP(exp->exp_obd, get_lustre_md)(exp, req, dt_exp, md_exp, md);
+	return MDP(exp->exp_obd, get_lustre_md)(exp, pill, dt_exp, md_exp, md);
 }
 
 static inline int md_free_lustre_md(struct obd_export *exp,
diff --git a/fs/lustre/llite/dcache.c b/fs/lustre/llite/dcache.c
index 24af33e..4162f46 100644
--- a/fs/lustre/llite/dcache.c
+++ b/fs/lustre/llite/dcache.c
@@ -202,17 +202,13 @@ int ll_revalidate_it_finish(struct ptlrpc_request *request,
 			    struct lookup_intent *it,
 			    struct inode *inode)
 {
-	int rc = 0;
-
 	if (!request)
 		return 0;
 
 	if (it_disposition(it, DISP_LOOKUP_NEG))
 		return -ENOENT;
 
-	rc = ll_prep_inode(&inode, request, NULL, it);
-
-	return rc;
+	return ll_prep_inode(&inode, &request->rq_pill, NULL, it);
 }
 
 void ll_lookup_finish_locks(struct lookup_intent *it, struct inode *inode)
diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index bd15fee..3432034 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -486,7 +486,7 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 	if (err)
 		goto out_request;
 
-	err = ll_prep_inode(&inode, request, parent->i_sb, NULL);
+	err = ll_prep_inode(&inode, &request->rq_pill, parent->i_sb, NULL);
 	if (err)
 		goto out_inode;
 
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 7c14cf2..2dcf25f 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -665,7 +665,7 @@ static int ll_intent_file_open(struct dentry *de, void *lmm, int lmmsize,
 		goto out;
 	}
 
-	rc = ll_prep_inode(&inode, req, NULL, itp);
+	rc = ll_prep_inode(&inode, &req->rq_pill, NULL, itp);
 
 	if (!rc && itp->it_lock_mode) {
 		u64 bits = 0;
@@ -4531,7 +4531,7 @@ int ll_get_fid_by_name(struct inode *parent, const char *name,
 		*fid = body->mbo_fid1;
 
 	if (inode)
-		rc = ll_prep_inode(inode, req, parent->i_sb, NULL);
+		rc = ll_prep_inode(inode, &req->rq_pill, parent->i_sb, NULL);
 out_req:
 	ptlrpc_req_finished(req);
 	return rc;
diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index a1e5e468..3674af9 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -1213,7 +1213,7 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 int ll_remount_fs(struct super_block *sb, int *flags, char *data);
 int ll_show_options(struct seq_file *seq, struct dentry *dentry);
 void ll_dirty_page_discard_warn(struct page *page, int ioret);
-int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
+int ll_prep_inode(struct inode **inode, struct req_capsule *pill,
 		  struct super_block *sb, struct lookup_intent *it);
 int ll_obd_statfs(struct inode *inode, void __user *arg);
 int ll_get_max_mdsize(struct ll_sb_info *sbi, int *max_mdsize);
@@ -1229,9 +1229,9 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 void ll_finish_md_op_data(struct md_op_data *op_data);
 int ll_get_obd_name(struct inode *inode, unsigned int cmd, unsigned long arg);
 void ll_compute_rootsquash_state(struct ll_sb_info *sbi);
-void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req);
 ssize_t ll_copy_user_md(const struct lov_user_md __user *md,
 			struct lov_user_md **kbuf);
+void ll_open_cleanup(struct super_block *sb, struct req_capsule *pill);
 
 void ll_dom_finish_open(struct inode *inode, struct ptlrpc_request *req);
 
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index fe49030..646bff8 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -650,8 +650,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt)
 		goto out_lock_cn_cb;
 	}
 
-	err = md_get_lustre_md(sbi->ll_md_exp, request, sbi->ll_dt_exp,
-			       sbi->ll_md_exp, &lmd);
+	err = md_get_lustre_md(sbi->ll_md_exp, &request->rq_pill,
+			       sbi->ll_dt_exp, sbi->ll_md_exp, &lmd);
 	if (err) {
 		CERROR("failed to understand root inode md: rc = %d\n", err);
 		ptlrpc_req_finished(request);
@@ -1723,7 +1723,7 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 		return rc;
 	}
 
-	rc = md_get_lustre_md(sbi->ll_md_exp, request, sbi->ll_dt_exp,
+	rc = md_get_lustre_md(sbi->ll_md_exp, &request->rq_pill, sbi->ll_dt_exp,
 			      sbi->ll_md_exp, &md);
 	if (rc) {
 		ptlrpc_req_finished(request);
@@ -2762,14 +2762,14 @@ int ll_remount_fs(struct super_block *sb, int *flags, char *data)
  * @sb:		super block for this file-system
  * @open_req:	pointer to the original open request
  */
-void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req)
+void ll_open_cleanup(struct super_block *sb, struct req_capsule *pill)
 {
 	struct mdt_body	*body;
 	struct md_op_data *op_data;
 	struct ptlrpc_request *close_req = NULL;
 	struct obd_export *exp = ll_s2sbi(sb)->ll_md_exp;
 
-	body = req_capsule_server_get(&open_req->rq_pill, &RMF_MDT_BODY);
+	body = req_capsule_server_get(pill, &RMF_MDT_BODY);
 	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
 	if (!op_data)
 		return;
@@ -2782,7 +2782,7 @@ void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req)
 	ll_finish_md_op_data(op_data);
 }
 
-int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
+int ll_prep_inode(struct inode **inode, struct req_capsule *pill,
 		  struct super_block *sb, struct lookup_intent *it)
 {
 	struct ll_sb_info *sbi = NULL;
@@ -2792,7 +2792,7 @@ int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
 
 	LASSERT(*inode || sb);
 	sbi = sb ? ll_s2sbi(sb) : ll_i2sbi(*inode);
-	rc = md_get_lustre_md(sbi->ll_md_exp, req, sbi->ll_dt_exp,
+	rc = md_get_lustre_md(sbi->ll_md_exp, pill, sbi->ll_dt_exp,
 			      sbi->ll_md_exp, &md);
 	if (rc)
 		goto out;
@@ -2878,7 +2878,7 @@ int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
 
 	if (rc != 0 && it && it->it_op & IT_OPEN) {
 		ll_intent_drop_lock(it);
-		ll_open_cleanup(sb ? sb : (*inode)->i_sb, req);
+		ll_open_cleanup(sb ? sb : (*inode)->i_sb, pill);
 	}
 
 	return rc;
diff --git a/fs/lustre/llite/llite_nfs.c b/fs/lustre/llite/llite_nfs.c
index bf15023..6be2309 100644
--- a/fs/lustre/llite/llite_nfs.c
+++ b/fs/lustre/llite/llite_nfs.c
@@ -101,7 +101,7 @@ struct inode *search_inode_for_lustre(struct super_block *sb,
 		       PFID(fid), rc);
 		return ERR_PTR(rc);
 	}
-	rc = ll_prep_inode(&inode, req, sb, NULL);
+	rc = ll_prep_inode(&inode, &req->rq_pill, sb, NULL);
 	ptlrpc_req_finished(req);
 	if (rc)
 		return ERR_PTR(rc);
diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index 43cbfbd..9eab6fe 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -649,7 +649,7 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 		struct mdt_body *body = req_capsule_server_get(pill,
 							       &RMF_MDT_BODY);
 
-		rc = ll_prep_inode(&inode, request, (*de)->d_sb, it);
+		rc = ll_prep_inode(&inode, &request->rq_pill, (*de)->d_sb, it);
 		if (rc)
 			return rc;
 
@@ -789,7 +789,7 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 out:
 	if (rc != 0 && it->it_op & IT_OPEN) {
 		ll_intent_drop_lock(it);
-		ll_open_cleanup((*de)->d_sb, request);
+		ll_open_cleanup((*de)->d_sb, &request->rq_pill);
 	}
 
 	return rc;
@@ -1249,7 +1249,7 @@ static struct inode *ll_create_node(struct inode *dir, struct lookup_intent *it)
 	LASSERT(it_disposition(it, DISP_ENQ_CREATE_REF));
 	request = it->it_request;
 	it_clear_disposition(it, DISP_ENQ_CREATE_REF);
-	rc = ll_prep_inode(&inode, request, dir->i_sb, it);
+	rc = ll_prep_inode(&inode, &request->rq_pill, dir->i_sb, it);
 	if (rc) {
 		inode = ERR_PTR(rc);
 		goto out;
@@ -1485,7 +1485,7 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 
 	ll_update_times(request, dir);
 
-	err = ll_prep_inode(&inode, request, dir->i_sb, NULL);
+	err = ll_prep_inode(&inode, &request->rq_pill, dir->i_sb, NULL);
 	if (err)
 		goto err_exit;
 
diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c
index 995a9e1..282cb5a 100644
--- a/fs/lustre/llite/statahead.c
+++ b/fs/lustre/llite/statahead.c
@@ -663,7 +663,7 @@ static void sa_instantiate(struct ll_statahead_info *sai,
 		goto out;
 	}
 
-	rc = ll_prep_inode(&child, req, dir->i_sb, it);
+	rc = ll_prep_inode(&child, &req->rq_pill, dir->i_sb, it);
 	if (rc)
 		goto out;
 
diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c
index fb89047..56d22d1 100644
--- a/fs/lustre/lmv/lmv_obd.c
+++ b/fs/lustre/lmv/lmv_obd.c
@@ -3368,7 +3368,7 @@ static enum ldlm_mode lmv_lock_match(struct obd_export *exp, u64 flags,
 }
 
 static int lmv_get_lustre_md(struct obd_export *exp,
-			     struct ptlrpc_request *req,
+			     struct req_capsule *pill,
 			     struct obd_export *dt_exp,
 			     struct obd_export *md_exp,
 			     struct lustre_md *md)
@@ -3378,7 +3378,8 @@ static int lmv_get_lustre_md(struct obd_export *exp,
 
 	if (!tgt || !tgt->ltd_exp)
 		return -EINVAL;
-	return md_get_lustre_md(tgt->ltd_exp, req, dt_exp, md_exp, md);
+
+	return md_get_lustre_md(tgt->ltd_exp, pill, dt_exp, md_exp, md);
 }
 
 static int lmv_free_lustre_md(struct obd_export *exp, struct lustre_md *md)
diff --git a/fs/lustre/mdc/mdc_acl.c b/fs/lustre/mdc/mdc_acl.c
index 6814045..8126390 100644
--- a/fs/lustre/mdc/mdc_acl.c
+++ b/fs/lustre/mdc/mdc_acl.c
@@ -24,9 +24,8 @@
 
 #include "mdc_internal.h"
 
-int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md)
+int mdc_unpack_acl(struct req_capsule *pill, struct lustre_md *md)
 {
-	struct req_capsule *pill = &req->rq_pill;
 	struct mdt_body	*body = md->body;
 	struct posix_acl *acl;
 	void *buf;
diff --git a/fs/lustre/mdc/mdc_dev.c b/fs/lustre/mdc/mdc_dev.c
index 0db05b5..1c28f80 100644
--- a/fs/lustre/mdc/mdc_dev.c
+++ b/fs/lustre/mdc/mdc_dev.c
@@ -563,12 +563,12 @@ static int mdc_lock_upcall(void *cookie, struct lustre_handle *lockh,
 }
 
 /* This is needed only for old servers (before 2.14) support */
-int mdc_fill_lvb(struct ptlrpc_request *req, struct ost_lvb *lvb)
+int mdc_fill_lvb(struct req_capsule *pill, struct ost_lvb *lvb)
 {
 	struct mdt_body *body;
 
 	/* get LVB data from mdt_body otherwise */
-	body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
+	body = req_capsule_server_get(pill, &RMF_MDT_BODY);
 	if (!body)
 		return -EPROTO;
 
@@ -590,7 +590,7 @@ int mdc_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req,
 
 	/* needed only for glimpse from an old server (< 2.14) */
 	if (glimpse && !exp_connect_dom_lvb(exp))
-		rc = mdc_fill_lvb(req, &ols->ols_lvb);
+		rc = mdc_fill_lvb(&req->rq_pill, &ols->ols_lvb);
 
 	if (glimpse && errcode == ELDLM_LOCK_ABORTED) {
 		struct ldlm_reply *rep;
diff --git a/fs/lustre/mdc/mdc_internal.h b/fs/lustre/mdc/mdc_internal.h
index 06b0637..fab40bd 100644
--- a/fs/lustre/mdc/mdc_internal.h
+++ b/fs/lustre/mdc/mdc_internal.h
@@ -37,37 +37,37 @@
 
 int mdc_tunables_init(struct obd_device *obd);
 
-void mdc_pack_body(struct ptlrpc_request *req, const struct lu_fid *fid,
+void mdc_pack_body(struct req_capsule *pill, const struct lu_fid *fid,
 		   u64 valid, size_t ea_size, u32 suppgid, u32 flags);
-void mdc_swap_layouts_pack(struct ptlrpc_request *req,
+void mdc_swap_layouts_pack(struct req_capsule *pill,
 			   struct md_op_data *op_data);
-void mdc_readdir_pack(struct ptlrpc_request *req, u64 pgoff, size_t size,
+void mdc_readdir_pack(struct req_capsule *pill, u64 pgoff, size_t size,
 		      const struct lu_fid *fid);
-void mdc_getattr_pack(struct ptlrpc_request *req, u64 valid, u32 flags,
+void mdc_getattr_pack(struct req_capsule *pill, u64 valid, u32 flags,
 		      struct md_op_data *data, size_t ea_size);
-void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_setattr_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		      void *ea, size_t ealen);
-void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_create_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		     const void *data, size_t datalen, umode_t mode, uid_t uid,
 		     gid_t gid, kernel_cap_t capability, u64 rdev);
-void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_open_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		   umode_t mode, u64 rdev, u64 flags, const void *data,
 		   size_t datalen);
-void mdc_file_secctx_pack(struct ptlrpc_request *req,
+void mdc_file_secctx_pack(struct req_capsule *pill,
 			  const char *secctx_name,
 			  const void *secctx, size_t secctx_size);
-void mdc_file_encctx_pack(struct ptlrpc_request *req,
+void mdc_file_encctx_pack(struct req_capsule *pill,
 			  const void *encctx, size_t encctx_size);
-void mdc_file_sepol_pack(struct ptlrpc_request *req);
+void mdc_file_sepol_pack(struct req_capsule *pill);
 
-void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
-void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
-void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_unlink_pack(struct req_capsule *pill, struct md_op_data *op_data);
+void mdc_link_pack(struct req_capsule *pill, struct md_op_data *op_data);
+void mdc_rename_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		     const char *old, size_t oldlen,
 		     const char *new, size_t newlen);
-void mdc_migrate_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_migrate_pack(struct req_capsule *pill, struct md_op_data *op_data,
 			const char *name, size_t namelen);
-void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data);
+void mdc_close_pack(struct req_capsule *pill, struct md_op_data *op_data);
 
 /* mdc/mdc_locks.c */
 int mdc_set_lock_data(struct obd_export *exp,
@@ -158,10 +158,10 @@ static inline int mdc_prep_elc_req(struct obd_export *exp,
 }
 
 #ifdef CONFIG_LUSTRE_FS_POSIX_ACL
-int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md);
+int mdc_unpack_acl(struct req_capsule *pill, struct lustre_md *md);
 #else
 static inline
-int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md)
+int mdc_unpack_acl(struct req_capsule *pill, struct lustre_md *md)
 {
 	return 0;
 }
@@ -190,7 +190,7 @@ static inline unsigned long hash_x_index(u64 hash, int hash64)
 int mdc_ldlm_blocking_ast(struct ldlm_lock *dlmlock,
 			  struct ldlm_lock_desc *new, void *data, int flag);
 int mdc_ldlm_glimpse_ast(struct ldlm_lock *dlmlock, void *data);
-int mdc_fill_lvb(struct ptlrpc_request *req, struct ost_lvb *lvb);
+int mdc_fill_lvb(struct req_capsule *pill, struct ost_lvb *lvb);
 
 /* the minimum inline repsize should be PAGE_SIZE at least */
 #define MDC_DOM_DEF_INLINE_REPSIZE max(8192UL, PAGE_SIZE)
diff --git a/fs/lustre/mdc/mdc_lib.c b/fs/lustre/mdc/mdc_lib.c
index 37fcb38..ccaa0f2 100644
--- a/fs/lustre/mdc/mdc_lib.c
+++ b/fs/lustre/mdc/mdc_lib.c
@@ -51,11 +51,10 @@ static void __mdc_pack_body(struct mdt_body *b, u32 suppgid)
 	b->mbo_capability = current_cap().cap[0];
 }
 
-void mdc_swap_layouts_pack(struct ptlrpc_request *req,
+void mdc_swap_layouts_pack(struct req_capsule *pill,
 			   struct md_op_data *op_data)
 {
-	struct mdt_body *b = req_capsule_client_get(&req->rq_pill,
-						    &RMF_MDT_BODY);
+	struct mdt_body *b = req_capsule_client_get(pill, &RMF_MDT_BODY);
 
 	__mdc_pack_body(b, op_data->op_suppgids[0]);
 	b->mbo_fid1 = op_data->op_fid1;
@@ -63,11 +62,11 @@ void mdc_swap_layouts_pack(struct ptlrpc_request *req,
 	b->mbo_valid |= OBD_MD_FLID;
 }
 
-void mdc_pack_body(struct ptlrpc_request *req, const struct lu_fid *fid,
+void mdc_pack_body(struct req_capsule *pill, const struct lu_fid *fid,
 		   u64 valid, size_t ea_size, u32 suppgid, u32 flags)
 {
-	struct mdt_body *b = req_capsule_client_get(&req->rq_pill,
-						    &RMF_MDT_BODY);
+	struct mdt_body *b = req_capsule_client_get(pill, &RMF_MDT_BODY);
+
 	b->mbo_valid = valid;
 	b->mbo_eadatasize = ea_size;
 	b->mbo_flags = flags;
@@ -81,7 +80,7 @@ void mdc_pack_body(struct ptlrpc_request *req, const struct lu_fid *fid,
 /**
  * Pack a name (path component) into a request
  *
- * @req:	request
+ * @pill:	request pill
  * @field:	request field (usually RMF_NAME)
  * @name:	path component
  * @name_len:	length of path component
@@ -91,7 +90,7 @@ void mdc_pack_body(struct ptlrpc_request *req, const struct lu_fid *fid,
  * @name must be '\0' terminated of length @name_len and represent
  * a single path component (not contain '/').
  */
-static void mdc_pack_name(struct ptlrpc_request *req,
+static void mdc_pack_name(struct req_capsule *pill,
 			  const struct req_msg_field *field,
 			  const char *name, size_t name_len)
 {
@@ -99,8 +98,8 @@ static void mdc_pack_name(struct ptlrpc_request *req,
 	size_t cpy_len;
 	char *buf;
 
-	buf = req_capsule_client_get(&req->rq_pill, field);
-	buf_size = req_capsule_get_size(&req->rq_pill, field, RCL_CLIENT);
+	buf = req_capsule_client_get(pill, field);
+	buf_size = req_capsule_get_size(pill, field, RCL_CLIENT);
 
 	LASSERT(name && name_len && buf && buf_size == name_len + 1);
 
@@ -108,12 +107,11 @@ static void mdc_pack_name(struct ptlrpc_request *req,
 
 	LASSERT(lu_name_is_valid_2(buf, cpy_len));
 	if (cpy_len != name_len)
-		CDEBUG(D_DENTRY, "%s: %s len %zd != %zd, concurrent rename?\n",
-		       req->rq_export->exp_obd->obd_name, buf, name_len,
-		       cpy_len);
+		CDEBUG(D_DENTRY, "%s len %zd != %zd, concurrent rename?\n",
+		       buf, name_len, cpy_len);
 }
 
-void mdc_file_secctx_pack(struct ptlrpc_request *req, const char *secctx_name,
+void mdc_file_secctx_pack(struct req_capsule *pill, const char *secctx_name,
 			  const void *secctx, size_t secctx_size)
 {
 	size_t buf_size;
@@ -122,22 +120,22 @@ void mdc_file_secctx_pack(struct ptlrpc_request *req, const char *secctx_name,
 	if (!secctx_name)
 		return;
 
-	buf = req_capsule_client_get(&req->rq_pill, &RMF_FILE_SECCTX_NAME);
-	buf_size = req_capsule_get_size(&req->rq_pill, &RMF_FILE_SECCTX_NAME,
+	buf = req_capsule_client_get(pill, &RMF_FILE_SECCTX_NAME);
+	buf_size = req_capsule_get_size(pill, &RMF_FILE_SECCTX_NAME,
 					RCL_CLIENT);
 
 	LASSERT(buf_size == strlen(secctx_name) + 1);
 	memcpy(buf, secctx_name, buf_size);
 
-	buf = req_capsule_client_get(&req->rq_pill, &RMF_FILE_SECCTX);
-	buf_size = req_capsule_get_size(&req->rq_pill, &RMF_FILE_SECCTX,
+	buf = req_capsule_client_get(pill, &RMF_FILE_SECCTX);
+	buf_size = req_capsule_get_size(pill, &RMF_FILE_SECCTX,
 					RCL_CLIENT);
 
 	LASSERT(buf_size == secctx_size);
 	memcpy(buf, secctx, buf_size);
 }
 
-void mdc_file_encctx_pack(struct ptlrpc_request *req,
+void mdc_file_encctx_pack(struct req_capsule *pill,
 			  const void *encctx, size_t encctx_size)
 {
 	void *buf;
@@ -146,35 +144,36 @@ void mdc_file_encctx_pack(struct ptlrpc_request *req,
 	if (!encctx)
 		return;
 
-	buf = req_capsule_client_get(&req->rq_pill, &RMF_FILE_ENCCTX);
-	buf_size = req_capsule_get_size(&req->rq_pill, &RMF_FILE_ENCCTX,
+	buf = req_capsule_client_get(pill, &RMF_FILE_ENCCTX);
+	buf_size = req_capsule_get_size(pill, &RMF_FILE_ENCCTX,
 					RCL_CLIENT);
 
 	LASSERT(buf_size == encctx_size);
 	memcpy(buf, encctx, buf_size);
 }
 
-void mdc_file_sepol_pack(struct ptlrpc_request *req)
+void mdc_file_sepol_pack(struct req_capsule *pill)
 {
 	void *buf;
 	size_t buf_size;
+	struct ptlrpc_request *req = pill->rc_req;
 
 	if (strlen(req->rq_sepol) == 0)
 		return;
 
-	buf = req_capsule_client_get(&req->rq_pill, &RMF_SELINUX_POL);
-	buf_size = req_capsule_get_size(&req->rq_pill, &RMF_SELINUX_POL,
+	buf = req_capsule_client_get(pill, &RMF_SELINUX_POL);
+	buf_size = req_capsule_get_size(pill, &RMF_SELINUX_POL,
 					RCL_CLIENT);
 
 	LASSERT(buf_size == strlen(req->rq_sepol) + 1);
 	snprintf(buf, strlen(req->rq_sepol) + 1, "%s", req->rq_sepol);
 }
 
-void mdc_readdir_pack(struct ptlrpc_request *req, u64 pgoff, size_t size,
+void mdc_readdir_pack(struct req_capsule *pill, u64 pgoff, size_t size,
 		      const struct lu_fid *fid)
 {
-	struct mdt_body *b = req_capsule_client_get(&req->rq_pill,
-						    &RMF_MDT_BODY);
+	struct mdt_body *b = req_capsule_client_get(pill, &RMF_MDT_BODY);
+
 	b->mbo_fid1 = *fid;
 	b->mbo_valid |= OBD_MD_FLID;
 	b->mbo_size = pgoff;			/* !! */
@@ -184,7 +183,7 @@ void mdc_readdir_pack(struct ptlrpc_request *req, u64 pgoff, size_t size,
 }
 
 /* packing of MDS records */
-void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_create_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		     const void *data, size_t datalen, umode_t mode,
 		     uid_t uid, gid_t gid, kernel_cap_t cap_effective,
 		     u64 rdev)
@@ -195,7 +194,7 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_create));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	rec->cr_opcode = REINT_CREATE;
 	rec->cr_fsuid = uid;
@@ -220,21 +219,21 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->cr_bias = op_data->op_bias;
 	rec->cr_umask = current_umask();
 
-	mdc_pack_name(req, &RMF_NAME, op_data->op_name, op_data->op_namelen);
+	mdc_pack_name(pill, &RMF_NAME, op_data->op_name, op_data->op_namelen);
 	if (data) {
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_EADATA);
+		tmp = req_capsule_client_get(pill, &RMF_EADATA);
 		memcpy(tmp, data, datalen);
 	}
 
-	mdc_file_secctx_pack(req, op_data->op_file_secctx_name,
+	mdc_file_secctx_pack(pill, op_data->op_file_secctx_name,
 			     op_data->op_file_secctx,
 			     op_data->op_file_secctx_size);
 
-	mdc_file_encctx_pack(req, op_data->op_file_encctx,
+	mdc_file_encctx_pack(pill, op_data->op_file_encctx,
 			     op_data->op_file_encctx_size);
 
 	/* pack SELinux policy info if any */
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(pill);
 }
 
 static inline u64 mds_pack_open_flags(u64 flags)
@@ -269,7 +268,7 @@ static inline u64 mds_pack_open_flags(u64 flags)
 }
 
 /* packing of MDS records */
-void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_open_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		   umode_t mode, u64 rdev, u64 flags, const void *lmm,
 		   size_t lmmlen)
 {
@@ -279,7 +278,7 @@ void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_create));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	/* XXX do something about time, uid, gid */
 	rec->cr_opcode = REINT_OPEN;
@@ -300,26 +299,26 @@ void mdc_open_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->cr_open_handle_old = op_data->op_open_handle;
 
 	if (op_data->op_name) {
-		mdc_pack_name(req, &RMF_NAME, op_data->op_name,
+		mdc_pack_name(pill, &RMF_NAME, op_data->op_name,
 			      op_data->op_namelen);
 
 		if (op_data->op_bias & MDS_CREATE_VOLATILE)
 			cr_flags |= MDS_OPEN_VOLATILE;
 
-		mdc_file_secctx_pack(req, op_data->op_file_secctx_name,
+		mdc_file_secctx_pack(pill, op_data->op_file_secctx_name,
 				     op_data->op_file_secctx,
 				     op_data->op_file_secctx_size);
 
-		mdc_file_encctx_pack(req, op_data->op_file_encctx,
+		mdc_file_encctx_pack(pill, op_data->op_file_encctx,
 				     op_data->op_file_encctx_size);
 
 		/* pack SELinux policy info if any */
-		mdc_file_sepol_pack(req);
+		mdc_file_sepol_pack(pill);
 	}
 
 	if (lmm) {
 		cr_flags |= MDS_OPEN_HAS_EA;
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_EADATA);
+		tmp = req_capsule_client_get(pill, &RMF_EADATA);
 		memcpy(tmp, lmm, lmmlen);
 		if (cr_flags & MDS_OPEN_PCC) {
 			LASSERT(op_data);
@@ -420,7 +419,7 @@ static void mdc_ioepoch_pack(struct mdt_ioepoch *epoch,
 	epoch->mio_padding = 0;
 }
 
-void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_setattr_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		      void *ea, size_t ealen)
 {
 	struct mdt_rec_setattr *rec;
@@ -428,13 +427,13 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 					sizeof(struct mdt_rec_setattr));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 	mdc_setattr_pack_rec(rec, op_data);
 
 	if (ealen == 0)
 		return;
 
-	lum = req_capsule_client_get(&req->rq_pill, &RMF_EADATA);
+	lum = req_capsule_client_get(pill, &RMF_EADATA);
 	if (!ea) { /* Remove LOV EA */
 		lum->lmm_magic = cpu_to_le32(LOV_USER_MAGIC_V1);
 		lum->lmm_stripe_size = 0;
@@ -446,13 +445,13 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	}
 }
 
-void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
+void mdc_unlink_pack(struct req_capsule *pill, struct md_op_data *op_data)
 {
 	struct mdt_rec_unlink *rec;
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_unlink));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	rec->ul_opcode = op_data->op_cli_flags & CLI_RM_ENTRY ?
 			 REINT_RMENTRY : REINT_UNLINK;
@@ -467,19 +466,19 @@ void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 	rec->ul_time = op_data->op_mod_time;
 	rec->ul_bias = op_data->op_bias;
 
-	mdc_pack_name(req, &RMF_NAME, op_data->op_name, op_data->op_namelen);
+	mdc_pack_name(pill, &RMF_NAME, op_data->op_name, op_data->op_namelen);
 
 	/* pack SELinux policy info if any */
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(pill);
 }
 
-void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
+void mdc_link_pack(struct req_capsule *pill, struct md_op_data *op_data)
 {
 	struct mdt_rec_link *rec;
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_link));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	rec->lk_opcode = REINT_LINK;
 	rec->lk_fsuid = op_data->op_fsuid; /* current->fsuid; */
@@ -492,13 +491,13 @@ void mdc_link_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 	rec->lk_time = op_data->op_mod_time;
 	rec->lk_bias = op_data->op_bias;
 
-	mdc_pack_name(req, &RMF_NAME, op_data->op_name, op_data->op_namelen);
+	mdc_pack_name(pill, &RMF_NAME, op_data->op_name, op_data->op_namelen);
 
 	/* pack SELinux policy info if any */
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(pill);
 }
 
-static void mdc_close_intent_pack(struct ptlrpc_request *req,
+static void mdc_close_intent_pack(struct req_capsule *pill,
 				  struct md_op_data *op_data)
 {
 	enum mds_op_bias bias = op_data->op_bias;
@@ -508,7 +507,7 @@ static void mdc_close_intent_pack(struct ptlrpc_request *req,
 	if (!(bias & (MDS_CLOSE_INTENT | MDS_CLOSE_MIGRATE)))
 		return;
 
-	data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+	data = req_capsule_client_get(pill, &RMF_CLOSE_DATA);
 	LASSERT(data);
 
 	lock = ldlm_handle2lock(&op_data->op_lease_handle);
@@ -534,7 +533,7 @@ static void mdc_close_intent_pack(struct ptlrpc_request *req,
 		} else {
 			size_t count = sync->resync_count;
 
-			memcpy(req_capsule_client_get(&req->rq_pill, &RMF_U32),
+			memcpy(req_capsule_client_get(pill, &RMF_U32),
 				op_data->op_data, count * sizeof(u32));
 		}
 	} else if (bias & MDS_PCC_ATTACH) {
@@ -542,7 +541,7 @@ static void mdc_close_intent_pack(struct ptlrpc_request *req,
 	}
 }
 
-void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_rename_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		     const char *old, size_t oldlen,
 		     const char *new, size_t newlen)
 {
@@ -550,7 +549,7 @@ void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_rename));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	/* XXX do something about time, uid, gid */
 	rec->rn_opcode = REINT_RENAME;
@@ -565,16 +564,16 @@ void mdc_rename_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->rn_mode = op_data->op_mode;
 	rec->rn_bias = op_data->op_bias;
 
-	mdc_pack_name(req, &RMF_NAME, old, oldlen);
+	mdc_pack_name(pill, &RMF_NAME, old, oldlen);
 
 	if (new)
-		mdc_pack_name(req, &RMF_SYMTGT, new, newlen);
+		mdc_pack_name(pill, &RMF_SYMTGT, new, newlen);
 
 	/* pack SELinux policy info if any */
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(pill);
 }
 
-void mdc_migrate_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
+void mdc_migrate_pack(struct req_capsule *pill, struct md_op_data *op_data,
 		      const char *name, size_t namelen)
 {
 	struct mdt_rec_rename *rec;
@@ -582,7 +581,7 @@ void mdc_migrate_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 
 	BUILD_BUG_ON(sizeof(struct mdt_rec_reint) !=
 		     sizeof(struct mdt_rec_rename));
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	rec->rn_opcode	 = REINT_MIGRATE;
 	rec->rn_fsuid	 = op_data->op_fsuid;
@@ -596,25 +595,24 @@ void mdc_migrate_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->rn_mode	 = op_data->op_mode;
 	rec->rn_bias	 = op_data->op_bias;
 
-	mdc_pack_name(req, &RMF_NAME, name, namelen);
+	mdc_pack_name(pill, &RMF_NAME, name, namelen);
 
 	if (op_data->op_bias & MDS_CLOSE_MIGRATE) {
 		struct mdt_ioepoch *epoch;
 
-		mdc_close_intent_pack(req, op_data);
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
+		mdc_close_intent_pack(pill, op_data);
+		epoch = req_capsule_client_get(pill, &RMF_MDT_EPOCH);
 		mdc_ioepoch_pack(epoch, op_data);
 	}
 
-	ea = req_capsule_client_get(&req->rq_pill, &RMF_EADATA);
+	ea = req_capsule_client_get(pill, &RMF_EADATA);
 	memcpy(ea, op_data->op_data, op_data->op_data_size);
 }
 
-void mdc_getattr_pack(struct ptlrpc_request *req, u64 valid, u32 flags,
+void mdc_getattr_pack(struct req_capsule *pill, u64 valid, u32 flags,
 		      struct md_op_data *op_data, size_t ea_size)
 {
-	struct mdt_body *b = req_capsule_client_get(&req->rq_pill,
-						    &RMF_MDT_BODY);
+	struct mdt_body *b = req_capsule_client_get(pill, &RMF_MDT_BODY);
 
 	b->mbo_valid = valid;
 	if (op_data->op_bias & MDS_CROSS_REF)
@@ -628,17 +626,17 @@ void mdc_getattr_pack(struct ptlrpc_request *req, u64 valid, u32 flags,
 	b->mbo_valid |= OBD_MD_FLID;
 
 	if (op_data->op_name)
-		mdc_pack_name(req, &RMF_NAME, op_data->op_name,
+		mdc_pack_name(pill, &RMF_NAME, op_data->op_name,
 			      op_data->op_namelen);
 }
 
-void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
+void mdc_close_pack(struct req_capsule *pill, struct md_op_data *op_data)
 {
 	struct mdt_ioepoch *epoch;
 	struct mdt_rec_setattr *rec;
 
-	epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
+	epoch = req_capsule_client_get(pill, &RMF_MDT_EPOCH);
+	rec = req_capsule_client_get(pill, &RMF_REC_REINT);
 
 	mdc_setattr_pack_rec(rec, op_data);
 	/*
@@ -654,5 +652,5 @@ void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 		rec->sa_valid &= ~MDS_ATTR_ATIME;
 
 	mdc_ioepoch_pack(epoch, op_data);
-	mdc_close_intent_pack(req, op_data);
+	mdc_close_intent_pack(pill, op_data);
 }
diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c
index 5373ec9..4135c3a 100644
--- a/fs/lustre/mdc/mdc_locks.c
+++ b/fs/lustre/mdc/mdc_locks.c
@@ -348,8 +348,8 @@ static int mdc_save_lovea(struct ptlrpc_request *req, void *data, u32 size)
 	lit->opc = (u64)it->it_op;
 
 	/* pack the intended request */
-	mdc_open_pack(req, op_data, it->it_create_mode, 0, it->it_flags, lmm,
-		      lmmsize);
+	mdc_open_pack(&req->rq_pill, op_data, it->it_create_mode, 0,
+		      it->it_flags, lmm, lmmsize);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     mdt_md_capsule_size);
@@ -487,11 +487,11 @@ static int mdc_save_lovea(struct ptlrpc_request *req, void *data, u32 size)
 					 exp->exp_connect_data.ocd_max_easize);
 
 	/* pack the intended request */
-	mdc_pack_body(req, &op_data->op_fid1, op_data->op_valid,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, op_data->op_valid,
 		      ea_vals_buf_size, -1, 0);
 
 	/* get SELinux policy info if any */
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(&req->rq_pill);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_EADATA, RCL_SERVER,
 			     GA_DEFAULT_EA_NAME_LEN * GA_DEFAULT_EA_NUM);
@@ -559,7 +559,7 @@ static int mdc_save_lovea(struct ptlrpc_request *req, void *data, u32 size)
 		easize = obd->u.cli.cl_max_mds_easize;
 
 	/* pack the intended request */
-	mdc_getattr_pack(req, valid, it->it_flags, op_data, easize);
+	mdc_getattr_pack(&req->rq_pill, valid, it->it_flags, op_data, easize);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER, easize);
 	req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER, acl_bufsize);
diff --git a/fs/lustre/mdc/mdc_reint.c b/fs/lustre/mdc/mdc_reint.c
index 786b23d..3f4e28a 100644
--- a/fs/lustre/mdc/mdc_reint.c
+++ b/fs/lustre/mdc/mdc_reint.c
@@ -139,7 +139,7 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		CDEBUG(D_INODE, "setting mtime %lld, ctime %lld\n",
 		       op_data->op_attr.ia_mtime.tv_sec,
 		       op_data->op_attr.ia_ctime.tv_sec);
-	mdc_setattr_pack(req, op_data, ea, ealen);
+	mdc_setattr_pack(&req->rq_pill, op_data, ea, ealen);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER, 0);
 
@@ -227,7 +227,7 @@ int mdc_create(struct obd_export *exp, struct md_op_data *op_data,
 	 * mdc_create_pack() fills msg->bufs[1] with name and msg->bufs[2] with
 	 * tgt, for symlinks or lov MD data.
 	 */
-	mdc_create_pack(req, op_data, data, datalen, mode, uid,
+	mdc_create_pack(&req->rq_pill, op_data, data, datalen, mode, uid,
 			gid, cap_effective, rdev);
 
 	ptlrpc_request_set_replen(req);
@@ -325,7 +325,7 @@ int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 		return rc;
 	}
 
-	mdc_unlink_pack(req, op_data);
+	mdc_unlink_pack(&req->rq_pill, op_data);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
@@ -381,7 +381,7 @@ int mdc_link(struct obd_export *exp, struct md_op_data *op_data,
 		return rc;
 	}
 
-	mdc_link_pack(req, op_data);
+	mdc_link_pack(&req->rq_pill, op_data);
 	ptlrpc_request_set_replen(req);
 
 	rc = mdc_reint(req, LUSTRE_IMP_FULL);
@@ -457,9 +457,10 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 		ldlm_cli_cancel_list(&cancels, count, req, 0);
 
 	if (op_data->op_cli_flags & CLI_MIGRATE)
-		mdc_migrate_pack(req, op_data, old, oldlen);
+		mdc_migrate_pack(&req->rq_pill, op_data, old, oldlen);
 	else
-		mdc_rename_pack(req, op_data, old, oldlen, new, newlen);
+		mdc_rename_pack(&req->rq_pill, op_data, old, oldlen,
+				new, newlen);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c
index 7df2c59..1fb9c46 100644
--- a/fs/lustre/mdc/mdc_request.c
+++ b/fs/lustre/mdc/mdc_request.c
@@ -116,7 +116,7 @@ static int mdc_get_root(struct obd_export *exp, const char *fileset,
 		ptlrpc_request_free(req);
 		return rc;
 	}
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 	if (fileset) {
 		char *name = req_capsule_client_get(&req->rq_pill, &RMF_NAME);
 
@@ -225,7 +225,7 @@ static int mdc_getattr(struct obd_export *exp, struct md_op_data *op_data,
 	}
 
 again:
-	mdc_pack_body(req, &op_data->op_fid1, op_data->op_valid,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, op_data->op_valid,
 		      op_data->op_mode, -1, 0);
 	req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER, acl_bufsize);
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
@@ -281,7 +281,7 @@ static int mdc_getattr_name(struct obd_export *exp, struct md_op_data *op_data,
 	}
 
 again:
-	mdc_pack_body(req, &op_data->op_fid1, op_data->op_valid,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, op_data->op_valid,
 		      op_data->op_mode, op_data->op_suppgids[0], 0);
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     op_data->op_mode);
@@ -391,7 +391,8 @@ static int mdc_xattr_common(struct obd_export *exp,
 		rec->sx_flags = flags;
 
 	} else {
-		mdc_pack_body(req, fid, valid, output_size, suppgid, flags);
+		mdc_pack_body(&req->rq_pill, fid, valid, output_size,
+			      suppgid, flags);
 	}
 
 	if (xattr_name) {
@@ -403,7 +404,7 @@ static int mdc_xattr_common(struct obd_export *exp,
 		memcpy(tmp, input, input_size);
 	}
 
-	mdc_file_sepol_pack(req);
+	mdc_file_sepol_pack(&req->rq_pill);
 
 	if (req_capsule_has_field(&req->rq_pill, &RMF_EADATA, RCL_SERVER))
 		req_capsule_set_size(&req->rq_pill, &RMF_EADATA,
@@ -510,13 +511,11 @@ static int mdc_getxattr(struct obd_export *exp, const struct lu_fid *fid,
 }
 
 
-static int mdc_get_lustre_md(struct obd_export *exp,
-			     struct ptlrpc_request *req,
+static int mdc_get_lustre_md(struct obd_export *exp, struct req_capsule *pill,
 			     struct obd_export *dt_exp,
 			     struct obd_export *md_exp,
 			     struct lustre_md *md)
 {
-	struct req_capsule *pill = &req->rq_pill;
 	int rc;
 
 	LASSERT(md);
@@ -624,7 +623,7 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 	 * in reply buffer.
 	 */
 	if (md->body->mbo_valid & OBD_MD_FLACL)
-		rc = mdc_unpack_acl(req, md);
+		rc = mdc_unpack_acl(pill, md);
 
 out:
 	if (rc)
@@ -940,7 +939,7 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 		op_data->op_xvalid &= ~(OP_XVALID_LAZYSIZE |
 					OP_XVALID_LAZYBLOCKS);
 
-	mdc_close_pack(req, op_data);
+	mdc_close_pack(&req->rq_pill, op_data);
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
@@ -1034,7 +1033,7 @@ static int mdc_getpage(struct obd_export *exp, const struct lu_fid *fid,
 	for (i = 0; i < npages; i++)
 		desc->bd_frag_ops->add_kiov_frag(desc, pages[i], 0, PAGE_SIZE);
 
-	mdc_readdir_pack(req, offset, PAGE_SIZE * npages, fid);
+	mdc_readdir_pack(&req->rq_pill, offset, PAGE_SIZE * npages, fid);
 
 	ptlrpc_request_set_replen(req);
 	rc = ptlrpc_queue_wait(req);
@@ -1727,7 +1726,7 @@ static int mdc_ioc_hsm_progress(struct obd_export *exp,
 		goto out;
 	}
 
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 
 	/* Copy hsm_progress struct */
 	req_hpk = req_capsule_client_get(&req->rq_pill, &RMF_MDS_HSM_PROGRESS);
@@ -1786,7 +1785,7 @@ static int mdc_ioc_hsm_ct_register(struct obd_import *imp, u32 archive_count,
 		return -ENOMEM;
 	}
 
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 
 	archive_array = req_capsule_client_get(&req->rq_pill,
 					       &RMF_MDS_HSM_ARCHIVE);
@@ -1828,7 +1827,7 @@ static int mdc_ioc_hsm_current_action(struct obd_export *exp,
 		return rc;
 	}
 
-	mdc_pack_body(req, &op_data->op_fid1, 0, 0,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, 0, 0,
 		      op_data->op_suppgids[0], 0);
 
 	ptlrpc_request_set_replen(req);
@@ -1864,7 +1863,7 @@ static int mdc_ioc_hsm_ct_unregister(struct obd_import *imp)
 		goto out;
 	}
 
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 
 	ptlrpc_request_set_replen(req);
 
@@ -1893,7 +1892,7 @@ static int mdc_ioc_hsm_state_get(struct obd_export *exp,
 		return rc;
 	}
 
-	mdc_pack_body(req, &op_data->op_fid1, 0, 0,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, 0, 0,
 		      op_data->op_suppgids[0], 0);
 
 	ptlrpc_request_set_replen(req);
@@ -1934,7 +1933,7 @@ static int mdc_ioc_hsm_state_set(struct obd_export *exp,
 		return rc;
 	}
 
-	mdc_pack_body(req, &op_data->op_fid1, 0, 0,
+	mdc_pack_body(&req->rq_pill, &op_data->op_fid1, 0, 0,
 		      op_data->op_suppgids[0], 0);
 
 	/* Copy states */
@@ -1983,7 +1982,7 @@ static int mdc_ioc_hsm_request(struct obd_export *exp,
 		return rc;
 	}
 
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 
 	/* Copy hsm_request struct */
 	req_hr = req_capsule_client_get(&req->rq_pill, &RMF_MDS_HSM_REQUEST);
@@ -2115,7 +2114,7 @@ static int mdc_ioc_swap_layouts(struct obd_export *exp,
 		return rc;
 	}
 
-	mdc_swap_layouts_pack(req, op_data);
+	mdc_swap_layouts_pack(&req->rq_pill, op_data);
 
 	payload = req_capsule_client_get(&req->rq_pill, &RMF_SWAP_LAYOUTS);
 	LASSERT(payload);
@@ -2308,7 +2307,7 @@ static int mdc_get_info_rpc(struct obd_export *exp,
 	if (rc == 0 || rc == -EREMOTE) {
 		tmp = req_capsule_server_get(&req->rq_pill, &RMF_GETINFO_VAL);
 		memcpy(val, tmp, vallen);
-		if (ptlrpc_rep_need_swab(req)) {
+		if (req_capsule_rep_need_swab(&req->rq_pill)) {
 			if (KEY_IS(KEY_FID2PATH))
 				lustre_swab_fid2path(val);
 		}
@@ -2560,7 +2559,7 @@ static int mdc_fsync(struct obd_export *exp, const struct lu_fid *fid,
 		return rc;
 	}
 
-	mdc_pack_body(req, fid, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, fid, 0, 0, -1, 0);
 
 	ptlrpc_request_set_replen(req);
 
@@ -2627,7 +2626,7 @@ static int mdc_rmfid(struct obd_export *exp, struct fid_array *fa,
 	tmp = req_capsule_client_get(&req->rq_pill, &RMF_FID_ARRAY);
 	memcpy(tmp, fa->fa_fids, flen);
 
-	mdc_pack_body(req, NULL, 0, 0, -1, 0);
+	mdc_pack_body(&req->rq_pill, NULL, 0, 0, -1, 0);
 	b = req_capsule_client_get(&req->rq_pill, &RMF_MDT_BODY);
 	b->mbo_ctime = ktime_get_real_seconds();
 
diff --git a/fs/lustre/mgc/mgc_request.c b/fs/lustre/mgc/mgc_request.c
index 5ea965c..1dfc74b 100644
--- a/fs/lustre/mgc/mgc_request.c
+++ b/fs/lustre/mgc/mgc_request.c
@@ -1442,7 +1442,7 @@ static int mgc_process_recover_log(struct obd_device *obd,
 		goto out;
 	}
 
-	mne_swab = ptlrpc_rep_need_swab(req);
+	mne_swab = req_capsule_rep_need_swab(&req->rq_pill);
 
 	for (i = 0; i < nrpages && ealen > 0; i++) {
 		int rc2;
diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c
index 8bbe68b..836b2a2 100644
--- a/fs/lustre/ptlrpc/layout.c
+++ b/fs/lustre/ptlrpc/layout.c
@@ -1756,7 +1756,7 @@ void req_capsule_init(struct req_capsule *pill,
 	if (req && pill == &req->rq_pill && req->rq_pill_init)
 		return;
 
-	memset(pill, 0, sizeof(*pill));
+	pill->rc_fmt = NULL;
 	pill->rc_req = req;
 	pill->rc_loc = location;
 	req_capsule_init_area(pill);
@@ -1780,10 +1780,7 @@ static int __req_format_is_sane(const struct req_format *fmt)
 static struct lustre_msg *__req_msg(const struct req_capsule *pill,
 				    enum req_location loc)
 {
-	struct ptlrpc_request *req;
-
-	req = pill->rc_req;
-	return loc == RCL_CLIENT ? req->rq_reqmsg : req->rq_repmsg;
+	return loc == RCL_CLIENT ? pill->rc_reqmsg : pill->rc_repmsg;
 }
 
 /**
@@ -1881,6 +1878,26 @@ u32 __req_capsule_offset(const struct req_capsule *pill,
 	return offset;
 }
 
+void req_capsule_set_swabbed(struct req_capsule *pill, enum req_location loc,
+			    u32 index)
+{
+	if (loc == RCL_CLIENT)
+		req_capsule_set_req_swabbed(pill, index);
+	else
+		req_capsule_set_rep_swabbed(pill, index);
+}
+
+bool req_capsule_need_swab(struct req_capsule *pill, enum req_location loc,
+			   u32 index)
+{
+	if (loc == RCL_CLIENT)
+		return (req_capsule_req_need_swab(pill) &&
+			!req_capsule_req_swabbed(pill, index));
+
+	return (req_capsule_rep_need_swab(pill) &&
+	       !req_capsule_rep_swabbed(pill, index));
+}
+
 /**
  * Helper for __req_capsule_get(); swabs value / array of values and/or dumps
  * them if desired.
@@ -1898,12 +1915,11 @@ u32 __req_capsule_offset(const struct req_capsule *pill,
 	int size;
 	int rc = 0;
 	bool do_swab;
-	bool inout = loc == RCL_CLIENT;
 	bool array = field->rmf_flags & RMF_F_STRUCT_ARRAY;
 
 	swabber = swabber ?: field->rmf_swabber;
 
-	if (ptlrpc_buf_need_swab(pill->rc_req, inout, offset) &&
+	if (req_capsule_need_swab(pill, loc, offset) &&
 	    (swabber || field->rmf_swab_len) && value)
 		do_swab = true;
 	else
@@ -1968,7 +1984,7 @@ u32 __req_capsule_offset(const struct req_capsule *pill,
 		}
 	}
 	if (do_swab)
-		ptlrpc_buf_set_swabbed(pill->rc_req, inout, offset);
+		req_capsule_set_swabbed(pill, loc, offset);
 
 	return 0;
 }
diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c
index 133202d..6710e6b 100644
--- a/fs/lustre/ptlrpc/pack_generic.c
+++ b/fs/lustre/ptlrpc/pack_generic.c
@@ -72,25 +72,6 @@ u32 lustre_msg_hdr_size(u32 magic, u32 count)
 	}
 }
 
-void ptlrpc_buf_set_swabbed(struct ptlrpc_request *req, const int inout,
-			    u32 index)
-{
-	if (inout)
-		lustre_set_req_swabbed(req, index);
-	else
-		lustre_set_rep_swabbed(req, index);
-}
-
-bool ptlrpc_buf_need_swab(struct ptlrpc_request *req, const int inout,
-			  u32 index)
-{
-	if (inout)
-		return (ptlrpc_req_need_swab(req) &&
-			!lustre_req_swabbed(req, index));
-
-	return (ptlrpc_rep_need_swab(req) && !lustre_rep_swabbed(req, index));
-}
-
 /* early reply size */
 u32 lustre_msg_early_size(void)
 {
@@ -576,7 +557,8 @@ int ptlrpc_unpack_req_msg(struct ptlrpc_request *req, int len)
 
 	rc = __lustre_unpack_msg(req->rq_reqmsg, len);
 	if (rc == 1) {
-		lustre_set_req_swabbed(req, MSG_PTLRPC_HEADER_OFF);
+		req_capsule_set_req_swabbed(&req->rq_pill,
+					    MSG_PTLRPC_HEADER_OFF);
 		rc = 0;
 	}
 	return rc;
@@ -588,26 +570,30 @@ int ptlrpc_unpack_rep_msg(struct ptlrpc_request *req, int len)
 
 	rc = __lustre_unpack_msg(req->rq_repmsg, len);
 	if (rc == 1) {
-		lustre_set_rep_swabbed(req, MSG_PTLRPC_HEADER_OFF);
+		req_capsule_set_rep_swabbed(&req->rq_pill,
+					    MSG_PTLRPC_HEADER_OFF);
 		rc = 0;
 	}
 	return rc;
 }
 
-static inline int lustre_unpack_ptlrpc_body_v2(struct ptlrpc_request *req,
-					       const int inout, int offset)
+static inline int
+lustre_unpack_ptlrpc_body_v2(struct ptlrpc_request *req,
+			     enum req_location loc, int offset)
 {
 	struct ptlrpc_body *pb;
-	struct lustre_msg_v2 *m = inout ? req->rq_reqmsg : req->rq_repmsg;
+	struct lustre_msg_v2 *m;
+
+	m = loc == RCL_CLIENT ? req->rq_reqmsg : req->rq_repmsg;
 
 	pb = lustre_msg_buf_v2(m, offset, sizeof(struct ptlrpc_body_v2));
 	if (!pb) {
 		CERROR("error unpacking ptlrpc body\n");
 		return -EFAULT;
 	}
-	if (ptlrpc_buf_need_swab(req, inout, offset)) {
+	if (req_capsule_need_swab(&req->rq_pill, loc, offset)) {
 		lustre_swab_ptlrpc_body(pb);
-		ptlrpc_buf_set_swabbed(req, inout, offset);
+		req_capsule_set_swabbed(&req->rq_pill, loc, offset);
 	}
 
 	if ((pb->pb_version & ~LUSTRE_VERSION_MASK) != PTLRPC_MSG_VERSION) {
@@ -615,7 +601,7 @@ static inline int lustre_unpack_ptlrpc_body_v2(struct ptlrpc_request *req,
 		return -EINVAL;
 	}
 
-	if (!inout)
+	if (loc == RCL_SERVER)
 		pb->pb_status = ptlrpc_status_ntoh(pb->pb_status);
 
 	return 0;
@@ -625,7 +611,7 @@ int lustre_unpack_req_ptlrpc_body(struct ptlrpc_request *req, int offset)
 {
 	switch (req->rq_reqmsg->lm_magic) {
 	case LUSTRE_MSG_MAGIC_V2:
-		return lustre_unpack_ptlrpc_body_v2(req, 1, offset);
+		return lustre_unpack_ptlrpc_body_v2(req, RCL_CLIENT, offset);
 	default:
 		CERROR("bad lustre msg magic: %08x\n",
 		       req->rq_reqmsg->lm_magic);
@@ -637,7 +623,7 @@ int lustre_unpack_rep_ptlrpc_body(struct ptlrpc_request *req, int offset)
 {
 	switch (req->rq_repmsg->lm_magic) {
 	case LUSTRE_MSG_MAGIC_V2:
-		return lustre_unpack_ptlrpc_body_v2(req, 0, offset);
+		return lustre_unpack_ptlrpc_body_v2(req, RCL_SERVER, offset);
 	default:
 		CERROR("bad lustre msg magic: %08x\n",
 		       req->rq_repmsg->lm_magic);
@@ -2454,7 +2440,8 @@ static inline int req_ptlrpc_body_swabbed(struct ptlrpc_request *req)
 
 	switch (req->rq_reqmsg->lm_magic) {
 	case LUSTRE_MSG_MAGIC_V2:
-		return lustre_req_swabbed(req, MSG_PTLRPC_BODY_OFF);
+		return req_capsule_req_swabbed(&req->rq_pill,
+					       MSG_PTLRPC_BODY_OFF);
 	default:
 		CERROR("bad lustre msg magic: %#08X\n",
 		       req->rq_reqmsg->lm_magic);
@@ -2469,7 +2456,8 @@ static inline int rep_ptlrpc_body_swabbed(struct ptlrpc_request *req)
 
 	switch (req->rq_repmsg->lm_magic) {
 	case LUSTRE_MSG_MAGIC_V2:
-		return lustre_rep_swabbed(req, MSG_PTLRPC_BODY_OFF);
+		return req_capsule_rep_swabbed(&req->rq_pill,
+					       MSG_PTLRPC_BODY_OFF);
 	default:
 		/* uninitialized yet */
 		return 0;
@@ -2491,7 +2479,7 @@ void _debug_req(struct ptlrpc_request *req,
 	if (req->rq_repmsg)
 		rep_ok = true;
 
-	if (ptlrpc_req_need_swab(req)) {
+	if (req_capsule_req_need_swab(&req->rq_pill)) {
 		req_ok = req_ok && req_ptlrpc_body_swabbed(req);
 		rep_ok = rep_ok && rep_ptlrpc_body_swabbed(req);
 	}
diff --git a/fs/lustre/ptlrpc/sec.c b/fs/lustre/ptlrpc/sec.c
index c65cf89..7e6b681 100644
--- a/fs/lustre/ptlrpc/sec.c
+++ b/fs/lustre/ptlrpc/sec.c
@@ -961,7 +961,8 @@ static int do_cli_unwrap_reply(struct ptlrpc_request *req)
 	rc = __lustre_unpack_msg(req->rq_repdata, req->rq_repdata_len);
 	switch (rc) {
 	case 1:
-		lustre_set_rep_swabbed(req, MSG_PTLRPC_HEADER_OFF);
+		req_capsule_set_rep_swabbed(&req->rq_pill,
+					    MSG_PTLRPC_HEADER_OFF);
 	case 0:
 		break;
 	default:
@@ -2090,7 +2091,8 @@ int sptlrpc_svc_unwrap_request(struct ptlrpc_request *req)
 	rc = __lustre_unpack_msg(msg, req->rq_reqdata_len);
 	switch (rc) {
 	case 1:
-		lustre_set_req_swabbed(req, MSG_PTLRPC_HEADER_OFF);
+		req_capsule_set_req_swabbed(&req->rq_pill,
+					    MSG_PTLRPC_HEADER_OFF);
 	case 0:
 		break;
 	default:
diff --git a/fs/lustre/ptlrpc/sec_plain.c b/fs/lustre/ptlrpc/sec_plain.c
index 7920ab0..0d1c591 100644
--- a/fs/lustre/ptlrpc/sec_plain.c
+++ b/fs/lustre/ptlrpc/sec_plain.c
@@ -221,7 +221,7 @@ int plain_ctx_verify(struct ptlrpc_cli_ctx *ctx, struct ptlrpc_request *req)
 		return -EPROTO;
 	}
 
-	swabbed = ptlrpc_rep_need_swab(req);
+	swabbed = req_capsule_rep_need_swab(&req->rq_pill);
 
 	phdr = lustre_msg_buf(msg, PLAIN_PACK_HDR_OFF, sizeof(*phdr));
 	if (!phdr) {
@@ -736,7 +736,7 @@ static int plain_accept(struct ptlrpc_request *req)
 		return SECSVC_DROP;
 	}
 
-	swabbed = ptlrpc_req_need_swab(req);
+	swabbed = req_capsule_req_need_swab(&req->rq_pill);
 
 	phdr = lustre_msg_buf(msg, PLAIN_PACK_HDR_OFF, sizeof(*phdr));
 	if (!phdr) {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 24/27] lustre: llite: add selinux testing
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (22 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 23/27] lustre: ptlrpc: move more members in PTLRPC request into pill James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 25/27] lnet: Fix destination NID for discovery PUSH James Simmons
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Saurabh Tandan, Lustre Development List

From: Sebastien Buisson <sbuisson@ddn.com>

New test sanity-selinux.sh aims at exercing SELinux support
on the client side, as implemented according to LU-5560.
This patch adds new fail_locs in CLIO.

WC-bug-id: https://jira.whamcloud.com/browse/LU-5560
Lustre-commit: bfca8338e5f2ae1b ("LU-5560 tests: add sanity-selinux.sh")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-on: http://review.whamcloud.com/15818
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_support.h | 3 +++
 fs/lustre/llite/dir.c           | 2 ++
 fs/lustre/llite/namei.c         | 6 ++++++
 3 files changed, 11 insertions(+)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index 188e552..1e8cebf 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -468,6 +468,9 @@
 #define OBD_FAIL_LLITE_LOST_LAYOUT			0x1407
 #define OBD_FAIL_LLITE_NO_CHECK_DEAD			0x1408
 #define OBD_FAIL_GETATTR_DELAY				0x1409
+#define OBD_FAIL_LLITE_CREATE_FILE_PAUSE		0x1409
+#define OBD_FAIL_LLITE_NEWNODE_PAUSE			0x140a
+#define OBD_FAIL_LLITE_SETDIRSTRIPE_PAUSE		0x140b
 #define OBD_FAIL_LLITE_CREATE_NODE_PAUSE		0x140c
 #define OBD_FAIL_LLITE_IMUTEX_SEC			0x140e
 #define OBD_FAIL_LLITE_IMUTEX_NOSEC			0x140f
diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 3432034..fa8e697 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -486,6 +486,8 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 	if (err)
 		goto out_request;
 
+	CFS_FAIL_TIMEOUT(OBD_FAIL_LLITE_SETDIRSTRIPE_PAUSE, cfs_fail_val);
+
 	err = ll_prep_inode(&inode, &request->rq_pill, parent->i_sb, NULL);
 	if (err)
 		goto out_inode;
diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index 9eab6fe..f42e872 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -1156,6 +1156,8 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 	else if (de)
 		dentry = de;
 
+	CFS_FAIL_TIMEOUT(OBD_FAIL_LLITE_CREATE_FILE_PAUSE, cfs_fail_val);
+
 	if (!rc) {
 		if (it_disposition(it, DISP_OPEN_CREATE)) {
 			/* Dentry instantiated in ll_create_it. */
@@ -1485,6 +1487,8 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 
 	ll_update_times(request, dir);
 
+	CFS_FAIL_TIMEOUT(OBD_FAIL_LLITE_NEWNODE_PAUSE, cfs_fail_val);
+
 	err = ll_prep_inode(&inode, &request->rq_pill, dir->i_sb, NULL);
 	if (err)
 		goto err_exit;
@@ -1575,6 +1579,8 @@ static int ll_create_nd(struct inode *dir, struct dentry *dentry,
 	ktime_t kstart = ktime_get();
 	int rc;
 
+	CFS_FAIL_TIMEOUT(OBD_FAIL_LLITE_CREATE_FILE_PAUSE, cfs_fail_val);
+
 	CDEBUG(D_VFSTRACE,
 	       "VFS Op:name=%pd, dir=" DFID "(%p), flags=%u, excl=%d\n",
 	       dentry, PFID(ll_inode2fid(dir)), dir, mode, want_excl);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 25/27] lnet: Fix destination NID for discovery PUSH
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (23 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 24/27] lustre: llite: add selinux testing James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 26/27] lnet: Check if discovery toggled off in ping reply James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 27/27] lustre: update version to 2.14.52 James Simmons
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Chris Horn, Lustre Development List

From: Chris Horn <chris.horn@hpe.com>

If we're sending a discovery PUSH after receiving a discovery
REPLY then we want to send via the same NID that the reply was
sent to. This introduces a challenge in selecting an appropriate
destination NID for the PUSH because lnet_select_pathway() will not
run the MR selection algorithm for choosing a peer NI if the source
NI has been specified.

It is reasonable to assume that the NID used by the message
originator in sending the REPLY is a suitable destination for the
discovery PUSH. Thus, we record this NID in the same location we
currently record the lp_disc_src_nid, and use it when sending the
PUSH. With this change, the only other user of lnet_peer_select_nid()
is lnet_peer_send_ping(). In the ping case we do not set a source NID,
so lnet_select_pathway() is free to choose any peer NI. So this change
allows us to get rid of lnet_peer_select_nid() altogether.

Alternatively, we would need to reproduce a lot of the path selection
algorithm inside lnet_peer_select_nid() in order to avoid sending to
unhealthy NIDs. It seems undesirable and unnecessary to duplicate that
logic.

HPE-bug-id: LUS-9333
WC-bug-id: https://jira.whamcloud.com/browse/LU-14660
Lustre-commit: dce2f7d1987711dfd ("LU-14660 lnet: Fix destination NID for discovery PUSH")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43507
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h |  2 ++
 net/lnet/lnet/peer.c           | 52 ++++++++++--------------------------------
 2 files changed, 14 insertions(+), 40 deletions(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index d898066..cb0a950 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -640,6 +640,8 @@ struct lnet_peer {
 
 	/* source NID to use during discovery */
 	lnet_nid_t		lp_disc_src_nid;
+	/* destination NID to use during discovery */
+	lnet_nid_t		lp_disc_dst_nid;
 
 	/* net to perform discovery on */
 	u32			lp_disc_net_id;
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index d66a302..7630aff 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -221,6 +221,7 @@
 	spin_lock_init(&lp->lp_lock);
 	lp->lp_primary_nid = nid;
 	lp->lp_disc_src_nid = LNET_NID_ANY;
+	lp->lp_disc_dst_nid = LNET_NID_ANY;
 	if (lnet_peers_start_down())
 		lp->lp_alive = false;
 	else
@@ -2515,6 +2516,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	spin_lock(&lp->lp_lock);
 
 	lp->lp_disc_src_nid = ev->target.nid;
+	lp->lp_disc_dst_nid = ev->source.nid;
 
 	/*
 	 * If some kind of error happened the contents of message
@@ -3221,8 +3223,10 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 			 * received by lp, we need to set the discovery source
 			 * NID for new_lp to the NID stored in lp.
 			 */
-			if (lp->lp_disc_src_nid != LNET_NID_ANY)
+			if (lp->lp_disc_src_nid != LNET_NID_ANY) {
 				new_lp->lp_disc_src_nid = lp->lp_disc_src_nid;
+				new_lp->lp_disc_dst_nid = lp->lp_disc_dst_nid;
+			}
 			spin_unlock(&new_lp->lp_lock);
 			spin_unlock(&lp->lp_lock);
 
@@ -3273,41 +3277,10 @@ static int lnet_peer_ping_failed(struct lnet_peer *lp)
 	return rc ? rc : LNET_REDISCOVER_PEER;
 }
 
-/*
- * Select NID to send a Ping or Push to.
- */
-static lnet_nid_t lnet_peer_select_nid(struct lnet_peer *lp)
-{
-	struct lnet_peer_ni *lpni;
-
-	/* Look for a direct-connected NID for this peer. */
-	lpni = NULL;
-	while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL) {
-		if (!lnet_get_net_locked(lpni->lpni_peer_net->lpn_net_id))
-			continue;
-		break;
-	}
-	if (lpni)
-		return lpni->lpni_nid;
-
-	/* Look for a routed-connected NID for this peer. */
-	lpni = NULL;
-	while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL) {
-		if (!lnet_find_rnet_locked(lpni->lpni_peer_net->lpn_net_id))
-			continue;
-		break;
-	}
-	if (lpni)
-		return lpni->lpni_nid;
-
-	return LNET_NID_ANY;
-}
-
 /* Active side of ping. */
 static int lnet_peer_send_ping(struct lnet_peer *lp)
 __must_hold(&lp->lp_lock)
 {
-	lnet_nid_t pnid;
 	int nnis;
 	int rc;
 	int cpt;
@@ -3319,12 +3292,11 @@ static int lnet_peer_send_ping(struct lnet_peer *lp)
 	cpt = lnet_net_lock_current();
 	/* Refcount for MD. */
 	lnet_peer_addref_locked(lp);
-	pnid = lnet_peer_select_nid(lp);
 	lnet_net_unlock(cpt);
 
 	nnis = max_t(int, lp->lp_data_nnis, LNET_INTERFACES_MIN);
 
-	rc = lnet_send_ping(pnid, &lp->lp_ping_mdh, nnis, lp,
+	rc = lnet_send_ping(lp->lp_primary_nid, &lp->lp_ping_mdh, nnis, lp,
 			    the_lnet.ln_dc_handler, false);
 	/* if LNetMDBind in lnet_send_ping fails we need to decrement the
 	 * refcount on the peer, otherwise LNetMDUnlink will be called
@@ -3445,18 +3417,17 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 		CERROR("Can't bind push source MD: %d\n", rc);
 		goto fail_error;
 	}
+
 	cpt = lnet_net_lock_current();
 	/* Refcount for MD. */
 	lnet_peer_addref_locked(lp);
 	id.pid = LNET_PID_LUSTRE;
-	id.nid = lnet_peer_select_nid(lp);
+	if (lp->lp_disc_dst_nid != LNET_NID_ANY)
+		id.nid = lp->lp_disc_dst_nid;
+	else
+		id.nid = lp->lp_primary_nid;
 	lnet_net_unlock(cpt);
 
-	if (id.nid == LNET_NID_ANY) {
-		rc = -EHOSTUNREACH;
-		goto fail_unlink;
-	}
-
 	rc = LNetPut(lp->lp_disc_src_nid, lp->lp_push_mdh,
 		     LNET_ACK_REQ, id, LNET_RESERVED_PORTAL,
 		     LNET_PROTO_PING_MATCHBITS, 0, 0);
@@ -3466,6 +3437,7 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	 * scratch
 	 */
 	lp->lp_disc_src_nid = LNET_NID_ANY;
+	lp->lp_disc_dst_nid = LNET_NID_ANY;
 	if (rc)
 		goto fail_unlink;
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 26/27] lnet: Check if discovery toggled off in ping reply
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (24 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 25/27] lnet: Fix destination NID for discovery PUSH James Simmons
@ 2021-06-13 23:11 ` James Simmons
  2021-06-13 23:11 ` [lustre-devel] [PATCH 27/27] lustre: update version to 2.14.52 James Simmons
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Chris Horn, Lustre Development List

From: Chris Horn <chris.horn@hpe.com>

If a peer is initially discovered and found to have discovery
enabled, but the peer later reloads LNet with discovery disabled,
then we can delete the peer and re-create it the next time the peer
is discovered.

It is safe to delete and re-create the peer as long as it wasn't
configured manually.

In lnet_peer_deletion(), we need to use lnet_del_init() when removing
the peer from the discovery queue because the lnet_peer_del() code
path can result in a call to lnet_peer_queue_for_discovery() where
we check if the lp_dc_list is empty.

HPE-bug-id: LUS-9178
Fixes: 7ec94557b1 ("lnet: Prevent discovery on peer marked deletion")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14661
Lustre-commit: 143893381d428466 ("LU-14661 lnet: Check if discovery toggled off in ping reply")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43508
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/peer.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 7630aff..2fc784d 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2254,22 +2254,34 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	/* The peer may have discovery disabled at its end. Set
 	 * NO_DISCOVERY as appropriate.
 	 */
-	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY)) {
+	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY) ||
+	    lnet_peer_discovery_disabled) {
 		CDEBUG(D_NET, "Peer %s has discovery disabled\n",
 		       libcfs_nid2str(lp->lp_primary_nid));
-		/* Mark the peer for deletion if we already know about it
-		 * and it's going from discovery set to no discovery set
+
+		/* Detect whether this peer has toggled discovery from on to
+		 * off and whether we can delete and re-create the peer. Peers
+		 * that were manually configured cannot be deleted by discovery.
+		 * We need to delete this peer and re-create it if the peer was
+		 * not configured manually, is currently considered DD capable,
+		 * and either:
+		 * 1. We've already discovered the peer (the peer has toggled
+		 *    the discovery feature from on to off), or
+		 * 2. The peer is considered MR, but it was not user configured
+		 *    (this was a "temporary" peer created via the kernel APIs
+		 *     that we're discovering for the first time)
 		 */
-		if (!(lp->lp_state & (LNET_PEER_NO_DISCOVERY |
-				      LNET_PEER_DISCOVERING)) &&
-		    lp->lp_state & LNET_PEER_DISCOVERED) {
+		if (!(lp->lp_state & (LNET_PEER_CONFIGURED |
+				      LNET_PEER_NO_DISCOVERY)) &&
+		    (lp->lp_state & (LNET_PEER_DISCOVERED |
+				     LNET_PEER_MULTI_RAIL))) {
 			CDEBUG(D_NET, "Marking %s:0x%x for deletion\n",
 			       libcfs_nid2str(lp->lp_primary_nid),
 			       lp->lp_state);
 			lp->lp_state |= LNET_PEER_MARK_DELETION;
 		}
 		lp->lp_state |= LNET_PEER_NO_DISCOVERY;
-	} else if (lp->lp_state & LNET_PEER_NO_DISCOVERY) {
+	} else {
 		CDEBUG(D_NET, "Peer %s has discovery enabled\n",
 		       libcfs_nid2str(lp->lp_primary_nid));
 		lp->lp_state &= ~LNET_PEER_NO_DISCOVERY;
@@ -3083,7 +3095,7 @@ static int lnet_peer_deletion(struct lnet_peer *lp)
 	 * of deleting it.
 	 */
 	if (!list_empty(&lp->lp_dc_list))
-		list_del(&lp->lp_dc_list);
+		list_del_init(&lp->lp_dc_list);
 	list_for_each_entry_safe(route, tmp,
 				 &lp->lp_routes,
 				 lr_gwlist)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [lustre-devel] [PATCH 27/27] lustre: update version to 2.14.52
  2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
                   ` (25 preceding siblings ...)
  2021-06-13 23:11 ` [lustre-devel] [PATCH 26/27] lnet: Check if discovery toggled off in ping reply James Simmons
@ 2021-06-13 23:11 ` James Simmons
  26 siblings, 0 replies; 28+ messages in thread
From: James Simmons @ 2021-06-13 23:11 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Oleg Drokin <green@whamcloud.com>

New tag 2.14.52

Signed-off-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/uapi/linux/lustre/lustre_ver.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/lustre/lustre_ver.h b/include/uapi/linux/lustre/lustre_ver.h
index 604b1df..a840eca 100644
--- a/include/uapi/linux/lustre/lustre_ver.h
+++ b/include/uapi/linux/lustre/lustre_ver.h
@@ -3,9 +3,9 @@
 
 #define LUSTRE_MAJOR 2
 #define LUSTRE_MINOR 14
-#define LUSTRE_PATCH 51
+#define LUSTRE_PATCH 52
 #define LUSTRE_FIX 0
-#define LUSTRE_VERSION_STRING "2.14.51"
+#define LUSTRE_VERSION_STRING "2.14.52"
 
 #define OBD_OCD_VERSION(major, minor, patch, fix)			\
 	(((major) << 24) + ((minor) << 16) + ((patch) << 8) + (fix))
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-06-13 23:13 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-13 23:11 [lustre-devel] [PATCH 00/27] lustre: sync to 2.14.52 James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 01/27] lustre: uapi: add mdt_hash_name James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 02/27] lustre: uapi: rename CONFIG_T_* to MGS_CFG_T_* James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 03/27] lnet: o2iblnd: fix bug in list_first_entry() change James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 04/27] lustre: flr: mmap write/punch does not stale other mirrors James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 05/27] lustre: llite: default lsm update may memory leak James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 06/27] lustre: pcc: don't alloc FID in LLITE for pcc open James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 07/27] lustre: quota: default OST Pool Quotas James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 08/27] lustre: rename tgt_pool_* functions James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 09/27] lustre: llite: refresh layout after mirror merge/split James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 10/27] lustre: ptlrpc: do not match reply with resent RPC James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 11/27] lustre: vvp: wait for nrpages to be updated James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 12/27] lustre: obd: check if sbi->ll_md_exp is initialized James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 13/27] lustre: osc: Batch gang_lookup cbs James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 14/27] lustre: llite: Return errors for aio James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 15/27] lnet: do not crash if lnet_sock_getaddr returns error James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 16/27] lustre: sec: forbid file rename from enc to unencrypted dir James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 17/27] lustre: mdc: start changelog thread upon first access James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 18/27] lustre: llog: changelog purge deletes plain llog James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 19/27] lnet: libcfs: allow comma-separated masks James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 20/27] lustre: osc: cleanup comment in osc_object_is_contended James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 21/27] lnet: simplify lnet_ni_add_interface James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 22/27] lustre: lmv: change default hash type to crush James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 23/27] lustre: ptlrpc: move more members in PTLRPC request into pill James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 24/27] lustre: llite: add selinux testing James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 25/27] lnet: Fix destination NID for discovery PUSH James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 26/27] lnet: Check if discovery toggled off in ping reply James Simmons
2021-06-13 23:11 ` [lustre-devel] [PATCH 27/27] lustre: update version to 2.14.52 James Simmons

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox