All of lore.kernel.org
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10
@ 2018-07-31  2:25 James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 01/31] lustre: osc: Send RPCs when extents are full James Simmons
                   ` (31 more replies)
  0 siblings, 32 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

This covers all the missing patches landed from the start of the
lustre 2.10 development cycle until right before the PFL feature
landed. Several bug fixes as well as cleanups. This is based on
top of the recent list patches as well as my mount code for llite
patch series.

Abrarahmed Momin (1):
  lustre: llite: Remove OBD_FAIL_OSC_CONNECT_CKSUM

Amir Shehata (1):
  lustre: lnet: Fix route hops print

Andreas Dilger (2):
  lustre: libcfs: reduce libcfs checksum speed test time
  lustre: obdclass: improve missing operation message

Andriy Skulysh (1):
  lustre: osc: hung in osc_destroy()

Bruno Faccini (3):
  lustre: obdclass: obdclass module cleanup upon load error
  lustre: obdclass: handle early requests vs CT registering
  lustre: llite: handle client racy case during create

Chris Horn (1):
  lustre: llite: Return -ERESTARTSYS in range_lock()

Di Wang (1):
  lustre: lmv: honour the specified stripe index

Dmitry Eremin (1):
  lustre: clio: remove unused members from struct cl_thread_info

Fan Yong (1):
  lustre: fid: race between client_fid_fini and seq_client_flush

Gu Zheng (1):
  lustre: libcfs: avoid overflow of crypto bandwidth calculation

Hongchao Zhang (1):
  lustre: mgc: relate sptlrpc & param to MGC

James Simmons (1):
  lustre: docs: update TODO file

Jinshan Xiong (1):
  lustre: llite: ignore layout for ll_writepages()

John L. Hammond (5):
  lustre: llite: return small device numbers for compat stat()
  lustre: obd: remove OBD_NOTIFY_SYNC{,_NONBLOCK}
  lustre: obd: remove OBD_NOTIFY_CONFIG
  lustre: obdclass: use static initializer macros where possible
  lustre: obd: remove unused data parameter from obd_notify()

Niu Yawei (4):
  lustre: llite: don't zero timestamps internally
  lustre: config: don't attach sub logs for LWP
  lustre: llite: buggy special handling on MULTIMODRPCS
  lustre: config: move config types into lustre_idl.h

Patrick Farrell (2):
  lustre: osc: Send RPCs when extents are full
  lustre: llite: reduce jobstats race window

Sebastien Buisson (1):
  lustre: obd: add 'network' client mount option

Sonia Sharma (1):
  lustre: lnet: removal of obsolete LNDs

Steve Guminski (2):
  lustre: obd: change positional struct initializers to C99
  lustre: lnet: change positional struct initializers to C99

 drivers/staging/lustre/TODO                        | 51 +-----------
 .../lustre/include/linux/libcfs/libcfs_crypto.h    |  4 +-
 .../lustre/include/uapi/linux/lnet/nidstr.h        | 18 ++---
 .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
 drivers/staging/lustre/lnet/libcfs/linux-crypto.c  | 33 ++++++--
 drivers/staging/lustre/lnet/lnet/api-ni.c          |  9 +--
 drivers/staging/lustre/lnet/lnet/lo.c              | 20 +++--
 drivers/staging/lustre/lnet/lnet/router_proc.c     |  2 +-
 drivers/staging/lustre/lnet/selftest/framework.c   |  2 +-
 drivers/staging/lustre/lustre/fid/fid_request.c    | 21 +++--
 drivers/staging/lustre/lustre/include/cl_object.h  |  1 -
 drivers/staging/lustre/lustre/include/lu_object.h  |  6 --
 .../staging/lustre/lustre/include/lustre_disk.h    |  1 +
 drivers/staging/lustre/lustre/include/lustre_net.h |  3 +-
 drivers/staging/lustre/lustre/include/obd.h        | 10 +--
 drivers/staging/lustre/lustre/include/obd_class.h  | 66 +++++++---------
 .../staging/lustre/lustre/include/obd_support.h    |  4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      | 42 ++++++----
 drivers/staging/lustre/lustre/llite/file.c         | 16 +++-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c | 12 ++-
 .../staging/lustre/lustre/llite/llite_internal.h   | 10 ++-
 drivers/staging/lustre/lustre/llite/llite_lib.c    | 26 +++----
 drivers/staging/lustre/lustre/llite/namei.c        |  9 +--
 drivers/staging/lustre/lustre/llite/range_lock.c   |  2 +-
 drivers/staging/lustre/lustre/llite/rw.c           | 15 ++--
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        | 51 +++++-------
 drivers/staging/lustre/lustre/lov/lov_obd.c        | 63 ++++-----------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    | 17 ++--
 drivers/staging/lustre/lustre/mgc/mgc_request.c    | 47 ++++++-----
 .../staging/lustre/lustre/obdclass/cl_internal.h   | 45 +----------
 drivers/staging/lustre/lustre/obdclass/cl_io.c     | 21 +----
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 16 +---
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 91 +++++++++++++++-------
 drivers/staging/lustre/lustre/obdclass/genops.c    |  5 +-
 .../staging/lustre/lustre/obdclass/kernelcomm.c    |  8 ++
 drivers/staging/lustre/lustre/obdclass/lu_object.c | 15 ----
 .../staging/lustre/lustre/obdclass/lustre_peer.c   | 16 +---
 .../staging/lustre/lustre/obdclass/obd_config.c    | 46 ++++++++++-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 38 +++++++++
 drivers/staging/lustre/lustre/osc/osc_cache.c      | 45 +++++++----
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |  2 +-
 drivers/staging/lustre/lustre/osc/osc_object.c     |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    | 18 +++--
 drivers/staging/lustre/lustre/ptlrpc/client.c      |  4 +-
 drivers/staging/lustre/lustre/ptlrpc/events.c      |  4 +
 drivers/staging/lustre/lustre/ptlrpc/import.c      |  5 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++
 47 files changed, 561 insertions(+), 478 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 01/31] lustre: osc: Send RPCs when extents are full
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 02/31] lustre: obd: add 'network' client mount option James Simmons
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Patrick Farrell <paf@cray.com>

Currently, Lustre decides to send an RPC under a number of
conditions (such as memory pressure or lock cancellcation);
one of the conditions it looks for is "enough dirty pages
to fill an RPC". This worked fine when only one process
could be dirtying pages at a time, but in newer Lustre
versions, more than one process can write to the same
file (and the same osc object) at once.

In this case, the "count dirty pages method" will see there
are enough dirty pages to fill an RPC, but since the dirty
pages are being created by multiple writers, they are not
contiguous and will not fit in to one RPC. This resulted in
many RPCs of less than full size being sent, despite a
good I/O pattern. (Earlier versions of Lustre usually
send only full RPCs when presented with this pattern.)

Instead, we remove this check and add extents to a special
full extent list when they reach max pages per RPC, then
send from that list. (This is similar to high priority
and urgent extents.)

With a good I/O pattern, like usually used in benchmarking,
it should be possible to send only full size RPCs. This
patch achieves that without degrading performance in other
cases.

In IOR tests with multiple writers to a single file,
this patch improves performance by several times, and
returns performance to equal levels (single striped files)
or much greater levels (very high speed OSTs, files
with many stripes) vs earlier versions.

Supporting data is provided in LU-8515.

Signed-off-by: Patrick Farrell <paf@cray.com>
Intel-bug-id: https://jira.whamcloud.com/browse/LU-8515
Reviewed-on: https://review.whamcloud.com/22012
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_cache.c      | 45 ++++++++++++++--------
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |  2 +-
 drivers/staging/lustre/lustre/osc/osc_object.c     |  4 +-
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 87d0d16..e44822a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -634,6 +634,10 @@ void osc_extent_release(const struct lu_env *env, struct osc_extent *ext)
 			if (ext->oe_urgent)
 				list_move_tail(&ext->oe_link,
 					       &obj->oo_urgent_exts);
+			else if (ext->oe_nr_pages == ext->oe_mppr) {
+				list_move_tail(&ext->oe_link,
+					       &obj->oo_full_exts);
+			}
 		}
 		osc_object_unlock(obj);
 
@@ -1790,9 +1794,10 @@ static int osc_makes_rpc(struct client_obd *cli, struct osc_object *osc,
 			CDEBUG(D_CACHE, "cache waiters forcing RPC\n");
 			return 1;
 		}
-		if (atomic_read(&osc->oo_nr_writes) >=
-		    cli->cl_max_pages_per_rpc)
+		if (!list_empty(&osc->oo_full_exts)) {
+			CDEBUG(D_CACHE, "full extent ready, make an RPC\n");
 			return 1;
+		}
 	} else {
 		if (atomic_read(&osc->oo_nr_reads) == 0)
 			return 0;
@@ -1963,6 +1968,7 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
 
 	EASSERT((ext->oe_state == OES_CACHE || ext->oe_state == OES_LOCK_DONE),
 		ext);
+	OSC_EXTENT_DUMP(D_CACHE, ext, "trying to add this extent\n");
 
 	if (!data->erd_max_extents)
 		return 0;
@@ -2085,19 +2091,22 @@ static unsigned int get_write_extents(struct osc_object *obj,
 				 struct osc_extent, oe_link);
 		if (!try_to_add_extent_for_io(cli, ext, &data))
 			return data.erd_page_count;
+	}
+	if (data.erd_page_count == data.erd_max_pages)
+		return data.erd_page_count;
 
-		if (!ext->oe_intree)
-			continue;
-
-		while ((ext = next_extent(ext)) != NULL) {
-			if ((ext->oe_state != OES_CACHE) ||
-			    (!list_empty(&ext->oe_link) &&
-			     ext->oe_owner))
-				continue;
-
-			if (!try_to_add_extent_for_io(cli, ext, &data))
-				return data.erd_page_count;
-		}
+	/*
+	 * One key difference between full extents and other extents: full
+	 * extents can usually only be added if the rpclist was empty, so if we
+	 * can't add one, we continue on to trying to add normal extents.  This
+	 * is so we don't miss adding extra extents to an RPC containing high
+	 * priority or urgent extents.
+	 */
+	while (!list_empty(&obj->oo_full_exts)) {
+		ext = list_entry(obj->oo_full_exts.next,
+				 struct osc_extent, oe_link);
+		if (!try_to_add_extent_for_io(cli, ext, &data))
+			break;
 	}
 	if (data.erd_page_count == data.erd_max_pages)
 		return data.erd_page_count;
@@ -2879,8 +2888,12 @@ int osc_cache_truncate_start(const struct lu_env *env, struct osc_object *obj,
 			osc_update_pending(obj, OBD_BRW_WRITE,
 					   -ext->oe_nr_pages);
 		}
-		EASSERT(list_empty(&ext->oe_link), ext);
-		list_add_tail(&ext->oe_link, &list);
+		/* This extent could be on the full extents list, that's OK */
+		EASSERT(!ext->oe_hp && !ext->oe_urgent, ext);
+		if (!list_empty(&ext->oe_link))
+			list_move_tail(&ext->oe_link, &list);
+		else
+			list_add_tail(&ext->oe_link, &list);
 
 		ext = next_extent(ext);
 	}
diff --git a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
index d86d3f7..da04c2c 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
@@ -139,7 +139,7 @@ struct osc_object {
 	 */
 	struct list_head	   oo_hp_exts; /* list of hp extents */
 	struct list_head	   oo_urgent_exts; /* list of writeback extents */
-	struct list_head	   oo_rpc_exts;
+	struct list_head	   oo_full_exts;
 
 	struct list_head	   oo_reading_exts;
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_object.c b/drivers/staging/lustre/lustre/osc/osc_object.c
index 8424018..b9bf2b8 100644
--- a/drivers/staging/lustre/lustre/osc/osc_object.c
+++ b/drivers/staging/lustre/lustre/osc/osc_object.c
@@ -85,7 +85,7 @@ static int osc_object_init(const struct lu_env *env, struct lu_object *obj,
 	osc->oo_root.rb_node = NULL;
 	INIT_LIST_HEAD(&osc->oo_hp_exts);
 	INIT_LIST_HEAD(&osc->oo_urgent_exts);
-	INIT_LIST_HEAD(&osc->oo_rpc_exts);
+	INIT_LIST_HEAD(&osc->oo_full_exts);
 	INIT_LIST_HEAD(&osc->oo_reading_exts);
 	atomic_set(&osc->oo_nr_reads, 0);
 	atomic_set(&osc->oo_nr_writes, 0);
@@ -111,7 +111,7 @@ static void osc_object_free(const struct lu_env *env, struct lu_object *obj)
 	LASSERT(!osc->oo_root.rb_node);
 	LASSERT(list_empty(&osc->oo_hp_exts));
 	LASSERT(list_empty(&osc->oo_urgent_exts));
-	LASSERT(list_empty(&osc->oo_rpc_exts));
+	LASSERT(list_empty(&osc->oo_full_exts));
 	LASSERT(list_empty(&osc->oo_reading_exts));
 	LASSERT(atomic_read(&osc->oo_nr_reads) == 0);
 	LASSERT(atomic_read(&osc->oo_nr_writes) == 0);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 02/31] lustre: obd: add 'network' client mount option
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 01/31] lustre: osc: Send RPCs when extents are full James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 03/31] lustre: obd: change positional struct initializers to C99 James Simmons
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

Add a 'network' mount option on client side. All connections made by
the client must be on the LNet network specified in the 'network'
option.

This option can be useful in case of several Lustre client mount
points on the same node, with each mount point using a different
network. It is also interesting when running Lustre clients from
containers, by restricting each container to a specific network.

This new option is added by tampering with two config commands:
- setup: add a fourth parameter, which is the net to restrict
  connections to. This parameter will be passed down to
  ptlrpc_uuid_to_peer() so that client only connects to peers on the
  restricted network.
- add_conn: skip this command if uuid to connect to is not on
  restricted network.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-7845
Reviewed-on: https://review.whamcloud.com/19792
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/include/lustre_disk.h    |  1 +
 drivers/staging/lustre/lustre/include/lustre_net.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      | 28 ++++++++++++++-
 .../staging/lustre/lustre/obdclass/obd_config.c    | 42 ++++++++++++++++++++++
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 38 ++++++++++++++++++++
 drivers/staging/lustre/lustre/ptlrpc/client.c      |  4 ++-
 drivers/staging/lustre/lustre/ptlrpc/events.c      |  4 +++
 7 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_disk.h b/drivers/staging/lustre/lustre/include/lustre_disk.h
index d5fadde..103eb6a 100644
--- a/drivers/staging/lustre/lustre/include/lustre_disk.h
+++ b/drivers/staging/lustre/lustre/include/lustre_disk.h
@@ -87,6 +87,7 @@ struct lustre_mount_data {
 	__u32	*lmd_exclude;	/* array of OSTs to ignore */
 	char	*lmd_mgs;	/* MGS nid */
 	char	*lmd_osd_type;	/* OSD type */
+	char    *lmd_nidnet;	/* network to restrict this client to */
 };
 
 #define LMD_FLG_SERVER		0x0001	/* Mounting a server */
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index dcad90b..361b897 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -1812,7 +1812,8 @@ static inline int ptlrpc_client_bulk_active(struct ptlrpc_request *req)
 
 void ptlrpc_init_client(int req_portal, int rep_portal, char *name,
 			struct ptlrpc_client *);
-struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid);
+struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid,
+						    lnet_nid_t nid4refnet);
 
 int ptlrpc_queue_wait(struct ptlrpc_request *req);
 int ptlrpc_replay_req(struct ptlrpc_request *req);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 07baea7..5da8c88 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -55,6 +55,7 @@ static int import_set_conn(struct obd_import *imp, struct obd_uuid *uuid,
 {
 	struct ptlrpc_connection *ptlrpc_conn;
 	struct obd_import_conn *imp_conn = NULL, *item;
+	lnet_nid_t nid4refnet = LNET_NID_ANY;
 	int rc = 0;
 
 	if (!create && !priority) {
@@ -62,7 +63,12 @@ static int import_set_conn(struct obd_import *imp, struct obd_uuid *uuid,
 		return -EINVAL;
 	}
 
-	ptlrpc_conn = ptlrpc_uuid_to_connection(uuid);
+	if (imp->imp_connection &&
+	    imp->imp_connection->c_remote_uuid.uuid[0] == 0)
+		/* nid4refnet is used to restrict network connections */
+		nid4refnet = imp->imp_connection->c_self;
+
+	ptlrpc_conn = ptlrpc_uuid_to_connection(uuid, nid4refnet);
 	if (!ptlrpc_conn) {
 		CDEBUG(D_HA, "can't find connection %s\n", uuid->uuid);
 		return -ENOENT;
@@ -233,6 +239,7 @@ void client_destroy_import(struct obd_import *imp)
  * 1 - client UUID
  * 2 - server UUID
  * 3 - inactive-on-startup
+ * 4 - restrictive net
  */
 int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 {
@@ -242,6 +249,10 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	int rq_portal, rp_portal, connect_op;
 	char *name = obddev->obd_type->typ_name;
 	enum ldlm_ns_type ns_type = LDLM_NS_TYPE_UNKNOWN;
+	struct ptlrpc_connection fake_conn = {
+		.c_self = 0,
+		.c_remote_uuid.uuid[0] = 0
+	};
 	int rc;
 
 	/* In a more perfect world, we would hang a ptlrpc_client off of
@@ -412,11 +423,26 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	       LUSTRE_CFG_BUFLEN(lcfg, 1));
 	class_import_put(imp);
 
+	if (lustre_cfg_buf(lcfg, 4)) {
+		u32 refnet = libcfs_str2net(lustre_cfg_string(lcfg, 4));
+
+		if (refnet == LNET_NIDNET(LNET_NID_ANY)) {
+			rc = -EINVAL;
+			CERROR("%s: bad mount option 'network=%s': rc = %d\n",
+			       obddev->obd_name, lustre_cfg_string(lcfg, 4),
+			       rc);
+			goto err_import;
+		}
+		fake_conn.c_self = LNET_MKNID(refnet, 0);
+		imp->imp_connection = &fake_conn;
+	}
+
 	rc = client_import_add_conn(imp, &server_uuid, 1);
 	if (rc) {
 		CERROR("can't add initial connection\n");
 		goto err_import;
 	}
+	imp->imp_connection = NULL;
 
 	cli->cl_import = imp;
 	/* cli->cl_max_mds_easize updated by mdc_init_ea_size() */
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c b/drivers/staging/lustre/lustre/obdclass/obd_config.c
index cfcd17e..6d47435 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
@@ -40,6 +40,8 @@
 #include <linux/uaccess.h>
 #include <linux/string.h>
 
+#include <uapi/linux/lustre/lustre_idl.h>
+#include <lustre_disk.h>
 #include <uapi/linux/lustre/lustre_ioctl.h>
 #include <llog_swab.h>
 #include <lprocfs_status.h>
@@ -1280,6 +1282,7 @@ int class_config_llog_handler(const struct lu_env *env,
 				lcfg->lcfg_command = LCFG_LOV_ADD_INA;
 		}
 
+		lustre_cfg_bufs_reset(&bufs, NULL);
 		lustre_cfg_bufs_init(&bufs, lcfg);
 
 		if (clli && clli->cfg_instance &&
@@ -1323,6 +1326,45 @@ int class_config_llog_handler(const struct lu_env *env,
 						   clli->cfg_obdname);
 		}
 
+		/* Add net info to setup command
+		 * if given on command line.
+		 * So config log will be:
+		 * [0]: client name
+		 * [1]: client UUID
+		 * [2]: server UUID
+		 * [3]: inactive-on-startup
+		 * [4]: restrictive net
+		 */
+		if (clli && clli->cfg_sb && s2lsi(clli->cfg_sb)) {
+			struct lustre_sb_info *lsi = s2lsi(clli->cfg_sb);
+			char *nidnet = lsi->lsi_lmd->lmd_nidnet;
+
+			if (lcfg->lcfg_command == LCFG_SETUP &&
+			    lcfg->lcfg_bufcount != 2 && nidnet) {
+				CDEBUG(D_CONFIG,
+				       "Adding net %s info to setup command for client %s\n",
+				       nidnet, lustre_cfg_string(lcfg, 0));
+				lustre_cfg_bufs_set_string(&bufs, 4, nidnet);
+			}
+		}
+
+		/* Skip add_conn command if uuid is not on restricted net */
+		if (clli && clli->cfg_sb && s2lsi(clli->cfg_sb)) {
+			struct lustre_sb_info *lsi = s2lsi(clli->cfg_sb);
+			char *uuid_str = lustre_cfg_string(lcfg, 1);
+
+			if (lcfg->lcfg_command == LCFG_ADD_CONN &&
+			    lsi->lsi_lmd->lmd_nidnet &&
+			    LNET_NIDNET(libcfs_str2nid(uuid_str)) !=
+			    libcfs_str2net(lsi->lsi_lmd->lmd_nidnet)) {
+				CDEBUG(D_CONFIG, "skipping add_conn for %s\n",
+				       uuid_str);
+				rc = 0;
+				/* No processing! */
+				break;
+			}
+		}
+
 		lcfg_len = lustre_cfg_len(bufs.lcfg_bufcount, bufs.lcfg_buflen);
 		lcfg_new = kzalloc(lcfg_len, GFP_NOFS);
 		if (!lcfg_new) {
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index b84bca4..1d88e8c 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -550,6 +550,7 @@ static int lustre_free_lsi(struct super_block *sb)
 		kfree(lsi->lsi_lmd->lmd_mgs);
 		kfree(lsi->lsi_lmd->lmd_osd_type);
 		kfree(lsi->lsi_lmd->lmd_params);
+		kfree(lsi->lsi_lmd->lmd_nidnet);
 
 		kfree(lsi->lsi_lmd);
 	}
@@ -827,6 +828,27 @@ static int lmd_parse_mgssec(struct lustre_mount_data *lmd, char *ptr)
 	return 0;
 }
 
+static int lmd_parse_network(struct lustre_mount_data *lmd, char *ptr)
+{
+	char *tail;
+	int length;
+
+	kfree(lmd->lmd_nidnet);
+	lmd->lmd_nidnet = NULL;
+
+	tail = strchr(ptr, ',');
+	if (!tail)
+		length = strlen(ptr);
+	else
+		length = tail - ptr;
+
+	lmd->lmd_nidnet = kstrndup(ptr, length, GFP_KERNEL);
+	if (!lmd->lmd_nidnet)
+		return -ENOMEM;
+
+	return 0;
+}
+
 static int lmd_parse_string(char **handle, char *ptr)
 {
 	char   *tail;
@@ -1146,6 +1168,11 @@ int lmd_parse(char *options, struct lustre_mount_data *lmd)
 			 */
 			*s1 = '\0';
 			break;
+		} else if (strncmp(s1, "network=", 8) == 0) {
+			rc = lmd_parse_network(lmd, s1 + 8);
+			if (rc)
+				goto invalid;
+			clear++;
 		}
 
 		/* Find next opt */
@@ -1192,6 +1219,17 @@ int lmd_parse(char *options, struct lustre_mount_data *lmd)
 			if (!lmd->lmd_fileset)
 				return -ENOMEM;
 		}
+	} else {
+		/* server mount */
+		if (lmd->lmd_nidnet) {
+			/* 'network=' mount option forbidden for server */
+			kfree(lmd->lmd_nidnet);
+			lmd->lmd_nidnet = NULL;
+			rc = -EINVAL;
+			CERROR("%s: option 'network=' not allowed for Lustre servers: rc = %d\n",
+			       devname, rc);
+			return rc;
+		}
 	}
 
 	/* Freed in lustre_free_lsi */
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 91dd098..4bf26a4 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -77,7 +77,8 @@ void ptlrpc_init_client(int req_portal, int rep_portal, char *name,
 /**
  * Return PortalRPC connection for remote uud \a uuid
  */
-struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid)
+struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid,
+						    lnet_nid_t nid4refnet)
 {
 	struct ptlrpc_connection *c;
 	lnet_nid_t self;
@@ -89,6 +90,7 @@ struct ptlrpc_connection *ptlrpc_uuid_to_connection(struct obd_uuid *uuid)
 	 * before accessing its values.
 	 * coverity[uninit_use_in_call]
 	 */
+	peer.nid = nid4refnet;
 	err = ptlrpc_uuid_to_peer(uuid, &peer, &self);
 	if (err != 0) {
 		CNETERR("cannot find peer %s!\n", uuid->uuid);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c
index 130bacc..ebf985e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -462,6 +462,10 @@ int ptlrpc_uuid_to_peer(struct obd_uuid *uuid,
 
 	/* Choose the matching UUID that's closest */
 	while (lustre_uuid_to_peer(uuid->uuid, &dst_nid, count++) == 0) {
+		if (peer->nid != LNET_NID_ANY && LNET_NIDADDR(peer->nid) == 0 &&
+		    LNET_NIDNET(dst_nid) != LNET_NIDNET(peer->nid))
+			continue;
+
 		dist = LNetDist(dst_nid, &src_nid, &order);
 		if (dist < 0)
 			continue;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 03/31] lustre: obd: change positional struct initializers to C99
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 01/31] lustre: osc: Send RPCs when extents are full James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 02/31] lustre: obd: add 'network' client mount option James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 04/31] lustre: lmv: honour the specified stripe index James Simmons
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Steve Guminski <stephenx.guminski@intel.com>

This patch makes no functional changes.  Struct initializers in the
obd directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is changed to match the code style guidelines.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/obdclass/obd_config.c:
        struct llog_process_cat_data

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23697
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/obd_config.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c b/drivers/staging/lustre/lustre/obdclass/obd_config.c
index 6d47435..d962f0c 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
@@ -1417,7 +1417,9 @@ int class_config_llog_handler(const struct lu_env *env,
 int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
 			    char *name, struct config_llog_instance *cfg)
 {
-	struct llog_process_cat_data	 cd = {0, 0};
+	struct llog_process_cat_data cd = {
+		.lpcd_first_idx = 0,
+	};
 	struct llog_handle		*llh;
 	llog_cb_t			 callback;
 	int				 rc;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 04/31] lustre: lmv: honour the specified stripe index
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (2 preceding siblings ...)
  2018-07-31  2:25 ` [lustre-devel] [PATCH 03/31] lustre: obd: change positional struct initializers to C99 James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 05/31] lustre: llite: return small device numbers for compat stat() James Simmons
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Di Wang <di.wang@intel.com>

when creating the striped directory, specified
stripe index should always be used even the parent
has default stripe index.

Signed-off-by: Di Wang <di.wang@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-8994
Reviewed-on: https://review.whamcloud.com/24777
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c | 44 +++++++++++------------------
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 3da5a0a..bbb1ddf 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1128,7 +1128,8 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp,
 static int lmv_placement_policy(struct obd_device *obd,
 				struct md_op_data *op_data, u32 *mds)
 {
-	struct lmv_obd	  *lmv = &obd->u.lmv;
+	struct lmv_obd *lmv = &obd->u.lmv;
+	struct lmv_user_md *lum;
 
 	LASSERT(mds);
 
@@ -1137,34 +1138,23 @@ static int lmv_placement_policy(struct obd_device *obd,
 		return 0;
 	}
 
-	if (op_data->op_default_stripe_offset != -1) {
-		*mds = op_data->op_default_stripe_offset;
-		return 0;
-	}
-
-	/**
-	 * If stripe_offset is provided during setdirstripe
-	 * (setdirstripe -i xx), xx MDS will be chosen.
+	lum = op_data->op_data;
+	/* Choose MDS by
+	 * 1. See if the stripe offset is specified by lum.
+	 * 2. Then check if there is default stripe offset.
+	 * 3. Finally choose MDS by name hash if the parent
+	 *    is striped directory. (see lmv_locate_mds()).
 	 */
-	if (op_data->op_cli_flags & CLI_SET_MEA && op_data->op_data) {
-		struct lmv_user_md *lum;
-
-		lum = op_data->op_data;
-		if (le32_to_cpu(lum->lum_stripe_offset) != (__u32)-1) {
-			*mds = le32_to_cpu(lum->lum_stripe_offset);
-		} else {
-			/*
-			 * -1 means default, which will be in the same MDT with
-			 * the stripe
-			 */
-			*mds = op_data->op_mds;
-			lum->lum_stripe_offset = cpu_to_le32(op_data->op_mds);
-		}
+	if (op_data->op_cli_flags & CLI_SET_MEA && lum &&
+	    le32_to_cpu(lum->lum_stripe_offset) != (u32)-1) {
+		*mds = le32_to_cpu(lum->lum_stripe_offset);
+	} else if (op_data->op_default_stripe_offset != (u32)-1) {
+		*mds = op_data->op_default_stripe_offset;
+		op_data->op_mds = *mds;
+		/* Correct the stripe offset in lum */
+		if (lum)
+			lum->lum_stripe_offset = cpu_to_le32(*mds);
 	} else {
-		/*
-		 * Allocate new fid on target according to operation type and
-		 * parent home mds.
-		 */
 		*mds = op_data->op_mds;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 05/31] lustre: llite: return small device numbers for compat stat()
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (3 preceding siblings ...)
  2018-07-31  2:25 ` [lustre-devel] [PATCH 04/31] lustre: lmv: honour the specified stripe index James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  2:25 ` [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window James Simmons
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: "John L. Hammond" <jhammond@whamcloud.com>

The compat_sys_*stat*() syscalls will fail unless the devices majors
and minors are both less than 256. So in ll_getattr_it(), if we are in
32 bit compat mode then coerce the device numbers in to the expected
format.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-8855
Reviewed-on: https://review.whamcloud.com/23877
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 3f0f379..684877c 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3248,14 +3248,20 @@ int ll_getattr(const struct path *path, struct kstat *stat,
 	OBD_FAIL_TIMEOUT(OBD_FAIL_GETATTR_DELAY, 30);
 
 	stat->dev = inode->i_sb->s_dev;
-	if (ll_need_32bit_api(sbi))
+	if (ll_need_32bit_api(sbi)) {
 		stat->ino = cl_fid_build_ino(&lli->lli_fid, 1);
-	else
+		stat->dev = MKDEV(MAJOR(inode->i_sb->s_dev) & 0xff,
+				  MINOR(inode->i_sb->s_dev) & 0xff);
+		stat->rdev = MKDEV(MAJOR(inode->i_rdev) & 0xff,
+				   MINOR(inode->i_rdev) & 0xff);
+	} else {
+		stat->dev = inode->i_sb->s_dev;
+		stat->rdev = inode->i_rdev;
 		stat->ino = inode->i_ino;
+	}
 	stat->mode = inode->i_mode;
 	stat->uid = inode->i_uid;
 	stat->gid = inode->i_gid;
-	stat->rdev = inode->i_rdev;
 	stat->atime = inode->i_atime;
 	stat->mtime = inode->i_mtime;
 	stat->ctime = inode->i_ctime;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (4 preceding siblings ...)
  2018-07-31  2:25 ` [lustre-devel] [PATCH 05/31] lustre: llite: return small device numbers for compat stat() James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31  4:05   ` Patrick Farrell
  2018-07-31  2:25 ` [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99 James Simmons
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Patrick Farrell <paf@cray.com>

In the current code, lli_jobid is set to zero on every call
to lustre_get_jobid.  This causes problems, because it's
used asynchronously to set the job id in RPCs, and some
RPCs will falsely get no jobid set.  (For small IO sizes,
this can be up to 60% of RPCs.)

It would be very expensive to put hard synchronization
between this and every outbound RPC, and it's OK to very
rarely get an RPC without correct job stats info.

This patch only updates the lli_jobid when the job id has
changed, which leaves only a very small window for reading
an inconsistent job id.

Signed-off-by: Patrick Farrell <paf@cray.com>
WC-id: https://jira.whamcloud.com/browse/LU-8926
Reviewed-on: https://review.whamcloud.com/24253
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index c0861b9..72b118a 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)
 		lli->lli_async_rc = 0;
 	}
 	mutex_init(&lli->lli_layout_mutex);
+	memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);
 }
 
 int ll_fill_super(struct super_block *sb)
diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index cdaf729..87327ef 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -95,26 +95,34 @@
  */
 int lustre_get_jobid(char *jobid)
 {
-	memset(jobid, 0, LUSTRE_JOBID_SIZE);
+	char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };
+
 	/* Jobstats isn't enabled */
 	if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)
-		return 0;
+		goto out_cache_jobid;
 
 	/* Use process name + fsuid as jobid */
 	if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {
-		snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",
+		snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",
 			 current->comm,
 			 from_kuid(&init_user_ns, current_fsuid()));
-		return 0;
+		goto out_cache_jobid;
 	}
 
 	/* Whole node dedicated to single job */
 	if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {
-		strcpy(jobid, obd_jobid_node);
-		return 0;
+		strcpy(tmp_jobid, obd_jobid_node);
+		goto out_cache_jobid;
 	}
 
 	return -ENOENT;
+
+out_cache_jobid:
+	/* Only replace the job ID if it changed. */
+	if (strcmp(jobid, tmp_jobid) != 0)
+		strcpy(jobid, tmp_jobid);
+
+	return 0;
 }
 EXPORT_SYMBOL(lustre_get_jobid);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (5 preceding siblings ...)
  2018-07-31  2:25 ` [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window James Simmons
@ 2018-07-31  2:25 ` James Simmons
  2018-07-31 22:32   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally James Simmons
                   ` (24 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:25 UTC (permalink / raw)
  To: lustre-devel

From: Steve Guminski <stephenx.guminski@intel.com>

This patch makes no functional changes. Struct initializers in the
lnet directory that use C89 or GCC-only syntax are updated to C99
syntax. Whitespace is corrected to match coding style guidelines.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

The following struct initializers have been updated:

lnet/lnet/api-ni.c:
        lnet_process_id_t id
lnet/lnet/lo.c:
        lnd_t the_lolnd
lnet/selftest/framework.c:
        struct lst_sid LST_INVALID_SID

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23493
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c        |  3 ++-
 drivers/staging/lustre/lnet/lnet/lo.c            | 20 +++++++++-----------
 drivers/staging/lustre/lnet/selftest/framework.c |  2 +-
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index fea0373..e517893 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -884,7 +884,8 @@ struct lnet_ni  *
 		     struct lnet_handle_md *md_handle,
 		     int ni_count, bool set_eq)
 {
-	struct lnet_process_id id = {LNET_NID_ANY, LNET_PID_ANY};
+	struct lnet_process_id id = { .nid = LNET_NID_ANY,
+				      .pid = LNET_PID_ANY };
 	struct lnet_handle_me me_handle;
 	struct lnet_md md = { NULL };
 	int rc, rc2;
diff --git a/drivers/staging/lustre/lnet/lnet/lo.c b/drivers/staging/lustre/lnet/lnet/lo.c
index 7456b98..dd16cdf 100644
--- a/drivers/staging/lustre/lnet/lnet/lo.c
+++ b/drivers/staging/lustre/lnet/lnet/lo.c
@@ -91,15 +91,13 @@
 }
 
 struct lnet_lnd the_lolnd = {
-	/* .lnd_list       = */ {&the_lolnd.lnd_list, &the_lolnd.lnd_list},
-	/* .lnd_refcount   = */ 0,
-	/* .lnd_type       = */ LOLND,
-	/* .lnd_startup    = */ lolnd_startup,
-	/* .lnd_shutdown   = */ lolnd_shutdown,
-	/* .lnt_ctl        = */ NULL,
-	/* .lnd_send       = */ lolnd_send,
-	/* .lnd_recv       = */ lolnd_recv,
-	/* .lnd_eager_recv = */ NULL,
-	/* .lnd_notify     = */ NULL,
-	/* .lnd_accept     = */ NULL
+	.lnd_list	= {
+				.next	= &the_lolnd.lnd_list,
+				.prev	= &the_lolnd.lnd_list
+			},
+	.lnd_type	= LOLND,
+	.lnd_startup	= lolnd_startup,
+	.lnd_shutdown	= lolnd_shutdown,
+	.lnd_send	= lolnd_send,
+	.lnd_recv	= lolnd_recv,
 };
diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index 03a64e3..944a2a6 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -40,7 +40,7 @@
 
 #include "selftest.h"
 
-struct lst_sid LST_INVALID_SID = {LNET_NID_ANY, -1};
+struct lst_sid LST_INVALID_SID = { .ses_nid = LNET_NID_ANY, .ses_stamp = -1 };
 
 static int session_timeout = 100;
 module_param(session_timeout, int, 0444);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (6 preceding siblings ...)
  2018-07-31  2:25 ` [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99 James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31 22:31   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 09/31] lustre: mgc: relate sptlrpc & param to MGC James Simmons
                   ` (23 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

In ll_md_blocking_ast(), we zero all timestamps to avoid these
'leftovers' interfering the new timestamps from MDS, especially
when the timestamps are set back by other clients. It's not
quite right to change timestamps in this way, because:

1. The pending lock can be matched by getattr, so these zero
   timestamps can be fetched by application in a small race window.

2. It doesn't make sense to zero the mtime and ctime, because we
   always use the newest ctime and mtime from MDS when do attributes
   merge, they won't interfere new timestamps set by other clients.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9033
Reviewed-on: https://review.whamcloud.com/24984
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c           | 4 +++-
 drivers/staging/lustre/lustre/llite/llite_internal.h | 5 +++++
 drivers/staging/lustre/lustre/llite/namei.c          | 6 +-----
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 684877c..e4aefb5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1024,8 +1024,10 @@ int ll_merge_attr(const struct lu_env *env, struct inode *inode)
 	 * POSIX. Solving this problem needs to send an RPC to MDT for each
 	 * read, this will hurt performance.
 	 */
-	if (inode->i_atime.tv_sec < lli->lli_atime)
+	if (inode->i_atime.tv_sec < lli->lli_atime || lli->lli_update_atime) {
 		inode->i_atime.tv_sec = lli->lli_atime;
+		lli->lli_update_atime = 0;
+	}
 	inode->i_mtime.tv_sec = lli->lli_mtime;
 	inode->i_ctime.tv_sec = lli->lli_ctime;
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 8399501..f6c8daf 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -138,6 +138,11 @@ struct ll_inode_info {
 	s64				lli_ctime;
 	spinlock_t			lli_agl_lock;
 
+	/* update atime from MDS no matter if it's older than
+	 * local inode atime.
+	 */
+	unsigned int			lli_update_atime:1;
+
 	/* Try to make the d::member and f::member are aligned. Before using
 	 * these members, make clear whether it is directory or not.
 	 */
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 134cc31..e541f78 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -265,11 +265,7 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 		if (bits & MDS_INODELOCK_UPDATE) {
 			struct ll_inode_info *lli = ll_i2info(inode);
 
-			spin_lock(&lli->lli_lock);
-			inode->i_mtime.tv_sec = 0;
-			inode->i_atime.tv_sec = 0;
-			inode->i_ctime.tv_sec = 0;
-			spin_unlock(&lli->lli_lock);
+			lli->lli_update_atime = 1;
 		}
 
 		if ((bits & MDS_INODELOCK_UPDATE) && S_ISDIR(inode->i_mode)) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 09/31] lustre: mgc: relate sptlrpc & param to MGC
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (7 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 10/31] lustre: lnet: removal of obsolete LNDs James Simmons
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Hongchao Zhang <hongchao@whamcloud.com>

If sptlrpc or params config logs come from different MGC,
it should be regarded as different logs, this patch binds
these config logs with MGC obd device to separate them.

The fix for a bug discovered later is also included for
this patch. Since sb is NULL for config_log_find_or_add
the cfs_instance field was being set to the obd device.
This confused the sptlrpc layer so for now cfg_instance
is set to NULl in the sptlrpc case. This will be resolved
with the move to kobjects.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Intel-bug-id: https://jira.whamcloud.com/browse/LU-9034
Reviewed-on: https://review.whamcloud.com/24988
Intel-bug-id: https://jira.whamcloud.com/browse/LU-9567
Reviewed-on: https://review.whamcloud.com/27320
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mgc/mgc_request.c | 36 ++++++++++++++-----------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 32df804..82acac0 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -263,18 +263,23 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 }
 
 static struct config_llog_data *
-config_params_log_add(struct obd_device *obd,
-		      struct config_llog_instance *cfg, struct super_block *sb)
+config_log_find_or_add(struct obd_device *obd, char *logname,
+		       struct super_block *sb, int type,
+		       struct config_llog_instance *cfg)
 {
 	struct config_llog_instance	lcfg = *cfg;
 	struct config_llog_data		*cld;
 
-	lcfg.cfg_instance = sb;
+	lcfg.cfg_instance = sb ? (void *)sb : (void *)obd;
 
-	cld = do_config_log_add(obd, PARAMS_FILENAME, CONFIG_T_PARAMS,
-				&lcfg, sb);
+	if (type == CONFIG_T_SPTLRPC)
+		lcfg.cfg_instance = NULL;
 
-	return cld;
+	cld = config_log_find(logname, &lcfg);
+	if (unlikely(cld))
+		return cld;
+
+	return do_config_log_add(obd, logname, type, &lcfg, sb);
 }
 
 /** Add this log to the list of active logs watched by an MGC.
@@ -310,17 +315,16 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	memcpy(seclogname, logname, ptr - logname);
 	strcpy(seclogname + (ptr - logname), "-sptlrpc");
 
-	sptlrpc_cld = config_log_find(seclogname, NULL);
-	if (!sptlrpc_cld) {
-		sptlrpc_cld = do_config_log_add(obd, seclogname,
-						CONFIG_T_SPTLRPC, NULL, NULL);
-		if (IS_ERR(sptlrpc_cld)) {
-			CERROR("can't create sptlrpc log: %s\n", seclogname);
-			rc = PTR_ERR(sptlrpc_cld);
-			goto out_err;
-		}
+	sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
+					     CONFIG_T_SPTLRPC, cfg);
+	if (IS_ERR(sptlrpc_cld)) {
+		CERROR("can't create sptlrpc log: %s\n", seclogname);
+		rc = PTR_ERR(sptlrpc_cld);
+		goto out_err;
 	}
-	params_cld = config_params_log_add(obd, cfg, sb);
+
+	params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
+					    CONFIG_T_PARAMS, cfg);
 	if (IS_ERR(params_cld)) {
 		rc = PTR_ERR(params_cld);
 		CERROR("%s: can't create params log: rc = %d\n",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 10/31] lustre: lnet: removal of obsolete LNDs
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (8 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 09/31] lustre: mgc: relate sptlrpc & param to MGC James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print James Simmons
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Sonia Sharma <sharmaso@whamcloud.com>

Obsolete LNDs were already removed. Commented out the name<->network
number mapping for the obsolete LNDs.

Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-8769
Reviewed-on: https://review.whamcloud.com/23621
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/include/uapi/linux/lnet/nidstr.h    | 18 +++++++++---------
 drivers/staging/lustre/lnet/lnet/api-ni.c              |  6 ------
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/nidstr.h b/drivers/staging/lustre/include/uapi/linux/lnet/nidstr.h
index 882074e..3354e5a 100644
--- a/drivers/staging/lustre/include/uapi/linux/lnet/nidstr.h
+++ b/drivers/staging/lustre/include/uapi/linux/lnet/nidstr.h
@@ -38,18 +38,18 @@ enum {
 	 * Only add to these values (i.e. don't ever change or redefine them):
 	 * network addresses depend on them...
 	 */
-	QSWLND		= 1,
+	/*QSWLND	= 1, removed v2_7_50			*/
 	SOCKLND		= 2,
-	GMLND		= 3,
-	PTLLND		= 4,
+	/*GMLND		= 3, removed v2_0_0-rc1a-16-gc660aac	*/
+	/*PTLLND	= 4, removed v2_7_50			*/
 	O2IBLND		= 5,
-	CIBLND		= 6,
-	OPENIBLND	= 7,
-	IIBLND		= 8,
+	/*CIBLND        = 6, removed v2_0_0-rc1a-175-gd2b8a0e	*/
+	/*OPENIBLND	= 7, removed v2_0_0-rc1a-175-gd2b8a0e	*/
+	/*IIBLND	= 8, removed v2_0_0-rc1a-175-gd2b8a0e	*/
 	LOLND		= 9,
-	RALND		= 10,
-	VIBLND		= 11,
-	MXLND		= 12,
+	/*RALND		= 10, removed v2_7_50_0-34-g8be9e41	*/
+	/*VIBLND	= 11, removed v2_0_0-rc1a-175-gd2b8a0e	*/
+	/*MXLND		= 12, removed v2_7_50_0-34-g8be9e41	*/
 	GNILND		= 13,
 	GNIIPLND	= 14,
 };
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index e517893..a949ac2 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1207,12 +1207,6 @@ struct lnet_ni  *
 
 	LASSERT(libcfs_isknown_lnd(lnd_type));
 
-	if (lnd_type == CIBLND || lnd_type == OPENIBLND ||
-	    lnd_type == IIBLND || lnd_type == VIBLND) {
-		CERROR("LND %s obsoleted\n", libcfs_lnd2str(lnd_type));
-		goto failed0;
-	}
-
 	/* Make sure this new NI is unique. */
 	lnet_net_lock(LNET_LOCK_EX);
 	rc = lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nis);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (9 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 10/31] lustre: lnet: removal of obsolete LNDs James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31 22:38   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 12/31] lustre: obdclass: obdclass module cleanup upon load error James Simmons
                   ` (20 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Amir Shehata <ashehata@whamcloud.com>

The default number of hops for  a route is -1. This is
currently being printed as %u. Change that to %d to
make it print out properly.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-9078
Reviewed-on: https://review.whamcloud.com/25250
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index 8856798..aa98ce5 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
 			int alive = lnet_is_route_alive(route);
 
 			s += snprintf(s, tmpstr + tmpsiz - s,
-				      "%-8s %4u %8u %7s %s\n",
+				      "%-8s %4d %8u %7s %s\n",
 				      libcfs_net2str(net), hops,
 				      priority,
 				      alive ? "up" : "down",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 12/31] lustre: obdclass: obdclass module cleanup upon load error
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (10 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP James Simmons
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Bruno Faccini <bruno.faccini@intel.com>

Fix obdclass_init() error paths to proceed with cleanup.
This will particularly allow to no longer crash upon next
load attempt and this due to previous miscdevice not been
deregistered and thus still referenced in misc_list when
unmapped.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-6499
Reviewed-on: https://review.whamcloud.com/22544
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/include/obd_support.h    |  1 +
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 53 ++++++++++++++++++----
 2 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 87806e8..726cc4d 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -359,6 +359,7 @@
 #define OBD_FAIL_OBD_IDX_READ_NET	0x607
 #define OBD_FAIL_OBD_IDX_READ_BREAK	 0x608
 #define OBD_FAIL_OBD_NO_LRU		 0x609
+#define OBD_FAIL_OBDCLASS_MODULE_LOAD	 0x60a
 
 #define OBD_FAIL_TGT_REPLY_NET	   0x700
 #define OBD_FAIL_TGT_CONN_RACE	   0x701
diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index 87327ef..04e55fc 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -469,19 +469,19 @@ static int __init obdclass_init(void)
 
 	err = obd_init_checks();
 	if (err)
-		return err;
+		goto cleanup_zombie_impexp;
 
 	class_init_uuidlist();
 	err = class_handle_init();
 	if (err)
-		return err;
+		goto cleanup_uuidlist;
 
 	INIT_LIST_HEAD(&obd_types);
 
 	err = misc_register(&obd_psdev);
 	if (err) {
 		CERROR("cannot register OBD miscdevices: err %d\n", err);
-		return err;
+		goto cleanup_class_handle;
 	}
 
 	/* This struct is already zeroed for us (static global) */
@@ -499,25 +499,62 @@ static int __init obdclass_init(void)
 
 	err = obd_init_caches();
 	if (err)
-		return err;
+		goto cleanup_deregister;
 
 	err = class_procfs_init();
 	if (err)
-		return err;
+		goto cleanup_caches;
 
 	err = obd_sysctl_init();
 	if (err)
-		return err;
+		goto cleanup_class_procfs;
 
 	err = lu_global_init();
 	if (err)
-		return err;
+		goto cleanup_class_procfs;
 
 	err = cl_global_init();
 	if (err != 0)
-		return err;
+		goto cleanup_lu_global;
 
 	err = llog_info_init();
+	if (err)
+		goto cleanup_cl_global;
+
+	/* simulate a late OOM situation now to require all
+	 * alloc'ed/initialized resources to be freed
+	 */
+	if (!OBD_FAIL_CHECK(OBD_FAIL_OBDCLASS_MODULE_LOAD))
+		return 0;
+
+	/* force error to ensure module will be unloaded/cleaned */
+	err = -ENOMEM;
+
+	llog_info_fini();
+
+cleanup_cl_global:
+	cl_global_fini();
+
+cleanup_lu_global:
+	lu_global_fini();
+
+cleanup_class_procfs:
+	class_procfs_clean();
+
+cleanup_caches:
+	obd_cleanup_caches();
+
+cleanup_deregister:
+	misc_deregister(&obd_psdev);
+
+cleanup_class_handle:
+	class_handle_cleanup();
+
+cleanup_uuidlist:
+	class_exit_uuidlist();
+
+cleanup_zombie_impexp:
+	obd_zombie_impexp_stop();
 
 	return err;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (11 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 12/31] lustre: obdclass: obdclass module cleanup upon load error James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31 22:41   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 14/31] lustre: llite: buggy special handling on MULTIMODRPCS James Simmons
                   ` (18 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

Lustre target processes client log to retrieve MDT NIDs and start
LWPs, it goes the same code path of mgc_process_config() just like
processing the target config log, so that sub clds for security,
nodemap, param & recovery will be attached unnecessarily.

The mgc subsystem is used by both server and client. This change
allows us to cleanly handle the future case when the mgc layer
would be built with server code. This way server specific config
logs will only be processed when a server mount occurs.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9081
Reviewed-on: https://review.whamcloud.com/25293
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd_class.h | 19 +++++++-----
 drivers/staging/lustre/lustre/llite/llite_lib.c   |  1 +
 drivers/staging/lustre/lustre/mgc/mgc_request.c   | 37 +++++++++++++----------
 3 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index adfe2ab..e772e3d 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -153,17 +153,22 @@ struct config_llog_instance {
 	llog_cb_t	    cfg_callback;
 	int		    cfg_last_idx; /* for partial llog processing */
 	int		    cfg_flags;
+	u32		    cfg_sub_clds;
 };
 
 int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
 			    char *name, struct config_llog_instance *cfg);
-enum {
-	CONFIG_T_CONFIG  = 0,
-	CONFIG_T_SPTLRPC = 1,
-	CONFIG_T_RECOVER = 2,
-	CONFIG_T_PARAMS  = 3,
-	CONFIG_T_MAX     = 4
-};
+
+#define CONFIG_T_CONFIG		BIT(0)
+#define CONFIG_T_SPTLRPC	BIT(1)
+#define CONFIG_T_RECOVER	BIT(2)
+#define CONFIG_T_PARAMS		BIT(3)
+
+/* Sub clds should be attached to the config_llog_data when processing
+ * config log for client or server target.
+ */
+#define CONFIG_SUB_CLIENT	(CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
+				 CONFIG_T_PARAMS)
 
 #define PARAMS_FILENAME	"params"
 #define LCTL_UPCALL	"lctl"
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 72b118a..71eb42d 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -949,6 +949,7 @@ int ll_fill_super(struct super_block *sb)
 	cfg->cfg_instance = sb;
 	cfg->cfg_uuid = lsi->lsi_llsbi->ll_sb_uuid;
 	cfg->cfg_callback = class_config_llog_handler;
+	cfg->cfg_sub_clds = CONFIG_SUB_CLIENT;
 	/* set up client obds */
 	err = lustre_process_log(sb, profilenm, cfg);
 	if (err < 0)
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 82acac0..06fcc7e 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -293,8 +293,8 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 {
 	struct lustre_sb_info *lsi = s2lsi(sb);
 	struct config_llog_data *cld;
-	struct config_llog_data *sptlrpc_cld;
-	struct config_llog_data *params_cld;
+	struct config_llog_data *sptlrpc_cld = NULL;
+	struct config_llog_data *params_cld = NULL;
 	struct config_llog_data *recover_cld = NULL;
 	char			seclogname[32];
 	char			*ptr;
@@ -315,21 +315,25 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	memcpy(seclogname, logname, ptr - logname);
 	strcpy(seclogname + (ptr - logname), "-sptlrpc");
 
-	sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
-					     CONFIG_T_SPTLRPC, cfg);
-	if (IS_ERR(sptlrpc_cld)) {
-		CERROR("can't create sptlrpc log: %s\n", seclogname);
-		rc = PTR_ERR(sptlrpc_cld);
-		goto out_err;
+	if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
+		sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
+						     CONFIG_T_SPTLRPC, cfg);
+		if (IS_ERR(sptlrpc_cld)) {
+			CERROR("can't create sptlrpc log: %s\n", seclogname);
+			rc = PTR_ERR(sptlrpc_cld);
+			goto out_err;
+		}
 	}
 
-	params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
-					    CONFIG_T_PARAMS, cfg);
-	if (IS_ERR(params_cld)) {
-		rc = PTR_ERR(params_cld);
-		CERROR("%s: can't create params log: rc = %d\n",
-		       obd->obd_name, rc);
-		goto out_sptlrpc;
+	if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
+		params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
+						    CONFIG_T_PARAMS, cfg);
+		if (IS_ERR(params_cld)) {
+			rc = PTR_ERR(params_cld);
+			CERROR("%s: can't create params log: rc = %d\n",
+			       obd->obd_name, rc);
+			goto out_sptlrpc;
+		}
 	}
 
 	cld = do_config_log_add(obd, logname, CONFIG_T_CONFIG, cfg, sb);
@@ -340,7 +344,8 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	}
 
 	LASSERT(lsi->lsi_lmd);
-	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)) {
+	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
+	    cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
 		ptr = strrchr(seclogname, '-');
 		if (ptr) {
 			*ptr = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 14/31] lustre: llite: buggy special handling on MULTIMODRPCS
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (12 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 15/31] lustre: clio: remove unused members from struct cl_thread_info James Simmons
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

There is some special handling over MULTIMODPRCS flag in
client_connect_import(), it looks unnecessary and buggy,
the MULTIMODPRCS flag would be cleared unexpectedly from
imp_connect_data on reconnect.

This patch removed the special handling code and treat
MULTIMODRPCS normally just like other flags.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9115
Reviewed-on: https://review.whamcloud.com/25435
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c   | 12 ------------
 drivers/staging/lustre/lustre/llite/llite_lib.c |  2 +-
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 5da8c88..c36d1e4 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -512,7 +512,6 @@ int client_connect_import(const struct lu_env *env,
 	struct obd_import       *imp    = cli->cl_import;
 	struct obd_connect_data *ocd;
 	struct lustre_handle    conn    = { 0 };
-	bool is_mdc = false;
 	int		     rc;
 
 	*exp = NULL;
@@ -539,18 +538,12 @@ int client_connect_import(const struct lu_env *env,
 	ocd = &imp->imp_connect_data;
 	if (data) {
 		*ocd = *data;
-		is_mdc = !strncmp(imp->imp_obd->obd_type->typ_name,
-				  LUSTRE_MDC_NAME, 3);
-		if (is_mdc)
-			data->ocd_connect_flags |= OBD_CONNECT_MULTIMODRPCS;
 		imp->imp_connect_flags_orig = data->ocd_connect_flags;
 		imp->imp_connect_flags2_orig = data->ocd_connect_flags2;
 	}
 
 	rc = ptlrpc_connect_import(imp);
 	if (rc != 0) {
-		if (data && is_mdc)
-			data->ocd_connect_flags &= ~OBD_CONNECT_MULTIMODRPCS;
 		LASSERT(imp->imp_state == LUSTRE_IMP_DISCON);
 		goto out_ldlm;
 	}
@@ -561,11 +554,6 @@ int client_connect_import(const struct lu_env *env,
 			 ocd->ocd_connect_flags, "old %#llx, new %#llx\n",
 			 data->ocd_connect_flags, ocd->ocd_connect_flags);
 		data->ocd_connect_flags = ocd->ocd_connect_flags;
-		/* clear the flag as it was not set and is not known
-		 * by upper layers
-		 */
-		if (is_mdc)
-			data->ocd_connect_flags &= ~OBD_CONNECT_MULTIMODRPCS;
 	}
 
 	ptlrpc_pinger_add_import(imp);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 71eb42d..ccb5bda 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -211,7 +211,7 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt)
 				  OBD_CONNECT_DIR_STRIPE |
 				  OBD_CONNECT_BULK_MBITS |
 				  OBD_CONNECT_SUBTREE |
-				  OBD_CONNECT_FLAGS2;
+				  OBD_CONNECT_FLAGS2 | OBD_CONNECT_MULTIMODRPCS;
 
 	data->ocd_connect_flags2 = 0;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 15/31] lustre: clio: remove unused members from struct cl_thread_info
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (13 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 14/31] lustre: llite: buggy special handling on MULTIMODRPCS James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 16/31] lustre: obd: remove OBD_NOTIFY_SYNC{, _NONBLOCK} James Simmons
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Dmitry Eremin <dmitry.eremin@intel.com>

The pointer to the topmost ongoing IO in the thread and
other members are not used any more.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-8888
Reviewed-on: https://review.whamcloud.com/24062
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |  1 -
 .../staging/lustre/lustre/obdclass/cl_internal.h   | 45 +---------------------
 drivers/staging/lustre/lustre/obdclass/cl_io.c     | 21 +---------
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 16 +-------
 4 files changed, 4 insertions(+), 79 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 58af22e..382bfe8 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -2291,7 +2291,6 @@ int cl_io_commit_async(const struct lu_env *env, struct cl_io *io,
 		       cl_commit_cbt cb);
 int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
 		     pgoff_t start, struct cl_read_ahead *ra);
-int cl_io_is_going(const struct lu_env *env);
 
 /**
  * True, iff \a io is an O_APPEND write(2).
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_internal.h b/drivers/staging/lustre/lustre/obdclass/cl_internal.h
index a0db830..8770e32 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_internal.h
+++ b/drivers/staging/lustre/lustre/obdclass/cl_internal.h
@@ -37,57 +37,14 @@
 #ifndef _CL_INTERNAL_H
 #define _CL_INTERNAL_H
 
-#define CLT_PVEC_SIZE (14)
-
-/**
- * Possible levels of the nesting. Currently this is 2: there are "top"
- * entities (files, extent locks), and "sub" entities (stripes and stripe
- * locks). This is used only for debugging counters right now.
- */
-enum clt_nesting_level {
-	CNL_TOP,
-	CNL_SUB,
-	CNL_NR
-};
-
 /**
  * Thread local state internal for generic cl-code.
  */
 struct cl_thread_info {
-	/*
-	 * Common fields.
-	 */
-	struct cl_io	 clt_io;
-	struct cl_2queue     clt_queue;
-
-	/*
-	 * Fields used by cl_lock.c
-	 */
-	struct cl_lock_descr clt_descr;
-	struct cl_page_list  clt_list;
-	/** @} debugging */
-
-	/*
-	 * Fields used by cl_page.c
-	 */
-	struct cl_page      *clt_pvec[CLT_PVEC_SIZE];
-
-	/*
-	 * Fields used by cl_io.c
-	 */
 	/**
-	 * Pointer to the topmost ongoing IO in this thread.
-	 */
-	struct cl_io	*clt_current_io;
-	/**
-	 * Used for submitting a sync io.
+	 * Used for submitting a sync I/O.
 	 */
 	struct cl_sync_io    clt_anchor;
-	/**
-	 * Fields used by cl_lock_discard_pages().
-	 */
-	pgoff_t	      clt_next_index;
-	pgoff_t	      clt_fn_index; /* first non-overlapped index */
 };
 
 struct cl_thread_info *cl_env_info(const struct lu_env *env);
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c b/drivers/staging/lustre/lustre/obdclass/cl_io.c
index 2c77e72..3a96d4a 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
@@ -68,14 +68,6 @@ static inline int cl_io_is_loopable(const struct cl_io *io)
 }
 
 /**
- * Returns true iff there is an IO ongoing in the given environment.
- */
-int cl_io_is_going(const struct lu_env *env)
-{
-	return cl_env_info(env)->clt_current_io != NULL;
-}
-
-/**
  * cl_io invariant that holds at all times when exported cl_io_*() functions
  * are entered and left.
  */
@@ -100,7 +92,6 @@ static int cl_io_invariant(const struct cl_io *io)
 void cl_io_fini(const struct lu_env *env, struct cl_io *io)
 {
 	struct cl_io_slice    *slice;
-	struct cl_thread_info *info;
 
 	LINVRNT(cl_io_type_is_valid(io->ci_type));
 	LINVRNT(cl_io_invariant(io));
@@ -119,9 +110,6 @@ void cl_io_fini(const struct lu_env *env, struct cl_io *io)
 		slice->cis_io = NULL;
 	}
 	io->ci_state = CIS_FINI;
-	info = cl_env_info(env);
-	if (info->clt_current_io == io)
-		info->clt_current_io = NULL;
 
 	/* sanity check for layout change */
 	switch (io->ci_type) {
@@ -184,11 +172,8 @@ static int cl_io_init0(const struct lu_env *env, struct cl_io *io,
 int cl_io_sub_init(const struct lu_env *env, struct cl_io *io,
 		   enum cl_io_type iot, struct cl_object *obj)
 {
-	struct cl_thread_info *info = cl_env_info(env);
-
 	LASSERT(obj != cl_object_top(obj));
-	if (!info->clt_current_io)
-		info->clt_current_io = io;
+
 	return cl_io_init0(env, io, iot, obj);
 }
 EXPORT_SYMBOL(cl_io_sub_init);
@@ -206,12 +191,8 @@ int cl_io_sub_init(const struct lu_env *env, struct cl_io *io,
 int cl_io_init(const struct lu_env *env, struct cl_io *io,
 	       enum cl_io_type iot, struct cl_object *obj)
 {
-	struct cl_thread_info *info = cl_env_info(env);
-
 	LASSERT(obj == cl_object_top(obj));
-	LASSERT(!info->clt_current_io);
 
-	info->clt_current_io = io;
 	return cl_io_init0(env, io, iot, obj);
 }
 EXPORT_SYMBOL(cl_io_init);
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 42cce2d..d1d7bec 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -973,20 +973,8 @@ struct cl_thread_info *cl_env_info(const struct lu_env *env)
 	return lu_context_key_get(&env->le_ctx, &cl_key);
 }
 
-/* defines cl0_key_{init,fini}() */
-LU_KEY_INIT_FINI(cl0, struct cl_thread_info);
-
-static void *cl_key_init(const struct lu_context *ctx,
-			 struct lu_context_key *key)
-{
-	return cl0_key_init(ctx, key);
-}
-
-static void cl_key_fini(const struct lu_context *ctx,
-			struct lu_context_key *key, void *data)
-{
-	cl0_key_fini(ctx, key, data);
-}
+/* defines cl_key_{init,fini}() */
+LU_KEY_INIT_FINI(cl, struct cl_thread_info);
 
 static struct lu_context_key cl_key = {
 	.lct_tags = LCT_CL_THREAD,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 16/31] lustre: obd: remove OBD_NOTIFY_SYNC{, _NONBLOCK}
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (14 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 15/31] lustre: clio: remove unused members from struct cl_thread_info James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 17/31] lustre: obdclass: handle early requests vs CT registering James Simmons
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: "John L. Hammond" <jhammond@whamcloud.com>

None of the OBD notify handlers listen for OBD_NOTIFY_SYNC{,_NONBLOCK}
events so remove them and related code in lov_notify().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/21421
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h |  3 --
 drivers/staging/lustre/lustre/lov/lov_obd.c | 52 +++++------------------------
 2 files changed, 9 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 62f85a1..5bf2be8 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -491,9 +491,6 @@ enum obd_notify_event {
 	OBD_NOTIFY_INACTIVE,
 	/* Connect data for import were changed */
 	OBD_NOTIFY_OCD,
-	/* Sync request */
-	OBD_NOTIFY_SYNC_NONBLOCK,
-	OBD_NOTIFY_SYNC,
 	/* Configuration event */
 	OBD_NOTIFY_CONFIG,
 	/* Administratively deactivate/activate event */
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 0dd471c..85d3b29 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -429,10 +429,8 @@ static int lov_notify(struct obd_device *obd, struct obd_device *watched,
 	struct lov_obd *lov = &obd->u.lov;
 
 	down_read(&lov->lov_notify_lock);
-	if (!lov->lov_connects) {
-		up_read(&lov->lov_notify_lock);
-		return rc;
-	}
+	if (!lov->lov_connects)
+		goto out_notify_lock;
 
 	if (ev == OBD_NOTIFY_ACTIVE || ev == OBD_NOTIFY_INACTIVE ||
 	    ev == OBD_NOTIFY_ACTIVATE || ev == OBD_NOTIFY_DEACTIVATE) {
@@ -441,12 +439,13 @@ static int lov_notify(struct obd_device *obd, struct obd_device *watched,
 		LASSERT(watched);
 
 		if (strcmp(watched->obd_type->typ_name, LUSTRE_OSC_NAME)) {
-			up_read(&lov->lov_notify_lock);
 			CERROR("unexpected notification of %s %s!\n",
 			       watched->obd_type->typ_name,
 			       watched->obd_name);
-			return -EINVAL;
+			rc = -EINVAL;
+			goto out_notify_lock;
 		}
+
 		uuid = &watched->u.cli.cl_target_uuid;
 
 		/* Set OSC as active before notifying the observer, so the
@@ -454,53 +453,20 @@ static int lov_notify(struct obd_device *obd, struct obd_device *watched,
 		 */
 		rc = lov_set_osc_active(obd, uuid, ev);
 		if (rc < 0) {
-			up_read(&lov->lov_notify_lock);
 			CERROR("event(%d) of %s failed: %d\n", ev,
 			       obd_uuid2str(uuid), rc);
-			return rc;
+			goto out_notify_lock;
 		}
 		/* active event should be pass lov target index as data */
 		data = &rc;
 	}
 
 	/* Pass the notification up the chain. */
-	if (watched) {
-		rc = obd_notify_observer(obd, watched, ev, data);
-	} else {
-		/* NULL watched means all osc's in the lov (only for syncs) */
-		/* sync event should be send lov idx as data */
-		struct lov_obd *lov = &obd->u.lov;
-		int i, is_sync;
-
-		data = &i;
-		is_sync = (ev == OBD_NOTIFY_SYNC) ||
-			  (ev == OBD_NOTIFY_SYNC_NONBLOCK);
-
-		obd_getref(obd);
-		for (i = 0; i < lov->desc.ld_tgt_count; i++) {
-			if (!lov->lov_tgts[i])
-				continue;
-
-			/* don't send sync event if target not
-			 * connected/activated
-			 */
-			if (is_sync &&  !lov->lov_tgts[i]->ltd_active)
-				continue;
-
-			rc = obd_notify_observer(obd, lov->lov_tgts[i]->ltd_obd,
-						 ev, data);
-			if (rc) {
-				CERROR("%s: notify %s of %s failed %d\n",
-				       obd->obd_name,
-				       obd->obd_observer->obd_name,
-				       lov->lov_tgts[i]->ltd_obd->obd_name,
-				       rc);
-			}
-		}
-		obd_putref(obd);
-	}
+	rc = obd_notify_observer(obd, watched, ev, data);
 
+out_notify_lock:
 	up_read(&lov->lov_notify_lock);
+
 	return rc;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 17/31] lustre: obdclass: handle early requests vs CT registering
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (15 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 16/31] lustre: obd: remove OBD_NOTIFY_SYNC{, _NONBLOCK} James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 18/31] lustre: libcfs: avoid overflow of crypto bandwidth calculation James Simmons
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Bruno Faccini <bruno.faccini@intel.com>

This patch addresses cases where CDT may start to send requests
before CT has fully registered with all MDTs and thus when the KUC
pipe kernel side has still not been initialized in
lmv_hsm_ct_register().

This will avoid Oops'es due to kkuc_groups[KUC_GRP_HSM] being
uninitialized/zero'ed and we rely on CDT to later retry.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9038
Reviewed-on: https://review.whamcloud.com/25050
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/kernelcomm.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/staging/lustre/lustre/obdclass/kernelcomm.c b/drivers/staging/lustre/lustre/obdclass/kernelcomm.c
index 63067a7..304288d 100644
--- a/drivers/staging/lustre/lustre/obdclass/kernelcomm.c
+++ b/drivers/staging/lustre/lustre/obdclass/kernelcomm.c
@@ -183,6 +183,14 @@ int libcfs_kkuc_group_put(unsigned int group, void *payload)
 	int one_success = 0;
 
 	down_write(&kg_sem);
+
+	if (unlikely(!kkuc_groups[group].next) ||
+	    unlikely(OBD_FAIL_CHECK(OBD_FAIL_MDS_HSM_CT_REGISTER_NET))) {
+		/* no agent have fully registered, CDT will retry */
+		up_write(&kg_sem);
+		return -EAGAIN;
+	}
+
 	list_for_each_entry(reg, &kkuc_groups[group], kr_chain) {
 		if (reg->kr_fp) {
 			rc = libcfs_kkuc_msg_put(reg->kr_fp, payload);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 18/31] lustre: libcfs: avoid overflow of crypto bandwidth calculation
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (16 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 17/31] lustre: obdclass: handle early requests vs CT registering James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 19/31] lustre: obd: remove OBD_NOTIFY_CONFIG James Simmons
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Gu Zheng <gzheng@ddn.com>

bcount and buf_len are both int, and no force convert in the
calculation code:

tmp = ((bcount * buf_len / jiffies_to_msecs(end - start)) *
       1000) / (1024 * 1024);
That may cause overflow in modern fast machine.

Signed-off-by: Gu Zheng <gzheng@ddn.com>
WC-id: https://jira.whamcloud.com/browse/LU-9116
Reviewed-on: https://review.whamcloud.com/25436
Reviewed-by: Andreas Dilger <adilger@whamclud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/linux-crypto.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux-crypto.c b/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
index 21ff9bf..cfff54d 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
@@ -313,7 +313,8 @@ static void cfs_crypto_performance_test(enum cfs_crypto_hash_alg hash_alg)
 	int buf_len = max(PAGE_SIZE, 1048576UL);
 	void *buf;
 	unsigned long start, end;
-	int bcount, err = 0;
+	unsigned long bcount;
+	int err = 0;
 	struct page *page;
 	unsigned char hash[CFS_CRYPTO_HASH_DIGESTSIZE_MAX];
 	unsigned int hash_len = sizeof(hash);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 19/31] lustre: obd: remove OBD_NOTIFY_CONFIG
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (17 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 18/31] lustre: libcfs: avoid overflow of crypto bandwidth calculation James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 20/31] lustre: llite: Remove OBD_FAIL_OSC_CONNECT_CKSUM James Simmons
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: "John L. Hammond" <jhammond@whamcloud.com>

None of the OBD notify handlers listen for the OBD_NOTIFY_CONFIG
event so remove it and its sole use in server_start_targets()
which is on the server side.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/21422
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5bf2be8..10e3bb8 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -491,8 +491,6 @@ enum obd_notify_event {
 	OBD_NOTIFY_INACTIVE,
 	/* Connect data for import were changed */
 	OBD_NOTIFY_OCD,
-	/* Configuration event */
-	OBD_NOTIFY_CONFIG,
 	/* Administratively deactivate/activate event */
 	OBD_NOTIFY_DEACTIVATE,
 	OBD_NOTIFY_ACTIVATE
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 20/31] lustre: llite: Remove OBD_FAIL_OSC_CONNECT_CKSUM
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (18 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 19/31] lustre: obd: remove OBD_NOTIFY_CONFIG James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 21/31] lustre: osc: hung in osc_destroy() James Simmons
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Abrarahmed Momin <abrar.habib@seagate.com>

Remove OBD_FAIL_OSC_CONNECT_CKSUM as all clients and servers
since 1.8 support OBD_CONNECT_CKSUM. No reason to check
interoperability with older servers anymore.

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Seagate-bug-id: MRP-1421
WC-id: https://jira.whamcloud.com/browse/LU-5361
Reviewed-on: https://review.whamcloud.com/23644
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/include/obd_support.h    |  2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    | 22 ++++++++++------------
 drivers/staging/lustre/lustre/ptlrpc/import.c      |  5 ++---
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 726cc4d..80b9935 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -312,7 +312,7 @@
 #define OBD_FAIL_OSC_CHECKSUM_RECEIVE    0x408
 #define OBD_FAIL_OSC_CHECKSUM_SEND       0x409
 #define OBD_FAIL_OSC_BRW_PREP_REQ2       0x40a
-#define OBD_FAIL_OSC_CONNECT_CKSUM       0x40b
+/* #define OBD_FAIL_OSC_CONNECT_CKSUM	 0x40b Obsolete since 2.9 */
 #define OBD_FAIL_OSC_CKSUM_ADLER_ONLY    0x40c
 #define OBD_FAIL_OSC_DIO_PAUSE	   0x40d
 #define OBD_FAIL_OSC_OBJECT_CONTENTION   0x40e
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index ccb5bda..cd5b064 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -395,19 +395,17 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt)
 	if (!OBD_FAIL_CHECK(OBD_FAIL_OSC_CONNECT_GRANT_PARAM))
 		data->ocd_connect_flags |= OBD_CONNECT_GRANT_PARAM;
 
-	if (!OBD_FAIL_CHECK(OBD_FAIL_OSC_CONNECT_CKSUM)) {
-		/* OBD_CONNECT_CKSUM should always be set, even if checksums are
-		 * disabled by default, because it can still be enabled on the
-		 * fly via /sys. As a consequence, we still need to come to an
-		 * agreement on the supported algorithms at connect time
-		 */
-		data->ocd_connect_flags |= OBD_CONNECT_CKSUM;
+	/* OBD_CONNECT_CKSUM should always be set, even if checksums are
+	 * disabled by default, because it can still be enabled on the
+	 * fly via /sys. As a consequence, we still need to come to an
+	 * agreement on the supported algorithms at connect time
+	 */
+	data->ocd_connect_flags |= OBD_CONNECT_CKSUM;
 
-		if (OBD_FAIL_CHECK(OBD_FAIL_OSC_CKSUM_ADLER_ONLY))
-			data->ocd_cksum_types = OBD_CKSUM_ADLER;
-		else
-			data->ocd_cksum_types = cksum_types_supported_client();
-	}
+	if (OBD_FAIL_CHECK(OBD_FAIL_OSC_CKSUM_ADLER_ONLY))
+		data->ocd_cksum_types = OBD_CKSUM_ADLER;
+	else
+		data->ocd_cksum_types = cksum_types_supported_client();
 
 	data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 4db0d89..07dc87d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -814,12 +814,11 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 		 * the checksum types it doesn't support
 		 */
 		if (!(ocd->ocd_cksum_types & cksum_types_supported_client())) {
-			LCONSOLE_WARN("The negotiation of the checksum algorithm to use with server %s failed (%x/%x), disabling checksums\n",
+			LCONSOLE_ERROR("The negotiation of the checksum algorithm to use with server %s failed (%x/%x), disabling checksums\n",
 				      obd2cli_tgt(imp->imp_obd),
 				      ocd->ocd_cksum_types,
 				      cksum_types_supported_client());
-			cli->cl_checksum = 0;
-			cli->cl_supp_cksum_types = OBD_CKSUM_ADLER;
+			return -EPROTO;
 		} else {
 			cli->cl_supp_cksum_types = ocd->ocd_cksum_types;
 		}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 21/31] lustre: osc: hung in osc_destroy()
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (19 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 20/31] lustre: llite: Remove OBD_FAIL_OSC_CONNECT_CKSUM James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 22/31] lustre: libcfs: reduce libcfs checksum speed test time James Simmons
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Andriy Skulysh <c17819@cray.com>

cl_destroy_in_flight becomes < 0 because the
osc_can_send_destroy() won't increment
cl_destroy_in_flight if l_wait_event() gets
interrupted by a signal, but the request will
still be sent and the request's interpret
function will decrease the counter.

Don't send OST_DESTROY request on signal
and return -EINTR.

Signed-off-by: Andriy Skulysh <c17819@cray.com>
Seagate-bug-id: MRP-3834
WC-id: https://jira.whamcloud.com/browse/LU-8624
Reviewed-on: https://review.whamcloud.com/22588
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/osc_request.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 21497ea..b7f8e07 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -652,8 +652,12 @@ static int osc_destroy(const struct lu_env *env, struct obd_export *exp,
 		 * Wait until the number of on-going destroy RPCs drops
 		 * under max_rpc_in_flight
 		 */
-		l_wait_event_abortable_exclusive(cli->cl_destroy_waitq,
-					       osc_can_send_destroy(cli));
+		rc = l_wait_event_abortable_exclusive(cli->cl_destroy_waitq,
+						      osc_can_send_destroy(cli));
+		if (rc) {
+			ptlrpc_request_free(req);
+			return rc;
+		}
 	}
 
 	/* Do not wait for response */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 22/31] lustre: libcfs: reduce libcfs checksum speed test time
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (20 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 21/31] lustre: osc: hung in osc_destroy() James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 23/31] lustre: llite: Return -ERESTARTSYS in range_lock() James Simmons
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Andreas Dilger <adilger@whamcloud.com>

Loading the libcfs module is getting increasingly slow due to
multiple checksum types being speed tested at startup (8 different
checksums * 1s per checksum).

Reduce the number of checksum algorithms checked at module load
time to the ones that are actually need the speed (i.e. the bulk
data checksums), and reduce the amount of time taken to compute the
checksum. The other checksum types typically do not need the speed,
but rather are selected by the configuration.

Precompute the checksum speeds and supported types for the OST so
they are not recomputed for each new client that connects.

This reduces the module load time from 8.0s to 0.76s in my testing.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-9201
Reviewed-on: https://review.whamcloud.com/25923
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/linux/libcfs/libcfs_crypto.h    |  4 ++-
 drivers/staging/lustre/lnet/libcfs/linux-crypto.c  | 30 +++++++++++++++++-----
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
index 176fae7..ca8620b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
@@ -46,13 +46,15 @@ enum cfs_crypto_hash_alg {
 	CFS_HASH_ALG_NULL       = 0,
 	CFS_HASH_ALG_ADLER32,
 	CFS_HASH_ALG_CRC32,
+	CFS_HASH_ALG_CRC32C,
+	/* hashes before here will be speed-tested at module load */
 	CFS_HASH_ALG_MD5,
 	CFS_HASH_ALG_SHA1,
 	CFS_HASH_ALG_SHA256,
 	CFS_HASH_ALG_SHA384,
 	CFS_HASH_ALG_SHA512,
-	CFS_HASH_ALG_CRC32C,
 	CFS_HASH_ALG_MAX,
+	CFS_HASH_ALG_SPEED_MAX = CFS_HASH_ALG_MD5,
 	CFS_HASH_ALG_UNKNOWN	= 0xff
 };
 
diff --git a/drivers/staging/lustre/lnet/libcfs/linux-crypto.c b/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
index cfff54d..b206e3c 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux-crypto.c
@@ -300,7 +300,10 @@ int cfs_crypto_hash_final(struct ahash_request *req,
 /**
  * Compute the speed of specified hash function
  *
- * Run a speed test on the given hash algorithm on buffer of the given size.
+ * Run a speed test on the given hash algorithm on buffer using a 1MB buffer
+ * size.  This is a reasonable buffer size for Lustre RPCs, even if the actual
+ * RPC size is larger or smaller.
+ *
  * The speed is stored internally in the cfs_crypto_hash_speeds[] array, and
  * is available through the cfs_crypto_hash_speed() function.
  *
@@ -329,8 +332,8 @@ static void cfs_crypto_performance_test(enum cfs_crypto_hash_alg hash_alg)
 	memset(buf, 0xAD, PAGE_SIZE);
 	kunmap(page);
 
-	for (start = jiffies, end = start + msecs_to_jiffies(MSEC_PER_SEC),
-	     bcount = 0; time_before(jiffies, end); bcount++) {
+	for (start = jiffies, end = start + msecs_to_jiffies(MSEC_PER_SEC / 4),
+	     bcount = 0; time_before(jiffies, end) && err == 0; bcount++) {
 		struct ahash_request *hdesc;
 		int i;
 
@@ -373,8 +376,12 @@ static void cfs_crypto_performance_test(enum cfs_crypto_hash_alg hash_alg)
 /**
  * hash speed in Mbytes per second for valid hash algorithm
  *
- * Return the performance of the specified \a hash_alg that was previously
- * computed using cfs_crypto_performance_test().
+ * Return the performance of the specified \a hash_alg that was
+ * computed using cfs_crypto_performance_test().  If the performance
+ * has not yet been computed, do that when it is first requested.
+ * That avoids computing the speed when it is not actually needed.
+ * To avoid competing threads computing the checksum speed at the
+ * same time, only compute a single checksum speed at one time.
  *
  * \param[in] hash_alg	hash algorithm id (CFS_HASH_ALG_*)
  *
@@ -384,8 +391,17 @@ static void cfs_crypto_performance_test(enum cfs_crypto_hash_alg hash_alg)
  */
 int cfs_crypto_hash_speed(enum cfs_crypto_hash_alg hash_alg)
 {
-	if (hash_alg < CFS_HASH_ALG_MAX)
+	if (hash_alg < CFS_HASH_ALG_MAX) {
+		if (unlikely(cfs_crypto_hash_speeds[hash_alg] == 0)) {
+			static DEFINE_MUTEX(crypto_hash_speed_mutex);
+
+			mutex_lock(&crypto_hash_speed_mutex);
+			if (cfs_crypto_hash_speeds[hash_alg] == 0)
+				cfs_crypto_performance_test(hash_alg);
+			mutex_unlock(&crypto_hash_speed_mutex);
+		}
 		return cfs_crypto_hash_speeds[hash_alg];
+	}
 	return -ENOENT;
 }
 EXPORT_SYMBOL(cfs_crypto_hash_speed);
@@ -412,7 +428,7 @@ static int cfs_crypto_test_hashes(void)
 {
 	enum cfs_crypto_hash_alg hash_alg;
 
-	for (hash_alg = 0; hash_alg < CFS_HASH_ALG_MAX; hash_alg++)
+	for (hash_alg = 0; hash_alg < CFS_HASH_ALG_SPEED_MAX; hash_alg++)
 		cfs_crypto_performance_test(hash_alg);
 
 	return 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 23/31] lustre: llite: Return -ERESTARTSYS in range_lock()
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (21 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 22/31] lustre: libcfs: reduce libcfs checksum speed test time James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 24/31] lustre: obdclass: use static initializer macros where possible James Simmons
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Chris Horn <hornc@cray.com>

If we return -ERESTARTSYS rather than -EINTR then the syscall can be
retried rather than failing with -EINTR.

Signed-off-by: Chris Horn <hornc@cray.com>
WC-id: https://jira.whamcloud.com/browse/LU-8735
Reviewed-on: https://review.whamcloud.com/23259
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/range_lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/range_lock.c b/drivers/staging/lustre/lustre/llite/range_lock.c
index acdb0dc..d37da8e 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.c
+++ b/drivers/staging/lustre/lustre/llite/range_lock.c
@@ -157,7 +157,7 @@ int range_lock(struct range_lock_tree *tree, struct range_lock *lock)
 
 		if (signal_pending(current)) {
 			range_unlock(tree, lock);
-			rc = -EINTR;
+			rc = -ERESTARTSYS;
 			goto out;
 		}
 		spin_lock(&tree->rlt_lock);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 24/31] lustre: obdclass: use static initializer macros where possible
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (22 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 23/31] lustre: llite: Return -ERESTARTSYS in range_lock() James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h James Simmons
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: "John L. Hammond" <jhammond@whamcloud.com>

In obdclass replace module load time initialization of several
atomics, lists, locks, mutexes, wait queues, etc with static
initialization using the kernel provided macros.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-9010
Reviewed-on: https://review.whamcloud.com/24827
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lu_object.h  |  6 -----
 drivers/staging/lustre/lustre/include/obd_class.h  |  6 -----
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 26 +++++-----------------
 drivers/staging/lustre/lustre/obdclass/genops.c    |  5 ++++-
 drivers/staging/lustre/lustre/obdclass/lu_object.c | 15 -------------
 .../staging/lustre/lustre/obdclass/lustre_peer.c   | 16 ++-----------
 6 files changed, 12 insertions(+), 62 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index 4153db7..47f8021 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -330,12 +330,6 @@ struct lu_device_type {
 	 * Number of existing device type instances.
 	 */
 	atomic_t				ldt_device_nr;
-	/**
-	 * Linkage into a global list of all device types.
-	 *
-	 * \see lu_device_types.
-	 */
-	struct list_head			      ldt_linkage;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index e772e3d..184da99 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -48,7 +48,6 @@
 #define OBD_STATFS_FOR_MDT0	0x0004
 
 /* OBD Device Declarations */
-extern struct obd_device *obd_devs[MAX_OBD_DEVICES];
 extern rwlock_t obd_dev_lock;
 
 /* OBD Operations Declarations */
@@ -59,7 +58,6 @@
 struct lu_device_type;
 
 /* genops.c */
-extern struct list_head obd_types;
 struct obd_export *class_conn2export(struct lustre_handle *conn);
 int class_register_type(struct obd_ops *dt_ops, struct md_ops *md_ops,
 			const char *name, struct lu_device_type *ldt);
@@ -133,7 +131,6 @@ void class_decref(struct obd_device *obd,
 int class_config_llog_handler(const struct lu_env *env,
 			      struct llog_handle *handle,
 			      struct llog_rec_hdr *rec, void *data);
-int class_add_uuid(const char *uuid, __u64 nid);
 
 /* obdecho */
 void lprocfs_echo_init_vars(struct lprocfs_static_vars *lvars);
@@ -1576,13 +1573,10 @@ struct lwp_register_item {
 int class_add_uuid(const char *uuid, __u64 nid);
 int class_del_uuid(const char *uuid);
 int class_check_uuid(struct obd_uuid *uuid, __u64 nid);
-void class_init_uuidlist(void);
-void class_exit_uuidlist(void);
 
 /* class_obd.c */
 extern char obd_jobid_node[];
 extern struct miscdevice obd_psdev;
-extern spinlock_t obd_types_lock;
 int class_procfs_init(void);
 int class_procfs_clean(void);
 
diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index 04e55fc..05ae6e1 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -49,10 +49,6 @@
 #include <uapi/linux/lnet/libcfs_ioctl.h>
 #include "llog_internal.h"
 
-struct obd_device *obd_devs[MAX_OBD_DEVICES];
-struct list_head obd_types;
-DEFINE_RWLOCK(obd_dev_lock);
-
 /* The following are visible and mutable through /sys/fs/lustre. */
 unsigned int obd_debug_peer_on_timeout;
 EXPORT_SYMBOL(obd_debug_peer_on_timeout);
@@ -455,28 +451,25 @@ static int obd_init_checks(void)
 
 static int __init obdclass_init(void)
 {
-	int i, err;
+	int err;
 
 	LCONSOLE_INFO("Lustre: Build Version: " LUSTRE_VERSION_STRING "\n");
 
-	spin_lock_init(&obd_types_lock);
-
 	err = libcfs_setup();
 	if (err)
 		return err;
 
-	obd_zombie_impexp_init();
+	err = obd_zombie_impexp_init();
+	if (err)
+		return err;
 
 	err = obd_init_checks();
 	if (err)
 		goto cleanup_zombie_impexp;
 
-	class_init_uuidlist();
 	err = class_handle_init();
 	if (err)
-		goto cleanup_uuidlist;
-
-	INIT_LIST_HEAD(&obd_types);
+		goto cleanup_zombie_impexp;
 
 	err = misc_register(&obd_psdev);
 	if (err) {
@@ -484,10 +477,6 @@ static int __init obdclass_init(void)
 		goto cleanup_class_handle;
 	}
 
-	/* This struct is already zeroed for us (static global) */
-	for (i = 0; i < class_devno_max(); i++)
-		obd_devs[i] = NULL;
-
 	/* Default the dirty page cache cap to 1/2 of system memory.
 	 * For clients with less memory, a larger fraction is needed
 	 * for other purposes (mostly for BGL).
@@ -550,9 +539,6 @@ static int __init obdclass_init(void)
 cleanup_class_handle:
 	class_handle_cleanup();
 
-cleanup_uuidlist:
-	class_exit_uuidlist();
-
 cleanup_zombie_impexp:
 	obd_zombie_impexp_stop();
 
@@ -571,7 +557,7 @@ static void obdclass_exit(void)
 	class_procfs_clean();
 
 	class_handle_cleanup();
-	class_exit_uuidlist();
+	class_del_uuid(NULL); /* Delete all UUIDs. */
 	obd_zombie_impexp_stop();
 }
 
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index 8454b44..532418e 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -41,7 +41,10 @@
 #include <lprocfs_status.h>
 #include <lustre_kernelcomm.h>
 
-spinlock_t obd_types_lock;
+static DEFINE_SPINLOCK(obd_types_lock);
+static LIST_HEAD(obd_types);
+DEFINE_RWLOCK(obd_dev_lock);
+static struct obd_device *obd_devs[MAX_OBD_DEVICES];
 
 static struct kmem_cache *obd_device_cachep;
 struct kmem_cache *obdo_cachep;
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 2d24eb6..cb57abf 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -764,35 +764,20 @@ struct lu_object *lu_object_find_slice(const struct lu_env *env,
 }
 EXPORT_SYMBOL(lu_object_find_slice);
 
-/**
- * Global list of all device types.
- */
-static LIST_HEAD(lu_device_types);
-
 int lu_device_type_init(struct lu_device_type *ldt)
 {
 	int result = 0;
 
 	atomic_set(&ldt->ldt_device_nr, 0);
-	INIT_LIST_HEAD(&ldt->ldt_linkage);
 	if (ldt->ldt_ops->ldto_init)
 		result = ldt->ldt_ops->ldto_init(ldt);
 
-	if (!result) {
-		spin_lock(&obd_types_lock);
-		list_add(&ldt->ldt_linkage, &lu_device_types);
-		spin_unlock(&obd_types_lock);
-	}
-
 	return result;
 }
 EXPORT_SYMBOL(lu_device_type_init);
 
 void lu_device_type_fini(struct lu_device_type *ldt)
 {
-	spin_lock(&obd_types_lock);
-	list_del_init(&ldt->ldt_linkage);
-	spin_unlock(&obd_types_lock);
 	if (ldt->ldt_ops->ldto_fini)
 		ldt->ldt_ops->ldto_fini(ldt);
 }
diff --git a/drivers/staging/lustre/lustre/obdclass/lustre_peer.c b/drivers/staging/lustre/lustre/obdclass/lustre_peer.c
index 7fc62b7..5705b0a 100644
--- a/drivers/staging/lustre/lustre/obdclass/lustre_peer.c
+++ b/drivers/staging/lustre/lustre/obdclass/lustre_peer.c
@@ -51,20 +51,8 @@ struct uuid_nid_data {
 };
 
 /* FIXME: This should probably become more elegant than a global linked list */
-static struct list_head	g_uuid_list;
-static spinlock_t	g_uuid_lock;
-
-void class_init_uuidlist(void)
-{
-	INIT_LIST_HEAD(&g_uuid_list);
-	spin_lock_init(&g_uuid_lock);
-}
-
-void class_exit_uuidlist(void)
-{
-	/* delete all */
-	class_del_uuid(NULL);
-}
+static LIST_HEAD(g_uuid_list);
+static DEFINE_SPINLOCK(g_uuid_lock);
 
 int lustre_uuid_to_peer(const char *uuid, lnet_nid_t *peer_nid, int index)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (23 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 24/31] lustre: obdclass: use static initializer macros where possible James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31 22:47   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 26/31] lustre: obd: remove unused data parameter from obd_notify() James Simmons
                   ` (6 subsequent siblings)
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Niu Yawei <yawei.niu@intel.com>

Move config type values CONFIG_T_XXX into lustre_idl.h since they
will be put on wire when reading config logs.

Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
mgs_config_res.

Redefine CONFIG_SUB_XXX for the sub clds attached on config log.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9216
Reviewed-on: https://review.whamcloud.com/26022
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
 drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
 4 files changed, 103 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
index c9b32ef..bd3b45a 100644
--- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
@@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
 	} u;
 };
 
+enum {
+	CONFIG_T_CONFIG  = 0,
+	CONFIG_T_SPTLRPC = 1,
+	CONFIG_T_RECOVER = 2,
+	CONFIG_T_PARAMS  = 3,
+	CONFIG_T_MAX
+};
+
 struct mgs_config_body {
 	char		mcb_name[MTI_NAME_MAXLEN]; /* logname */
 	__u64		mcb_offset;    /* next index of config log to request */
 	__u16		mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
-	__u8		mcb_reserved;
+	__u8		mcb_nm_cur_pass;
 	__u8		mcb_bits;      /* bits unit size of config log */
 	__u32		mcb_units;     /* # of units for bulk transfer */
 };
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 184da99..647cc22 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -156,16 +156,16 @@ struct config_llog_instance {
 int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
 			    char *name, struct config_llog_instance *cfg);
 
-#define CONFIG_T_CONFIG		BIT(0)
-#define CONFIG_T_SPTLRPC	BIT(1)
-#define CONFIG_T_RECOVER	BIT(2)
-#define CONFIG_T_PARAMS		BIT(3)
+#define CONFIG_SUB_CONFIG	BIT(0)
+#define CONFIG_SUB_SPTLRPC	BIT(1)
+#define CONFIG_SUB_RECOVER	BIT(2)
+#define CONFIG_SUB_PARAMS	BIT(3)
 
 /* Sub clds should be attached to the config_llog_data when processing
  * config log for client or server target.
  */
-#define CONFIG_SUB_CLIENT	(CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
-				 CONFIG_T_PARAMS)
+#define CONFIG_SUB_CLIENT	(CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
+				 CONFIG_SUB_PARAMS)
 
 #define PARAMS_FILENAME	"params"
 #define LCTL_UPCALL	"lctl"
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 06fcc7e..833e6a0 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 	memcpy(seclogname, logname, ptr - logname);
 	strcpy(seclogname + (ptr - logname), "-sptlrpc");
 
-	if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
+	if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
 		sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
 						     CONFIG_T_SPTLRPC, cfg);
 		if (IS_ERR(sptlrpc_cld)) {
@@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 		}
 	}
 
-	if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
+	if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
 		params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
 						    CONFIG_T_PARAMS, cfg);
 		if (IS_ERR(params_cld)) {
@@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
 
 	LASSERT(lsi->lsi_lmd);
 	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
-	    cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
+	    cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
 		ptr = strrchr(seclogname, '-');
 		if (ptr) {
 			*ptr = 0;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 2f081ed..09b1298 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
 		 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
 
+	/* Checks for struct mgs_nidtbl_entry */
+	LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
+		 (long long)(int)sizeof(struct mgs_nidtbl_entry));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
+	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
+	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
+
+	/* Checks for struct mgs_config_body */
+	LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
+		 (long long)(int)sizeof(struct mgs_config_body));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_name));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_type));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
+	LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_body, mcb_units));
+	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
+
+	BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
+	BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
+	BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
+	BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
+
+	/* Checks for struct mgs_config_res */
+	LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
+		 (long long)(int)sizeof(struct mgs_config_res));
+	LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
+	LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
+	LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_config_res, mcr_size));
+	LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
+
 	/* Checks for struct lustre_capa */
 	LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
 		 (long long)(int)sizeof(struct lustre_capa));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 26/31] lustre: obd: remove unused data parameter from obd_notify()
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (24 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 27/31] lustre: llite: handle client racy case during create James Simmons
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: "John L. Hammond" <jhammond@whamcloud.com>

Remove the unused data parameter from obd_notify() and related
functions.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/24428
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |  4 ++--
 drivers/staging/lustre/lustre/include/obd_class.h  | 25 ++++++++--------------
 drivers/staging/lustre/lustre/llite/lcommon_misc.c | 12 +++++------
 .../staging/lustre/lustre/llite/llite_internal.h   |  5 ++---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  7 +++---
 drivers/staging/lustre/lustre/lov/lov_obd.c        | 13 +++++------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  6 +++---
 drivers/staging/lustre/lustre/osc/osc_request.c    | 10 ++++-----
 8 files changed, 34 insertions(+), 48 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 10e3bb8..333c703 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -502,7 +502,7 @@ enum obd_notify_event {
  */
 struct obd_notify_upcall {
 	int (*onu_upcall)(struct obd_device *host, struct obd_device *watched,
-			  enum obd_notify_event ev, void *owner, void *data);
+			  enum obd_notify_event ev, void *owner);
 	/* Opaque datum supplied by upper layer listener */
 	void *onu_owner;
 };
@@ -861,7 +861,7 @@ struct obd_ops {
 			    enum obd_import_event);
 
 	int (*notify)(struct obd_device *obd, struct obd_device *watched,
-		      enum obd_notify_event ev, void *data);
+		      enum obd_notify_event ev);
 
 	int (*health_check)(const struct lu_env *env, struct obd_device *);
 	struct obd_uuid *(*get_uuid)(struct obd_export *exp);
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 647cc22..50d5ddb 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1074,8 +1074,7 @@ static inline void obd_import_event(struct obd_device *obd,
 
 static inline int obd_notify(struct obd_device *obd,
 			     struct obd_device *watched,
-			     enum obd_notify_event ev,
-			     void *data)
+			     enum obd_notify_event ev)
 {
 	int rc;
 
@@ -1094,35 +1093,29 @@ static inline int obd_notify(struct obd_device *obd,
 	}
 
 	OBD_COUNTER_INCREMENT(obd, notify);
-	rc = OBP(obd, notify)(obd, watched, ev, data);
+	rc = OBP(obd, notify)(obd, watched, ev);
 	return rc;
 }
 
 static inline int obd_notify_observer(struct obd_device *observer,
 				      struct obd_device *observed,
-				      enum obd_notify_event ev,
-				      void *data)
+				      enum obd_notify_event ev)
 {
-	int rc1;
-	int rc2;
-
 	struct obd_notify_upcall *onu;
+	int rc = 0;
+	int rc2 = 0;
 
 	if (observer->obd_observer)
-		rc1 = obd_notify(observer->obd_observer, observed, ev, data);
-	else
-		rc1 = 0;
+		rc = obd_notify(observer->obd_observer, observed, ev);
+
 	/*
 	 * Also, call non-obd listener, if any
 	 */
 	onu = &observer->obd_upcall;
 	if (onu->onu_upcall)
-		rc2 = onu->onu_upcall(observer, observed, ev,
-				      onu->onu_owner, NULL);
-	else
-		rc2 = 0;
+		rc2 = onu->onu_upcall(observer, observed, ev, onu->onu_owner);
 
-	return rc1 ? rc1 : rc2;
+	return rc ? rc : rc2;
 }
 
 static inline int obd_quotactl(struct obd_export *exp,
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index a246b95..80563a2 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -76,14 +76,12 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 }
 
 /**
- * This function is used as an upcall-callback hooked by liblustre and llite
- * clients into obd_notify() listeners chain to handle notifications about
- * change of import connect_flags. See llu_fsswop_mount() and
- * lustre_common_fill_super().
+ * This function is used as an upcall-callback hooked llite clients
+ * into obd_notify() listeners chain to handle notifications about
+ * change of import connect_flags. See lustre_common_fill_super().
  */
-int cl_ocd_update(struct obd_device *host,
-		  struct obd_device *watched,
-		  enum obd_notify_event ev, void *owner, void *data)
+int cl_ocd_update(struct obd_device *host, struct obd_device *watched,
+		  enum obd_notify_event ev, void *owner)
 {
 	struct lustre_client_ocd *lco;
 	struct client_obd	*cli;
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index f6c8daf..00fe706 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -669,9 +669,8 @@ static inline bool ll_sbi_has_fast_read(struct ll_sb_info *sbi)
 
 /* llite/lcommon_misc.c */
 int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp);
-int cl_ocd_update(struct obd_device *host,
-		  struct obd_device *watched,
-		  enum obd_notify_event ev, void *owner, void *data);
+int cl_ocd_update(struct obd_device *host, struct obd_device *watched,
+		  enum obd_notify_event ev, void *owner);
 int cl_get_grouplock(struct cl_object *obj, unsigned long gid, int nonblock,
 		     struct ll_grouplock *cg);
 void cl_put_grouplock(struct ll_grouplock *cg);
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index bbb1ddf..55db904 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -140,7 +140,7 @@ static struct obd_uuid *lmv_get_uuid(struct obd_export *exp)
 }
 
 static int lmv_notify(struct obd_device *obd, struct obd_device *watched,
-		      enum obd_notify_event ev, void *data)
+		      enum obd_notify_event ev)
 {
 	struct obd_connect_data *conn_data;
 	struct lmv_obd	  *lmv = &obd->u.lmv;
@@ -182,7 +182,7 @@ static int lmv_notify(struct obd_device *obd, struct obd_device *watched,
 	 * Pass the notification up the chain.
 	 */
 	if (obd->obd_observer)
-		rc = obd_notify(obd->obd_observer, watched, ev, data);
+		rc = obd_notify(obd->obd_observer, watched, ev);
 
 	return rc;
 }
@@ -330,8 +330,7 @@ static int lmv_connect_mdc(struct obd_device *obd, struct lmv_tgt_desc *tgt)
 		 * Tell the observer about the new target.
 		 */
 		rc = obd_notify(obd->obd_observer, mdc_exp->exp_obd,
-				OBD_NOTIFY_ACTIVE,
-				(void *)(tgt - lmv->tgts[0]));
+				OBD_NOTIFY_ACTIVE);
 		if (rc) {
 			obd_disconnect(mdc_exp);
 			return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 85d3b29..07f6b1b 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -121,7 +121,7 @@ static void lov_putref(struct obd_device *obd)
 static int lov_set_osc_active(struct obd_device *obd, struct obd_uuid *uuid,
 			      enum obd_notify_event ev);
 static int lov_notify(struct obd_device *obd, struct obd_device *watched,
-		      enum obd_notify_event ev, void *data);
+		      enum obd_notify_event ev);
 
 int lov_connect_obd(struct obd_device *obd, __u32 index, int activate,
 		    struct obd_connect_data *data)
@@ -244,7 +244,7 @@ static int lov_connect(const struct lu_env *env,
 			continue;
 
 		rc = lov_notify(obd, lov->lov_tgts[i]->ltd_exp->exp_obd,
-				OBD_NOTIFY_CONNECT, (void *)&i);
+				OBD_NOTIFY_CONNECT);
 		if (rc) {
 			CERROR("%s error sending notify %d\n",
 			       obd->obd_name, rc);
@@ -423,7 +423,7 @@ static int lov_set_osc_active(struct obd_device *obd, struct obd_uuid *uuid,
 }
 
 static int lov_notify(struct obd_device *obd, struct obd_device *watched,
-		      enum obd_notify_event ev, void *data)
+		      enum obd_notify_event ev)
 {
 	int rc = 0;
 	struct lov_obd *lov = &obd->u.lov;
@@ -457,12 +457,10 @@ static int lov_notify(struct obd_device *obd, struct obd_device *watched,
 			       obd_uuid2str(uuid), rc);
 			goto out_notify_lock;
 		}
-		/* active event should be pass lov target index as data */
-		data = &rc;
 	}
 
 	/* Pass the notification up the chain. */
-	rc = obd_notify_observer(obd, watched, ev, data);
+	rc = obd_notify_observer(obd, watched, ev);
 
 out_notify_lock:
 	up_read(&lov->lov_notify_lock);
@@ -590,8 +588,7 @@ static int lov_add_target(struct obd_device *obd, struct obd_uuid *uuidp,
 	}
 
 	rc = lov_notify(obd, tgt->ltd_exp->exp_obd,
-			active ? OBD_NOTIFY_CONNECT : OBD_NOTIFY_INACTIVE,
-			(void *)&index);
+			active ? OBD_NOTIFY_CONNECT : OBD_NOTIFY_INACTIVE);
 
 out:
 	if (rc) {
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index bfa07d7..c2f0a54 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2520,7 +2520,7 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 		if (cli->cl_seq)
 			seq_client_flush(cli->cl_seq);
 
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE);
 		break;
 	}
 	case IMP_EVENT_INVALIDATE: {
@@ -2531,7 +2531,7 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 		break;
 	}
 	case IMP_EVENT_ACTIVE:
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVE);
 		/* redo the kuc registration after reconnecting */
 		if (rc == 0)
 			/* re-register HSM agents */
@@ -2540,7 +2540,7 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 						       (void *)imp);
 		break;
 	case IMP_EVENT_OCD:
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD);
 		break;
 	case IMP_EVENT_DISCON:
 	case IMP_EVENT_DEACTIVATE:
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index b7f8e07..b2b55a7 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2678,7 +2678,7 @@ static int osc_import_event(struct obd_device *obd,
 		break;
 	}
 	case IMP_EVENT_INACTIVE: {
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE);
 		break;
 	}
 	case IMP_EVENT_INVALIDATE: {
@@ -2704,7 +2704,7 @@ static int osc_import_event(struct obd_device *obd,
 		break;
 	}
 	case IMP_EVENT_ACTIVE: {
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVE);
 		break;
 	}
 	case IMP_EVENT_OCD: {
@@ -2717,15 +2717,15 @@ static int osc_import_event(struct obd_device *obd,
 		if (ocd->ocd_connect_flags & OBD_CONNECT_REQPORTAL)
 			imp->imp_client->cli_request_portal = OST_REQUEST_PORTAL;
 
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD);
 		break;
 	}
 	case IMP_EVENT_DEACTIVATE: {
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_DEACTIVATE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_DEACTIVATE);
 		break;
 	}
 	case IMP_EVENT_ACTIVATE: {
-		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVATE, NULL);
+		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_ACTIVATE);
 		break;
 	}
 	default:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 27/31] lustre: llite: handle client racy case during create
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (25 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 26/31] lustre: obd: remove unused data parameter from obd_notify() James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 28/31] lustre: obdclass: improve missing operation message James Simmons
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Bruno Faccini <bruno.faccini@intel.com>

Some very infrequent situations exists on client side
able to cause a race during create when concurrent access
by fid occurs. The result of the race can allow a d_alias
to be already present when it was not expected when original
code/LBUG has been written.

One of the identified scenario is when a concurrent access of
inode thru the .lustre/fid/<[FID]> method occurs.

Final fix is to remove inaccurate
LASSERT(hlist_empty(&inode->i_dentry)); in ll_create_node().

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-8907
Reviewed-on: https://review.whamcloud.com/25296
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd_support.h | 1 +
 drivers/staging/lustre/lustre/llite/namei.c         | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 80b9935..1832193 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -441,6 +441,7 @@
 #define OBD_FAIL_MAKE_LOVEA_HOLE		    0x1406
 #define OBD_FAIL_LLITE_LOST_LAYOUT		    0x1407
 #define OBD_FAIL_GETATTR_DELAY			    0x1409
+#define OBD_FAIL_LLITE_CREATE_NODE_PAUSE	    0x140c
 
 #define OBD_FAIL_FID_INDIR	0x1501
 #define OBD_FAIL_FID_INLMA	0x1502
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index e541f78..da5854e 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -802,7 +802,8 @@ static struct inode *ll_create_node(struct inode *dir, struct lookup_intent *it)
 		goto out;
 	}
 
-	LASSERT(hlist_empty(&inode->i_dentry));
+	/* Pause to allow for a race with concurrent access by fid */
+	OBD_FAIL_TIMEOUT(OBD_FAIL_LLITE_CREATE_NODE_PAUSE, cfs_fail_val);
 
 	/* We asked for a lock on the directory, but were granted a
 	 * lock on the inode.  Since we finally have an inode pointer,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 28/31] lustre: obdclass: improve missing operation message
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (26 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 27/31] lustre: llite: handle client racy case during create James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 29/31] lustre: llite: ignore layout for ll_writepages() James Simmons
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Andreas Dilger <adilger@whamcloud.com>

Some tests in the past reported missing OBD operations, so improve
the error message to include the device name to make it easier to
debug where this is happening.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
WC-id: https://jira.whamcloud.com/browse/LU-1095
Reviewed-on: https://review.whamcloud.com/25586
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd_class.h | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 50d5ddb..fd9d99b 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -398,7 +398,7 @@ static inline int obd_check_dev_active(struct obd_device *obd)
 		return -EOPNOTSUPP;				\
 	}							\
 	if (!OBT((exp)->exp_obd) || !MDP((exp)->exp_obd, op)) {	\
-		CERROR("obd_" #op ": dev %s/%d no operation\n", \
+		CERROR("%s: obd_" #op ": dev %d no operation\n",\
 			(exp)->exp_obd->obd_name,		\
 			(exp)->exp_obd->obd_minor);		\
 		return -EOPNOTSUPP;				\
@@ -409,8 +409,8 @@ static inline int obd_check_dev_active(struct obd_device *obd)
 do {									\
 	if (!OBT(obd) || !OBP((obd), op)) {				\
 		if (err)						\
-			CERROR("obd_" #op ": dev %d no operation\n",	\
-				obd->obd_minor);			\
+			CERROR("%s: no obd_" #op " operation\n",	\
+				obd->obd_name);				\
 		return err;						\
 	}								\
 } while (0)
@@ -425,19 +425,15 @@ static inline int obd_check_dev_active(struct obd_device *obd)
 		CERROR("obd_" #op ": cleaned up obd\n");	\
 		return -EOPNOTSUPP;				\
 	}							\
-	if (!OBT((exp)->exp_obd) || !OBP((exp)->exp_obd, op)) {	\
-		CERROR("obd_" #op ": dev %d no operation\n",	\
-			(exp)->exp_obd->obd_minor);		\
-		return -EOPNOTSUPP;				\
-	}							\
+	OBD_CHECK_DT_OP((exp)->exp_obd, op, -EOPNOTSUPP);	\
 } while (0)
 
 #define CTXT_CHECK_OP(ctxt, op, err)					\
 do {									\
 	if (!OBT(ctxt->loc_obd) || !CTXTP((ctxt), op)) {		\
 		if (err)						\
-			CERROR("lop_" #op ": dev %d no operation\n",	\
-				ctxt->loc_obd->obd_minor);		\
+			CERROR("%s: no lop_" #op " operation\n",	\
+				ctxt->loc_obd->obd_name);		\
 		return err;						\
 	}								\
 } while (0)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 29/31] lustre: llite: ignore layout for ll_writepages()
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (27 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 28/31] lustre: obdclass: improve missing operation message James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31  2:26 ` [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush James Simmons
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Jinshan Xiong <jinshan.xiong@gmail.com>

ll_writepages() would be called inside the direct IO context and
if the layout has been changed during this time, the layout_conf()
has to wait for active IO to complete before applying the layout
change, this is a case of deadlock.

It should ignore layout to avoid this problem. This is safe as long
as pages exist, the layout won't be changed on this client.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
WC-id: https://jira.whamcloud.com/browse/LU-9129
Reviewed-on: https://review.whamcloud.com/25474
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/rw.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/rw.c b/drivers/staging/lustre/lustre/llite/rw.c
index 59747da..49ac723 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -1000,13 +1000,11 @@ int ll_writepage(struct page *vmpage, struct writeback_control *wbc)
 int ll_writepages(struct address_space *mapping, struct writeback_control *wbc)
 {
 	struct inode *inode = mapping->host;
-	struct ll_sb_info *sbi = ll_i2sbi(inode);
 	loff_t start;
 	loff_t end;
 	enum cl_fsync_mode mode;
 	int range_whole = 0;
 	int result;
-	int ignore_layout = 0;
 
 	if (wbc->range_cyclic) {
 		start = mapping->writeback_index << PAGE_SHIFT;
@@ -1024,17 +1022,14 @@ int ll_writepages(struct address_space *mapping, struct writeback_control *wbc)
 	if (wbc->sync_mode == WB_SYNC_ALL)
 		mode = CL_FSYNC_LOCAL;
 
-	if (sbi->ll_umounting)
-		/* if the mountpoint is being umounted, all pages have to be
-		 * evicted to avoid hitting LBUG when truncate_inode_pages()
-		 * is called later on.
-		 */
-		ignore_layout = 1;
-
 	if (!ll_i2info(inode)->lli_clob)
 		return 0;
 
-	result = cl_sync_file_range(inode, start, end, mode, ignore_layout);
+	/* for directio, it would call writepages() to evict cached pages
+	 * inside the IO context of write, which will cause deadlock at
+	 * layout_conf since it waits for active IOs to complete.
+	 */
+	result = cl_sync_file_range(inode, start, end, mode, 1);
 	if (result > 0) {
 		wbc->nr_to_write -= result;
 		result = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (28 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 29/31] lustre: llite: ignore layout for ll_writepages() James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-07-31 22:55   ` NeilBrown
  2018-07-31  2:26 ` [lustre-devel] [PATCH 31/31] lustre: docs: update TODO file James Simmons
  2018-08-01  3:41 ` [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 NeilBrown
  31 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

From: Fan Yong <fan.yong@intel.com>

When the client mount failed or umount, the client_fid_fini() will
be called. At that time, the async connection failure will trigger
seq_client_flush() which parameter may have been released by the
client_fid_fini() by race.

Introduce client_obd::cl_seq_rwsem to protect client_obd::cl_seq.

Signed-off-by: Fan Yong <fan.yong@intel.com>
WC-id: https://jira.whamcloud.com/browse/LU-9224
Reviewed-on: https://review.whamcloud.com/26079
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/fid/fid_request.c | 21 +++++++++++++++------
 drivers/staging/lustre/lustre/include/obd.h     |  1 +
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c   |  2 ++
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 11 +++++++++--
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
index a34fd90..f91242c 100644
--- a/drivers/staging/lustre/lustre/fid/fid_request.c
+++ b/drivers/staging/lustre/lustre/fid/fid_request.c
@@ -343,11 +343,14 @@ int client_fid_init(struct obd_device *obd,
 {
 	struct client_obd *cli = &obd->u.cli;
 	char *prefix;
-	int rc;
+	int rc = 0;
 
+	down_write(&cli->cl_seq_rwsem);
 	cli->cl_seq = kzalloc(sizeof(*cli->cl_seq), GFP_NOFS);
-	if (!cli->cl_seq)
-		return -ENOMEM;
+	if (!cli->cl_seq) {
+		rc = -ENOMEM;
+		goto out_free_lock;
+	}
 
 	prefix = kzalloc(MAX_OBD_NAME + 5, GFP_NOFS);
 	if (!prefix) {
@@ -361,10 +364,14 @@ int client_fid_init(struct obd_device *obd,
 	seq_client_init(cli->cl_seq, exp, type, prefix);
 	kfree(prefix);
 
-	return 0;
 out_free_seq:
-	kfree(cli->cl_seq);
-	cli->cl_seq = NULL;
+	if (rc) {
+		kfree(cli->cl_seq);
+		cli->cl_seq = NULL;
+	}
+out_free_lock:
+	up_write(&cli->cl_seq_rwsem);
+
 	return rc;
 }
 EXPORT_SYMBOL(client_fid_init);
@@ -373,11 +380,13 @@ int client_fid_fini(struct obd_device *obd)
 {
 	struct client_obd *cli = &obd->u.cli;
 
+	down_write(&cli->cl_seq_rwsem);
 	if (cli->cl_seq) {
 		seq_client_fini(cli->cl_seq);
 		kfree(cli->cl_seq);
 		cli->cl_seq = NULL;
 	}
+	up_write(&cli->cl_seq_rwsem);
 
 	return 0;
 }
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 333c703..3c0dbb6 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -333,6 +333,7 @@ struct client_obd {
 
 	/* sequence manager */
 	struct lu_client_seq    *cl_seq;
+	struct rw_semaphore	 cl_seq_rwsem;
 
 	atomic_t	     cl_resends; /* resend count */
 
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index c36d1e4..32eda4f 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -308,6 +308,8 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	}
 
 	init_rwsem(&cli->cl_sem);
+	cli->cl_seq = NULL;
+	init_rwsem(&cli->cl_seq_rwsem);
 	cli->cl_conn_count = 0;
 	memcpy(server_uuid.uuid, lustre_cfg_buf(lcfg, 2),
 	       min_t(unsigned int, LUSTRE_CFG_BUFLEN(lcfg, 2),
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index c2f0a54..a759da2 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2517,8 +2517,10 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
 		 * Flush current sequence to make client obtain new one
 		 * from server in case of disconnect/reconnect.
 		 */
+		down_read(&cli->cl_seq_rwsem);
 		if (cli->cl_seq)
 			seq_client_flush(cli->cl_seq);
+		up_read(&cli->cl_seq_rwsem);
 
 		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE);
 		break;
@@ -2557,9 +2559,14 @@ int mdc_fid_alloc(const struct lu_env *env, struct obd_export *exp,
 		  struct lu_fid *fid, struct md_op_data *op_data)
 {
 	struct client_obd *cli = &exp->exp_obd->u.cli;
-	struct lu_client_seq *seq = cli->cl_seq;
+	int rc = -EIO;
 
-	return seq_client_alloc_fid(env, seq, fid);
+	down_read(&cli->cl_seq_rwsem);
+	if (cli->cl_seq)
+		rc = seq_client_alloc_fid(env, cli->cl_seq, fid);
+	up_read(&cli->cl_seq_rwsem);
+
+	return rc;
 }
 
 static struct obd_uuid *mdc_get_uuid(struct obd_export *exp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 31/31] lustre: docs: update TODO file
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (29 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush James Simmons
@ 2018-07-31  2:26 ` James Simmons
  2018-08-01  3:41 ` [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 NeilBrown
  31 siblings, 0 replies; 58+ messages in thread
From: James Simmons @ 2018-07-31  2:26 UTC (permalink / raw)
  To: lustre-devel

With several bugs fixed we can remove them from the TODO file.
Also update several email addresses.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/TODO | 51 ++++-----------------------------------------
 1 file changed, 4 insertions(+), 47 deletions(-)

diff --git a/drivers/staging/lustre/TODO b/drivers/staging/lustre/TODO
index 5332cdb..942280b 100644
--- a/drivers/staging/lustre/TODO
+++ b/drivers/staging/lustre/TODO
@@ -17,10 +17,6 @@ addressed:
 *
 ******************************************************************************
 
-https://jira.hpdd.intel.com/browse/LU-100086
-
-LNET_MINOR conflicts with USERIO_MINOR
-
 ------------------------------------------------------------------------------
 
 https://jira.hpdd.intel.com/browse/LU-8130
@@ -29,14 +25,6 @@ Fix and simplify libcfs hash handling
 
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-8703
-
-The current way we handle SMP is wrong. Platforms like ARM and KNL can have
-core and NUMA setups with things like NUMA nodes with no cores. We need to
-handle such cases. This work also greatly simplified the lustre SMP code.
-
-------------------------------------------------------------------------------
-
 https://jira.hpdd.intel.com/browse/LU-9019
 
 Replace libcfs time API with standard kernel APIs. Also migrate away from
@@ -53,14 +41,9 @@ Poor performance for the ko2iblnd driver. This is related to many of the
 patches below that are missing from the linux client.
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-9886
-
-Crash in upstream kiblnd_handle_early_rxs()
-------------------------------------------------------------------------------
-
 https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089
 
-Default to default to using MEM_REG
+Default to using MEM_REG
 ------------------------------------------------------------------------------
 
 https://jira.hpdd.intel.com/browse/LU-10459
@@ -98,11 +81,6 @@ https://jira.hpdd.intel.com/browse/LU-10129
 query device capabilities
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-10015
-
-fix race at kiblnd_connect_peer
-------------------------------------------------------------------------------
-
 https://jira.hpdd.intel.com/browse/LU-9983
 
 allow for discontiguous fragments
@@ -123,21 +101,11 @@ https://jira.hpdd.intel.com/browse/LU-9507
 Don't Assert On Reconnect with MultiQP
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-9472
-
-Fix FastReg map/unmap for MLX5
-------------------------------------------------------------------------------
-
 https://jira.hpdd.intel.com/browse/LU-9425
 
 Turn on 2 sges by default
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-8943
-
-Enable Multiple OPA Endpoints between Nodes
-------------------------------------------------------------------------------
-
 https://jira.hpdd.intel.com/browse/LU-5718
 
 multiple sges for work request
@@ -286,17 +254,6 @@ https://jira.hpdd.intel.com/browse/LU-9862
 Patch that landed for LU-7890 leads to static checker errors
 ------------------------------------------------------------------------------
 
-https://jira.hpdd.intel.com/browse/LU-9868
-
-dcache/namei fixes for lustre
-------------------------------------------------------------------------------
-
-https://jira.hpdd.intel.com/browse/LU-10467
-
-use standard linux wait_events macros work by Neil Brown
-
-------------------------------------------------------------------------------
-
-Please send any patches to Greg Kroah-Hartman <greg@kroah.com>, Andreas Dilger
-<andreas.dilger@intel.com>, James Simmons <jsimmons@infradead.org> and
-Oleg Drokin <oleg.drokin@intel.com>.
+Please send any patches to NeilBrown <neilb@suse.com>, Andreas Dilger
+<adilger@whamcloud.com>, James Simmons <jsimmons@infradead.org> and
+Oleg Drokin <green@whamcloud.com>.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window
  2018-07-31  2:25 ` [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window James Simmons
@ 2018-07-31  4:05   ` Patrick Farrell
  2018-08-02  3:52     ` James Simmons
  0 siblings, 1 reply; 58+ messages in thread
From: Patrick Farrell @ 2018-07-31  4:05 UTC (permalink / raw)
  To: lustre-devel

I'm puzzled, James - Why is "cache_jobid" in there?  Isn't that from Ben Evans' work?  This patch landed before all of that...

________________________________
From: James Simmons <jsimmons@infradead.org>
Sent: Monday, July 30, 2018 9:25:58 PM
To: Andreas Dilger; Oleg Drokin; NeilBrown
Cc: Lustre Development List; Patrick Farrell; James Simmons
Subject: [PATCH 06/31] lustre: llite: reduce jobstats race window

From: Patrick Farrell <paf@cray.com>

In the current code, lli_jobid is set to zero on every call
to lustre_get_jobid.  This causes problems, because it's
used asynchronously to set the job id in RPCs, and some
RPCs will falsely get no jobid set.  (For small IO sizes,
this can be up to 60% of RPCs.)

It would be very expensive to put hard synchronization
between this and every outbound RPC, and it's OK to very
rarely get an RPC without correct job stats info.

This patch only updates the lli_jobid when the job id has
changed, which leaves only a very small window for reading
an inconsistent job id.

Signed-off-by: Patrick Farrell <paf@cray.com>
WC-id: https://jira.whamcloud.com/browse/LU-8926
Reviewed-on: https://review.whamcloud.com/24253
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +
 drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index c0861b9..72b118a 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)
                 lli->lli_async_rc = 0;
         }
         mutex_init(&lli->lli_layout_mutex);
+       memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);
 }

 int ll_fill_super(struct super_block *sb)
diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index cdaf729..87327ef 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -95,26 +95,34 @@
  */
 int lustre_get_jobid(char *jobid)
 {
-       memset(jobid, 0, LUSTRE_JOBID_SIZE);
+       char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };
+
         /* Jobstats isn't enabled */
         if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)
-               return 0;
+               goto out_cache_jobid;

         /* Use process name + fsuid as jobid */
         if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {
-               snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",
+               snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",
                          current->comm,
                          from_kuid(&init_user_ns, current_fsuid()));
-               return 0;
+               goto out_cache_jobid;
         }

         /* Whole node dedicated to single job */
         if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {
-               strcpy(jobid, obd_jobid_node);
-               return 0;
+               strcpy(tmp_jobid, obd_jobid_node);
+               goto out_cache_jobid;
         }

         return -ENOENT;
+
+out_cache_jobid:
+       /* Only replace the job ID if it changed. */
+       if (strcmp(jobid, tmp_jobid) != 0)
+               strcpy(jobid, tmp_jobid);
+
+       return 0;
 }
 EXPORT_SYMBOL(lustre_get_jobid);

--
1.8.3.1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180731/9cd3af6a/attachment-0001.html>

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally
  2018-07-31  2:26 ` [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally James Simmons
@ 2018-07-31 22:31   ` NeilBrown
  0 siblings, 0 replies; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:31 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:
> --- a/drivers/staging/lustre/lustre/llite/llite_internal.h
> +++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
> @@ -138,6 +138,11 @@ struct ll_inode_info {
>  	s64				lli_ctime;
>  	spinlock_t			lli_agl_lock;
>  
> +	/* update atime from MDS no matter if it's older than
> +	 * local inode atime.
> +	 */
> +	unsigned int			lli_update_atime:1;
> +

Why not make this another flag bit in lli_flags ??

I might add a patch to do that.

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/6c1efa30/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99
  2018-07-31  2:25 ` [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99 James Simmons
@ 2018-07-31 22:32   ` NeilBrown
  0 siblings, 0 replies; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:32 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:
> --- a/drivers/staging/lustre/lnet/lnet/lo.c
> +++ b/drivers/staging/lustre/lnet/lnet/lo.c
> @@ -91,15 +91,13 @@
>  }
>  
>  struct lnet_lnd the_lolnd = {
> -	/* .lnd_list       = */ {&the_lolnd.lnd_list, &the_lolnd.lnd_list},
> -	/* .lnd_refcount   = */ 0,
> -	/* .lnd_type       = */ LOLND,
> -	/* .lnd_startup    = */ lolnd_startup,
> -	/* .lnd_shutdown   = */ lolnd_shutdown,
> -	/* .lnt_ctl        = */ NULL,
> -	/* .lnd_send       = */ lolnd_send,
> -	/* .lnd_recv       = */ lolnd_recv,
> -	/* .lnd_eager_recv = */ NULL,
> -	/* .lnd_notify     = */ NULL,
> -	/* .lnd_accept     = */ NULL
> +	.lnd_list	= {
> +				.next	= &the_lolnd.lnd_list,
> +				.prev	= &the_lolnd.lnd_list
> +			},

That would be better as
        .lnd_list       = LIST_HEAD_INIT(the_lolnd.lnd_list),

I'll queue a patch to make that change.

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/c1dc9566/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-07-31  2:26 ` [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print James Simmons
@ 2018-07-31 22:38   ` NeilBrown
  2018-07-31 23:54     ` Amir Shehata
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:38 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:

> From: Amir Shehata <ashehata@whamcloud.com>
>
> The default number of hops for  a route is -1. This is
> currently being printed as %u. Change that to %d to
> make it print out properly.

-1 hops???  I wish I could hop -1 times - it would be a good party
trick!!

What does -1 mean?  Unlimited (just a guess).  If so, could we print
"unlimited"??

I'm fine with having magic numbers in the code, but I don't like them to
leak out.

NeilBrown

>
> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9078
> Reviewed-on: https://review.whamcloud.com/25250
> Reviewed-by: Olaf Weber <olaf@sgi.com>
> Reviewed-by: Doug Oucharek <dougso@me.com>
> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
> index 8856798..aa98ce5 100644
> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>  			int alive = lnet_is_route_alive(route);
>  
>  			s += snprintf(s, tmpstr + tmpsiz - s,
> -				      "%-8s %4u %8u %7s %s\n",
> +				      "%-8s %4d %8u %7s %s\n",
>  				      libcfs_net2str(net), hops,
>  				      priority,
>  				      alive ? "up" : "down",
> -- 
> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/ce9b5960/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP
  2018-07-31  2:26 ` [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP James Simmons
@ 2018-07-31 22:41   ` NeilBrown
  0 siblings, 0 replies; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:41 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:

> From: Niu Yawei <yawei.niu@intel.com>
>
> Lustre target processes client log to retrieve MDT NIDs and start
> LWPs, it goes the same code path of mgc_process_config() just like
> processing the target config log, so that sub clds for security,
> nodemap, param & recovery will be attached unnecessarily.
>
> The mgc subsystem is used by both server and client. This change
> allows us to cleanly handle the future case when the mgc layer
> would be built with server code. This way server specific config
> logs will only be processed when a server mount occurs.
>
> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9081
> Reviewed-on: https://review.whamcloud.com/25293
> Reviewed-by: Fan Yong <fan.yong@intel.com>
> Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/include/obd_class.h | 19 +++++++-----
>  drivers/staging/lustre/lustre/llite/llite_lib.c   |  1 +
>  drivers/staging/lustre/lustre/mgc/mgc_request.c   | 37 +++++++++++++----------
>  3 files changed, 34 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
> index adfe2ab..e772e3d 100644
> --- a/drivers/staging/lustre/lustre/include/obd_class.h
> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
> @@ -153,17 +153,22 @@ struct config_llog_instance {
>  	llog_cb_t	    cfg_callback;
>  	int		    cfg_last_idx; /* for partial llog processing */
>  	int		    cfg_flags;
> +	u32		    cfg_sub_clds;
>  };
>  
>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>  			    char *name, struct config_llog_instance *cfg);
> -enum {
> -	CONFIG_T_CONFIG  = 0,
> -	CONFIG_T_SPTLRPC = 1,
> -	CONFIG_T_RECOVER = 2,
> -	CONFIG_T_PARAMS  = 3,
> -	CONFIG_T_MAX     = 4
> -};
> +
> +#define CONFIG_T_CONFIG		BIT(0)
> +#define CONFIG_T_SPTLRPC	BIT(1)
> +#define CONFIG_T_RECOVER	BIT(2)
> +#define CONFIG_T_PARAMS		BIT(3)

This could still be an enum:

enum {
	CONFIG_T_CONFIG  = BIT(0),
	CONFIG_T_SPTLRPC = BIT(1),
	CONFIG_T_RECOVER = BIT(2),
	CONFIG_T_PARAMS  = BIT(3),
};

and I'm glad "CONFIG_T_MAX" is gone - as the MAX was clearly '3', not '4'.

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/167df322/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-07-31  2:26 ` [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h James Simmons
@ 2018-07-31 22:47   ` NeilBrown
  2018-07-31 23:04     ` Patrick Farrell
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:47 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:

> From: Niu Yawei <yawei.niu@intel.com>
>
> Move config type values CONFIG_T_XXX into lustre_idl.h since they
> will be put on wire when reading config logs.
>
> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
> mgs_config_res.
>
> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>
> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9216
> Reviewed-on: https://review.whamcloud.com/26022
> Reviewed-by: Fan Yong <fan.yong@intel.com>
> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>  4 files changed, 103 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> index c9b32ef..bd3b45a 100644
> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>  	} u;
>  };
>  
> +enum {
> +	CONFIG_T_CONFIG  = 0,
> +	CONFIG_T_SPTLRPC = 1,
> +	CONFIG_T_RECOVER = 2,
> +	CONFIG_T_PARAMS  = 3,
> +	CONFIG_T_MAX

Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
a MAX).
It's never used, so it'll have to go.

NeilBrown


> +};
> +
>  struct mgs_config_body {
>  	char		mcb_name[MTI_NAME_MAXLEN]; /* logname */
>  	__u64		mcb_offset;    /* next index of config log to request */
>  	__u16		mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
> -	__u8		mcb_reserved;
> +	__u8		mcb_nm_cur_pass;
>  	__u8		mcb_bits;      /* bits unit size of config log */
>  	__u32		mcb_units;     /* # of units for bulk transfer */
>  };
> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
> index 184da99..647cc22 100644
> --- a/drivers/staging/lustre/lustre/include/obd_class.h
> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
> @@ -156,16 +156,16 @@ struct config_llog_instance {
>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>  			    char *name, struct config_llog_instance *cfg);
>  
> -#define CONFIG_T_CONFIG		BIT(0)
> -#define CONFIG_T_SPTLRPC	BIT(1)
> -#define CONFIG_T_RECOVER	BIT(2)
> -#define CONFIG_T_PARAMS		BIT(3)
> +#define CONFIG_SUB_CONFIG	BIT(0)
> +#define CONFIG_SUB_SPTLRPC	BIT(1)
> +#define CONFIG_SUB_RECOVER	BIT(2)
> +#define CONFIG_SUB_PARAMS	BIT(3)
>  
>  /* Sub clds should be attached to the config_llog_data when processing
>   * config log for client or server target.
>   */
> -#define CONFIG_SUB_CLIENT	(CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
> -				 CONFIG_T_PARAMS)
> +#define CONFIG_SUB_CLIENT	(CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
> +				 CONFIG_SUB_PARAMS)
>  
>  #define PARAMS_FILENAME	"params"
>  #define LCTL_UPCALL	"lctl"
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index 06fcc7e..833e6a0 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>  	memcpy(seclogname, logname, ptr - logname);
>  	strcpy(seclogname + (ptr - logname), "-sptlrpc");
>  
> -	if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
> +	if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>  		sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>  						     CONFIG_T_SPTLRPC, cfg);
>  		if (IS_ERR(sptlrpc_cld)) {
> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>  		}
>  	}
>  
> -	if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
> +	if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>  		params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>  						    CONFIG_T_PARAMS, cfg);
>  		if (IS_ERR(params_cld)) {
> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>  
>  	LASSERT(lsi->lsi_lmd);
>  	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
> -	    cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
> +	    cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>  		ptr = strrchr(seclogname, '-');
>  		if (ptr) {
>  			*ptr = 0;
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> index 2f081ed..09b1298 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>  	LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>  		 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>  
> +	/* Checks for struct mgs_nidtbl_entry */
> +	LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
> +		 (long long)(int)sizeof(struct mgs_nidtbl_entry));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
> +	LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
> +	LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
> +
> +	/* Checks for struct mgs_config_body */
> +	LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
> +		 (long long)(int)sizeof(struct mgs_config_body));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_name));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_type));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
> +	LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_body, mcb_units));
> +	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
> +
> +	BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
> +	BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
> +	BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
> +	BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
> +
> +	/* Checks for struct mgs_config_res */
> +	LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
> +		 (long long)(int)sizeof(struct mgs_config_res));
> +	LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
> +	LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
> +	LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
> +		 (long long)(int)offsetof(struct mgs_config_res, mcr_size));
> +	LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
> +		 (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
> +
>  	/* Checks for struct lustre_capa */
>  	LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>  		 (long long)(int)sizeof(struct lustre_capa));
> -- 
> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/7dfd1f53/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush
  2018-07-31  2:26 ` [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush James Simmons
@ 2018-07-31 22:55   ` NeilBrown
  2018-08-01  0:44     ` Yong, Fan
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-07-31 22:55 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:

> From: Fan Yong <fan.yong@intel.com>
>
> When the client mount failed or umount, the client_fid_fini() will
> be called. At that time, the async connection failure will trigger
> seq_client_flush() which parameter may have been released by the
> client_fid_fini() by race.
>
> Introduce client_obd::cl_seq_rwsem to protect client_obd::cl_seq.

This looks odd..

I think the cl_seq_rwsem is being used like a refcount on cl_seq,
to prevent it from being freed while it is still in use.
If I'm correct, then I would much prefer that a refcount was used.

Is this more than just a disguised refcount?

NeilBrown


>
> Signed-off-by: Fan Yong <fan.yong@intel.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9224
> Reviewed-on: https://review.whamcloud.com/26079
> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/fid/fid_request.c | 21 +++++++++++++++------
>  drivers/staging/lustre/lustre/include/obd.h     |  1 +
>  drivers/staging/lustre/lustre/ldlm/ldlm_lib.c   |  2 ++
>  drivers/staging/lustre/lustre/mdc/mdc_request.c | 11 +++++++++--
>  4 files changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
> index a34fd90..f91242c 100644
> --- a/drivers/staging/lustre/lustre/fid/fid_request.c
> +++ b/drivers/staging/lustre/lustre/fid/fid_request.c
> @@ -343,11 +343,14 @@ int client_fid_init(struct obd_device *obd,
>  {
>  	struct client_obd *cli = &obd->u.cli;
>  	char *prefix;
> -	int rc;
> +	int rc = 0;
>  
> +	down_write(&cli->cl_seq_rwsem);
>  	cli->cl_seq = kzalloc(sizeof(*cli->cl_seq), GFP_NOFS);
> -	if (!cli->cl_seq)
> -		return -ENOMEM;
> +	if (!cli->cl_seq) {
> +		rc = -ENOMEM;
> +		goto out_free_lock;
> +	}
>  
>  	prefix = kzalloc(MAX_OBD_NAME + 5, GFP_NOFS);
>  	if (!prefix) {
> @@ -361,10 +364,14 @@ int client_fid_init(struct obd_device *obd,
>  	seq_client_init(cli->cl_seq, exp, type, prefix);
>  	kfree(prefix);
>  
> -	return 0;
>  out_free_seq:
> -	kfree(cli->cl_seq);
> -	cli->cl_seq = NULL;
> +	if (rc) {
> +		kfree(cli->cl_seq);
> +		cli->cl_seq = NULL;
> +	}
> +out_free_lock:
> +	up_write(&cli->cl_seq_rwsem);
> +
>  	return rc;
>  }
>  EXPORT_SYMBOL(client_fid_init);
> @@ -373,11 +380,13 @@ int client_fid_fini(struct obd_device *obd)
>  {
>  	struct client_obd *cli = &obd->u.cli;
>  
> +	down_write(&cli->cl_seq_rwsem);
>  	if (cli->cl_seq) {
>  		seq_client_fini(cli->cl_seq);
>  		kfree(cli->cl_seq);
>  		cli->cl_seq = NULL;
>  	}
> +	up_write(&cli->cl_seq_rwsem);
>  
>  	return 0;
>  }
> diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
> index 333c703..3c0dbb6 100644
> --- a/drivers/staging/lustre/lustre/include/obd.h
> +++ b/drivers/staging/lustre/lustre/include/obd.h
> @@ -333,6 +333,7 @@ struct client_obd {
>  
>  	/* sequence manager */
>  	struct lu_client_seq    *cl_seq;
> +	struct rw_semaphore	 cl_seq_rwsem;
>  
>  	atomic_t	     cl_resends; /* resend count */
>  
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> index c36d1e4..32eda4f 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> @@ -308,6 +308,8 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
>  	}
>  
>  	init_rwsem(&cli->cl_sem);
> +	cli->cl_seq = NULL;
> +	init_rwsem(&cli->cl_seq_rwsem);
>  	cli->cl_conn_count = 0;
>  	memcpy(server_uuid.uuid, lustre_cfg_buf(lcfg, 2),
>  	       min_t(unsigned int, LUSTRE_CFG_BUFLEN(lcfg, 2),
> diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> index c2f0a54..a759da2 100644
> --- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
> +++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> @@ -2517,8 +2517,10 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
>  		 * Flush current sequence to make client obtain new one
>  		 * from server in case of disconnect/reconnect.
>  		 */
> +		down_read(&cli->cl_seq_rwsem);
>  		if (cli->cl_seq)
>  			seq_client_flush(cli->cl_seq);
> +		up_read(&cli->cl_seq_rwsem);
>  
>  		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE);
>  		break;
> @@ -2557,9 +2559,14 @@ int mdc_fid_alloc(const struct lu_env *env, struct obd_export *exp,
>  		  struct lu_fid *fid, struct md_op_data *op_data)
>  {
>  	struct client_obd *cli = &exp->exp_obd->u.cli;
> -	struct lu_client_seq *seq = cli->cl_seq;
> +	int rc = -EIO;
>  
> -	return seq_client_alloc_fid(env, seq, fid);
> +	down_read(&cli->cl_seq_rwsem);
> +	if (cli->cl_seq)
> +		rc = seq_client_alloc_fid(env, cli->cl_seq, fid);
> +	up_read(&cli->cl_seq_rwsem);
> +
> +	return rc;
>  }
>  
>  static struct obd_uuid *mdc_get_uuid(struct obd_export *exp)
> -- 
> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/cce690fa/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-07-31 22:47   ` NeilBrown
@ 2018-07-31 23:04     ` Patrick Farrell
  2018-08-01  0:23       ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Patrick Farrell @ 2018-07-31 23:04 UTC (permalink / raw)
  To: lustre-devel

Neil,

Do you have an objection to the concept, or just because this one's not used?
Having a MAX makes it easy to write things like < MYENUM_MAX as sanity checking code, and then if the enum is added to, it still works.  Seems useful to me.

- Patrick
________________________________
From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
Sent: Tuesday, July 31, 2018 5:47:28 PM
To: James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h

On Mon, Jul 30 2018, James Simmons wrote:

> From: Niu Yawei <yawei.niu@intel.com>
>
> Move config type values CONFIG_T_XXX into lustre_idl.h since they
> will be put on wire when reading config logs.
>
> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
> mgs_config_res.
>
> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>
> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9216
> Reviewed-on: https://review.whamcloud.com/26022
> Reviewed-by: Fan Yong <fan.yong@intel.com>
> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>  4 files changed, 103 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> index c9b32ef..bd3b45a 100644
> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>        } u;
>  };
>
> +enum {
> +     CONFIG_T_CONFIG  = 0,
> +     CONFIG_T_SPTLRPC = 1,
> +     CONFIG_T_RECOVER = 2,
> +     CONFIG_T_PARAMS  = 3,
> +     CONFIG_T_MAX

Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
a MAX).
It's never used, so it'll have to go.

NeilBrown


> +};
> +
>  struct mgs_config_body {
>        char            mcb_name[MTI_NAME_MAXLEN]; /* logname */
>        __u64           mcb_offset;    /* next index of config log to request */
>        __u16           mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
> -     __u8            mcb_reserved;
> +     __u8            mcb_nm_cur_pass;
>        __u8            mcb_bits;      /* bits unit size of config log */
>        __u32           mcb_units;     /* # of units for bulk transfer */
>  };
> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
> index 184da99..647cc22 100644
> --- a/drivers/staging/lustre/lustre/include/obd_class.h
> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
> @@ -156,16 +156,16 @@ struct config_llog_instance {
>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>                            char *name, struct config_llog_instance *cfg);
>
> -#define CONFIG_T_CONFIG              BIT(0)
> -#define CONFIG_T_SPTLRPC     BIT(1)
> -#define CONFIG_T_RECOVER     BIT(2)
> -#define CONFIG_T_PARAMS              BIT(3)
> +#define CONFIG_SUB_CONFIG    BIT(0)
> +#define CONFIG_SUB_SPTLRPC   BIT(1)
> +#define CONFIG_SUB_RECOVER   BIT(2)
> +#define CONFIG_SUB_PARAMS    BIT(3)
>
>  /* Sub clds should be attached to the config_llog_data when processing
>   * config log for client or server target.
>   */
> -#define CONFIG_SUB_CLIENT    (CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
> -                              CONFIG_T_PARAMS)
> +#define CONFIG_SUB_CLIENT    (CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
> +                              CONFIG_SUB_PARAMS)
>
>  #define PARAMS_FILENAME      "params"
>  #define LCTL_UPCALL  "lctl"
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index 06fcc7e..833e6a0 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>        memcpy(seclogname, logname, ptr - logname);
>        strcpy(seclogname + (ptr - logname), "-sptlrpc");
>
> -     if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
> +     if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>                sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>                                                     CONFIG_T_SPTLRPC, cfg);
>                if (IS_ERR(sptlrpc_cld)) {
> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>                }
>        }
>
> -     if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
> +     if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>                params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>                                                    CONFIG_T_PARAMS, cfg);
>                if (IS_ERR(params_cld)) {
> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>
>        LASSERT(lsi->lsi_lmd);
>        if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
> -         cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
> +         cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>                ptr = strrchr(seclogname, '-');
>                if (ptr) {
>                        *ptr = 0;
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> index 2f081ed..09b1298 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>        LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>                 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>
> +     /* Checks for struct mgs_nidtbl_entry */
> +     LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
> +              (long long)(int)sizeof(struct mgs_nidtbl_entry));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
> +
> +     /* Checks for struct mgs_config_body */
> +     LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
> +              (long long)(int)sizeof(struct mgs_config_body));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_name));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_type));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_body, mcb_units));
> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
> +
> +     BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
> +     BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
> +     BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
> +     BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
> +
> +     /* Checks for struct mgs_config_res */
> +     LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
> +              (long long)(int)sizeof(struct mgs_config_res));
> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
> +              (long long)(int)offsetof(struct mgs_config_res, mcr_size));
> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
> +
>        /* Checks for struct lustre_capa */
>        LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>                 (long long)(int)sizeof(struct lustre_capa));
> --
> 1.8.3.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180731/cda9aafb/attachment-0001.html>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-07-31 22:38   ` NeilBrown
@ 2018-07-31 23:54     ` Amir Shehata
  2018-08-01  0:32       ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Amir Shehata @ 2018-07-31 23:54 UTC (permalink / raw)
  To: lustre-devel

The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
 
In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.

Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.

thanks
amir
________________________________________
From: NeilBrown [neilb at suse.com]
Sent: Tuesday, July 31, 2018 3:38 PM
To: James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List; Amir Shehata; James Simmons
Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print

On Mon, Jul 30 2018, James Simmons wrote:

> From: Amir Shehata <ashehata@whamcloud.com>
>
> The default number of hops for  a route is -1. This is
> currently being printed as %u. Change that to %d to
> make it print out properly.

-1 hops???  I wish I could hop -1 times - it would be a good party
trick!!

What does -1 mean?  Unlimited (just a guess).  If so, could we print
"unlimited"??

I'm fine with having magic numbers in the code, but I don't like them to
leak out.

NeilBrown

>
> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9078
> Reviewed-on: https://review.whamcloud.com/25250
> Reviewed-by: Olaf Weber <olaf@sgi.com>
> Reviewed-by: Doug Oucharek <dougso@me.com>
> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
> index 8856798..aa98ce5 100644
> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>                       int alive = lnet_is_route_alive(route);
>
>                       s += snprintf(s, tmpstr + tmpsiz - s,
> -                                   "%-8s %4u %8u %7s %s\n",
> +                                   "%-8s %4d %8u %7s %s\n",
>                                     libcfs_net2str(net), hops,
>                                     priority,
>                                     alive ? "up" : "down",
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-07-31 23:04     ` Patrick Farrell
@ 2018-08-01  0:23       ` NeilBrown
  2018-08-01  0:40         ` Patrick Farrell
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-01  0:23 UTC (permalink / raw)
  To: lustre-devel

On Tue, Jul 31 2018, Patrick Farrell wrote:

> Neil,
>
> Do you have an objection to the concept, or just because this one's not used?
> Having a MAX makes it easy to write things like < MYENUM_MAX as sanity checking code, and then if the enum is added to, it still works.  Seems useful to me.

I object to the name.  "MAX" is short for "MAXIMUM" which means the
highest value that is actually used.  When comparing something to the
maximum it makes sense to say
   if (foo <= maximum)

but it rarely makes sense to say
   if (foo < maximum)

If you want a count of the number of values, use MYENUM_CNT or
MYENUM_NUM. This can sensibly be one more than the maximum value.
But if you have MYSENUM_MAX, make sure it is the maximum meaningful
value for the enum.
</rant>

Thanks,
NeilBrown


>
> - Patrick
> ________________________________
> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> Sent: Tuesday, July 31, 2018 5:47:28 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>
> On Mon, Jul 30 2018, James Simmons wrote:
>
>> From: Niu Yawei <yawei.niu@intel.com>
>>
>> Move config type values CONFIG_T_XXX into lustre_idl.h since they
>> will be put on wire when reading config logs.
>>
>> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
>> mgs_config_res.
>>
>> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>>
>> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
>> WC-id: https://jira.whamcloud.com/browse/LU-9216
>> Reviewed-on: https://review.whamcloud.com/26022
>> Reviewed-by: Fan Yong <fan.yong@intel.com>
>> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>>  4 files changed, 103 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> index c9b32ef..bd3b45a 100644
>> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>>        } u;
>>  };
>>
>> +enum {
>> +     CONFIG_T_CONFIG  = 0,
>> +     CONFIG_T_SPTLRPC = 1,
>> +     CONFIG_T_RECOVER = 2,
>> +     CONFIG_T_PARAMS  = 3,
>> +     CONFIG_T_MAX
>
> Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
> a MAX).
> It's never used, so it'll have to go.
>
> NeilBrown
>
>
>> +};
>> +
>>  struct mgs_config_body {
>>        char            mcb_name[MTI_NAME_MAXLEN]; /* logname */
>>        __u64           mcb_offset;    /* next index of config log to request */
>>        __u16           mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
>> -     __u8            mcb_reserved;
>> +     __u8            mcb_nm_cur_pass;
>>        __u8            mcb_bits;      /* bits unit size of config log */
>>        __u32           mcb_units;     /* # of units for bulk transfer */
>>  };
>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
>> index 184da99..647cc22 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>> @@ -156,16 +156,16 @@ struct config_llog_instance {
>>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>>                            char *name, struct config_llog_instance *cfg);
>>
>> -#define CONFIG_T_CONFIG              BIT(0)
>> -#define CONFIG_T_SPTLRPC     BIT(1)
>> -#define CONFIG_T_RECOVER     BIT(2)
>> -#define CONFIG_T_PARAMS              BIT(3)
>> +#define CONFIG_SUB_CONFIG    BIT(0)
>> +#define CONFIG_SUB_SPTLRPC   BIT(1)
>> +#define CONFIG_SUB_RECOVER   BIT(2)
>> +#define CONFIG_SUB_PARAMS    BIT(3)
>>
>>  /* Sub clds should be attached to the config_llog_data when processing
>>   * config log for client or server target.
>>   */
>> -#define CONFIG_SUB_CLIENT    (CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
>> -                              CONFIG_T_PARAMS)
>> +#define CONFIG_SUB_CLIENT    (CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
>> +                              CONFIG_SUB_PARAMS)
>>
>>  #define PARAMS_FILENAME      "params"
>>  #define LCTL_UPCALL  "lctl"
>> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> index 06fcc7e..833e6a0 100644
>> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>        memcpy(seclogname, logname, ptr - logname);
>>        strcpy(seclogname + (ptr - logname), "-sptlrpc");
>>
>> -     if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>>                sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>>                                                     CONFIG_T_SPTLRPC, cfg);
>>                if (IS_ERR(sptlrpc_cld)) {
>> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>                }
>>        }
>>
>> -     if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>>                params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>>                                                    CONFIG_T_PARAMS, cfg);
>>                if (IS_ERR(params_cld)) {
>> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>
>>        LASSERT(lsi->lsi_lmd);
>>        if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
>> -         cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
>> +         cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>>                ptr = strrchr(seclogname, '-');
>>                if (ptr) {
>>                        *ptr = 0;
>> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> index 2f081ed..09b1298 100644
>> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>>        LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>>                 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>>
>> +     /* Checks for struct mgs_nidtbl_entry */
>> +     LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_nidtbl_entry));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
>> +
>> +     /* Checks for struct mgs_config_body */
>> +     LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_config_body));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_name));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_type));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_units));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
>> +
>> +     BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
>> +     BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
>> +     BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
>> +     BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
>> +
>> +     /* Checks for struct mgs_config_res */
>> +     LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_config_res));
>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_size));
>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
>> +
>>        /* Checks for struct lustre_capa */
>>        LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>>                 (long long)(int)sizeof(struct lustre_capa));
>> --
>> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/b493703e/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-07-31 23:54     ` Amir Shehata
@ 2018-08-01  0:32       ` NeilBrown
  2018-08-01  1:15         ` Amir Shehata
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-01  0:32 UTC (permalink / raw)
  To: lustre-devel


Hi Amir,
 thanks for the background.

 I had to chuckle at "0 being the highest", though I know that this
 distortion is not something specific to lustre.

 Your description seems to suggest that "-1" means "unknown" with an
 implication that the number of hops might be high, and best not to take
 the risk.

 Your point about compatability with scripts has some validity, though
 it is annoying to have to support such ugly interfaces indefinitely.
 Are there really likely to be dependencies? lustre has only been
 printing -1 since Feb last year when this patch went upstream.
 That was presumably an abi change as it would have printed MAXINT-1
 previously.  Did that cause any problems?

Thanks,
NeilBrown


On Tue, Jul 31 2018, Amir Shehata wrote:

> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>  
> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>
> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>
> thanks
> amir
> ________________________________________
> From: NeilBrown [neilb at suse.com]
> Sent: Tuesday, July 31, 2018 3:38 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List; Amir Shehata; James Simmons
> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>
> On Mon, Jul 30 2018, James Simmons wrote:
>
>> From: Amir Shehata <ashehata@whamcloud.com>
>>
>> The default number of hops for  a route is -1. This is
>> currently being printed as %u. Change that to %d to
>> make it print out properly.
>
> -1 hops???  I wish I could hop -1 times - it would be a good party
> trick!!
>
> What does -1 mean?  Unlimited (just a guess).  If so, could we print
> "unlimited"??
>
> I'm fine with having magic numbers in the code, but I don't like them to
> leak out.
>
> NeilBrown
>
>>
>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>> Reviewed-on: https://review.whamcloud.com/25250
>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>> Reviewed-by: Doug Oucharek <dougso@me.com>
>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>> index 8856798..aa98ce5 100644
>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>                       int alive = lnet_is_route_alive(route);
>>
>>                       s += snprintf(s, tmpstr + tmpsiz - s,
>> -                                   "%-8s %4u %8u %7s %s\n",
>> +                                   "%-8s %4d %8u %7s %s\n",
>>                                     libcfs_net2str(net), hops,
>>                                     priority,
>>                                     alive ? "up" : "down",
>> --
>> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/6ad86bcf/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-08-01  0:23       ` NeilBrown
@ 2018-08-01  0:40         ` Patrick Farrell
  2018-08-01  2:10           ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Patrick Farrell @ 2018-08-01  0:40 UTC (permalink / raw)
  To: lustre-devel


Huh.  I see your point, and the logic of it is obviously sound, but I'm curious - I don't think I've ever seen names like those you gave used.  I've only seen MAX.

Am I wrong about that, or is MAX indeed the dominant choice?

I also think in context, used in caps, MAX is clearly a special value without other meaning.
________________________________
From: NeilBrown <neilb@suse.com>
Sent: Tuesday, July 31, 2018 7:23:21 PM
To: Patrick Farrell; James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h

On Tue, Jul 31 2018, Patrick Farrell wrote:

> Neil,
>
> Do you have an objection to the concept, or just because this one's not used?
> Having a MAX makes it easy to write things like < MYENUM_MAX as sanity checking code, and then if the enum is added to, it still works.  Seems useful to me.

I object to the name.  "MAX" is short for "MAXIMUM" which means the
highest value that is actually used.  When comparing something to the
maximum it makes sense to say
   if (foo <= maximum)

but it rarely makes sense to say
   if (foo < maximum)

If you want a count of the number of values, use MYENUM_CNT or
MYENUM_NUM. This can sensibly be one more than the maximum value.
But if you have MYSENUM_MAX, make sure it is the maximum meaningful
value for the enum.
</rant>

Thanks,
NeilBrown


>
> - Patrick
> ________________________________
> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> Sent: Tuesday, July 31, 2018 5:47:28 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>
> On Mon, Jul 30 2018, James Simmons wrote:
>
>> From: Niu Yawei <yawei.niu@intel.com>
>>
>> Move config type values CONFIG_T_XXX into lustre_idl.h since they
>> will be put on wire when reading config logs.
>>
>> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
>> mgs_config_res.
>>
>> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>>
>> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
>> WC-id: https://jira.whamcloud.com/browse/LU-9216
>> Reviewed-on: https://review.whamcloud.com/26022
>> Reviewed-by: Fan Yong <fan.yong@intel.com>
>> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>>  4 files changed, 103 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> index c9b32ef..bd3b45a 100644
>> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>>        } u;
>>  };
>>
>> +enum {
>> +     CONFIG_T_CONFIG  = 0,
>> +     CONFIG_T_SPTLRPC = 1,
>> +     CONFIG_T_RECOVER = 2,
>> +     CONFIG_T_PARAMS  = 3,
>> +     CONFIG_T_MAX
>
> Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
> a MAX).
> It's never used, so it'll have to go.
>
> NeilBrown
>
>
>> +};
>> +
>>  struct mgs_config_body {
>>        char            mcb_name[MTI_NAME_MAXLEN]; /* logname */
>>        __u64           mcb_offset;    /* next index of config log to request */
>>        __u16           mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
>> -     __u8            mcb_reserved;
>> +     __u8            mcb_nm_cur_pass;
>>        __u8            mcb_bits;      /* bits unit size of config log */
>>        __u32           mcb_units;     /* # of units for bulk transfer */
>>  };
>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
>> index 184da99..647cc22 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>> @@ -156,16 +156,16 @@ struct config_llog_instance {
>>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>>                            char *name, struct config_llog_instance *cfg);
>>
>> -#define CONFIG_T_CONFIG              BIT(0)
>> -#define CONFIG_T_SPTLRPC     BIT(1)
>> -#define CONFIG_T_RECOVER     BIT(2)
>> -#define CONFIG_T_PARAMS              BIT(3)
>> +#define CONFIG_SUB_CONFIG    BIT(0)
>> +#define CONFIG_SUB_SPTLRPC   BIT(1)
>> +#define CONFIG_SUB_RECOVER   BIT(2)
>> +#define CONFIG_SUB_PARAMS    BIT(3)
>>
>>  /* Sub clds should be attached to the config_llog_data when processing
>>   * config log for client or server target.
>>   */
>> -#define CONFIG_SUB_CLIENT    (CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
>> -                              CONFIG_T_PARAMS)
>> +#define CONFIG_SUB_CLIENT    (CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
>> +                              CONFIG_SUB_PARAMS)
>>
>>  #define PARAMS_FILENAME      "params"
>>  #define LCTL_UPCALL  "lctl"
>> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> index 06fcc7e..833e6a0 100644
>> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>        memcpy(seclogname, logname, ptr - logname);
>>        strcpy(seclogname + (ptr - logname), "-sptlrpc");
>>
>> -     if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>>                sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>>                                                     CONFIG_T_SPTLRPC, cfg);
>>                if (IS_ERR(sptlrpc_cld)) {
>> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>                }
>>        }
>>
>> -     if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>>                params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>>                                                    CONFIG_T_PARAMS, cfg);
>>                if (IS_ERR(params_cld)) {
>> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>
>>        LASSERT(lsi->lsi_lmd);
>>        if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
>> -         cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
>> +         cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>>                ptr = strrchr(seclogname, '-');
>>                if (ptr) {
>>                        *ptr = 0;
>> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> index 2f081ed..09b1298 100644
>> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>>        LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>>                 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>>
>> +     /* Checks for struct mgs_nidtbl_entry */
>> +     LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_nidtbl_entry));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
>> +
>> +     /* Checks for struct mgs_config_body */
>> +     LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_config_body));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_name));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_type));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_units));
>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
>> +
>> +     BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
>> +     BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
>> +     BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
>> +     BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
>> +
>> +     /* Checks for struct mgs_config_res */
>> +     LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
>> +              (long long)(int)sizeof(struct mgs_config_res));
>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_size));
>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
>> +
>>        /* Checks for struct lustre_capa */
>>        LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>>                 (long long)(int)sizeof(struct lustre_capa));
>> --
>> 1.8.3.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/e6744011/attachment-0001.html>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush
  2018-07-31 22:55   ` NeilBrown
@ 2018-08-01  0:44     ` Yong, Fan
  2018-08-01  2:58       ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Yong, Fan @ 2018-08-01  0:44 UTC (permalink / raw)
  To: lustre-devel

The client_obd::cl_seq_rwsem protects the client_obd::cl_seq itself, not the internal members inside client_obd::cl_seq. I do not think simple refcount can work here. For example, if the client_fid_fini() is destroying the client_obd::cl_seq, we need some mechanism, such as the mutex, to pervert others accessing the client_obd::cl_seq. Under such case, even though someone could acquire refcount (after the client_fid_fini() start), it still cannot prevent the in-processing destroy.


--
Cheers,
Nasf

-----Original Message-----
From: NeilBrown [mailto:neilb at suse.com] 
Sent: Wednesday, August 1, 2018 6:55 AM
To: James Simmons <jsimmons@infradead.org>; Andreas Dilger <adilger@whamcloud.com>; Oleg Drokin <green@whamcloud.com>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>; Yong, Fan <fan.yong@intel.com>; James Simmons <jsimmons@infradead.org>
Subject: Re: [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush

On Mon, Jul 30 2018, James Simmons wrote:

> From: Fan Yong <fan.yong@intel.com>
>
> When the client mount failed or umount, the client_fid_fini() will be 
> called. At that time, the async connection failure will trigger
> seq_client_flush() which parameter may have been released by the
> client_fid_fini() by race.
>
> Introduce client_obd::cl_seq_rwsem to protect client_obd::cl_seq.

This looks odd..

I think the cl_seq_rwsem is being used like a refcount on cl_seq, to prevent it from being freed while it is still in use.
If I'm correct, then I would much prefer that a refcount was used.

Is this more than just a disguised refcount?

NeilBrown


>
> Signed-off-by: Fan Yong <fan.yong@intel.com>
> WC-id: https://jira.whamcloud.com/browse/LU-9224
> Reviewed-on: https://review.whamcloud.com/26079
> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/fid/fid_request.c | 21 +++++++++++++++------
>  drivers/staging/lustre/lustre/include/obd.h     |  1 +
>  drivers/staging/lustre/lustre/ldlm/ldlm_lib.c   |  2 ++
>  drivers/staging/lustre/lustre/mdc/mdc_request.c | 11 +++++++++--
>  4 files changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c 
> b/drivers/staging/lustre/lustre/fid/fid_request.c
> index a34fd90..f91242c 100644
> --- a/drivers/staging/lustre/lustre/fid/fid_request.c
> +++ b/drivers/staging/lustre/lustre/fid/fid_request.c
> @@ -343,11 +343,14 @@ int client_fid_init(struct obd_device *obd,  {
>  	struct client_obd *cli = &obd->u.cli;
>  	char *prefix;
> -	int rc;
> +	int rc = 0;
>  
> +	down_write(&cli->cl_seq_rwsem);
>  	cli->cl_seq = kzalloc(sizeof(*cli->cl_seq), GFP_NOFS);
> -	if (!cli->cl_seq)
> -		return -ENOMEM;
> +	if (!cli->cl_seq) {
> +		rc = -ENOMEM;
> +		goto out_free_lock;
> +	}
>  
>  	prefix = kzalloc(MAX_OBD_NAME + 5, GFP_NOFS);
>  	if (!prefix) {
> @@ -361,10 +364,14 @@ int client_fid_init(struct obd_device *obd,
>  	seq_client_init(cli->cl_seq, exp, type, prefix);
>  	kfree(prefix);
>  
> -	return 0;
>  out_free_seq:
> -	kfree(cli->cl_seq);
> -	cli->cl_seq = NULL;
> +	if (rc) {
> +		kfree(cli->cl_seq);
> +		cli->cl_seq = NULL;
> +	}
> +out_free_lock:
> +	up_write(&cli->cl_seq_rwsem);
> +
>  	return rc;
>  }
>  EXPORT_SYMBOL(client_fid_init);
> @@ -373,11 +380,13 @@ int client_fid_fini(struct obd_device *obd)  {
>  	struct client_obd *cli = &obd->u.cli;
>  
> +	down_write(&cli->cl_seq_rwsem);
>  	if (cli->cl_seq) {
>  		seq_client_fini(cli->cl_seq);
>  		kfree(cli->cl_seq);
>  		cli->cl_seq = NULL;
>  	}
> +	up_write(&cli->cl_seq_rwsem);
>  
>  	return 0;
>  }
> diff --git a/drivers/staging/lustre/lustre/include/obd.h 
> b/drivers/staging/lustre/lustre/include/obd.h
> index 333c703..3c0dbb6 100644
> --- a/drivers/staging/lustre/lustre/include/obd.h
> +++ b/drivers/staging/lustre/lustre/include/obd.h
> @@ -333,6 +333,7 @@ struct client_obd {
>  
>  	/* sequence manager */
>  	struct lu_client_seq    *cl_seq;
> +	struct rw_semaphore	 cl_seq_rwsem;
>  
>  	atomic_t	     cl_resends; /* resend count */
>  
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c 
> b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> index c36d1e4..32eda4f 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
> @@ -308,6 +308,8 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
>  	}
>  
>  	init_rwsem(&cli->cl_sem);
> +	cli->cl_seq = NULL;
> +	init_rwsem(&cli->cl_seq_rwsem);
>  	cli->cl_conn_count = 0;
>  	memcpy(server_uuid.uuid, lustre_cfg_buf(lcfg, 2),
>  	       min_t(unsigned int, LUSTRE_CFG_BUFLEN(lcfg, 2), diff --git 
> a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
> b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> index c2f0a54..a759da2 100644
> --- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
> +++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> @@ -2517,8 +2517,10 @@ static int mdc_import_event(struct obd_device *obd, struct obd_import *imp,
>  		 * Flush current sequence to make client obtain new one
>  		 * from server in case of disconnect/reconnect.
>  		 */
> +		down_read(&cli->cl_seq_rwsem);
>  		if (cli->cl_seq)
>  			seq_client_flush(cli->cl_seq);
> +		up_read(&cli->cl_seq_rwsem);
>  
>  		rc = obd_notify_observer(obd, obd, OBD_NOTIFY_INACTIVE);
>  		break;
> @@ -2557,9 +2559,14 @@ int mdc_fid_alloc(const struct lu_env *env, struct obd_export *exp,
>  		  struct lu_fid *fid, struct md_op_data *op_data)  {
>  	struct client_obd *cli = &exp->exp_obd->u.cli;
> -	struct lu_client_seq *seq = cli->cl_seq;
> +	int rc = -EIO;
>  
> -	return seq_client_alloc_fid(env, seq, fid);
> +	down_read(&cli->cl_seq_rwsem);
> +	if (cli->cl_seq)
> +		rc = seq_client_alloc_fid(env, cli->cl_seq, fid);
> +	up_read(&cli->cl_seq_rwsem);
> +
> +	return rc;
>  }
>  
>  static struct obd_uuid *mdc_get_uuid(struct obd_export *exp)
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-01  0:32       ` NeilBrown
@ 2018-08-01  1:15         ` Amir Shehata
  2018-08-01  2:50           ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Amir Shehata @ 2018-08-01  1:15 UTC (permalink / raw)
  To: lustre-devel

Hi Neil,

This issue actually came up because of LU-6060, which changed the behavior for LLNL. The behavior then was changed again by LU-6851: https://jira.whamcloud.com/browse/LU-6851 (if you'd like more background)

As a result of LU-6851 we were printing the unsigned value of -1. That's why we ended up printing it as -1, which is more bearable than just printing a large unsigned value.

I'm not disagreeing that it'll be better to print a clearer value, "unknown" does sound like it relays the correct meaning. However, I've sometimes run into issues where changing user facing interfaces caused  problems to user scripts. Would this be the case here, I'm not 100% sure. We can always make the change and then wait for tickets to be opened.

However, I think of more concern to me is that if we make changes like this to the upstreamed client, it's probably a good idea to also make them to the whamcloud repo as well, so as not to diverge the client and server (LNet is common between them). This particular case, might not be very significant, but other issues might come up that are of more significance. 

The code bases are already diverging significantly between the upstream client and the master repo, which makes porting features from master to upstream a difficult task. Do we have a strategy on how to deal with this?

thanks
amir
________________________________________
From: NeilBrown [neilb at suse.com]
Sent: Tuesday, July 31, 2018 5:32 PM
To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List
Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print

Hi Amir,
 thanks for the background.

 I had to chuckle at "0 being the highest", though I know that this
 distortion is not something specific to lustre.

 Your description seems to suggest that "-1" means "unknown" with an
 implication that the number of hops might be high, and best not to take
 the risk.

 Your point about compatability with scripts has some validity, though
 it is annoying to have to support such ugly interfaces indefinitely.
 Are there really likely to be dependencies? lustre has only been
 printing -1 since Feb last year when this patch went upstream.
 That was presumably an abi change as it would have printed MAXINT-1
 previously.  Did that cause any problems?

Thanks,
NeilBrown


On Tue, Jul 31 2018, Amir Shehata wrote:

> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>
> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>
> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>
> thanks
> amir
> ________________________________________
> From: NeilBrown [neilb at suse.com]
> Sent: Tuesday, July 31, 2018 3:38 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List; Amir Shehata; James Simmons
> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>
> On Mon, Jul 30 2018, James Simmons wrote:
>
>> From: Amir Shehata <ashehata@whamcloud.com>
>>
>> The default number of hops for  a route is -1. This is
>> currently being printed as %u. Change that to %d to
>> make it print out properly.
>
> -1 hops???  I wish I could hop -1 times - it would be a good party
> trick!!
>
> What does -1 mean?  Unlimited (just a guess).  If so, could we print
> "unlimited"??
>
> I'm fine with having magic numbers in the code, but I don't like them to
> leak out.
>
> NeilBrown
>
>>
>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>> Reviewed-on: https://review.whamcloud.com/25250
>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>> Reviewed-by: Doug Oucharek <dougso@me.com>
>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>> index 8856798..aa98ce5 100644
>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>                       int alive = lnet_is_route_alive(route);
>>
>>                       s += snprintf(s, tmpstr + tmpsiz - s,
>> -                                   "%-8s %4u %8u %7s %s\n",
>> +                                   "%-8s %4d %8u %7s %s\n",
>>                                     libcfs_net2str(net), hops,
>>                                     priority,
>>                                     alive ? "up" : "down",
>> --
>> 1.8.3.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-08-01  0:40         ` Patrick Farrell
@ 2018-08-01  2:10           ` NeilBrown
  2018-08-01  3:23             ` Patrick Farrell
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-01  2:10 UTC (permalink / raw)
  To: lustre-devel

On Wed, Aug 01 2018, Patrick Farrell wrote:

> Huh.  I see your point, and the logic of it is obviously sound, but I'm curious - I don't think I've ever seen names like those you gave used.  I've only seen MAX.

uses "count":
  enum acpi_bus_device_type
  include/linux/acpi.h
  enum cgroup_subsys_id

uses "reserved" as in "this number as larger are reserved"
   include/apci/actbl*
   enum bh_state_bits (uses "BH_PrivateStart, which is similar)

uses "last" (just as bad as max)
   enum amd_asic_type
   enum req_opf
   include/linux/ccp.h



uses "max"
   include/drm/bridge/dw_hdmi.h
   enum drm_color_encoding
   enum drm_color_range
   enum drm_sched_priority
   enum wb_reason
   enum backlight_type
   enum rdmacg_resource_type
   include/linux/clk/ti.h
   include/linux/crypto.h

uses "size" (as good as 'num' or 'count')
   include/drm/bridge/mhl.h

uses "num"
   enum drm_global_types
   enum ttm_ref_type
   ans1_ber_bytecode.h (actually 'nr' not 'num')
   enum wb_stat_item (actually uses "NR" prefix)
   enum bcm963xx_nvram_nand_part (__..._NR_PARTS)
   enum blkg_rwstat_type
   enum req_flag_bits

I was looking at all matches of "git grep -w 'enum.*{' include/" and
gave up when I had done include/[a-k]* and include/linux/[a-c]*. about
55 of the way.l

Clearly havin a name to represent the number of names in the enum is a
common need.  There are several alternatives people use.  "max" might be
a little more common than "num", but I don't think it is a clear winner.
Count, cnt, size, reserved, private, num, nr are all good and meaningful.
max and last are wrong - unless the number actually is the max or the
last, which it usually isn't.

>
> Am I wrong about that, or is MAX indeed the dominant choice?
>
> I also think in context, used in caps, MAX is clearly a special value
> without other meaning.

"clearly" except where it isn't clear.
Not that long ago:

http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-May/120870.html

The code *looked* right - it is an error if the id number is larger than
the MAX.  That makes sense.
But it was wrong, because the MAX wasn't the maximum, it was one more.

NeilBrown


> ________________________________
> From: NeilBrown <neilb@suse.com>
> Sent: Tuesday, July 31, 2018 7:23:21 PM
> To: Patrick Farrell; James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>
> On Tue, Jul 31 2018, Patrick Farrell wrote:
>
>> Neil,
>>
>> Do you have an objection to the concept, or just because this one's not used?
>> Having a MAX makes it easy to write things like < MYENUM_MAX as sanity checking code, and then if the enum is added to, it still works.  Seems useful to me.
>
> I object to the name.  "MAX" is short for "MAXIMUM" which means the
> highest value that is actually used.  When comparing something to the
> maximum it makes sense to say
>    if (foo <= maximum)
>
> but it rarely makes sense to say
>    if (foo < maximum)
>
> If you want a count of the number of values, use MYENUM_CNT or
> MYENUM_NUM. This can sensibly be one more than the maximum value.
> But if you have MYSENUM_MAX, make sure it is the maximum meaningful
> value for the enum.
> </rant>
>
> Thanks,
> NeilBrown
>
>
>>
>> - Patrick
>> ________________________________
>> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
>> Sent: Tuesday, July 31, 2018 5:47:28 PM
>> To: James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List
>> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>>
>> On Mon, Jul 30 2018, James Simmons wrote:
>>
>>> From: Niu Yawei <yawei.niu@intel.com>
>>>
>>> Move config type values CONFIG_T_XXX into lustre_idl.h since they
>>> will be put on wire when reading config logs.
>>>
>>> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
>>> mgs_config_res.
>>>
>>> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>>>
>>> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
>>> WC-id: https://jira.whamcloud.com/browse/LU-9216
>>> Reviewed-on: https://review.whamcloud.com/26022
>>> Reviewed-by: Fan Yong <fan.yong@intel.com>
>>> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>> ---
>>>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>>>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>>>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>>>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>>>  4 files changed, 103 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> index c9b32ef..bd3b45a 100644
>>> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>>>        } u;
>>>  };
>>>
>>> +enum {
>>> +     CONFIG_T_CONFIG  = 0,
>>> +     CONFIG_T_SPTLRPC = 1,
>>> +     CONFIG_T_RECOVER = 2,
>>> +     CONFIG_T_PARAMS  = 3,
>>> +     CONFIG_T_MAX
>>
>> Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
>> a MAX).
>> It's never used, so it'll have to go.
>>
>> NeilBrown
>>
>>
>>> +};
>>> +
>>>  struct mgs_config_body {
>>>        char            mcb_name[MTI_NAME_MAXLEN]; /* logname */
>>>        __u64           mcb_offset;    /* next index of config log to request */
>>>        __u16           mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
>>> -     __u8            mcb_reserved;
>>> +     __u8            mcb_nm_cur_pass;
>>>        __u8            mcb_bits;      /* bits unit size of config log */
>>>        __u32           mcb_units;     /* # of units for bulk transfer */
>>>  };
>>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
>>> index 184da99..647cc22 100644
>>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>>> @@ -156,16 +156,16 @@ struct config_llog_instance {
>>>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>>>                            char *name, struct config_llog_instance *cfg);
>>>
>>> -#define CONFIG_T_CONFIG              BIT(0)
>>> -#define CONFIG_T_SPTLRPC     BIT(1)
>>> -#define CONFIG_T_RECOVER     BIT(2)
>>> -#define CONFIG_T_PARAMS              BIT(3)
>>> +#define CONFIG_SUB_CONFIG    BIT(0)
>>> +#define CONFIG_SUB_SPTLRPC   BIT(1)
>>> +#define CONFIG_SUB_RECOVER   BIT(2)
>>> +#define CONFIG_SUB_PARAMS    BIT(3)
>>>
>>>  /* Sub clds should be attached to the config_llog_data when processing
>>>   * config log for client or server target.
>>>   */
>>> -#define CONFIG_SUB_CLIENT    (CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
>>> -                              CONFIG_T_PARAMS)
>>> +#define CONFIG_SUB_CLIENT    (CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
>>> +                              CONFIG_SUB_PARAMS)
>>>
>>>  #define PARAMS_FILENAME      "params"
>>>  #define LCTL_UPCALL  "lctl"
>>> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> index 06fcc7e..833e6a0 100644
>>> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>        memcpy(seclogname, logname, ptr - logname);
>>>        strcpy(seclogname + (ptr - logname), "-sptlrpc");
>>>
>>> -     if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
>>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>>>                sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>>>                                                     CONFIG_T_SPTLRPC, cfg);
>>>                if (IS_ERR(sptlrpc_cld)) {
>>> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>                }
>>>        }
>>>
>>> -     if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
>>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>>>                params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>>>                                                    CONFIG_T_PARAMS, cfg);
>>>                if (IS_ERR(params_cld)) {
>>> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>
>>>        LASSERT(lsi->lsi_lmd);
>>>        if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
>>> -         cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
>>> +         cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>>>                ptr = strrchr(seclogname, '-');
>>>                if (ptr) {
>>>                        *ptr = 0;
>>> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> index 2f081ed..09b1298 100644
>>> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>>>        LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>>>                 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>>>
>>> +     /* Checks for struct mgs_nidtbl_entry */
>>> +     LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_nidtbl_entry));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
>>> +
>>> +     /* Checks for struct mgs_config_body */
>>> +     LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_config_body));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_name));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_units));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
>>> +
>>> +     BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
>>> +     BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
>>> +     BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
>>> +     BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
>>> +
>>> +     /* Checks for struct mgs_config_res */
>>> +     LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_config_res));
>>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
>>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_size));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
>>> +
>>>        /* Checks for struct lustre_capa */
>>>        LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>>>                 (long long)(int)sizeof(struct lustre_capa));
>>> --
>>> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/df74fbc3/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-01  1:15         ` Amir Shehata
@ 2018-08-01  2:50           ` NeilBrown
  2018-08-01 17:11             ` Amir Shehata
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-01  2:50 UTC (permalink / raw)
  To: lustre-devel


Hi Amir,
 I think I'm happy to let this slide.  I don't like magic numbers, but
 this one isn't important enough to justify the problems that might be
 caused by changing it.

 To answer your broader question:
> The code bases are already diverging significantly between the
> upstream client and the master repo, which makes porting features from
> master to upstream a difficult task. Do we have a strategy on how to
> deal with this? 

 The long term strategy is just to get the work done so that the client
 code - and then all the kernel code - can be deleted from the master
 repo and can live solely in Linux.  I know that is still quite a way
 away.

 Shorter term, there are no magic answers.  Yes it is difficult but it
 is far from impossible.  My plan has always been to get the code that
 is already in drivers/staging into a reasonable state, then start
 forward-porting patches from master.  If other people do some of the
 forward-porting, that just makes me happier.
 If you think there is too much churn in my lustre tree, then just
 provide patches based on some old commit - I'm quite happy to receive
 patches based on fairly old code, and to do the final steps of
 forward-porting/conflict resolution myself (I have lots of practice).

NeilBrown


On Wed, Aug 01 2018, Amir Shehata wrote:

> Hi Neil,
>
> This issue actually came up because of LU-6060, which changed the behavior for LLNL. The behavior then was changed again by LU-6851: https://jira.whamcloud.com/browse/LU-6851 (if you'd like more background)
>
> As a result of LU-6851 we were printing the unsigned value of -1. That's why we ended up printing it as -1, which is more bearable than just printing a large unsigned value.
>
> I'm not disagreeing that it'll be better to print a clearer value, "unknown" does sound like it relays the correct meaning. However, I've sometimes run into issues where changing user facing interfaces caused  problems to user scripts. Would this be the case here, I'm not 100% sure. We can always make the change and then wait for tickets to be opened.
>
> However, I think of more concern to me is that if we make changes like this to the upstreamed client, it's probably a good idea to also make them to the whamcloud repo as well, so as not to diverge the client and server (LNet is common between them). This particular case, might not be very significant, but other issues might come up that are of more significance. 
>
> The code bases are already diverging significantly between the upstream client and the master repo, which makes porting features from master to upstream a difficult task. Do we have a strategy on how to deal with this?
>
> thanks
> amir
> ________________________________________
> From: NeilBrown [neilb at suse.com]
> Sent: Tuesday, July 31, 2018 5:32 PM
> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>
> Hi Amir,
>  thanks for the background.
>
>  I had to chuckle at "0 being the highest", though I know that this
>  distortion is not something specific to lustre.
>
>  Your description seems to suggest that "-1" means "unknown" with an
>  implication that the number of hops might be high, and best not to take
>  the risk.
>
>  Your point about compatability with scripts has some validity, though
>  it is annoying to have to support such ugly interfaces indefinitely.
>  Are there really likely to be dependencies? lustre has only been
>  printing -1 since Feb last year when this patch went upstream.
>  That was presumably an abi change as it would have printed MAXINT-1
>  previously.  Did that cause any problems?
>
> Thanks,
> NeilBrown
>
>
> On Tue, Jul 31 2018, Amir Shehata wrote:
>
>> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>>
>> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>>
>> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>>
>> thanks
>> amir
>> ________________________________________
>> From: NeilBrown [neilb at suse.com]
>> Sent: Tuesday, July 31, 2018 3:38 PM
>> To: James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List; Amir Shehata; James Simmons
>> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>>
>> On Mon, Jul 30 2018, James Simmons wrote:
>>
>>> From: Amir Shehata <ashehata@whamcloud.com>
>>>
>>> The default number of hops for  a route is -1. This is
>>> currently being printed as %u. Change that to %d to
>>> make it print out properly.
>>
>> -1 hops???  I wish I could hop -1 times - it would be a good party
>> trick!!
>>
>> What does -1 mean?  Unlimited (just a guess).  If so, could we print
>> "unlimited"??
>>
>> I'm fine with having magic numbers in the code, but I don't like them to
>> leak out.
>>
>> NeilBrown
>>
>>>
>>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>>> Reviewed-on: https://review.whamcloud.com/25250
>>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>>> Reviewed-by: Doug Oucharek <dougso@me.com>
>>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>> ---
>>>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> index 8856798..aa98ce5 100644
>>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>>                       int alive = lnet_is_route_alive(route);
>>>
>>>                       s += snprintf(s, tmpstr + tmpsiz - s,
>>> -                                   "%-8s %4u %8u %7s %s\n",
>>> +                                   "%-8s %4d %8u %7s %s\n",
>>>                                     libcfs_net2str(net), hops,
>>>                                     priority,
>>>                                     alive ? "up" : "down",
>>> --
>>> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/69c9df16/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush
  2018-08-01  0:44     ` Yong, Fan
@ 2018-08-01  2:58       ` NeilBrown
  2018-08-01  4:15         ` Yong, Fan
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-01  2:58 UTC (permalink / raw)
  To: lustre-devel

On Wed, Aug 01 2018, Yong, Fan wrote:

> The client_obd::cl_seq_rwsem protects the client_obd::cl_seq itself,
> not the internal members inside client_obd::cl_seq. I do not think
> simple refcount can work here. For example, if the client_fid_fini()
> is destroying the client_obd::cl_seq, we need some mechanism, such as
> the mutex, to pervert others accessing the client_obd::cl_seq. Under
> such case, even though someone could acquire refcount (after the
> client_fid_fini() start), it still cannot prevent the in-processing
> destroy. 

A common pattern with refcounts is to free the object when the refcount
reaches zero.
   kref_put(&cli->cl_refcount, free_cl_seq(cli))

or similar.
So we start with a refcount of 1,
the code that currently does 'down_read()' instead does
   if (kref_get_unless_zero(&cli->cl_refcount)) {
              do something with cli->cl_seq
              kref_put(.....);
   }

and client_fid_fini() just does the kref_put().

An important question is whether client_fid_fini() really needs to wait
for seq_client_flush() or seq_client_alloc_fid() to complete.
If it does, then we probably can't do much better than the rwsem.
If it doesn't, then kref_put() is the better way to go.

In either case, I cannot see any justification for holding the lock in
client_fid_init().
It would be better to completely initialize the lu_client_seq, and then
atomically assign it to cli->cl_seq.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/e83d7ded/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
  2018-08-01  2:10           ` NeilBrown
@ 2018-08-01  3:23             ` Patrick Farrell
  0 siblings, 0 replies; 58+ messages in thread
From: Patrick Farrell @ 2018-08-01  3:23 UTC (permalink / raw)
  To: lustre-devel

Ah, that last one is a good example.


All right, I'm converted.  Thanks.


- Patrick

________________________________
From: NeilBrown <neilb@suse.com>
Sent: Tuesday, July 31, 2018 9:10:46 PM
To: Patrick Farrell; James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h

On Wed, Aug 01 2018, Patrick Farrell wrote:

> Huh.  I see your point, and the logic of it is obviously sound, but I'm curious - I don't think I've ever seen names like those you gave used.  I've only seen MAX.

uses "count":
  enum acpi_bus_device_type
  include/linux/acpi.h
  enum cgroup_subsys_id

uses "reserved" as in "this number as larger are reserved"
   include/apci/actbl*
   enum bh_state_bits (uses "BH_PrivateStart, which is similar)

uses "last" (just as bad as max)
   enum amd_asic_type
   enum req_opf
   include/linux/ccp.h



uses "max"
   include/drm/bridge/dw_hdmi.h
   enum drm_color_encoding
   enum drm_color_range
   enum drm_sched_priority
   enum wb_reason
   enum backlight_type
   enum rdmacg_resource_type
   include/linux/clk/ti.h
   include/linux/crypto.h

uses "size" (as good as 'num' or 'count')
   include/drm/bridge/mhl.h

uses "num"
   enum drm_global_types
   enum ttm_ref_type
   ans1_ber_bytecode.h (actually 'nr' not 'num')
   enum wb_stat_item (actually uses "NR" prefix)
   enum bcm963xx_nvram_nand_part (__..._NR_PARTS)
   enum blkg_rwstat_type
   enum req_flag_bits

I was looking at all matches of "git grep -w 'enum.*{' include/" and
gave up when I had done include/[a-k]* and include/linux/[a-c]*. about
55 of the way.l

Clearly havin a name to represent the number of names in the enum is a
common need.  There are several alternatives people use.  "max" might be
a little more common than "num", but I don't think it is a clear winner.
Count, cnt, size, reserved, private, num, nr are all good and meaningful.
max and last are wrong - unless the number actually is the max or the
last, which it usually isn't.

>
> Am I wrong about that, or is MAX indeed the dominant choice?
>
> I also think in context, used in caps, MAX is clearly a special value
> without other meaning.

"clearly" except where it isn't clear.
Not that long ago:

http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-May/120870.html

The code *looked* right - it is an error if the id number is larger than
the MAX.  That makes sense.
But it was wrong, because the MAX wasn't the maximum, it was one more.

NeilBrown


> ________________________________
> From: NeilBrown <neilb@suse.com>
> Sent: Tuesday, July 31, 2018 7:23:21 PM
> To: Patrick Farrell; James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>
> On Tue, Jul 31 2018, Patrick Farrell wrote:
>
>> Neil,
>>
>> Do you have an objection to the concept, or just because this one's not used?
>> Having a MAX makes it easy to write things like < MYENUM_MAX as sanity checking code, and then if the enum is added to, it still works.  Seems useful to me.
>
> I object to the name.  "MAX" is short for "MAXIMUM" which means the
> highest value that is actually used.  When comparing something to the
> maximum it makes sense to say
>    if (foo <= maximum)
>
> but it rarely makes sense to say
>    if (foo < maximum)
>
> If you want a count of the number of values, use MYENUM_CNT or
> MYENUM_NUM. This can sensibly be one more than the maximum value.
> But if you have MYSENUM_MAX, make sure it is the maximum meaningful
> value for the enum.
> </rant>
>
> Thanks,
> NeilBrown
>
>
>>
>> - Patrick
>> ________________________________
>> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
>> Sent: Tuesday, July 31, 2018 5:47:28 PM
>> To: James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List
>> Subject: Re: [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h
>>
>> On Mon, Jul 30 2018, James Simmons wrote:
>>
>>> From: Niu Yawei <yawei.niu@intel.com>
>>>
>>> Move config type values CONFIG_T_XXX into lustre_idl.h since they
>>> will be put on wire when reading config logs.
>>>
>>> Add missing wire checks for mgs_nidtbl_entry, mgs_config_body and
>>> mgs_config_res.
>>>
>>> Redefine CONFIG_SUB_XXX for the sub clds attached on config log.
>>>
>>> Signed-off-by: Niu Yawei <yawei.niu@intel.com>
>>> WC-id: https://jira.whamcloud.com/browse/LU-9216
>>> Reviewed-on: https://review.whamcloud.com/26022
>>> Reviewed-by: Fan Yong <fan.yong@intel.com>
>>> Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>> ---
>>>  .../lustre/include/uapi/linux/lustre/lustre_idl.h  | 10 ++-
>>>  drivers/staging/lustre/lustre/include/obd_class.h  | 12 +--
>>>  drivers/staging/lustre/lustre/mgc/mgc_request.c    |  6 +-
>>>  drivers/staging/lustre/lustre/ptlrpc/wiretest.c    | 85 ++++++++++++++++++++++
>>>  4 files changed, 103 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> index c9b32ef..bd3b45a 100644
>>> --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h
>>> @@ -2111,11 +2111,19 @@ struct mgs_nidtbl_entry {
>>>        } u;
>>>  };
>>>
>>> +enum {
>>> +     CONFIG_T_CONFIG  = 0,
>>> +     CONFIG_T_SPTLRPC = 1,
>>> +     CONFIG_T_RECOVER = 2,
>>> +     CONFIG_T_PARAMS  = 3,
>>> +     CONFIG_T_MAX
>>
>> Arrrgggh.  It's back.  I thought we had killed CONFIG_T_MAX (which isn't
>> a MAX).
>> It's never used, so it'll have to go.
>>
>> NeilBrown
>>
>>
>>> +};
>>> +
>>>  struct mgs_config_body {
>>>        char            mcb_name[MTI_NAME_MAXLEN]; /* logname */
>>>        __u64           mcb_offset;    /* next index of config log to request */
>>>        __u16           mcb_type;      /* type of log: CONFIG_T_[CONFIG|RECOVER] */
>>> -     __u8            mcb_reserved;
>>> +     __u8            mcb_nm_cur_pass;
>>>        __u8            mcb_bits;      /* bits unit size of config log */
>>>        __u32           mcb_units;     /* # of units for bulk transfer */
>>>  };
>>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
>>> index 184da99..647cc22 100644
>>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>>> @@ -156,16 +156,16 @@ struct config_llog_instance {
>>>  int class_config_parse_llog(const struct lu_env *env, struct llog_ctxt *ctxt,
>>>                            char *name, struct config_llog_instance *cfg);
>>>
>>> -#define CONFIG_T_CONFIG              BIT(0)
>>> -#define CONFIG_T_SPTLRPC     BIT(1)
>>> -#define CONFIG_T_RECOVER     BIT(2)
>>> -#define CONFIG_T_PARAMS              BIT(3)
>>> +#define CONFIG_SUB_CONFIG    BIT(0)
>>> +#define CONFIG_SUB_SPTLRPC   BIT(1)
>>> +#define CONFIG_SUB_RECOVER   BIT(2)
>>> +#define CONFIG_SUB_PARAMS    BIT(3)
>>>
>>>  /* Sub clds should be attached to the config_llog_data when processing
>>>   * config log for client or server target.
>>>   */
>>> -#define CONFIG_SUB_CLIENT    (CONFIG_T_SPTLRPC | CONFIG_T_RECOVER | \
>>> -                              CONFIG_T_PARAMS)
>>> +#define CONFIG_SUB_CLIENT    (CONFIG_SUB_SPTLRPC | CONFIG_SUB_RECOVER | \
>>> +                              CONFIG_SUB_PARAMS)
>>>
>>>  #define PARAMS_FILENAME      "params"
>>>  #define LCTL_UPCALL  "lctl"
>>> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> index 06fcc7e..833e6a0 100644
>>> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
>>> @@ -315,7 +315,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>        memcpy(seclogname, logname, ptr - logname);
>>>        strcpy(seclogname + (ptr - logname), "-sptlrpc");
>>>
>>> -     if (cfg->cfg_sub_clds & CONFIG_T_SPTLRPC) {
>>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_SPTLRPC) {
>>>                sptlrpc_cld = config_log_find_or_add(obd, seclogname, NULL,
>>>                                                     CONFIG_T_SPTLRPC, cfg);
>>>                if (IS_ERR(sptlrpc_cld)) {
>>> @@ -325,7 +325,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>                }
>>>        }
>>>
>>> -     if (cfg->cfg_sub_clds & CONFIG_T_PARAMS) {
>>> +     if (cfg->cfg_sub_clds & CONFIG_SUB_PARAMS) {
>>>                params_cld = config_log_find_or_add(obd, PARAMS_FILENAME, sb,
>>>                                                    CONFIG_T_PARAMS, cfg);
>>>                if (IS_ERR(params_cld)) {
>>> @@ -345,7 +345,7 @@ struct config_llog_data *do_config_log_add(struct obd_device *obd,
>>>
>>>        LASSERT(lsi->lsi_lmd);
>>>        if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR) &&
>>> -         cfg->cfg_sub_clds & CONFIG_T_RECOVER) {
>>> +         cfg->cfg_sub_clds & CONFIG_SUB_RECOVER) {
>>>                ptr = strrchr(seclogname, '-');
>>>                if (ptr) {
>>>                        *ptr = 0;
>>> diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> index 2f081ed..09b1298 100644
>>> --- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> +++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
>>> @@ -3629,6 +3629,91 @@ void lustre_assert_wire_constants(void)
>>>        LASSERTF((int)sizeof(((struct mgs_target_info *)0)->mti_params) == 4096, "found %lld\n",
>>>                 (long long)(int)sizeof(((struct mgs_target_info *)0)->mti_params));
>>>
>>> +     /* Checks for struct mgs_nidtbl_entry */
>>> +     LASSERTF((int)sizeof(struct mgs_nidtbl_entry) == 24, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_nidtbl_entry));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_version) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_version));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_version));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_instance) == 8, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_instance));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_instance));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_index) == 12, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_index));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_index));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_length) == 16, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_length));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_length));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_type) == 20, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_type));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_type) == 21, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_type));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_size) == 22, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_size));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_size));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, mne_nid_count) == 23, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, mne_nid_count));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->mne_nid_count));
>>> +     LASSERTF((int)offsetof(struct mgs_nidtbl_entry, u.nids[0]) == 24, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_nidtbl_entry, u.nids[0]));
>>> +     LASSERTF((int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_nidtbl_entry *)0)->u.nids[0]));
>>> +
>>> +     /* Checks for struct mgs_config_body */
>>> +     LASSERTF((int)sizeof(struct mgs_config_body) == 80, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_config_body));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_name) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_name));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_name) == 64, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_name));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_offset) == 64, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_offset));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_offset) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_offset));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_type) == 72, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_type));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_type) == 2, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_type));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_nm_cur_pass) == 74, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_nm_cur_pass));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_nm_cur_pass));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_bits) == 75, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_bits));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_bits) == 1, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_bits));
>>> +     LASSERTF((int)offsetof(struct mgs_config_body, mcb_units) == 76, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_body, mcb_units));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
>>> +
>>> +     BUILD_BUG_ON(CONFIG_T_CONFIG != 0);
>>> +     BUILD_BUG_ON(CONFIG_T_SPTLRPC != 1);
>>> +     BUILD_BUG_ON(CONFIG_T_RECOVER != 2);
>>> +     BUILD_BUG_ON(CONFIG_T_PARAMS != 3);
>>> +
>>> +     /* Checks for struct mgs_config_res */
>>> +     LASSERTF((int)sizeof(struct mgs_config_res) == 16, "found %lld\n",
>>> +              (long long)(int)sizeof(struct mgs_config_res));
>>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_offset) == 0, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_offset));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_offset) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_offset));
>>> +     LASSERTF((int)offsetof(struct mgs_config_res, mcr_size) == 8, "found %lld\n",
>>> +              (long long)(int)offsetof(struct mgs_config_res, mcr_size));
>>> +     LASSERTF((int)sizeof(((struct mgs_config_res *)0)->mcr_size) == 8, "found %lld\n",
>>> +              (long long)(int)sizeof(((struct mgs_config_res *)0)->mcr_size));
>>> +
>>>        /* Checks for struct lustre_capa */
>>>        LASSERTF((int)sizeof(struct lustre_capa) == 120, "found %lld\n",
>>>                 (long long)(int)sizeof(struct lustre_capa));
>>> --
>>> 1.8.3.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/8e7dbf93/attachment-0001.html>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10
  2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
                   ` (30 preceding siblings ...)
  2018-07-31  2:26 ` [lustre-devel] [PATCH 31/31] lustre: docs: update TODO file James Simmons
@ 2018-08-01  3:41 ` NeilBrown
  31 siblings, 0 replies; 58+ messages in thread
From: NeilBrown @ 2018-08-01  3:41 UTC (permalink / raw)
  To: lustre-devel

On Mon, Jul 30 2018, James Simmons wrote:

> This covers all the missing patches landed from the start of the
> lustre 2.10 development cycle until right before the PFL feature
> landed. Several bug fixes as well as cleanups. This is based on
> top of the recent list patches as well as my mount code for llite
> patch series.

I've applied all these to my lustre-testing branch, plus the fixups
I promised.
I cannot review them for correctness, just for whether the code looks
sensible and appears consistent with upstream style.

If my testing shows no problems and I don't hear any complaints I
suspect they will migrate to 'lustre' in a week or so.

BTW I've created a new branch 'lustre-wip' which contains patches that
depend on changes to non-lustre code, and so are likely to be delayed
more than I would like...

Thanks,
NeilBrown

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180801/5bdd3769/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush
  2018-08-01  2:58       ` NeilBrown
@ 2018-08-01  4:15         ` Yong, Fan
  0 siblings, 0 replies; 58+ messages in thread
From: Yong, Fan @ 2018-08-01  4:15 UTC (permalink / raw)
  To: lustre-devel

From the logic view, the client_fid_fini() should wait all the client_obd::cl_seq users (in spite of flush() or fid_allocation() or other possible new users in the future) done or abort their things before destroying the client_obd::cl_seq. The rwsem build up the framework for that.

> In either case, I cannot see any justification for holding the lock in client_fid_init().
As for holding the rwsem during the initiation, honestly, it is unnecessary since currently there is neither concurrent initialization nor other accessing during the initialization. And just because of no contend on the rwsem, holding such rwsem here will not cause too much overhead. On the other hand, if someone will change the caller logic as to there will be possible concurrent initialization, then we need to hold the rwsem during the initialization instead of atomically assigning it to cli->cl_seq.

--
Cheers,
Nasf

-----Original Message-----
From: NeilBrown [mailto:neilb at suse.com] 
Sent: Wednesday, August 1, 2018 10:58 AM
To: Yong, Fan <fan.yong@intel.com>; James Simmons <jsimmons@infradead.org>; Andreas Dilger <adilger@whamcloud.com>; Oleg Drokin <green@whamcloud.com>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>; James Simmons <jsimmons@infradead.org>
Subject: RE: [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush

On Wed, Aug 01 2018, Yong, Fan wrote:

> The client_obd::cl_seq_rwsem protects the client_obd::cl_seq itself, 
> not the internal members inside client_obd::cl_seq. I do not think 
> simple refcount can work here. For example, if the client_fid_fini() 
> is destroying the client_obd::cl_seq, we need some mechanism, such as 
> the mutex, to pervert others accessing the client_obd::cl_seq. Under 
> such case, even though someone could acquire refcount (after the
> client_fid_fini() start), it still cannot prevent the in-processing 
> destroy.

A common pattern with refcounts is to free the object when the refcount reaches zero.
   kref_put(&cli->cl_refcount, free_cl_seq(cli))

or similar.
So we start with a refcount of 1,
the code that currently does 'down_read()' instead does
   if (kref_get_unless_zero(&cli->cl_refcount)) {
              do something with cli->cl_seq
              kref_put(.....);
   }

and client_fid_fini() just does the kref_put().

An important question is whether client_fid_fini() really needs to wait for seq_client_flush() or seq_client_alloc_fid() to complete.
If it does, then we probably can't do much better than the rwsem.
If it doesn't, then kref_put() is the better way to go.

In either case, I cannot see any justification for holding the lock in client_fid_init().
It would be better to completely initialize the lu_client_seq, and then atomically assign it to cli->cl_seq.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-01  2:50           ` NeilBrown
@ 2018-08-01 17:11             ` Amir Shehata
  2018-08-02  2:09               ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Amir Shehata @ 2018-08-01 17:11 UTC (permalink / raw)
  To: lustre-devel

Hi Neil,

Thanks for the explanation. 

Speaking specifically for LNet, I'm not sure it's feasible to remove the code from master repo. As I mentioned LNet is a common piece between both the client and server sides. Both of them rely on it. And I believe LNet is also used by DVS, which is a cray developed layer, I'm not very familiar with it. So just deleting it from the master repo I don't think would work.

Over the past years, there has been discussions about making it a standalone module that can be pulled in as a dependency. This approach makes a bit more sense to me.

What are your thoughts on that?

thanks
amir
________________________________________
From: NeilBrown [neilb at suse.com]
Sent: Tuesday, July 31, 2018 7:50 PM
To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
Cc: Lustre Development List
Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print

Hi Amir,
 I think I'm happy to let this slide.  I don't like magic numbers, but
 this one isn't important enough to justify the problems that might be
 caused by changing it.

 To answer your broader question:
> The code bases are already diverging significantly between the
> upstream client and the master repo, which makes porting features from
> master to upstream a difficult task. Do we have a strategy on how to
> deal with this?

 The long term strategy is just to get the work done so that the client
 code - and then all the kernel code - can be deleted from the master
 repo and can live solely in Linux.  I know that is still quite a way
 away.

 Shorter term, there are no magic answers.  Yes it is difficult but it
 is far from impossible.  My plan has always been to get the code that
 is already in drivers/staging into a reasonable state, then start
 forward-porting patches from master.  If other people do some of the
 forward-porting, that just makes me happier.
 If you think there is too much churn in my lustre tree, then just
 provide patches based on some old commit - I'm quite happy to receive
 patches based on fairly old code, and to do the final steps of
 forward-porting/conflict resolution myself (I have lots of practice).

NeilBrown


On Wed, Aug 01 2018, Amir Shehata wrote:

> Hi Neil,
>
> This issue actually came up because of LU-6060, which changed the behavior for LLNL. The behavior then was changed again by LU-6851: https://jira.whamcloud.com/browse/LU-6851 (if you'd like more background)
>
> As a result of LU-6851 we were printing the unsigned value of -1. That's why we ended up printing it as -1, which is more bearable than just printing a large unsigned value.
>
> I'm not disagreeing that it'll be better to print a clearer value, "unknown" does sound like it relays the correct meaning. However, I've sometimes run into issues where changing user facing interfaces caused  problems to user scripts. Would this be the case here, I'm not 100% sure. We can always make the change and then wait for tickets to be opened.
>
> However, I think of more concern to me is that if we make changes like this to the upstreamed client, it's probably a good idea to also make them to the whamcloud repo as well, so as not to diverge the client and server (LNet is common between them). This particular case, might not be very significant, but other issues might come up that are of more significance.
>
> The code bases are already diverging significantly between the upstream client and the master repo, which makes porting features from master to upstream a difficult task. Do we have a strategy on how to deal with this?
>
> thanks
> amir
> ________________________________________
> From: NeilBrown [neilb at suse.com]
> Sent: Tuesday, July 31, 2018 5:32 PM
> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>
> Hi Amir,
>  thanks for the background.
>
>  I had to chuckle at "0 being the highest", though I know that this
>  distortion is not something specific to lustre.
>
>  Your description seems to suggest that "-1" means "unknown" with an
>  implication that the number of hops might be high, and best not to take
>  the risk.
>
>  Your point about compatability with scripts has some validity, though
>  it is annoying to have to support such ugly interfaces indefinitely.
>  Are there really likely to be dependencies? lustre has only been
>  printing -1 since Feb last year when this patch went upstream.
>  That was presumably an abi change as it would have printed MAXINT-1
>  previously.  Did that cause any problems?
>
> Thanks,
> NeilBrown
>
>
> On Tue, Jul 31 2018, Amir Shehata wrote:
>
>> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>>
>> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>>
>> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>>
>> thanks
>> amir
>> ________________________________________
>> From: NeilBrown [neilb at suse.com]
>> Sent: Tuesday, July 31, 2018 3:38 PM
>> To: James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List; Amir Shehata; James Simmons
>> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>>
>> On Mon, Jul 30 2018, James Simmons wrote:
>>
>>> From: Amir Shehata <ashehata@whamcloud.com>
>>>
>>> The default number of hops for  a route is -1. This is
>>> currently being printed as %u. Change that to %d to
>>> make it print out properly.
>>
>> -1 hops???  I wish I could hop -1 times - it would be a good party
>> trick!!
>>
>> What does -1 mean?  Unlimited (just a guess).  If so, could we print
>> "unlimited"??
>>
>> I'm fine with having magic numbers in the code, but I don't like them to
>> leak out.
>>
>> NeilBrown
>>
>>>
>>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>>> Reviewed-on: https://review.whamcloud.com/25250
>>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>>> Reviewed-by: Doug Oucharek <dougso@me.com>
>>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>> ---
>>>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> index 8856798..aa98ce5 100644
>>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>>                       int alive = lnet_is_route_alive(route);
>>>
>>>                       s += snprintf(s, tmpstr + tmpsiz - s,
>>> -                                   "%-8s %4u %8u %7s %s\n",
>>> +                                   "%-8s %4d %8u %7s %s\n",
>>>                                     libcfs_net2str(net), hops,
>>>                                     priority,
>>>                                     alive ? "up" : "down",
>>> --
>>> 1.8.3.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-01 17:11             ` Amir Shehata
@ 2018-08-02  2:09               ` NeilBrown
  2018-08-03  0:01                 ` Andreas Dilger
  0 siblings, 1 reply; 58+ messages in thread
From: NeilBrown @ 2018-08-02  2:09 UTC (permalink / raw)
  To: lustre-devel


Once lustre and lnet are part of upstream Linux, what is the value of
keeping any of it in the master repo?
There would be be a need to keep it only to support old versions of
Linux, which will hopefully be less and less over time.
It might make sense to backport the upstream-linux code those particular
versions where it is needed, and do all development work in upstream
Linux, and just backport.

NeilBrown


On Wed, Aug 01 2018, Amir Shehata wrote:

> Hi Neil,
>
> Thanks for the explanation. 
>
> Speaking specifically for LNet, I'm not sure it's feasible to remove the code from master repo. As I mentioned LNet is a common piece between both the client and server sides. Both of them rely on it. And I believe LNet is also used by DVS, which is a cray developed layer, I'm not very familiar with it. So just deleting it from the master repo I don't think would work.
>
> Over the past years, there has been discussions about making it a standalone module that can be pulled in as a dependency. This approach makes a bit more sense to me.
>
> What are your thoughts on that?
>
> thanks
> amir
> ________________________________________
> From: NeilBrown [neilb at suse.com]
> Sent: Tuesday, July 31, 2018 7:50 PM
> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>
> Hi Amir,
>  I think I'm happy to let this slide.  I don't like magic numbers, but
>  this one isn't important enough to justify the problems that might be
>  caused by changing it.
>
>  To answer your broader question:
>> The code bases are already diverging significantly between the
>> upstream client and the master repo, which makes porting features from
>> master to upstream a difficult task. Do we have a strategy on how to
>> deal with this?
>
>  The long term strategy is just to get the work done so that the client
>  code - and then all the kernel code - can be deleted from the master
>  repo and can live solely in Linux.  I know that is still quite a way
>  away.
>
>  Shorter term, there are no magic answers.  Yes it is difficult but it
>  is far from impossible.  My plan has always been to get the code that
>  is already in drivers/staging into a reasonable state, then start
>  forward-porting patches from master.  If other people do some of the
>  forward-porting, that just makes me happier.
>  If you think there is too much churn in my lustre tree, then just
>  provide patches based on some old commit - I'm quite happy to receive
>  patches based on fairly old code, and to do the final steps of
>  forward-porting/conflict resolution myself (I have lots of practice).
>
> NeilBrown
>
>
> On Wed, Aug 01 2018, Amir Shehata wrote:
>
>> Hi Neil,
>>
>> This issue actually came up because of LU-6060, which changed the behavior for LLNL. The behavior then was changed again by LU-6851: https://jira.whamcloud.com/browse/LU-6851 (if you'd like more background)
>>
>> As a result of LU-6851 we were printing the unsigned value of -1. That's why we ended up printing it as -1, which is more bearable than just printing a large unsigned value.
>>
>> I'm not disagreeing that it'll be better to print a clearer value, "unknown" does sound like it relays the correct meaning. However, I've sometimes run into issues where changing user facing interfaces caused  problems to user scripts. Would this be the case here, I'm not 100% sure. We can always make the change and then wait for tickets to be opened.
>>
>> However, I think of more concern to me is that if we make changes like this to the upstreamed client, it's probably a good idea to also make them to the whamcloud repo as well, so as not to diverge the client and server (LNet is common between them). This particular case, might not be very significant, but other issues might come up that are of more significance.
>>
>> The code bases are already diverging significantly between the upstream client and the master repo, which makes porting features from master to upstream a difficult task. Do we have a strategy on how to deal with this?
>>
>> thanks
>> amir
>> ________________________________________
>> From: NeilBrown [neilb at suse.com]
>> Sent: Tuesday, July 31, 2018 5:32 PM
>> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List
>> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>>
>> Hi Amir,
>>  thanks for the background.
>>
>>  I had to chuckle at "0 being the highest", though I know that this
>>  distortion is not something specific to lustre.
>>
>>  Your description seems to suggest that "-1" means "unknown" with an
>>  implication that the number of hops might be high, and best not to take
>>  the risk.
>>
>>  Your point about compatability with scripts has some validity, though
>>  it is annoying to have to support such ugly interfaces indefinitely.
>>  Are there really likely to be dependencies? lustre has only been
>>  printing -1 since Feb last year when this patch went upstream.
>>  That was presumably an abi change as it would have printed MAXINT-1
>>  previously.  Did that cause any problems?
>>
>> Thanks,
>> NeilBrown
>>
>>
>> On Tue, Jul 31 2018, Amir Shehata wrote:
>>
>>> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>>>
>>> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>>>
>>> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>>>
>>> thanks
>>> amir
>>> ________________________________________
>>> From: NeilBrown [neilb at suse.com]
>>> Sent: Tuesday, July 31, 2018 3:38 PM
>>> To: James Simmons; Andreas Dilger; Oleg Drokin
>>> Cc: Lustre Development List; Amir Shehata; James Simmons
>>> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>>>
>>> On Mon, Jul 30 2018, James Simmons wrote:
>>>
>>>> From: Amir Shehata <ashehata@whamcloud.com>
>>>>
>>>> The default number of hops for  a route is -1. This is
>>>> currently being printed as %u. Change that to %d to
>>>> make it print out properly.
>>>
>>> -1 hops???  I wish I could hop -1 times - it would be a good party
>>> trick!!
>>>
>>> What does -1 mean?  Unlimited (just a guess).  If so, could we print
>>> "unlimited"??
>>>
>>> I'm fine with having magic numbers in the code, but I don't like them to
>>> leak out.
>>>
>>> NeilBrown
>>>
>>>>
>>>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>>>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>>>> Reviewed-on: https://review.whamcloud.com/25250
>>>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>>>> Reviewed-by: Doug Oucharek <dougso@me.com>
>>>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>>> ---
>>>>  drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>> index 8856798..aa98ce5 100644
>>>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>>>                       int alive = lnet_is_route_alive(route);
>>>>
>>>>                       s += snprintf(s, tmpstr + tmpsiz - s,
>>>> -                                   "%-8s %4u %8u %7s %s\n",
>>>> +                                   "%-8s %4d %8u %7s %s\n",
>>>>                                     libcfs_net2str(net), hops,
>>>>                                     priority,
>>>>                                     alive ? "up" : "down",
>>>> --
>>>> 1.8.3.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180802/c02ad5c9/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window
  2018-07-31  4:05   ` Patrick Farrell
@ 2018-08-02  3:52     ` James Simmons
  2018-08-02  3:58       ` Patrick Farrell
  0 siblings, 1 reply; 58+ messages in thread
From: James Simmons @ 2018-08-02  3:52 UTC (permalink / raw)
  To: lustre-devel


> I'm puzzled, James - Why is "cache_jobid" in there?? Isn't that from Ben Evans' work?? This patch landed before all of that...

All back ported patches have the potential to be modified so it can pass 
checkpatch as well as perfered standards. One of the common complaints 
was that lustre tends to use generic goto lables which can make grepping 
of the code more challenging. So I often change generic got lables to 
something with more meat. In this case I picked a nice name that came from 
a later patch :-)
 
> ______________________________________________________________________________________________________________________________
> From: James Simmons <jsimmons@infradead.org>
> Sent: Monday, July 30, 2018 9:25:58 PM
> To: Andreas Dilger; Oleg Drokin; NeilBrown
> Cc: Lustre Development List; Patrick Farrell; James Simmons
> Subject: [PATCH 06/31] lustre: llite: reduce jobstats race window ?
> From: Patrick Farrell <paf@cray.com>
> 
> In the current code, lli_jobid is set to zero on every call
> to lustre_get_jobid.? This causes problems, because it's
> used asynchronously to set the job id in RPCs, and some
> RPCs will falsely get no jobid set.? (For small IO sizes,
> this can be up to 60% of RPCs.)
> 
> It would be very expensive to put hard synchronization
> between this and every outbound RPC, and it's OK to very
> rarely get an RPC without correct job stats info.
> 
> This patch only updates the lli_jobid when the job id has
> changed, which leaves only a very small window for reading
> an inconsistent job id.
> 
> Signed-off-by: Patrick Farrell <paf@cray.com>
> WC-id: https://jira.whamcloud.com/browse/LU-8926
> Reviewed-on: https://review.whamcloud.com/24253
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> Reviewed-by: Chris Horn <hornc@cray.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
> ?drivers/staging/lustre/lustre/llite/llite_lib.c??? |? 1 +
> ?drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------
> ?2 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
> index c0861b9..72b118a 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
> @@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)
> ???????????????? lli->lli_async_rc = 0;
> ???????? }
> ???????? mutex_init(&lli->lli_layout_mutex);
> +?????? memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);
> ?}
> ?
> ?int ll_fill_super(struct super_block *sb)
> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> index cdaf729..87327ef 100644
> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> @@ -95,26 +95,34 @@
> ? */
> ?int lustre_get_jobid(char *jobid)
> ?{
> -?????? memset(jobid, 0, LUSTRE_JOBID_SIZE);
> +?????? char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };
> +
> ???????? /* Jobstats isn't enabled */
> ???????? if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)
> -?????????????? return 0;
> +?????????????? goto out_cache_jobid;
> ?
> ???????? /* Use process name + fsuid as jobid */
> ???????? if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {
> -?????????????? snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",
> +?????????????? snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",
> ????????????????????????? current->comm,
> ????????????????????????? from_kuid(&init_user_ns, current_fsuid()));
> -?????????????? return 0;
> +?????????????? goto out_cache_jobid;
> ???????? }
> ?
> ???????? /* Whole node dedicated to single job */
> ???????? if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {
> -?????????????? strcpy(jobid, obd_jobid_node);
> -?????????????? return 0;
> +?????????????? strcpy(tmp_jobid, obd_jobid_node);
> +?????????????? goto out_cache_jobid;
> ???????? }
> ?
> ???????? return -ENOENT;
> +
> +out_cache_jobid:
> +?????? /* Only replace the job ID if it changed. */
> +?????? if (strcmp(jobid, tmp_jobid) != 0)
> +?????????????? strcpy(jobid, tmp_jobid);
> +
> +?????? return 0;
> ?}
> ?EXPORT_SYMBOL(lustre_get_jobid);
> ?
> --
> 1.8.3.1
> 
> 
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window
  2018-08-02  3:52     ` James Simmons
@ 2018-08-02  3:58       ` Patrick Farrell
  0 siblings, 0 replies; 58+ messages in thread
From: Patrick Farrell @ 2018-08-02  3:58 UTC (permalink / raw)
  To: lustre-devel

Fair enough.  It doesn?t really make sense in context, but that?s fine since it?s temporary.


________________________________
From: James Simmons <jsimmons@infradead.org>
Sent: Wednesday, August 1, 2018 10:52:21 PM
To: Patrick Farrell
Cc: Andreas Dilger; Oleg Drokin; NeilBrown; Lustre Development List
Subject: Re: [PATCH 06/31] lustre: llite: reduce jobstats race window


> I'm puzzled, James - Why is "cache_jobid" in there?  Isn't that from Ben Evans' work?  This patch landed before all of that...

All back ported patches have the potential to be modified so it can pass
checkpatch as well as perfered standards. One of the common complaints
was that lustre tends to use generic goto lables which can make grepping
of the code more challenging. So I often change generic got lables to
something with more meat. In this case I picked a nice name that came from
a later patch :-)

> ______________________________________________________________________________________________________________________________
> From: James Simmons <jsimmons@infradead.org>
> Sent: Monday, July 30, 2018 9:25:58 PM
> To: Andreas Dilger; Oleg Drokin; NeilBrown
> Cc: Lustre Development List; Patrick Farrell; James Simmons
> Subject: [PATCH 06/31] lustre: llite: reduce jobstats race window
> From: Patrick Farrell <paf@cray.com>
>
> In the current code, lli_jobid is set to zero on every call
> to lustre_get_jobid.  This causes problems, because it's
> used asynchronously to set the job id in RPCs, and some
> RPCs will falsely get no jobid set.  (For small IO sizes,
> this can be up to 60% of RPCs.)
>
> It would be very expensive to put hard synchronization
> between this and every outbound RPC, and it's OK to very
> rarely get an RPC without correct job stats info.
>
> This patch only updates the lli_jobid when the job id has
> changed, which leaves only a very small window for reading
> an inconsistent job id.
>
> Signed-off-by: Patrick Farrell <paf@cray.com>
> WC-id: https://jira.whamcloud.com/browse/LU-8926
> Reviewed-on: https://review.whamcloud.com/24253
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> Reviewed-by: Chris Horn <hornc@cray.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |  1 +
>  drivers/staging/lustre/lustre/obdclass/class_obd.c | 20 ++++++++++++++------
>  2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
> index c0861b9..72b118a 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
> @@ -894,6 +894,7 @@ void ll_lli_init(struct ll_inode_info *lli)
>                  lli->lli_async_rc = 0;
>          }
>          mutex_init(&lli->lli_layout_mutex);
> +       memset(lli->lli_jobid, 0, LUSTRE_JOBID_SIZE);
>  }
>
>  int ll_fill_super(struct super_block *sb)
> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> index cdaf729..87327ef 100644
> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> @@ -95,26 +95,34 @@
>   */
>  int lustre_get_jobid(char *jobid)
>  {
> -       memset(jobid, 0, LUSTRE_JOBID_SIZE);
> +       char tmp_jobid[LUSTRE_JOBID_SIZE] = { 0 };
> +
>          /* Jobstats isn't enabled */
>          if (strcmp(obd_jobid_var, JOBSTATS_DISABLE) == 0)
> -               return 0;
> +               goto out_cache_jobid;
>
>          /* Use process name + fsuid as jobid */
>          if (strcmp(obd_jobid_var, JOBSTATS_PROCNAME_UID) == 0) {
> -               snprintf(jobid, LUSTRE_JOBID_SIZE, "%s.%u",
> +               snprintf(tmp_jobid, LUSTRE_JOBID_SIZE, "%s.%u",
>                           current->comm,
>                           from_kuid(&init_user_ns, current_fsuid()));
> -               return 0;
> +               goto out_cache_jobid;
>          }
>
>          /* Whole node dedicated to single job */
>          if (strcmp(obd_jobid_var, JOBSTATS_NODELOCAL) == 0) {
> -               strcpy(jobid, obd_jobid_node);
> -               return 0;
> +               strcpy(tmp_jobid, obd_jobid_node);
> +               goto out_cache_jobid;
>          }
>
>          return -ENOENT;
> +
> +out_cache_jobid:
> +       /* Only replace the job ID if it changed. */
> +       if (strcmp(jobid, tmp_jobid) != 0)
> +               strcpy(jobid, tmp_jobid);
> +
> +       return 0;
>  }
>  EXPORT_SYMBOL(lustre_get_jobid);
>
> --
> 1.8.3.1
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180802/0e31096d/attachment.html>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-02  2:09               ` NeilBrown
@ 2018-08-03  0:01                 ` Andreas Dilger
  2018-08-03  1:58                   ` NeilBrown
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Dilger @ 2018-08-03  0:01 UTC (permalink / raw)
  To: lustre-devel

On Aug 1, 2018, at 20:09, NeilBrown <neilb@suse.com> wrote:
> 
> 
> Once lustre and lnet are part of upstream Linux, what is the value of
> keeping any of it in the master repo?
> There would be be a need to keep it only to support old versions of
> Linux, which will hopefully be less and less over time.
> It might make sense to backport the upstream-linux code those particular
> versions where it is needed, and do all development work in upstream
> Linux, and just backport.

I think the main reason is that none of the distros (which is what our
customers use) will be running an upstream kernel for years after the
client and later the server have landed upsteam.  Also, since the
upstream kernel doesn't allow any kind of interoperability code,
backporting the upstream client to the older kernels will essentially
involve recreating all of the interop code that lives in the out-of-tree
client and server today, since there are a thousand small API changes
going into the kernel that affect the Lustre code.

I suspect there will be at least a couple of years of overlap, until
we start seeing the distro kernels including a version of Lustre, then
we can deprecate to out-of-tree code and only keep it maintained for
older kernels, as happens today for older Lustre releases.

Cheers, Andreas

> On Wed, Aug 01 2018, Amir Shehata wrote:
> 
>> Hi Neil,
>> 
>> Thanks for the explanation.
>> 
>> Speaking specifically for LNet, I'm not sure it's feasible to remove the code from master repo. As I mentioned LNet is a common piece between both the client and server sides. Both of them rely on it. And I believe LNet is also used by DVS, which is a cray developed layer, I'm not very familiar with it. So just deleting it from the master repo I don't think would work.
>> 
>> Over the past years, there has been discussions about making it a standalone module that can be pulled in as a dependency. This approach makes a bit more sense to me.
>> 
>> What are your thoughts on that?
>> 
>> thanks
>> amir
>> ________________________________________
>> From: NeilBrown [neilb at suse.com]
>> Sent: Tuesday, July 31, 2018 7:50 PM
>> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
>> Cc: Lustre Development List
>> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>> 
>> Hi Amir,
>> I think I'm happy to let this slide.  I don't like magic numbers, but
>> this one isn't important enough to justify the problems that might be
>> caused by changing it.
>> 
>> To answer your broader question:
>>> The code bases are already diverging significantly between the
>>> upstream client and the master repo, which makes porting features from
>>> master to upstream a difficult task. Do we have a strategy on how to
>>> deal with this?
>> 
>> The long term strategy is just to get the work done so that the client
>> code - and then all the kernel code - can be deleted from the master
>> repo and can live solely in Linux.  I know that is still quite a way
>> away.
>> 
>> Shorter term, there are no magic answers.  Yes it is difficult but it
>> is far from impossible.  My plan has always been to get the code that
>> is already in drivers/staging into a reasonable state, then start
>> forward-porting patches from master.  If other people do some of the
>> forward-porting, that just makes me happier.
>> If you think there is too much churn in my lustre tree, then just
>> provide patches based on some old commit - I'm quite happy to receive
>> patches based on fairly old code, and to do the final steps of
>> forward-porting/conflict resolution myself (I have lots of practice).
>> 
>> NeilBrown
>> 
>> 
>> On Wed, Aug 01 2018, Amir Shehata wrote:
>> 
>>> Hi Neil,
>>> 
>>> This issue actually came up because of LU-6060, which changed the behavior for LLNL. The behavior then was changed again by LU-6851: https://jira.whamcloud.com/browse/LU-6851 (if you'd like more background)
>>> 
>>> As a result of LU-6851 we were printing the unsigned value of -1. That's why we ended up printing it as -1, which is more bearable than just printing a large unsigned value.
>>> 
>>> I'm not disagreeing that it'll be better to print a clearer value, "unknown" does sound like it relays the correct meaning. However, I've sometimes run into issues where changing user facing interfaces caused  problems to user scripts. Would this be the case here, I'm not 100% sure. We can always make the change and then wait for tickets to be opened.
>>> 
>>> However, I think of more concern to me is that if we make changes like this to the upstreamed client, it's probably a good idea to also make them to the whamcloud repo as well, so as not to diverge the client and server (LNet is common between them). This particular case, might not be very significant, but other issues might come up that are of more significance.
>>> 
>>> The code bases are already diverging significantly between the upstream client and the master repo, which makes porting features from master to upstream a difficult task. Do we have a strategy on how to deal with this?
>>> 
>>> thanks
>>> amir
>>> ________________________________________
>>> From: NeilBrown [neilb at suse.com]
>>> Sent: Tuesday, July 31, 2018 5:32 PM
>>> To: Amir Shehata; James Simmons; Andreas Dilger; Oleg Drokin
>>> Cc: Lustre Development List
>>> Subject: RE: [PATCH 11/31] lustre: lnet: Fix route hops print
>>> 
>>> Hi Amir,
>>> thanks for the background.
>>> 
>>> I had to chuckle at "0 being the highest", though I know that this
>>> distortion is not something specific to lustre.
>>> 
>>> Your description seems to suggest that "-1" means "unknown" with an
>>> implication that the number of hops might be high, and best not to take
>>> the risk.
>>> 
>>> Your point about compatability with scripts has some validity, though
>>> it is annoying to have to support such ugly interfaces indefinitely.
>>> Are there really likely to be dependencies? lustre has only been
>>> printing -1 since Feb last year when this patch went upstream.
>>> That was presumably an abi change as it would have printed MAXINT-1
>>> previously.  Did that cause any problems?
>>> 
>>> Thanks,
>>> NeilBrown
>>> 
>>> 
>>> On Tue, Jul 31 2018, Amir Shehata wrote:
>>> 
>>>> The way hop and priority work in the code is they serve to select the preferred route. If you have multiple gateways leading to the same destination, you select the one with the highest priority (0 being the highest), followed by selecting the one with the least number of hops. If you don't specify hops, then it's actually treated as the least favoured if there are other routes with hops specified. If hops and priority are equivalent between routes, then you select the one with the most credits available, if that's equivalent you select in round robin.
>>>> 
>>>> In that sense hops and priority really serve the same purpose, select the preferred route. If it was up to me I would keep only one of them, but for historical reasons, both are kept.
>>>> 
>>>> Therefore, I'm not sure if "unlimited" actually relays the correct interpretation of that value. Note there could be user scripts out there that are already parsing the output. So by changing the -1 you could break the scripts. Also changing that will create an inconsistency between the server and client.
>>>> 
>>>> thanks
>>>> amir
>>>> ________________________________________
>>>> From: NeilBrown [neilb at suse.com]
>>>> Sent: Tuesday, July 31, 2018 3:38 PM
>>>> To: James Simmons; Andreas Dilger; Oleg Drokin
>>>> Cc: Lustre Development List; Amir Shehata; James Simmons
>>>> Subject: Re: [PATCH 11/31] lustre: lnet: Fix route hops print
>>>> 
>>>> On Mon, Jul 30 2018, James Simmons wrote:
>>>> 
>>>>> From: Amir Shehata <ashehata@whamcloud.com>
>>>>> 
>>>>> The default number of hops for  a route is -1. This is
>>>>> currently being printed as %u. Change that to %d to
>>>>> make it print out properly.
>>>> 
>>>> -1 hops???  I wish I could hop -1 times - it would be a good party
>>>> trick!!
>>>> 
>>>> What does -1 mean?  Unlimited (just a guess).  If so, could we print
>>>> "unlimited"??
>>>> 
>>>> I'm fine with having magic numbers in the code, but I don't like them to
>>>> leak out.
>>>> 
>>>> NeilBrown
>>>> 
>>>>> 
>>>>> Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
>>>>> WC-id: https://jira.whamcloud.com/browse/LU-9078
>>>>> Reviewed-on: https://review.whamcloud.com/25250
>>>>> Reviewed-by: Olaf Weber <olaf@sgi.com>
>>>>> Reviewed-by: Doug Oucharek <dougso@me.com>
>>>>> Reviewed-by: James Simmons <uja.ornl@yahoo.com>
>>>>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>>>>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>>>>> ---
>>>>> drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>>> index 8856798..aa98ce5 100644
>>>>> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>>> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
>>>>> @@ -218,7 +218,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>>>>>                      int alive = lnet_is_route_alive(route);
>>>>> 
>>>>>                      s += snprintf(s, tmpstr + tmpsiz - s,
>>>>> -                                   "%-8s %4u %8u %7s %s\n",
>>>>> +                                   "%-8s %4d %8u %7s %s\n",
>>>>>                                    libcfs_net2str(net), hops,
>>>>>                                    priority,
>>>>>                                    alive ? "up" : "down",
>>>>> --
>>>>> 1.8.3.1

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180803/acf7cb6c/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print
  2018-08-03  0:01                 ` Andreas Dilger
@ 2018-08-03  1:58                   ` NeilBrown
  0 siblings, 0 replies; 58+ messages in thread
From: NeilBrown @ 2018-08-03  1:58 UTC (permalink / raw)
  To: lustre-devel

On Fri, Aug 03 2018, Andreas Dilger wrote:

> On Aug 1, 2018, at 20:09, NeilBrown <neilb@suse.com> wrote:
>> 
>> 
>> Once lustre and lnet are part of upstream Linux, what is the value of
>> keeping any of it in the master repo?
>> There would be be a need to keep it only to support old versions of
>> Linux, which will hopefully be less and less over time.
>> It might make sense to backport the upstream-linux code those particular
>> versions where it is needed, and do all development work in upstream
>> Linux, and just backport.
>
> I think the main reason is that none of the distros (which is what our
> customers use) will be running an upstream kernel for years after the
> client and later the server have landed upsteam.  Also, since the
> upstream kernel doesn't allow any kind of interoperability code,
> backporting the upstream client to the older kernels will essentially
> involve recreating all of the interop code that lives in the out-of-tree
> client and server today, since there are a thousand small API changes
> going into the kernel that affect the Lustre code.
>
> I suspect there will be at least a couple of years of overlap, until
> we start seeing the distro kernels including a version of Lustre, then
> we can deprecate to out-of-tree code and only keep it maintained for
> older kernels, as happens today for older Lustre releases.

Yes, there will be overlap.  It would be worth some effort to minimize
that if we can.
Once we have something credible in mainline, I'd like to encourage
vendors to take a back-port of that, at least for their next SP.
Hopefully we can then do development against mainline, and backport that
to whatever still needs to be supported by out-of-tree code.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180803/2f9475c5/attachment.sig>

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2018-08-03  1:58 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-31  2:25 [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 01/31] lustre: osc: Send RPCs when extents are full James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 02/31] lustre: obd: add 'network' client mount option James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 03/31] lustre: obd: change positional struct initializers to C99 James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 04/31] lustre: lmv: honour the specified stripe index James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 05/31] lustre: llite: return small device numbers for compat stat() James Simmons
2018-07-31  2:25 ` [lustre-devel] [PATCH 06/31] lustre: llite: reduce jobstats race window James Simmons
2018-07-31  4:05   ` Patrick Farrell
2018-08-02  3:52     ` James Simmons
2018-08-02  3:58       ` Patrick Farrell
2018-07-31  2:25 ` [lustre-devel] [PATCH 07/31] lustre: lnet: change positional struct initializers to C99 James Simmons
2018-07-31 22:32   ` NeilBrown
2018-07-31  2:26 ` [lustre-devel] [PATCH 08/31] lustre: llite: don't zero timestamps internally James Simmons
2018-07-31 22:31   ` NeilBrown
2018-07-31  2:26 ` [lustre-devel] [PATCH 09/31] lustre: mgc: relate sptlrpc & param to MGC James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 10/31] lustre: lnet: removal of obsolete LNDs James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 11/31] lustre: lnet: Fix route hops print James Simmons
2018-07-31 22:38   ` NeilBrown
2018-07-31 23:54     ` Amir Shehata
2018-08-01  0:32       ` NeilBrown
2018-08-01  1:15         ` Amir Shehata
2018-08-01  2:50           ` NeilBrown
2018-08-01 17:11             ` Amir Shehata
2018-08-02  2:09               ` NeilBrown
2018-08-03  0:01                 ` Andreas Dilger
2018-08-03  1:58                   ` NeilBrown
2018-07-31  2:26 ` [lustre-devel] [PATCH 12/31] lustre: obdclass: obdclass module cleanup upon load error James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 13/31] lustre: config: don't attach sub logs for LWP James Simmons
2018-07-31 22:41   ` NeilBrown
2018-07-31  2:26 ` [lustre-devel] [PATCH 14/31] lustre: llite: buggy special handling on MULTIMODRPCS James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 15/31] lustre: clio: remove unused members from struct cl_thread_info James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 16/31] lustre: obd: remove OBD_NOTIFY_SYNC{, _NONBLOCK} James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 17/31] lustre: obdclass: handle early requests vs CT registering James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 18/31] lustre: libcfs: avoid overflow of crypto bandwidth calculation James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 19/31] lustre: obd: remove OBD_NOTIFY_CONFIG James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 20/31] lustre: llite: Remove OBD_FAIL_OSC_CONNECT_CKSUM James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 21/31] lustre: osc: hung in osc_destroy() James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 22/31] lustre: libcfs: reduce libcfs checksum speed test time James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 23/31] lustre: llite: Return -ERESTARTSYS in range_lock() James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 24/31] lustre: obdclass: use static initializer macros where possible James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 25/31] lustre: config: move config types into lustre_idl.h James Simmons
2018-07-31 22:47   ` NeilBrown
2018-07-31 23:04     ` Patrick Farrell
2018-08-01  0:23       ` NeilBrown
2018-08-01  0:40         ` Patrick Farrell
2018-08-01  2:10           ` NeilBrown
2018-08-01  3:23             ` Patrick Farrell
2018-07-31  2:26 ` [lustre-devel] [PATCH 26/31] lustre: obd: remove unused data parameter from obd_notify() James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 27/31] lustre: llite: handle client racy case during create James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 28/31] lustre: obdclass: improve missing operation message James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 29/31] lustre: llite: ignore layout for ll_writepages() James Simmons
2018-07-31  2:26 ` [lustre-devel] [PATCH 30/31] lustre: fid: race between client_fid_fini and seq_client_flush James Simmons
2018-07-31 22:55   ` NeilBrown
2018-08-01  0:44     ` Yong, Fan
2018-08-01  2:58       ` NeilBrown
2018-08-01  4:15         ` Yong, Fan
2018-07-31  2:26 ` [lustre-devel] [PATCH 31/31] lustre: docs: update TODO file James Simmons
2018-08-01  3:41 ` [lustre-devel] [PATCH 00/31] lustre: missing fixes and cleanups from lustre 2.10 NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.