linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/40] Sync upstream lustre client LNet core
@ 2015-11-20 23:35 James Simmons
  2015-11-20 23:35 ` [PATCH 01/40] staging: lustre: drop *_t from end of struct lnet_text_buf James Simmons
                   ` (40 more replies)
  0 siblings, 41 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, James Simmons

This is the majority of the fixes that have gone into the LNet layer.
Outside a few remaining patches this brings LNet close to what is
running in production world wide.

This patch series needs the remove IOC_LIBCFS_PING_TEST ioctl patch
landed first.

Amir Shehata (19):
  staging: lustre: Dynamic LNet Configuration (DLC)
  staging: lustre: Dynamic LNet Configuration (DLC) dynamic routing
  staging: lustre: DLC Feature dynamic net config
  staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes
  staging: lustre: Dynamic LNet Configuration (DLC) show command
  staging: lustre: fix crash due to NULL networks string
  staging: lustre: DLC user/kernel space glue code
  staging: lustre: fix kernel crash when network failed to start
  staging: lustre: improve LNet clean up code and API
  staging: lustre: return appropriate errno when adding route
  staging: lustre: startup lnet acceptor thread dynamically
  staging: lustre: reject invalid net configuration for lnet
  staging: lustre: return -EEXIST if NI is not unique
  staging: lustre: handle lnet_check_routes() errors
  staging: lustre: improvement to router checker
  staging: lustre: prevent assert on LNet module unload
  staging: lustre: remove messages from lazy portal on NI shutdown
  staging: lustre: remove unnecessary NULL check in IOC_LIBCFS_GET_NET
  staging: lustre: Allocate the correct number of rtr buffers

Bruno Faccini (1):
  staging: lustre: avoid race during lnet acceptor thread termination

Chris Horn (2):
  staging: lustre: reflect down routes in /proc/sys/lnet/routes
  staging: lustre: Use lnet_is_route_alive for router aliveness

Doug Oucharek (1):
  staging: lustre: Remove LASSERTS from router checker

Frank Zago (1):
  staging: lustre: do not memset after LIBCFS_ALLOC

James Simmons (4):
  staging: lustre: drop *_t from end of struct lnet_text_buf
  staging: lustre: eliminate obsolete Cray SeaStar support
  staging: lustre: Fixes to make lnetctl function as expected.
  staging: lustre: test for sk_sleep presence in compact-2.6.h

John L. Hammond (3):
  staging: lustre: remove uses of IS_ERR_VALUE()
  staging: lustre: remove LUSTRE_{,SRV_}LNET_PID
  staging: lustre: assume a kernel build

Liang Zhen (3):
  staging: lustre: fix failure handle of create reply
  staging: lustre: return +ve for blocked lnet message
  staging: lustre: copy out libcfs ioctl inline buffer

Sebastien Buisson (1):
  staging: lustre: fix 'NULL pointer dereference' errors for LNet

frank zago (5):
  staging: lustre: make local functions static for LNet ni
  staging: lustre: add sparse annotation __user wherever needed for lnet
  staging: lustre: make some lnet functions static
  staging: lustre: missed a few cases of using NULL instead of 0
  staging: lustre: remove unnecessary EXPORT_SYMBOL from lnet layer

 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 -
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   89 ++-
 .../lustre/include/linux/libcfs/linux/libcfs.h     |    3 -
 .../staging/lustre/include/linux/lnet/lib-dlc.h    |  118 ++
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   60 +-
 .../staging/lustre/include/linux/lnet/lib-types.h  |   38 +-
 drivers/staging/lustre/include/linux/lnet/lnetst.h |   96 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |    4 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    7 +-
 drivers/staging/lustre/lnet/lnet/acceptor.c        |   30 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          | 1295 +++++++++++++------
 drivers/staging/lustre/lnet/lnet/config.c          |  113 +-
 drivers/staging/lustre/lnet/lnet/lib-eq.c          |    3 -
 drivers/staging/lustre/lnet/lnet/lib-md.c          |    3 -
 drivers/staging/lustre/lnet/lnet/lib-me.c          |    3 -
 drivers/staging/lustre/lnet/lnet/lib-move.c        |  195 ++-
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |   20 +-
 drivers/staging/lustre/lnet/lnet/lib-ptl.c         |   54 +-
 drivers/staging/lustre/lnet/lnet/lib-socket.c      |    3 -
 drivers/staging/lustre/lnet/lnet/module.c          |   70 +-
 drivers/staging/lustre/lnet/lnet/peer.c            |  197 +++-
 drivers/staging/lustre/lnet/lnet/router.c          |  426 +++++--
 drivers/staging/lustre/lnet/lnet/router_proc.c     |    4 +-
 drivers/staging/lustre/lnet/selftest/conctl.c      |   62 +-
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    4 +-
 drivers/staging/lustre/lnet/selftest/conrpc.h      |    5 +-
 drivers/staging/lustre/lnet/selftest/console.c     |   97 +-
 drivers/staging/lustre/lnet/selftest/console.h     |   55 +-
 drivers/staging/lustre/lnet/selftest/framework.c   |   10 -
 drivers/staging/lustre/lnet/selftest/module.c      |    4 +-
 drivers/staging/lustre/lnet/selftest/rpc.c         |    4 +-
 .../lustre/lustre/libcfs/linux/linux-module.c      |   74 +-
 drivers/staging/lustre/lustre/libcfs/module.c      |  104 +-
 drivers/staging/lustre/lustre/libcfs/tracefile.c   |    6 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    2 +-
 drivers/staging/lustre/lustre/llite/statahead.c    |    8 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   18 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    9 +-
 .../lustre/lustre/obdclass/linux/linux-module.c    |   17 +-
 drivers/staging/lustre/lustre/obdclass/llog.c      |   13 +-
 .../staging/lustre/lustre/obdclass/obd_config.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/events.c      |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/pinger.c      |   10 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c     |   29 +-
 44 files changed, 2297 insertions(+), 1075 deletions(-)
 create mode 100644 drivers/staging/lustre/include/linux/lnet/lib-dlc.h


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 01/40] staging: lustre: drop *_t from end of struct lnet_text_buf
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet James Simmons
                   ` (39 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, James Simmons

When lnet_text_buf data structure was transform from typedef
to struct the *_t which is typical of typedef was not drop.
This patch removes the *_t to be consistent.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/lnet/config.c |   57 ++++++++++++++---------------
 1 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 5390ee9..867c96e 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -37,7 +37,7 @@
 #define DEBUG_SUBSYSTEM S_LNET
 #include "../../include/linux/lnet/lib-lnet.h"
 
-struct lnet_text_buf_t {	    /* tmp struct for parsing routes */
+struct lnet_text_buf {	    /* tmp struct for parsing routes */
 	struct list_head ltb_list;	/* stash on lists */
 	int ltb_size;	/* allocated size */
 	char ltb_text[0];     /* text buffer */
@@ -365,14 +365,14 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 	return -EINVAL;
 }
 
-static struct lnet_text_buf_t *
+static struct lnet_text_buf *
 lnet_new_text_buf(int str_len)
 {
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 	int nob;
 
 	/* NB allocate space for the terminating 0 */
-	nob = offsetof(struct lnet_text_buf_t, ltb_text[str_len + 1]);
+	nob = offsetof(struct lnet_text_buf, ltb_text[str_len + 1]);
 	if (nob > LNET_SINGLE_TEXTBUF_NOB) {
 		/* _way_ conservative for "route net gateway..." */
 		CERROR("text buffer too big\n");
@@ -395,7 +395,7 @@ lnet_new_text_buf(int str_len)
 }
 
 static void
-lnet_free_text_buf(struct lnet_text_buf_t *ltb)
+lnet_free_text_buf(struct lnet_text_buf *ltb)
 {
 	lnet_tbnob -= ltb->ltb_size;
 	LIBCFS_FREE(ltb, ltb->ltb_size);
@@ -404,10 +404,10 @@ lnet_free_text_buf(struct lnet_text_buf_t *ltb)
 static void
 lnet_free_text_bufs(struct list_head *tbs)
 {
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 
 	while (!list_empty(tbs)) {
-		ltb = list_entry(tbs->next, struct lnet_text_buf_t, ltb_list);
+		ltb = list_entry(tbs->next, struct lnet_text_buf, ltb_list);
 
 		list_del(&ltb->ltb_list);
 		lnet_free_text_buf(ltb);
@@ -421,7 +421,7 @@ lnet_str2tbs_sep(struct list_head *tbs, char *str)
 	char *sep;
 	int nob;
 	int i;
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 
 	INIT_LIST_HEAD(&pending);
 
@@ -479,7 +479,7 @@ lnet_expand1tb(struct list_head *list,
 {
 	int len1 = (int)(sep1 - str);
 	int len2 = strlen(sep2 + 1);
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 
 	LASSERT(*sep1 == '[');
 	LASSERT(*sep2 == ']');
@@ -636,7 +636,7 @@ lnet_parse_route(char *str, int *im_a_router)
 	struct list_head *tmp2;
 	__u32 net;
 	lnet_nid_t nid;
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 	int rc;
 	char *sep;
 	char *token = str;
@@ -692,8 +692,7 @@ lnet_parse_route(char *str, int *im_a_router)
 		list_add_tail(tmp1, tmp2);
 
 		while (tmp1 != tmp2) {
-			ltb = list_entry(tmp1, struct lnet_text_buf_t,
-					 ltb_list);
+			ltb = list_entry(tmp1, struct lnet_text_buf, ltb_list);
 
 			rc = lnet_str2tbs_expand(tmp1->next, ltb->ltb_text);
 			if (rc < 0)
@@ -733,13 +732,12 @@ lnet_parse_route(char *str, int *im_a_router)
 	LASSERT(!list_empty(&gateways));
 
 	list_for_each(tmp1, &nets) {
-		ltb = list_entry(tmp1, struct lnet_text_buf_t, ltb_list);
+		ltb = list_entry(tmp1, struct lnet_text_buf, ltb_list);
 		net = libcfs_str2net(ltb->ltb_text);
 		LASSERT(net != LNET_NIDNET(LNET_NID_ANY));
 
 		list_for_each(tmp2, &gateways) {
-			ltb = list_entry(tmp2, struct lnet_text_buf_t,
-					 ltb_list);
+			ltb = list_entry(tmp2, struct lnet_text_buf, ltb_list);
 			nid = libcfs_str2nid(ltb->ltb_text);
 			LASSERT(nid != LNET_NID_ANY);
 
@@ -772,10 +770,10 @@ lnet_parse_route(char *str, int *im_a_router)
 static int
 lnet_parse_route_tbs(struct list_head *tbs, int *im_a_router)
 {
-	struct lnet_text_buf_t *ltb;
+	struct lnet_text_buf *ltb;
 
 	while (!list_empty(tbs)) {
-		ltb = list_entry(tbs->next, struct lnet_text_buf_t, ltb_list);
+		ltb = list_entry(tbs->next, struct lnet_text_buf, ltb_list);
 
 		if (lnet_parse_route(ltb->ltb_text, im_a_router) < 0) {
 			lnet_free_text_bufs(tbs);
@@ -909,8 +907,8 @@ lnet_splitnets(char *source, struct list_head *nets)
 	int offset = 0;
 	int offset2;
 	int len;
-	struct lnet_text_buf_t *tb;
-	struct lnet_text_buf_t *tb2;
+	struct lnet_text_buf *tb;
+	struct lnet_text_buf *tb2;
 	struct list_head *t;
 	char *sep;
 	char *bracket;
@@ -919,7 +917,7 @@ lnet_splitnets(char *source, struct list_head *nets)
 	LASSERT(!list_empty(nets));
 	LASSERT(nets->next == nets->prev);     /* single entry */
 
-	tb = list_entry(nets->next, struct lnet_text_buf_t, ltb_list);
+	tb = list_entry(nets->next, struct lnet_text_buf, ltb_list);
 
 	for (;;) {
 		sep = strchr(tb->ltb_text, ',');
@@ -955,7 +953,7 @@ lnet_splitnets(char *source, struct list_head *nets)
 		}
 
 		list_for_each(t, nets) {
-			tb2 = list_entry(t, struct lnet_text_buf_t, ltb_list);
+			tb2 = list_entry(t, struct lnet_text_buf, ltb_list);
 
 			if (tb2 == tb)
 				continue;
@@ -996,8 +994,8 @@ lnet_match_networks(char **networksp, char *ip2nets, __u32 *ipaddrs, int nip)
 	struct list_head current_nets;
 	struct list_head *t;
 	struct list_head *t2;
-	struct lnet_text_buf_t *tb;
-	struct lnet_text_buf_t *tb2;
+	struct lnet_text_buf *tb;
+	struct lnet_text_buf *tb2;
 	__u32 net1;
 	__u32 net2;
 	int len;
@@ -1020,9 +1018,8 @@ lnet_match_networks(char **networksp, char *ip2nets, __u32 *ipaddrs, int nip)
 	rc = 0;
 
 	while (!list_empty(&raw_entries)) {
-		tb = list_entry(raw_entries.next, struct lnet_text_buf_t,
-				    ltb_list);
-
+		tb = list_entry(raw_entries.next, struct lnet_text_buf,
+				ltb_list);
 		strncpy(source, tb->ltb_text, sizeof(source));
 		source[sizeof(source)-1] = '\0';
 
@@ -1047,13 +1044,13 @@ lnet_match_networks(char **networksp, char *ip2nets, __u32 *ipaddrs, int nip)
 
 		dup = 0;
 		list_for_each(t, &current_nets) {
-			tb = list_entry(t, struct lnet_text_buf_t, ltb_list);
+			tb = list_entry(t, struct lnet_text_buf, ltb_list);
 			net1 = lnet_netspec2net(tb->ltb_text);
 			LASSERT(net1 != LNET_NIDNET(LNET_NID_ANY));
 
 			list_for_each(t2, &matched_nets) {
-				tb2 = list_entry(t2, struct lnet_text_buf_t,
-						     ltb_list);
+				tb2 = list_entry(t2, struct lnet_text_buf,
+						 ltb_list);
 				net2 = lnet_netspec2net(tb2->ltb_text);
 				LASSERT(net2 != LNET_NIDNET(LNET_NID_ANY));
 
@@ -1073,7 +1070,7 @@ lnet_match_networks(char **networksp, char *ip2nets, __u32 *ipaddrs, int nip)
 		}
 
 		list_for_each_safe(t, t2, &current_nets) {
-			tb = list_entry(t, struct lnet_text_buf_t, ltb_list);
+			tb = list_entry(t, struct lnet_text_buf, ltb_list);
 
 			list_del(&tb->ltb_list);
 			list_add_tail(&tb->ltb_list, &matched_nets);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
  2015-11-20 23:35 ` [PATCH 01/40] staging: lustre: drop *_t from end of struct lnet_text_buf James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02  7:46   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes James Simmons
                   ` (38 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Sebastien Buisson

From: Sebastien Buisson <sebastien.buisson@bull.net>

Fix 'NULL pointer dereference' defects found by Coverity version
6.5.3:
Dereference after null check (FORWARD_NULL)
For instance, Passing null pointer to a function which dereferences
it.
Dereference before null check (REVERSE_INULL)
Null-checking variable suggests that it may be null, but it has
already been dereferenced on all paths leading to the check.
Dereference null return value (NULL_RETURNS)

The following fixes for the LNet layer are broken out of patch
http://review.whamcloud.com/4720.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2217
Reviewed-on: http://review.whamcloud.com/4720
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |    2 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    2 +
 drivers/staging/lustre/lnet/selftest/conctl.c      |   51 ++++++++++----------
 3 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index de0f85f..0f4154c 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -2829,7 +2829,7 @@ int kiblnd_startup(lnet_ni_t *ni)
 	return 0;
 
 failed:
-	if (net->ibn_dev == NULL && ibdev != NULL)
+	if (net && net->ibn_dev == NULL && ibdev != NULL)
 		kiblnd_destroy_dev(ibdev);
 
 net_failed:
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 5631f60..7a68382 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -162,6 +162,7 @@ lnet_iov_nob(unsigned int niov, struct kvec *iov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(niov == 0 || iov);
 	while (niov-- > 0)
 		nob += (iov++)->iov_len;
 
@@ -280,6 +281,7 @@ lnet_kiov_nob(unsigned int niov, lnet_kiov_t *kiov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(niov == 0 || kiov);
 	while (niov-- > 0)
 		nob += (kiov++)->kiov_len;
 
diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index 556c837..2ca7d0e 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -679,45 +679,46 @@ static int
 lst_stat_query_ioctl(lstio_stat_args_t *args)
 {
 	int rc;
-	char *name;
+	char *name = NULL;
 
 	/* TODO: not finished */
 	if (args->lstio_sta_key != console_session.ses_key)
 		return -EACCES;
 
-	if (args->lstio_sta_resultp == NULL ||
-	    (args->lstio_sta_namep  == NULL &&
-	     args->lstio_sta_idsp   == NULL) ||
-	    args->lstio_sta_nmlen <= 0 ||
-	    args->lstio_sta_nmlen > LST_NAME_SIZE)
-		return -EINVAL;
-
-	if (args->lstio_sta_idsp != NULL &&
-	    args->lstio_sta_count <= 0)
+	if (!args->lstio_sta_resultp)
 		return -EINVAL;
 
-	LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
-	if (name == NULL)
-		return -ENOMEM;
-
-	if (copy_from_user(name, args->lstio_sta_namep,
-			       args->lstio_sta_nmlen)) {
-		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
-		return -EFAULT;
-	}
+	if (args->lstio_sta_idsp) {
+		if (args->lstio_sta_count <= 0)
+			return -EINVAL;
 
-	if (args->lstio_sta_idsp == NULL) {
-		rc = lstcon_group_stat(name, args->lstio_sta_timeout,
-				       args->lstio_sta_resultp);
-	} else {
 		rc = lstcon_nodes_stat(args->lstio_sta_count,
 				       args->lstio_sta_idsp,
 				       args->lstio_sta_timeout,
 				       args->lstio_sta_resultp);
-	}
+	} else if (args->lstio_sta_namep) {
+		if (args->lstio_sta_nmlen <= 0 ||
+		    args->lstio_sta_nmlen > LST_NAME_SIZE)
+			return -EINVAL;
+
+		LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
+		if (!name)
+			return -ENOMEM;
 
-	LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
+		rc = copy_from_user(name, args->lstio_sta_namep,
+				    args->lstio_sta_nmlen);
+		if (!rc)
+			rc = lstcon_group_stat(name, args->lstio_sta_timeout,
+					       args->lstio_sta_resultp);
+		else
+			rc = -EFAULT;
 
+	} else {
+		rc = -EINVAL;
+	}
+
+	if (name)
+		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
 	return rc;
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
  2015-11-20 23:35 ` [PATCH 01/40] staging: lustre: drop *_t from end of struct lnet_text_buf James Simmons
  2015-11-20 23:35 ` [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02  7:54   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 04/40] staging: lustre: fix failure handle of create reply James Simmons
                   ` (37 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Chris Horn

From: Chris Horn <hornc@cray.com>

We consider routes "down" if the router is down or the router
NI for the target network is down. This should be reflected
in the output of /proc/sys/lnet/routes

Signed-off-by: Chris Horn <hornc@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3679
Reviewed-on: http://review.whamcloud.com/7857
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   13 ++++++++
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   32 ++++++++++----------
 drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 +-
 3 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index b61d504..09c6bfe 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -64,6 +64,19 @@ extern lnet_t	the_lnet;	/* THE network */
 /** exclusive lock */
 #define LNET_LOCK_EX		CFS_PERCPT_LOCK_EX
 
+static inline int lnet_is_route_alive(lnet_route_t *route)
+{
+	/* gateway is down */
+	if (!route->lr_gateway->lp_alive)
+		return 0;
+	/* no NI status, assume it's alive */
+	if ((route->lr_gateway->lp_ping_feats &
+	     LNET_PING_FEAT_NI_STATUS) == 0)
+		return 1;
+	/* has NI status, check # down NIs */
+	return route->lr_downis == 0;
+}
+
 static inline int lnet_is_wire_handle_none(lnet_handle_wire_t *wh)
 {
 	return (wh->wh_interface_cookie == LNET_WIRE_HANDLE_COOKIE_NONE &&
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 7a68382..c56de44 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1122,9 +1122,9 @@ static lnet_peer_t *
 lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
 {
 	lnet_remotenet_t *rnet;
-	lnet_route_t *rtr;
-	lnet_route_t *rtr_best;
-	lnet_route_t *rtr_last;
+	lnet_route_t *route;
+	lnet_route_t *best_route;
+	lnet_route_t *last_route;
 	struct lnet_peer *lp_best;
 	struct lnet_peer *lp;
 	int rc;
@@ -1137,13 +1137,12 @@ lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
 		return NULL;
 
 	lp_best = NULL;
-	rtr_best = rtr_last = NULL;
-	list_for_each_entry(rtr, &rnet->lrn_routes, lr_list) {
-		lp = rtr->lr_gateway;
+	best_route = NULL;
+	last_route = NULL;
+	list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
+		lp = route->lr_gateway;
 
-		if (!lp->lp_alive || /* gateway is down */
-		    ((lp->lp_ping_feats & LNET_PING_FEAT_NI_STATUS) != 0 &&
-		     rtr->lr_downis != 0)) /* NI to target is down */
+		if (!lnet_is_route_alive(route))
 			continue;
 
 		if (ni != NULL && lp->lp_ni != ni)
@@ -1153,28 +1152,29 @@ lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
 			return lp;
 
 		if (lp_best == NULL) {
-			rtr_best = rtr_last = rtr;
+			best_route = route;
+			last_route = route;
 			lp_best = lp;
 			continue;
 		}
 
 		/* no protection on below fields, but it's harmless */
-		if (rtr_last->lr_seq - rtr->lr_seq < 0)
-			rtr_last = rtr;
+		if (last_route->lr_seq - route->lr_seq < 0)
+			last_route = route;
 
-		rc = lnet_compare_routes(rtr, rtr_best);
+		rc = lnet_compare_routes(route, best_route);
 		if (rc < 0)
 			continue;
 
-		rtr_best = rtr;
+		best_route = route;
 		lp_best = lp;
 	}
 
 	/* set sequence number on the best router to the latest sequence + 1
 	 * so we can round-robin all routers, it's race and inaccurate but
 	 * harmless and functional  */
-	if (rtr_best != NULL)
-		rtr_best->lr_seq = rtr_last->lr_seq + 1;
+	if (best_route)
+		best_route->lr_seq = last_route->lr_seq + 1;
 	return lp_best;
 }
 
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index 396c7c4..af7423f 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -240,7 +240,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
 			unsigned int hops = route->lr_hops;
 			unsigned int priority = route->lr_priority;
 			lnet_nid_t nid = route->lr_gateway->lp_nid;
-			int alive = route->lr_gateway->lp_alive;
+			int alive = lnet_is_route_alive(route);
 
 			s += snprintf(s, tmpstr + tmpsiz - s,
 				      "%-8s %4u %8u %7s %s\n",
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 04/40] staging: lustre: fix failure handle of create reply
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (2 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 05/40] staging: lustre: eliminate obsolete Cray SeaStar support James Simmons
                   ` (36 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

error handler of lnet_create_reply_msg() didn't release lnet_res_lock
if lnet_msg_alloc() failed.
It can be fixed by moving validation check of msg out from lock.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2745
Reviewed-on: http://review.whamcloud.com/5542
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/lib-move.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index c56de44..03fcddc 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -2160,17 +2160,17 @@ lnet_create_reply_msg(lnet_ni_t *ni, lnet_msg_t *getmsg)
 	LASSERT(!getmsg->msg_target_is_router);
 	LASSERT(!getmsg->msg_routing);
 
-	cpt = lnet_cpt_of_cookie(getmd->md_lh.lh_cookie);
-	lnet_res_lock(cpt);
-
-	LASSERT(getmd->md_refcount > 0);
-
 	if (msg == NULL) {
 		CERROR("%s: Dropping REPLY from %s: can't allocate msg\n",
-			libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id));
+		       libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id));
 		goto drop;
 	}
 
+	cpt = lnet_cpt_of_cookie(getmd->md_lh.lh_cookie);
+	lnet_res_lock(cpt);
+
+	LASSERT(getmd->md_refcount > 0);
+
 	if (getmd->md_threshold == 0) {
 		CERROR("%s: Dropping REPLY from %s for inactive MD %p\n",
 			libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id),
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 05/40] staging: lustre: eliminate obsolete Cray SeaStar support
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (3 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 04/40] staging: lustre: fix failure handle of create reply James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE() James Simmons
                   ` (35 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, James Simmons, James Simmons

Remove the bulk of code for the no longer supported
SeaStar interconnect found on older Cray systems.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1422
Reviewed-on: http://review.whamcloud.com/7469
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Chuck Fossen <chuckf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/router.c |   12 ++----------
 1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 4ea651c..fa61ec9 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -631,7 +631,6 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 		return; /* can't carry NI status info */
 
 	list_for_each_entry(rtr, &gw->lp_routes, lr_gwlist) {
-		int ptl_status = LNET_NI_STATUS_INVALID;
 		int down = 0;
 		int up = 0;
 		int i;
@@ -651,10 +650,7 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 				continue;
 
 			if (stat->ns_status == LNET_NI_STATUS_DOWN) {
-				if (LNET_NETTYP(LNET_NIDNET(nid)) != PTLLND)
-					down++;
-				else if (ptl_status != LNET_NI_STATUS_UP)
-					ptl_status = LNET_NI_STATUS_DOWN;
+				down++;
 				continue;
 			}
 
@@ -663,10 +659,6 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 					up = 1;
 					break;
 				}
-				/* ptl NIs are considered down only when
-				 * they're all down */
-				if (LNET_NETTYP(LNET_NIDNET(nid)) == PTLLND)
-					ptl_status = LNET_NI_STATUS_UP;
 				continue;
 			}
 
@@ -680,7 +672,7 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 			rtr->lr_downis = 0;
 			continue;
 		}
-		rtr->lr_downis = down + (ptl_status == LNET_NI_STATUS_DOWN);
+		rtr->lr_downis = down;
 	}
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE()
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (4 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 05/40] staging: lustre: eliminate obsolete Cray SeaStar support James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-21 18:45   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 07/40] staging: lustre: return +ve for blocked lnet message James Simmons
                   ` (34 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, John L. Hammond

From: John L. Hammond <john.hammond@intel.com>

Remove most uses of IS_ERR_VALUE(). This macro was often given an int
argument coming from PTR_ERR(). This invokes implementation defined
behavior since the long value gotten by applying PTR_ERR() to a kernel
pointer will usually not be representable as an int. Moreover it may
be just plain wrong to do this since the expressions IS_ERR(p) and
IS_ERR_VALUE((int) PTR_ERR(p)) are not equivalent for a general
pointer p.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3498
Reviewed-on: http://review.whamcloud.com/6759
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/acceptor.c      |    9 ++++---
 drivers/staging/lustre/lnet/lnet/router.c        |    7 +++--
 drivers/staging/lustre/lustre/libcfs/tracefile.c |    6 +++-
 drivers/staging/lustre/lustre/llite/statahead.c  |    8 ++++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c  |   18 +++++++++----
 drivers/staging/lustre/lustre/mgc/mgc_request.c  |    9 ++++---
 drivers/staging/lustre/lustre/obdclass/llog.c    |   13 ++++++----
 drivers/staging/lustre/lustre/ptlrpc/pinger.c    |   10 ++++---
 drivers/staging/lustre/lustre/ptlrpc/service.c   |   29 +++++++++++++---------
 9 files changed, 66 insertions(+), 43 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index 92ca1dd..d05754d 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -436,6 +436,7 @@ accept2secure(const char *acc, long *sec)
 int
 lnet_acceptor_start(void)
 {
+	struct task_struct *task;
 	int rc;
 	long rc2;
 	long secure;
@@ -454,10 +455,10 @@ lnet_acceptor_start(void)
 	if (lnet_count_acceptor_nis() == 0)  /* not required */
 		return 0;
 
-	rc2 = PTR_ERR(kthread_run(lnet_acceptor,
-				  (void *)(ulong_ptr_t)secure,
-				  "acceptor_%03ld", secure));
-	if (IS_ERR_VALUE(rc2)) {
+	task = kthread_run(lnet_acceptor, (void *)(ulong_ptr_t)secure,
+			   "acceptor_%03ld", secure);
+	if (IS_ERR(task)) {
+		rc2 = PTR_ERR(task);
 		CERROR("Can't start acceptor thread: %ld\n", rc2);
 
 		return -ESRCH;
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index fa61ec9..47f80aa 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -993,6 +993,7 @@ lnet_ping_router_locked(lnet_peer_t *rtr)
 int
 lnet_router_checker_start(void)
 {
+	struct task_struct *task;
 	int rc;
 	int eqsz;
 
@@ -1021,9 +1022,9 @@ lnet_router_checker_start(void)
 	}
 
 	the_lnet.ln_rc_state = LNET_RC_STATE_RUNNING;
-	rc = PTR_ERR(kthread_run(lnet_router_checker,
-				 NULL, "router_checker"));
-	if (IS_ERR_VALUE(rc)) {
+	task = kthread_run(lnet_router_checker, NULL, "router_checker");
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
 		CERROR("Can't start router checker thread: %d\n", rc);
 		/* block until event callback signals exit */
 		down(&the_lnet.ln_rc_signal);
diff --git a/drivers/staging/lustre/lustre/libcfs/tracefile.c b/drivers/staging/lustre/lustre/libcfs/tracefile.c
index 65c4f1a..bc5d0ee 100644
--- a/drivers/staging/lustre/lustre/libcfs/tracefile.c
+++ b/drivers/staging/lustre/lustre/libcfs/tracefile.c
@@ -1056,6 +1056,7 @@ end_loop:
 int cfs_trace_start_thread(void)
 {
 	struct tracefiled_ctl *tctl = &trace_tctl;
+	struct task_struct *task;
 	int rc = 0;
 
 	mutex_lock(&cfs_trace_thread_mutex);
@@ -1067,8 +1068,9 @@ int cfs_trace_start_thread(void)
 	init_waitqueue_head(&tctl->tctl_waitq);
 	atomic_set(&tctl->tctl_shutdown, 0);
 
-	if (IS_ERR(kthread_run(tracefiled, tctl, "ktracefiled"))) {
-		rc = -ECHILD;
+	task = kthread_run(tracefiled, tctl, "ktracefiled");
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
 		goto out;
 	}
 
diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c
index 18f5f2b..e17daf8 100644
--- a/drivers/staging/lustre/lustre/llite/statahead.c
+++ b/drivers/staging/lustre/lustre/llite/statahead.c
@@ -1512,6 +1512,7 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 	struct ll_sa_entry       *entry;
 	struct ptlrpc_thread     *thread;
 	struct l_wait_info	lwi   = { 0 };
+	struct task_struct *task;
 	int		       rc    = 0;
 	struct ll_inode_info     *plli;
 
@@ -1670,10 +1671,11 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 	lli->lli_sai = sai;
 
 	plli = ll_i2info(d_inode(parent));
-	rc = PTR_ERR(kthread_run(ll_statahead_thread, parent,
-				 "ll_sa_%u", plli->lli_opendir_pid));
+	task = kthread_run(ll_statahead_thread, parent, "ll_sa_%u",
+			   plli->lli_opendir_pid);
 	thread = &sai->sai_thread;
-	if (IS_ERR_VALUE(rc)) {
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
 		CERROR("can't start ll_sa thread, rc: %d\n", rc);
 		dput(parent);
 		lli->lli_opendir_key = NULL;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 294c050..ef25ccd 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1560,6 +1560,7 @@ static int mdc_ioc_changelog_send(struct obd_device *obd,
 				  struct ioc_changelog *icc)
 {
 	struct changelog_show *cs;
+	struct task_struct *task;
 	int rc;
 
 	/* Freed in mdc_changelog_send_thread */
@@ -1577,15 +1578,20 @@ static int mdc_ioc_changelog_send(struct obd_device *obd,
 	 * New thread because we should return to user app before
 	 * writing into our pipe
 	 */
-	rc = PTR_ERR(kthread_run(mdc_changelog_send_thread, cs,
-				 "mdc_clg_send_thread"));
-	if (!IS_ERR_VALUE(rc)) {
-		CDEBUG(D_CHANGELOG, "start changelog thread\n");
-		return 0;
+	task = kthread_run(mdc_changelog_send_thread, cs,
+			   "mdc_clg_send_thread");
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
+		CERROR("%s: can't start changelog thread: rc = %d\n",
+		       obd->obd_name, rc);
+		kfree(cs);
+	} else {
+		rc = 0;
+		CDEBUG(D_CHANGELOG, "%s: started changelog thread\n",
+		       obd->obd_name);
 	}
 
 	CERROR("Failed to start changelog thread: %d\n", rc);
-	kfree(cs);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 2c48847..1395a1a 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -711,6 +711,7 @@ static int mgc_cleanup(struct obd_device *obd)
 static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 {
 	struct lprocfs_static_vars lvars = { NULL };
+	struct task_struct *task;
 	int rc;
 
 	ptlrpcd_addref();
@@ -734,10 +735,10 @@ static int mgc_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 		init_waitqueue_head(&rq_waitq);
 
 		/* start requeue thread */
-		rc = PTR_ERR(kthread_run(mgc_requeue_thread, NULL,
-					     "ll_cfg_requeue"));
-		if (IS_ERR_VALUE(rc)) {
-			CERROR("%s: Cannot start requeue thread (%d),no more log updates!\n",
+		task = kthread_run(mgc_requeue_thread, NULL, "ll_cfg_requeue");
+		if (IS_ERR(task)) {
+			rc = PTR_ERR(task);
+			CERROR("%s: cannot start requeue thread: rc = %d; no more log updates\n",
 			       obd->obd_name, rc);
 			goto err_cleanup;
 		}
diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c b/drivers/staging/lustre/lustre/obdclass/llog.c
index 7cb55ef..741c258 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog.c
@@ -376,17 +376,19 @@ int llog_process_or_fork(const struct lu_env *env,
 	lpi->lpi_catdata   = catdata;
 
 	if (fork) {
+		struct task_struct *task;
+
 		/* The new thread can't use parent env,
 		 * init the new one in llog_process_thread_daemonize. */
 		lpi->lpi_env = NULL;
 		init_completion(&lpi->lpi_completion);
-		rc = PTR_ERR(kthread_run(llog_process_thread_daemonize, lpi,
-					     "llog_process_thread"));
-		if (IS_ERR_VALUE(rc)) {
+		task = kthread_run(llog_process_thread_daemonize, lpi,
+				   "llog_process_thread");
+		if (IS_ERR(task)) {
+			rc = PTR_ERR(task);
 			CERROR("%s: cannot start thread: rc = %d\n",
 			       loghandle->lgh_ctxt->loc_obd->obd_name, rc);
-			kfree(lpi);
-			return rc;
+			goto out_lpi;
 		}
 		wait_for_completion(&lpi->lpi_completion);
 	} else {
@@ -394,6 +396,7 @@ int llog_process_or_fork(const struct lu_env *env,
 		llog_process_thread(lpi);
 	}
 	rc = lpi->lpi_rc;
+out_lpi:
 	kfree(lpi);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pinger.c b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
index 5c719f1..a94265a 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pinger.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pinger.c
@@ -293,6 +293,7 @@ static struct ptlrpc_thread pinger_thread;
 int ptlrpc_start_pinger(void)
 {
 	struct l_wait_info lwi = { 0 };
+	struct task_struct *task;
 	int rc;
 
 	if (!thread_is_init(&pinger_thread) &&
@@ -303,10 +304,11 @@ int ptlrpc_start_pinger(void)
 
 	strcpy(pinger_thread.t_name, "ll_ping");
 
-	rc = PTR_ERR(kthread_run(ptlrpc_pinger_main, &pinger_thread,
-				 "%s", pinger_thread.t_name));
-	if (IS_ERR_VALUE(rc)) {
-		CERROR("cannot start thread: %d\n", rc);
+	task = kthread_run(ptlrpc_pinger_main, &pinger_thread,
+			   pinger_thread.t_name);
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
+		CERROR("cannot start pinger thread: rc = %d\n", rc);
 		return rc;
 	}
 	l_wait_event(pinger_thread.t_ctl_waitq,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/service.c b/drivers/staging/lustre/lustre/ptlrpc/service.c
index f45898f..5d02055 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/service.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/service.c
@@ -2255,24 +2255,27 @@ static int ptlrpc_start_hr_threads(void)
 
 		for (j = 0; j < hrp->hrp_nthrs; j++) {
 			struct	ptlrpc_hr_thread *hrt = &hrp->hrp_thrs[j];
+			struct task_struct *task;
 
-			rc = PTR_ERR(kthread_run(ptlrpc_hr_main,
+			task = kthread_run(ptlrpc_hr_main,
 						 &hrp->hrp_thrs[j],
 						 "ptlrpc_hr%02d_%03d",
 						 hrp->hrp_cpt,
-						 hrt->hrt_id));
-			if (IS_ERR_VALUE(rc))
+						 hrt->hrt_id);
+			if (IS_ERR(task)) {
+				rc = PTR_ERR(task);
 				break;
+			}
 		}
 		wait_event(ptlrpc_hr.hr_waitq,
 			       atomic_read(&hrp->hrp_nstarted) == j);
-		if (!IS_ERR_VALUE(rc))
-			continue;
 
-		CERROR("Reply handling thread %d:%d Failed on starting: rc = %d\n",
-		       i, j, rc);
-		ptlrpc_stop_hr_threads();
-		return rc;
+		if (rc < 0) {
+			CERROR("cannot start reply handler thread %d:%d: rc = %d\n",
+			       i, j, rc);
+			ptlrpc_stop_hr_threads();
+			return rc;
+		}
 	}
 	return 0;
 }
@@ -2374,6 +2377,7 @@ int ptlrpc_start_thread(struct ptlrpc_service_part *svcpt, int wait)
 	struct l_wait_info lwi = { 0 };
 	struct ptlrpc_thread *thread;
 	struct ptlrpc_service *svc;
+	struct task_struct *task;
 	int rc;
 
 	LASSERT(svcpt != NULL);
@@ -2442,9 +2446,10 @@ int ptlrpc_start_thread(struct ptlrpc_service_part *svcpt, int wait)
 	}
 
 	CDEBUG(D_RPCTRACE, "starting thread '%s'\n", thread->t_name);
-	rc = PTR_ERR(kthread_run(ptlrpc_main, thread, "%s", thread->t_name));
-	if (IS_ERR_VALUE(rc)) {
-		CERROR("cannot start thread '%s': rc %d\n",
+	task = kthread_run(ptlrpc_main, thread, "%s", thread->t_name);
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
+		CERROR("cannot start thread '%s': rc = %d\n",
 		       thread->t_name, rc);
 		spin_lock(&svcpt->scp_lock);
 		--svcpt->scp_nthrs_starting;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 07/40] staging: lustre: return +ve for blocked lnet message
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (5 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE() James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 08/40] staging: lustre: do not memset after LIBCFS_ALLOC James Simmons
                   ` (33 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Liang Zhen, James Simmons

From: Liang Zhen <liang.zhen@intel.com>

returned value of lnet_post_send_locked and
lnet_post_routed_recv_locked are changed to -ve by:
http://review.whamcloud.com/#/c/9369/

this is wrong because callers rely on +ve to identify blocked
message which is not a failure.

To respect linux kernel coding style and not use positive error
code, this patch adds two macros as non-error returned values of
these functions:
    LNET_CREDIT_OK    has credit for message
    LNET_CREDIT_WAIT  no credit and message is blocked

both these functions will return these two values instead of 0
and EAGAIN

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5151
Reviewed-on: http://review.whamcloud.com/10625
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/lib-move.c |   51 +++++++++++++++++----------
 1 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 03fcddc..021a81d 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -42,6 +42,11 @@
 
 #include "../../include/linux/lnet/lib-lnet.h"
 
+/** lnet message has credit and can be submitted to lnd for send/receive */
+#define LNET_CREDIT_OK		0
+/** lnet message is waiting for credit */
+#define LNET_CREDIT_WAIT	1
+
 static int local_nid_dist_zero = 1;
 module_param(local_nid_dist_zero, int, 0444);
 MODULE_PARM_DESC(local_nid_dist_zero, "Reserved");
@@ -777,10 +782,10 @@ lnet_peer_alive_locked(lnet_peer_t *lp)
  *	  lnet_send() is going to lnet_net_unlock immediately after this, so
  *	  it sets do_send FALSE and I don't do the unlock/send/lock bit.
  *
- * \retval 0 If \a msg sent or OK to send.
- * \retval EAGAIN If \a msg blocked for credit.
- * \retval EHOSTUNREACH If the next hop of the message appears dead.
- * \retval ECANCELED If the MD of the message has been unlinked.
+ * \retval LNET_CREDIT_OK If \a msg sent or OK to send.
+ * \retval LNET_CREDIT_WAIT If \a msg blocked for credit.
+ * \retval -EHOSTUNREACH If the next hop of the message appears dead.
+ * \retval -ECANCELED If the MD of the message has been unlinked.
  */
 static int
 lnet_post_send_locked(lnet_msg_t *msg, int do_send)
@@ -808,7 +813,7 @@ lnet_post_send_locked(lnet_msg_t *msg, int do_send)
 			lnet_finalize(ni, msg, -EHOSTUNREACH);
 
 		lnet_net_lock(cpt);
-		return EHOSTUNREACH;
+		return -EHOSTUNREACH;
 	}
 
 	if (msg->msg_md != NULL &&
@@ -821,7 +826,7 @@ lnet_post_send_locked(lnet_msg_t *msg, int do_send)
 			lnet_finalize(ni, msg, -ECANCELED);
 
 		lnet_net_lock(cpt);
-		return ECANCELED;
+		return -ECANCELED;
 	}
 
 	if (!msg->msg_peertxcredit) {
@@ -838,7 +843,7 @@ lnet_post_send_locked(lnet_msg_t *msg, int do_send)
 		if (lp->lp_txcredits < 0) {
 			msg->msg_tx_delayed = 1;
 			list_add_tail(&msg->msg_list, &lp->lp_txq);
-			return EAGAIN;
+			return LNET_CREDIT_WAIT;
 		}
 	}
 
@@ -855,7 +860,7 @@ lnet_post_send_locked(lnet_msg_t *msg, int do_send)
 		if (tq->tq_credits < 0) {
 			msg->msg_tx_delayed = 1;
 			list_add_tail(&msg->msg_list, &tq->tq_delayed);
-			return EAGAIN;
+			return LNET_CREDIT_WAIT;
 		}
 	}
 
@@ -864,7 +869,7 @@ lnet_post_send_locked(lnet_msg_t *msg, int do_send)
 		lnet_ni_send(ni, msg);
 		lnet_net_lock(cpt);
 	}
-	return 0;
+	return LNET_CREDIT_OK;
 }
 
 static lnet_rtrbufpool_t *
@@ -891,8 +896,10 @@ static int
 lnet_post_routed_recv_locked(lnet_msg_t *msg, int do_recv)
 {
 	/* lnet_parse is going to lnet_net_unlock immediately after this, so it
-	 * sets do_recv FALSE and I don't do the unlock/send/lock bit.  I
-	 * return EAGAIN if msg blocked and 0 if received or OK to receive */
+	 * sets do_recv FALSE and I don't do the unlock/send/lock bit.
+	 * I return LNET_CREDIT_WAIT if msg blocked and LNET_CREDIT_OK if
+	 * received or OK to receive
+	 */
 	lnet_peer_t *lp = msg->msg_rxpeer;
 	lnet_rtrbufpool_t *rbp;
 	lnet_rtrbuf_t *rb;
@@ -921,7 +928,7 @@ lnet_post_routed_recv_locked(lnet_msg_t *msg, int do_recv)
 			LASSERT(msg->msg_rx_ready_delay);
 			msg->msg_rx_delayed = 1;
 			list_add_tail(&msg->msg_list, &lp->lp_rtrq);
-			return EAGAIN;
+			return LNET_CREDIT_WAIT;
 		}
 	}
 
@@ -941,7 +948,7 @@ lnet_post_routed_recv_locked(lnet_msg_t *msg, int do_recv)
 			LASSERT(msg->msg_rx_ready_delay);
 			msg->msg_rx_delayed = 1;
 			list_add_tail(&msg->msg_list, &rbp->rbp_msgs);
-			return EAGAIN;
+			return LNET_CREDIT_WAIT;
 		}
 	}
 
@@ -960,7 +967,7 @@ lnet_post_routed_recv_locked(lnet_msg_t *msg, int do_recv)
 			     0, msg->msg_len, msg->msg_len);
 		lnet_net_lock(cpt);
 	}
-	return 0;
+	return LNET_CREDIT_OK;
 }
 
 void
@@ -1340,13 +1347,13 @@ lnet_send(lnet_nid_t src_nid, lnet_msg_t *msg, lnet_nid_t rtr_nid)
 	rc = lnet_post_send_locked(msg, 0);
 	lnet_net_unlock(cpt);
 
-	if (rc == EHOSTUNREACH || rc == ECANCELED)
-		return -rc;
+	if (rc < 0)
+		return rc;
 
-	if (rc == 0)
+	if (rc == LNET_CREDIT_OK)
 		lnet_ni_send(src_ni, msg);
 
-	return 0; /* rc == 0 or EAGAIN */
+	return 0; /* rc == LNET_CREDIT_OK or LNET_CREDIT_WAIT */
 }
 
 static void
@@ -1608,6 +1615,11 @@ lnet_parse_ack(lnet_ni_t *ni, lnet_msg_t *msg)
 	return 0;
 }
 
+/**
+ * \retval LNET_CREDIT_OK	If \a msg is forwarded
+ * \retval LNET_CREDIT_WAIT	If \a msg is blocked because w/o buffer
+ * \retval -ve			error code
+ */
 static int
 lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 {
@@ -1897,7 +1909,8 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 		if (rc < 0)
 			goto free_drop;
-		if (rc == 0) {
+
+		if (rc == LNET_CREDIT_OK) {
 			lnet_ni_recv(ni, msg->msg_private, msg, 0,
 				     0, payload_length, payload_length);
 		}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 08/40] staging: lustre: do not memset after LIBCFS_ALLOC
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (6 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 07/40] staging: lustre: return +ve for blocked lnet message James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 09/40] staging: lustre: Dynamic LNet Configuration (DLC) James Simmons
                   ` (32 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Frank Zago

From: Frank Zago <fzago@cray.com>

LIBCFS_ALLOC already zero out the memory allocated, so there is no
need to zero out the memory again.

Signed-off-by: Frank Zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5304
Reviewed-on: http://review.whamcloud.com/11012
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 0f4154c..f3cbc3b 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -335,8 +335,6 @@ int kiblnd_create_peer(lnet_ni_t *ni, kib_peer_t **peerp, lnet_nid_t nid)
 		return -ENOMEM;
 	}
 
-	memset(peer, 0, sizeof(*peer));	 /* zero flags etc */
-
 	peer->ibp_ni = ni;
 	peer->ibp_nid = nid;
 	peer->ibp_error = 0;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 09/40] staging: lustre: Dynamic LNet Configuration (DLC)
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (7 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 08/40] staging: lustre: do not memset after LIBCFS_ALLOC James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 10/40] staging: lustre: Dynamic LNet Configuration (DLC) dynamic routing James Simmons
                   ` (31 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the first patch of a set of patches that enables DLC.

This patch adds some cleanup in the config.c as well as some
preparatory changes in peer.c to enable dynamic network
configuration

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Change-Id: I8c8bbf3b55acf4d76f22a8be587b553a70d31889
Reviewed-on: http://review.whamcloud.com/9830
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    2 +-
 .../staging/lustre/include/linux/lnet/lib-types.h  |    5 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    8 +-
 drivers/staging/lustre/lnet/lnet/config.c          |   29 ++++-
 drivers/staging/lustre/lnet/lnet/peer.c            |  134 ++++++++++++++------
 5 files changed, 124 insertions(+), 54 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 09c6bfe..1e0b236 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -685,7 +685,7 @@ int lnet_parse_networks(struct list_head *nilist, char *networks);
 int lnet_nid2peer_locked(lnet_peer_t **lpp, lnet_nid_t nid, int cpt);
 lnet_peer_t *lnet_find_peer_locked(struct lnet_peer_table *ptable,
 				   lnet_nid_t nid);
-void lnet_peer_tables_cleanup(void);
+void lnet_peer_tables_cleanup(lnet_ni_t *ni);
 void lnet_peer_tables_destroy(void);
 int lnet_peer_tables_create(void);
 void lnet_debug_peer(lnet_nid_t nid);
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index d792c4a..39381d9 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -343,6 +343,8 @@ typedef struct lnet_peer {
 struct lnet_peer_table {
 	int			 pt_version;	/* /proc validity stamp */
 	int			 pt_number;	/* # peers extant */
+	/* # zombies to go to deathrow (and not there yet) */
+	int			 pt_zombies;
 	struct list_head	 pt_deathrow;	/* zombie peers */
 	struct list_head	*pt_hash;	/* NID->peer hash */
 };
@@ -600,9 +602,6 @@ typedef struct {
 	/* registered LNDs */
 	struct list_head		  ln_lnds;
 
-	/* space for network names */
-	char				 *ln_network_tokens;
-	int				  ln_network_tokens_nob;
 	/* test protocol compatibility flags */
 	int				  ln_testprotocompat;
 
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 284150f..f3c9937 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -882,7 +882,7 @@ lnet_shutdown_lndnis(void)
 
 	/* Clear the peer table and wait for all peers to go (they hold refs on
 	 * their NIs) */
-	lnet_peer_tables_cleanup();
+	lnet_peer_tables_cleanup(NULL);
 
 	lnet_net_lock(LNET_LOCK_EX);
 	/* Now wait for the NI's I just nuked to show up on ln_zombie_nis
@@ -939,12 +939,6 @@ lnet_shutdown_lndnis(void)
 
 	the_lnet.ln_shutdown = 0;
 	lnet_net_unlock(LNET_LOCK_EX);
-
-	if (the_lnet.ln_network_tokens != NULL) {
-		LIBCFS_FREE(the_lnet.ln_network_tokens,
-			    the_lnet.ln_network_tokens_nob);
-		the_lnet.ln_network_tokens = NULL;
-	}
 }
 
 static int
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 867c96e..7bb140b 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -96,6 +96,8 @@ lnet_net_unique(__u32 net, struct list_head *nilist)
 void
 lnet_ni_free(struct lnet_ni *ni)
 {
+	int i;
+
 	if (ni->ni_refs != NULL)
 		cfs_percpt_free(ni->ni_refs);
 
@@ -105,6 +107,10 @@ lnet_ni_free(struct lnet_ni *ni)
 	if (ni->ni_cpts != NULL)
 		cfs_expr_list_values_free(ni->ni_cpts, ni->ni_ncpts);
 
+	for (i = 0; i < LNET_MAX_INTERFACES && ni->ni_interfaces[i]; i++) {
+		LIBCFS_FREE(ni->ni_interfaces[i],
+			    strlen(ni->ni_interfaces[i]) + 1);
+	}
 	LIBCFS_FREE(ni, sizeof(*ni));
 }
 
@@ -199,8 +205,6 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 		return -ENOMEM;
 	}
 
-	the_lnet.ln_network_tokens = tokens;
-	the_lnet.ln_network_tokens_nob = tokensize;
 	memcpy(tokens, networks, tokensize);
 	str = tmp = tokens;
 
@@ -319,7 +323,23 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 				goto failed;
 			}
 
-			ni->ni_interfaces[niface++] = iface;
+			/*
+			 * Allocate a separate piece of memory and copy
+			 * into it the string, so we don't have
+			 * a depencency on the tokens string.  This way we
+			 * can free the tokens at the end of the function.
+			 * The newly allocated ni_interfaces[] can be
+			 * freed when freeing the NI
+			 */
+			LIBCFS_ALLOC(ni->ni_interfaces[niface],
+				     strlen(iface) + 1);
+			if (!ni->ni_interfaces[niface]) {
+				CERROR("Can't allocate net interface name\n");
+				goto failed;
+			}
+			strncpy(ni->ni_interfaces[niface], iface,
+				strlen(iface));
+			niface++;
 			iface = comma;
 		} while (iface != NULL);
 
@@ -344,6 +364,8 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 	}
 
 	LASSERT(!list_empty(nilist));
+
+	LIBCFS_FREE(tokens, tokensize);
 	return 0;
 
  failed_syntax:
@@ -360,7 +382,6 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 		cfs_expr_list_free(el);
 
 	LIBCFS_FREE(tokens, tokensize);
-	the_lnet.ln_network_tokens = NULL;
 
 	return -EINVAL;
 }
diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
index 1fceed3..bb5a0bb 100644
--- a/drivers/staging/lustre/lnet/lnet/peer.c
+++ b/drivers/staging/lustre/lnet/lnet/peer.c
@@ -103,62 +103,116 @@ lnet_peer_tables_destroy(void)
 	the_lnet.ln_peer_tables = NULL;
 }
 
+static void
+lnet_peer_table_cleanup_locked(lnet_ni_t *ni, struct lnet_peer_table *ptable)
+{
+	int i;
+	lnet_peer_t *lp;
+	lnet_peer_t *tmp;
+
+	for (i = 0; i < LNET_PEER_HASH_SIZE; i++) {
+		list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i],
+					 lp_hashlist) {
+			if (ni && ni != lp->lp_ni)
+				continue;
+			list_del_init(&lp->lp_hashlist);
+			/* Lose hash table's ref */
+			ptable->pt_zombies++;
+			lnet_peer_decref_locked(lp);
+		}
+	}
+}
+
+static void
+lnet_peer_table_deathrow_wait_locked(struct lnet_peer_table *ptable,
+				     int cpt_locked)
+{
+	int i;
+
+	for (i = 3; ptable->pt_zombies != 0; i++) {
+		lnet_net_unlock(cpt_locked);
+
+		if (IS_PO2(i)) {
+			CDEBUG(D_WARNING,
+			       "Waiting for %d zombies on peer table\n",
+			       ptable->pt_zombies);
+		}
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(cfs_time_seconds(1) >> 1);
+		lnet_net_lock(cpt_locked);
+	}
+}
+
+static void
+lnet_peer_table_del_rtrs_locked(lnet_ni_t *ni, struct lnet_peer_table *ptable,
+				int cpt_locked)
+{
+	lnet_peer_t *lp;
+	lnet_peer_t *tmp;
+	lnet_nid_t lp_nid;
+	int i;
+
+	for (i = 0; i < LNET_PEER_HASH_SIZE; i++) {
+		list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i],
+					 lp_hashlist) {
+			if (ni != lp->lp_ni)
+				continue;
+
+			if (lp->lp_rtr_refcount == 0)
+				continue;
+
+			lp_nid = lp->lp_nid;
+
+			lnet_net_unlock(cpt_locked);
+			lnet_del_route(LNET_NIDNET(LNET_NID_ANY), lp_nid);
+			lnet_net_lock(cpt_locked);
+		}
+	}
+}
+
 void
-lnet_peer_tables_cleanup(void)
+lnet_peer_tables_cleanup(lnet_ni_t *ni)
 {
 	struct lnet_peer_table *ptable;
+	struct list_head deathrow;
+	lnet_peer_t *lp;
 	int i;
-	int j;
 
-	LASSERT(the_lnet.ln_shutdown);	/* i.e. no new peers */
+	INIT_LIST_HEAD(&deathrow);
 
+	LASSERT(the_lnet.ln_shutdown || ni);
+	/*
+	 * If just deleting the peers for a NI, get rid of any routes these
+	 * peers are gateways for.
+	 */
 	cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) {
 		lnet_net_lock(i);
-
-		for (j = 0; j < LNET_PEER_HASH_SIZE; j++) {
-			struct list_head *peers = &ptable->pt_hash[j];
-
-			while (!list_empty(peers)) {
-				lnet_peer_t *lp = list_entry(peers->next,
-								 lnet_peer_t,
-								 lp_hashlist);
-				list_del_init(&lp->lp_hashlist);
-				/* lose hash table's ref */
-				lnet_peer_decref_locked(lp);
-			}
-		}
-
+		lnet_peer_table_del_rtrs_locked(ni, ptable, i);
 		lnet_net_unlock(i);
 	}
 
+	/*
+	 * Start the process of moving the applicable peers to
+	 * deathrow.
+	 */
 	cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) {
-		LIST_HEAD(deathrow);
-		lnet_peer_t *lp;
-
 		lnet_net_lock(i);
+		lnet_peer_table_cleanup_locked(ni, ptable);
+		lnet_net_unlock(i);
+	}
 
-		for (j = 3; ptable->pt_number != 0; j++) {
-			lnet_net_unlock(i);
-
-			if ((j & (j - 1)) == 0) {
-				CDEBUG(D_WARNING,
-				       "Waiting for %d peers on peer table\n",
-				       ptable->pt_number);
-			}
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(cfs_time_seconds(1) / 2);
-			lnet_net_lock(i);
-		}
+	/* Cleanup all entries on deathrow. */
+	cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) {
+		lnet_net_lock(i);
+		lnet_peer_table_deathrow_wait_locked(ptable, i);
 		list_splice_init(&ptable->pt_deathrow, &deathrow);
-
 		lnet_net_unlock(i);
+	}
 
-		while (!list_empty(&deathrow)) {
-			lp = list_entry(deathrow.next,
-					    lnet_peer_t, lp_hashlist);
-			list_del(&lp->lp_hashlist);
-			LIBCFS_FREE(lp, sizeof(*lp));
-		}
+	while (!list_empty(&deathrow)) {
+		lp = list_entry(deathrow.next, lnet_peer_t, lp_hashlist);
+		list_del(&lp->lp_hashlist);
+		LIBCFS_FREE(lp, sizeof(*lp));
 	}
 }
 
@@ -181,6 +235,8 @@ lnet_destroy_peer_locked(lnet_peer_t *lp)
 	lp->lp_ni = NULL;
 
 	list_add(&lp->lp_hashlist, &ptable->pt_deathrow);
+	LASSERT(ptable->pt_zombies > 0);
+	ptable->pt_zombies--;
 }
 
 lnet_peer_t *
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 10/40] staging: lustre: Dynamic LNet Configuration (DLC) dynamic routing
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (8 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 09/40] staging: lustre: Dynamic LNet Configuration (DLC) James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 11/40] staging: lustre: DLC Feature dynamic net config James Simmons
                   ` (30 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the second patch of a set of patches that enables DLC.

This patch adds the following features to LNET.  Currently these
features are not driven by user space.
- Enabling Routing on Demand.  The default number of router
  buffers are allocated.
- Disable Routing on demand. Unused router buffers are freed and
  used router buffers are freed when they are no longer in use.
  The following time routing is enabled the default router buffer
  values are used.  It has been decided that remembering the
  user set router buffer values should be remembered and re-set
  by user space scripts.
- Increase the number of router buffers on demand, by allocating
  new ones.
- Decrease the number of router buffers.  Exccess buffers are freed
  if they are not in use.  Otherwise they are freed once they are
  no longer in use.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Change-Id: Id07d4ad424d8f5ba72475d4149380afe2ac54e77
Reviewed-on: http://review.whamcloud.com/9831
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    8 +-
 .../staging/lustre/include/linux/lnet/lib-types.h  |    8 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    4 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   89 +++++--
 drivers/staging/lustre/lnet/lnet/router.c          |  276 +++++++++++++++-----
 5 files changed, 304 insertions(+), 81 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 1e0b236..60accdf 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -459,7 +459,11 @@ int lnet_get_route(int idx, __u32 *net, __u32 *hops,
 void lnet_router_debugfs_init(void);
 void lnet_router_debugfs_fini(void);
 int  lnet_rtrpools_alloc(int im_a_router);
-void lnet_rtrpools_free(void);
+void lnet_destroy_rtrbuf(lnet_rtrbuf_t *rb, int npages);
+int lnet_rtrpools_adjust(int tiny, int small, int large);
+int lnet_rtrpools_enable(void);
+void lnet_rtrpools_disable(void);
+void lnet_rtrpools_free(int keep_pools);
 lnet_remotenet_t *lnet_find_net_locked(__u32 net);
 
 int lnet_islocalnid(lnet_nid_t nid);
@@ -479,6 +483,8 @@ void lnet_prep_send(lnet_msg_t *msg, int type, lnet_process_id_t target,
 int lnet_send(lnet_nid_t nid, lnet_msg_t *msg, lnet_nid_t rtr_nid);
 void lnet_return_tx_credits_locked(lnet_msg_t *msg);
 void lnet_return_rx_credits_locked(lnet_msg_t *msg);
+void lnet_schedule_blocked_locked(lnet_rtrbufpool_t *rbp);
+void lnet_drop_routed_msgs_locked(struct list_head *list, int cpt);
 
 /* portals functions */
 /* portals attributes */
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index 39381d9..e7585b9 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -277,6 +277,7 @@ typedef struct lnet_ni {
 #define LNET_PING_FEAT_INVAL		(0)		/* no feature */
 #define LNET_PING_FEAT_BASE		(1 << 0)	/* just a ping */
 #define LNET_PING_FEAT_NI_STATUS	(1 << 1)	/* return NI status */
+#define LNET_PING_FEAT_RTE_DISABLED	(1 << 2)	/* Routing enabled */
 
 #define LNET_PING_FEAT_MASK		(LNET_PING_FEAT_BASE | \
 					 LNET_PING_FEAT_NI_STATUS)
@@ -400,7 +401,12 @@ typedef struct {
 
 #define LNET_PEER_HASHSIZE	503	/* prime! */
 
-#define LNET_NRBPOOLS		3	/* # different router buffer pools */
+#define LNET_TINY_BUF_IDX	0
+#define LNET_SMALL_BUF_IDX	1
+#define LNET_LARGE_BUF_IDX	2
+
+/* # different router buffer pools */
+#define LNET_NRBPOOLS		(LNET_LARGE_BUF_IDX + 1)
 
 enum {
 	/* Didn't match anything */
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index f3c9937..0338537 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -632,7 +632,7 @@ lnet_unprepare(void)
 
 	lnet_msg_containers_destroy();
 	lnet_peer_tables_destroy();
-	lnet_rtrpools_free();
+	lnet_rtrpools_free(0);
 
 	if (the_lnet.ln_counters != NULL) {
 		cfs_percpt_free(the_lnet.ln_counters);
@@ -1515,6 +1515,8 @@ lnet_create_ping_info(void)
 	pinfo->pi_pid     = the_lnet.ln_pid;
 	pinfo->pi_magic   = LNET_PROTO_PING_MAGIC;
 	pinfo->pi_features = LNET_PING_FEAT_NI_STATUS;
+	if (!the_lnet.ln_routing)
+		pinfo->pi_features |= LNET_PING_FEAT_RTE_DISABLED;
 
 	for (i = 0; i < n; i++) {
 		lnet_ni_status_t *ns = &pinfo->pi_ni[i];
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 021a81d..e1461af 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -935,9 +935,6 @@ lnet_post_routed_recv_locked(lnet_msg_t *msg, int do_recv)
 	rbp = lnet_msg2bufpool(msg);
 
 	if (!msg->msg_rtrcredit) {
-		LASSERT((rbp->rbp_credits < 0) ==
-			 !list_empty(&rbp->rbp_msgs));
-
 		msg->msg_rtrcredit = 1;
 		rbp->rbp_credits--;
 		if (rbp->rbp_credits < rbp->rbp_mincredits)
@@ -1029,6 +1026,43 @@ lnet_return_tx_credits_locked(lnet_msg_t *msg)
 }
 
 void
+lnet_schedule_blocked_locked(lnet_rtrbufpool_t *rbp)
+{
+	lnet_msg_t *msg;
+
+	if (list_empty(&rbp->rbp_msgs))
+		return;
+	msg = list_entry(rbp->rbp_msgs.next,
+			 lnet_msg_t, msg_list);
+	list_del(&msg->msg_list);
+
+	(void)lnet_post_routed_recv_locked(msg, 1);
+}
+
+void
+lnet_drop_routed_msgs_locked(struct list_head *list, int cpt)
+{
+	struct list_head drop;
+	lnet_msg_t *msg;
+	lnet_msg_t *tmp;
+
+	INIT_LIST_HEAD(&drop);
+
+	list_splice_init(list, &drop);
+
+	lnet_net_unlock(cpt);
+
+	list_for_each_entry_safe(msg, tmp, &drop, msg_list) {
+		lnet_ni_recv(msg->msg_rxpeer->lp_ni, msg->msg_private, NULL,
+			     0, 0, 0, msg->msg_hdr.payload_length);
+		list_del_init(&msg->msg_list);
+		lnet_finalize(NULL, msg, -ECANCELED);
+	}
+
+	lnet_net_lock(cpt);
+}
+
+void
 lnet_return_rx_credits_locked(lnet_msg_t *msg)
 {
 	lnet_peer_t *rxpeer = msg->msg_rxpeer;
@@ -1046,27 +1080,41 @@ lnet_return_rx_credits_locked(lnet_msg_t *msg)
 
 		rb = list_entry(msg->msg_kiov, lnet_rtrbuf_t, rb_kiov[0]);
 		rbp = rb->rb_pool;
-		LASSERT(rbp == lnet_msg2bufpool(msg));
 
 		msg->msg_kiov = NULL;
 		msg->msg_rtrcredit = 0;
 
-		LASSERT((rbp->rbp_credits < 0) ==
-			!list_empty(&rbp->rbp_msgs));
+		LASSERT(rbp == lnet_msg2bufpool(msg));
+
 		LASSERT((rbp->rbp_credits > 0) ==
 			!list_empty(&rbp->rbp_bufs));
 
-		list_add(&rb->rb_list, &rbp->rbp_bufs);
-		rbp->rbp_credits++;
-		if (rbp->rbp_credits <= 0) {
-			msg2 = list_entry(rbp->rbp_msgs.next,
-					      lnet_msg_t, msg_list);
-			list_del(&msg2->msg_list);
+		/*
+		 * If routing is now turned off, we just drop this buffer and
+		 * don't bother trying to return credits.
+		 */
+		if (!the_lnet.ln_routing) {
+			lnet_destroy_rtrbuf(rb, rbp->rbp_npages);
+			goto routing_off;
+		}
 
-			(void) lnet_post_routed_recv_locked(msg2, 1);
+		/*
+		 * It is possible that a user has lowered the desired number of
+		 * buffers in this pool.  Make sure we never put back
+		 * more buffers than the stated number.
+		 */
+		if (rbp->rbp_credits >= rbp->rbp_nbuffers) {
+			/* Discard this buffer so we don't have too many. */
+			lnet_destroy_rtrbuf(rb, rbp->rbp_npages);
+		} else {
+			list_add(&rb->rb_list, &rbp->rbp_bufs);
+			rbp->rbp_credits++;
+			if (rbp->rbp_credits <= 0)
+				lnet_schedule_blocked_locked(rbp);
 		}
 	}
 
+routing_off:
 	if (msg->msg_peerrtrcredit) {
 		/* give back peer router credits */
 		msg->msg_peerrtrcredit = 0;
@@ -1075,7 +1123,14 @@ lnet_return_rx_credits_locked(lnet_msg_t *msg)
 			!list_empty(&rxpeer->lp_rtrq));
 
 		rxpeer->lp_rtrcredits++;
-		if (rxpeer->lp_rtrcredits <= 0) {
+		/*
+		 * drop all messages which are queued to be routed on that
+		 * peer.
+		 */
+		if (!the_lnet.ln_routing) {
+			lnet_drop_routed_msgs_locked(&rxpeer->lp_rtrq,
+						     msg->msg_rx_cpt);
+		} else if (rxpeer->lp_rtrcredits <= 0) {
 			msg2 = list_entry(rxpeer->lp_rtrq.next,
 					      lnet_msg_t, msg_list);
 			list_del(&msg2->msg_list);
@@ -1625,6 +1680,9 @@ lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 {
 	int rc = 0;
 
+	if (!the_lnet.ln_routing)
+		return -ECANCELED;
+
 	if (msg->msg_rxpeer->lp_rtrcredits <= 0 ||
 	    lnet_msg2bufpool(msg)->rbp_credits <= 0) {
 		if (ni->ni_lnd->lnd_eager_recv == NULL) {
@@ -1780,9 +1838,8 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 	if (the_lnet.ln_routing &&
 	    ni->ni_last_alive != ktime_get_real_seconds()) {
-		lnet_ni_lock(ni);
-
 		/* NB: so far here is the only place to set NI status to "up */
+		lnet_ni_lock(ni);
 		ni->ni_last_alive = ktime_get_real_seconds();
 		if (ni->ni_status != NULL &&
 		    ni->ni_status->ns_status == LNET_NI_STATUS_DOWN)
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 47f80aa..749085f 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -28,8 +28,11 @@
 #define LNET_NRB_TINY		(LNET_NRB_TINY_MIN * 4)
 #define LNET_NRB_SMALL_MIN	4096	/* min value for each CPT */
 #define LNET_NRB_SMALL		(LNET_NRB_SMALL_MIN * 4)
+#define LNET_NRB_SMALL_PAGES	1
 #define LNET_NRB_LARGE_MIN	256	/* min value for each CPT */
 #define LNET_NRB_LARGE		(LNET_NRB_LARGE_MIN * 4)
+#define LNET_NRB_LARGE_PAGES   ((LNET_MTU + PAGE_CACHE_SIZE - 1) >> \
+				 PAGE_CACHE_SHIFT)
 
 static char *forwarding = "";
 module_param(forwarding, charp, 0444);
@@ -566,7 +569,8 @@ lnet_get_route(int idx, __u32 *net, __u32 *hops,
 					*hops     = route->lr_hops;
 					*priority = route->lr_priority;
 					*gateway  = route->lr_gateway->lp_nid;
-					*alive    = route->lr_gateway->lp_alive;
+					*alive = route->lr_gateway->lp_alive &&
+						 !route->lr_downis;
 					lnet_net_unlock(cpt);
 					return 0;
 				}
@@ -604,7 +608,7 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 {
 	lnet_ping_info_t *info = rcd->rcd_pinginfo;
 	struct lnet_peer *gw = rcd->rcd_gateway;
-	lnet_route_t *rtr;
+	lnet_route_t *rte;
 
 	if (!gw->lp_alive)
 		return;
@@ -630,11 +634,16 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 	if ((gw->lp_ping_feats & LNET_PING_FEAT_NI_STATUS) == 0)
 		return; /* can't carry NI status info */
 
-	list_for_each_entry(rtr, &gw->lp_routes, lr_gwlist) {
+	list_for_each_entry(rte, &gw->lp_routes, lr_gwlist) {
 		int down = 0;
 		int up = 0;
 		int i;
 
+		if ((gw->lp_ping_feats & LNET_PING_FEAT_RTE_DISABLED) != 0) {
+			rte->lr_downis = 1;
+			continue;
+		}
+
 		for (i = 0; i < info->pi_nnis && i < LNET_MAX_RTR_NIS; i++) {
 			lnet_ni_status_t *stat = &info->pi_ni[i];
 			lnet_nid_t nid = stat->ns_nid;
@@ -655,7 +664,7 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 			}
 
 			if (stat->ns_status == LNET_NI_STATUS_UP) {
-				if (LNET_NIDNET(nid) == rtr->lr_net) {
+				if (LNET_NIDNET(nid) == rte->lr_net) {
 					up = 1;
 					break;
 				}
@@ -669,10 +678,10 @@ lnet_parse_rc_info(lnet_rc_data_t *rcd)
 		}
 
 		if (up) { /* ignore downed NIs if NI for dest network is up */
-			rtr->lr_downis = 0;
+			rte->lr_downis = 0;
 			continue;
 		}
-		rtr->lr_downis = down;
+		rte->lr_downis = down;
 	}
 }
 
@@ -1209,7 +1218,7 @@ rescan:
 	return 0;
 }
 
-static void
+void
 lnet_destroy_rtrbuf(lnet_rtrbuf_t *rb, int npages)
 {
 	int sz = offsetof(lnet_rtrbuf_t, rb_kiov[npages]);
@@ -1256,66 +1265,103 @@ lnet_new_rtrbuf(lnet_rtrbufpool_t *rbp, int cpt)
 }
 
 static void
-lnet_rtrpool_free_bufs(lnet_rtrbufpool_t *rbp)
+lnet_rtrpool_free_bufs(lnet_rtrbufpool_t *rbp, int cpt)
 {
 	int npages = rbp->rbp_npages;
-	int nbuffers = 0;
+	struct list_head tmp;
 	lnet_rtrbuf_t *rb;
 
 	if (rbp->rbp_nbuffers == 0) /* not initialized or already freed */
 		return;
 
-	LASSERT(list_empty(&rbp->rbp_msgs));
-	LASSERT(rbp->rbp_credits == rbp->rbp_nbuffers);
+	INIT_LIST_HEAD(&tmp);
 
-	while (!list_empty(&rbp->rbp_bufs)) {
-		LASSERT(rbp->rbp_credits > 0);
+	lnet_net_lock(cpt);
+	lnet_drop_routed_msgs_locked(&rbp->rbp_msgs, cpt);
+	list_splice_init(&rbp->rbp_bufs, &tmp);
+	rbp->rbp_nbuffers = 0;
+	rbp->rbp_credits = 0;
+	rbp->rbp_mincredits = 0;
+	lnet_net_unlock(cpt);
 
-		rb = list_entry(rbp->rbp_bufs.next,
-				    lnet_rtrbuf_t, rb_list);
+	/* Free buffers on the free list. */
+	while (!list_empty(&tmp)) {
+		rb = list_entry(tmp.next, lnet_rtrbuf_t, rb_list);
 		list_del(&rb->rb_list);
 		lnet_destroy_rtrbuf(rb, npages);
-		nbuffers++;
 	}
-
-	LASSERT(rbp->rbp_nbuffers == nbuffers);
-	LASSERT(rbp->rbp_credits == nbuffers);
-
-	rbp->rbp_nbuffers = rbp->rbp_credits = 0;
 }
 
 static int
-lnet_rtrpool_alloc_bufs(lnet_rtrbufpool_t *rbp, int nbufs, int cpt)
+lnet_rtrpool_adjust_bufs(lnet_rtrbufpool_t *rbp, int nbufs, int cpt)
 {
+	struct list_head rb_list;
 	lnet_rtrbuf_t *rb;
-	int i;
+	int num_rb;
+	int num_buffers = 0;
+	int npages = rbp->rbp_npages;
 
-	if (rbp->rbp_nbuffers != 0) {
-		LASSERT(rbp->rbp_nbuffers == nbufs);
+	/*
+	 * If we are called for less buffers than already in the pool, we
+	 * just lower the nbuffers number and excess buffers will be
+	 * thrown away as they are returned to the free list.  Credits
+	 * then get adjusted as well.
+	 */
+	if (nbufs <= rbp->rbp_nbuffers) {
+		lnet_net_lock(cpt);
+		rbp->rbp_nbuffers = nbufs;
+		lnet_net_unlock(cpt);
 		return 0;
 	}
 
-	for (i = 0; i < nbufs; i++) {
-		rb = lnet_new_rtrbuf(rbp, cpt);
+	INIT_LIST_HEAD(&rb_list);
 
+	/*
+	 * allocate the buffers on a local list first.  If all buffers are
+	 * allocated successfully then join this list to the rbp buffer
+	 * list. If not then free all allocated buffers.
+	 */
+	num_rb = rbp->rbp_nbuffers;
+
+	while (num_rb < nbufs) {
+		rb = lnet_new_rtrbuf(rbp, cpt);
 		if (rb == NULL) {
-			CERROR("Failed to allocate %d router bufs of %d pages\n",
-			       nbufs, rbp->rbp_npages);
-			return -ENOMEM;
+			CERROR("Failed to allocate %d route bufs of %d pages\n",
+			       nbufs, npages);
+			goto failed;
 		}
 
-		rbp->rbp_nbuffers++;
-		rbp->rbp_credits++;
-		rbp->rbp_mincredits++;
-		list_add(&rb->rb_list, &rbp->rbp_bufs);
-
-		/* No allocation "under fire" */
-		/* Otherwise we'd need code to schedule blocked msgs etc */
-		LASSERT(!the_lnet.ln_routing);
+		list_add(&rb->rb_list, &rb_list);
+		num_buffers++;
+		num_rb++;
 	}
 
-	LASSERT(rbp->rbp_credits == nbufs);
+	lnet_net_lock(cpt);
+
+	list_splice_tail(&rb_list, &rbp->rbp_bufs);
+	rbp->rbp_nbuffers += num_buffers;
+	rbp->rbp_credits += num_buffers;
+	rbp->rbp_mincredits = rbp->rbp_credits;
+	/*
+	 * We need to schedule blocked msg using the newly
+	 * added buffers.
+	 */
+	while (!list_empty(&rbp->rbp_bufs) &&
+	       !list_empty(&rbp->rbp_msgs))
+		lnet_schedule_blocked_locked(rbp);
+
+	lnet_net_unlock(cpt);
+
 	return 0;
+
+failed:
+	while (!list_empty(&rb_list)) {
+		rb = list_entry(rb_list.next, lnet_rtrbuf_t, rb_list);
+		list_del(&rb->rb_list);
+		lnet_destroy_rtrbuf(rb, npages);
+	}
+
+	return -ENOMEM;
 }
 
 static void
@@ -1330,7 +1376,7 @@ lnet_rtrpool_init(lnet_rtrbufpool_t *rbp, int npages)
 }
 
 void
-lnet_rtrpools_free(void)
+lnet_rtrpools_free(int keep_pools)
 {
 	lnet_rtrbufpool_t *rtrp;
 	int i;
@@ -1339,17 +1385,19 @@ lnet_rtrpools_free(void)
 		return;
 
 	cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
-		lnet_rtrpool_free_bufs(&rtrp[0]);
-		lnet_rtrpool_free_bufs(&rtrp[1]);
-		lnet_rtrpool_free_bufs(&rtrp[2]);
+		lnet_rtrpool_free_bufs(&rtrp[LNET_TINY_BUF_IDX], i);
+		lnet_rtrpool_free_bufs(&rtrp[LNET_SMALL_BUF_IDX], i);
+		lnet_rtrpool_free_bufs(&rtrp[LNET_LARGE_BUF_IDX], i);
 	}
 
-	cfs_percpt_free(the_lnet.ln_rtrpools);
-	the_lnet.ln_rtrpools = NULL;
+	if (!keep_pools) {
+		cfs_percpt_free(the_lnet.ln_rtrpools);
+		the_lnet.ln_rtrpools = NULL;
+	}
 }
 
 static int
-lnet_nrb_tiny_calculate(int npages)
+lnet_nrb_tiny_calculate(void)
 {
 	int nrbs = LNET_NRB_TINY;
 
@@ -1368,7 +1416,7 @@ lnet_nrb_tiny_calculate(int npages)
 }
 
 static int
-lnet_nrb_small_calculate(int npages)
+lnet_nrb_small_calculate(void)
 {
 	int nrbs = LNET_NRB_SMALL;
 
@@ -1387,7 +1435,7 @@ lnet_nrb_small_calculate(int npages)
 }
 
 static int
-lnet_nrb_large_calculate(int npages)
+lnet_nrb_large_calculate(void)
 {
 	int nrbs = LNET_NRB_LARGE;
 
@@ -1409,16 +1457,12 @@ int
 lnet_rtrpools_alloc(int im_a_router)
 {
 	lnet_rtrbufpool_t *rtrp;
-	int large_pages;
-	int small_pages = 1;
 	int nrb_tiny;
 	int nrb_small;
 	int nrb_large;
 	int rc;
 	int i;
 
-	large_pages = (LNET_MTU + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
-
 	if (!strcmp(forwarding, "")) {
 		/* not set either way */
 		if (!im_a_router)
@@ -1433,15 +1477,15 @@ lnet_rtrpools_alloc(int im_a_router)
 		return -EINVAL;
 	}
 
-	nrb_tiny = lnet_nrb_tiny_calculate(0);
+	nrb_tiny = lnet_nrb_tiny_calculate();
 	if (nrb_tiny < 0)
 		return -EINVAL;
 
-	nrb_small = lnet_nrb_small_calculate(small_pages);
+	nrb_small = lnet_nrb_small_calculate();
 	if (nrb_small < 0)
 		return -EINVAL;
 
-	nrb_large = lnet_nrb_large_calculate(large_pages);
+	nrb_large = lnet_nrb_large_calculate();
 	if (nrb_large < 0)
 		return -EINVAL;
 
@@ -1455,18 +1499,23 @@ lnet_rtrpools_alloc(int im_a_router)
 	}
 
 	cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
-		lnet_rtrpool_init(&rtrp[0], 0);
-		rc = lnet_rtrpool_alloc_bufs(&rtrp[0], nrb_tiny, i);
+		lnet_rtrpool_init(&rtrp[LNET_TINY_BUF_IDX], 0);
+		rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_TINY_BUF_IDX],
+					      nrb_tiny, i);
 		if (rc != 0)
 			goto failed;
 
-		lnet_rtrpool_init(&rtrp[1], small_pages);
-		rc = lnet_rtrpool_alloc_bufs(&rtrp[1], nrb_small, i);
+		lnet_rtrpool_init(&rtrp[LNET_SMALL_BUF_IDX],
+				  LNET_NRB_SMALL_PAGES);
+		rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_SMALL_BUF_IDX],
+					      nrb_small, i);
 		if (rc != 0)
 			goto failed;
 
-		lnet_rtrpool_init(&rtrp[2], large_pages);
-		rc = lnet_rtrpool_alloc_bufs(&rtrp[2], nrb_large, i);
+		lnet_rtrpool_init(&rtrp[LNET_LARGE_BUF_IDX],
+				  LNET_NRB_LARGE_PAGES);
+		rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_LARGE_BUF_IDX],
+					      nrb_large, i);
 		if (rc != 0)
 			goto failed;
 	}
@@ -1478,11 +1527,114 @@ lnet_rtrpools_alloc(int im_a_router)
 	return 0;
 
  failed:
-	lnet_rtrpools_free();
+	lnet_rtrpools_free(0);
 	return rc;
 }
 
 int
+lnet_rtrpools_adjust(int tiny, int small, int large)
+{
+	int nrb = 0;
+	int rc = 0;
+	int i;
+	lnet_rtrbufpool_t *rtrp;
+
+	/*
+	 * this function doesn't revert the changes if adding new buffers
+	 * failed.  It's up to the user space caller to revert the
+	 * changes.
+	 */
+
+	if (!the_lnet.ln_routing)
+		return 0;
+
+	/*
+	 * If the provided values for each buffer pool are different than the
+	 * configured values, we need to take action.
+	 */
+	if (tiny >= 0 && tiny != tiny_router_buffers) {
+		tiny_router_buffers = tiny;
+		nrb = lnet_nrb_tiny_calculate();
+		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
+			rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_TINY_BUF_IDX],
+						      nrb, i);
+			if (rc != 0)
+				return rc;
+		}
+	}
+	if (small >= 0 && small != small_router_buffers) {
+		small_router_buffers = small;
+		nrb = lnet_nrb_small_calculate();
+		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
+			rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_SMALL_BUF_IDX],
+						      nrb, i);
+			if (rc != 0)
+				return rc;
+		}
+	}
+	if (large >= 0 && large != large_router_buffers) {
+		large_router_buffers = large;
+		nrb = lnet_nrb_large_calculate();
+		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
+			rc = lnet_rtrpool_adjust_bufs(&rtrp[LNET_LARGE_BUF_IDX],
+						      nrb, i);
+			if (rc != 0)
+				return rc;
+		}
+	}
+
+	return 0;
+}
+
+int
+lnet_rtrpools_enable(void)
+{
+	int rc;
+
+	if (the_lnet.ln_routing)
+		return 0;
+
+	if (!the_lnet.ln_rtrpools)
+		/*
+		 * If routing is turned off, and we have never
+		 * initialized the pools before, just call the
+		 * standard buffer pool allocation routine as
+		 * if we are just configuring this for the first
+		 * time.
+		 */
+		return lnet_rtrpools_alloc(1);
+
+	rc = lnet_rtrpools_adjust(0, 0, 0);
+	if (rc != 0)
+		return rc;
+
+	lnet_net_lock(LNET_LOCK_EX);
+	the_lnet.ln_routing = 1;
+
+	the_lnet.ln_ping_info->pi_features &= ~LNET_PING_FEAT_RTE_DISABLED;
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	return 0;
+}
+
+void
+lnet_rtrpools_disable(void)
+{
+	if (!the_lnet.ln_routing)
+		return;
+
+	lnet_net_lock(LNET_LOCK_EX);
+	the_lnet.ln_routing = 0;
+	the_lnet.ln_ping_info->pi_features |= LNET_PING_FEAT_RTE_DISABLED;
+
+	tiny_router_buffers = 0;
+	small_router_buffers = 0;
+	large_router_buffers = 0;
+	lnet_net_unlock(LNET_LOCK_EX);
+	lnet_rtrpools_free(1);
+}
+
+int
 lnet_notify(lnet_ni_t *ni, lnet_nid_t nid, int alive, unsigned long when)
 {
 	struct lnet_peer *lp = NULL;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 11/40] staging: lustre: DLC Feature dynamic net config
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (9 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 10/40] staging: lustre: Dynamic LNet Configuration (DLC) dynamic routing James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02  9:23   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes James Simmons
                   ` (29 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the third patch of a set of patches that enables DLC.

This patch adds the following features to LNET.  Currently these
features are not driven by user space.
- Adding/Deleting Networks dynamically
Two new functions were added:
 - lnet_dyn_add_ni()
    add an NI. if the NI is already added then fail with
    appropriate error code
 - lnet_dyn_del_ni()
    delete an existing NI.  If NI doesn't exist fail with
    appropriate failure code.
These functions shall be called from IOCTL.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Reviewed-on: http://review.whamcloud.com/9832
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    3 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |  795 +++++++++++++-------
 drivers/staging/lustre/lnet/lnet/config.c          |    2 +-
 3 files changed, 517 insertions(+), 283 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 60accdf..94d0dc5 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -679,14 +679,13 @@ void lnet_router_checker_stop(void);
 void lnet_router_ni_update_locked(lnet_peer_t *gw, __u32 net);
 void lnet_swap_pinginfo(lnet_ping_info_t *info);
 
-int lnet_ping_target_init(void);
-void lnet_ping_target_fini(void);
 int lnet_ping(lnet_process_id_t id, int timeout_ms,
 	      lnet_process_id_t *ids, int n_ids);
 
 int lnet_parse_ip2nets(char **networksp, char *ip2nets);
 int lnet_parse_routes(char *route_str, int *im_a_router);
 int lnet_parse_networks(struct list_head *nilist, char *networks);
+int lnet_net_unique(__u32 net, struct list_head *nilist);
 
 int lnet_nid2peer_locked(lnet_peer_t **lpp, lnet_nid_t nid, int cpt);
 lnet_peer_t *lnet_find_peer_locked(struct lnet_peer_table *ptable,
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 0338537..9661f6a 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -528,6 +528,11 @@ lnet_prepare(lnet_pid_t requested_pid)
 	struct lnet_res_container **recs;
 	int rc = 0;
 
+	if (requested_pid == LNET_PID_ANY) {
+		/* Don't instantiate LNET just for me */
+		return -ENETDOWN;
+	}
+
 	LASSERT(the_lnet.ln_refcount == 0);
 
 	the_lnet.ln_routing = 0;
@@ -813,6 +818,229 @@ lnet_count_acceptor_nis(void)
 	return count;
 }
 
+static lnet_ping_info_t *
+lnet_ping_info_create(int num_ni)
+{
+	lnet_ping_info_t *ping_info;
+	unsigned int infosz;
+
+	infosz = offsetof(lnet_ping_info_t, pi_ni[num_ni]);
+	LIBCFS_ALLOC(ping_info, infosz);
+	if (!ping_info) {
+		CERROR("Can't allocate ping info[%d]\n", num_ni);
+		return NULL;
+	}
+
+	ping_info->pi_nnis = num_ni;
+	ping_info->pi_pid = the_lnet.ln_pid;
+	ping_info->pi_magic = LNET_PROTO_PING_MAGIC;
+	ping_info->pi_features = LNET_PING_FEAT_NI_STATUS;
+
+	return ping_info;
+}
+
+static inline int
+lnet_get_ni_count(void)
+{
+	struct lnet_ni *ni;
+	int count = 0;
+
+	lnet_net_lock(0);
+
+	list_for_each_entry(ni, &the_lnet.ln_nis, ni_list)
+		count++;
+
+	lnet_net_unlock(0);
+
+	return count;
+}
+
+static inline void
+lnet_ping_info_free(lnet_ping_info_t *pinfo)
+{
+	LIBCFS_FREE(pinfo,
+		    offsetof(lnet_ping_info_t,
+			     pi_ni[pinfo->pi_nnis]));
+}
+
+static void
+lnet_ping_info_destroy(void)
+{
+	struct lnet_ni *ni;
+
+	lnet_net_lock(LNET_LOCK_EX);
+
+	list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) {
+		lnet_ni_lock(ni);
+		ni->ni_status = NULL;
+		lnet_ni_unlock(ni);
+	}
+
+	lnet_ping_info_free(the_lnet.ln_ping_info);
+	the_lnet.ln_ping_info = NULL;
+
+	lnet_net_unlock(LNET_LOCK_EX);
+}
+
+static void
+lnet_ping_event_handler(lnet_event_t *event)
+{
+	lnet_ping_info_t *pinfo = event->md.user_ptr;
+
+	if (event->unlinked)
+		pinfo->pi_features = LNET_PING_FEAT_INVAL;
+}
+
+static int
+lnet_ping_info_setup(lnet_ping_info_t **ppinfo, lnet_handle_md_t *md_handle,
+		     int ni_count, bool set_eq)
+{
+	lnet_process_id_t id = {LNET_NID_ANY, LNET_PID_ANY};
+	lnet_handle_me_t me_handle;
+	lnet_md_t md = {0};
+	int rc, rc2;
+
+	if (set_eq) {
+		rc = LNetEQAlloc(0, lnet_ping_event_handler,
+				 &the_lnet.ln_ping_target_eq);
+		if (rc != 0) {
+			CERROR("Can't allocate ping EQ: %d\n", rc);
+			return rc;
+		}
+	}
+
+	*ppinfo = lnet_ping_info_create(ni_count);
+	if (!*ppinfo) {
+		rc = -ENOMEM;
+		goto failed_0;
+	}
+
+	rc = LNetMEAttach(LNET_RESERVED_PORTAL, id,
+			  LNET_PROTO_PING_MATCHBITS, 0,
+			  LNET_UNLINK, LNET_INS_AFTER,
+			  &me_handle);
+	if (rc != 0) {
+		CERROR("Can't create ping ME: %d\n", rc);
+		goto failed_1;
+	}
+
+	/* initialize md content */
+	md.start = *ppinfo;
+	md.length = offsetof(lnet_ping_info_t,
+			     pi_ni[(*ppinfo)->pi_nnis]);
+	md.threshold = LNET_MD_THRESH_INF;
+	md.max_size = 0;
+	md.options = LNET_MD_OP_GET | LNET_MD_TRUNCATE |
+		     LNET_MD_MANAGE_REMOTE;
+	md.user_ptr  = NULL;
+	md.eq_handle = the_lnet.ln_ping_target_eq;
+	md.user_ptr = *ppinfo;
+
+	rc = LNetMDAttach(me_handle, md, LNET_RETAIN, md_handle);
+	if (rc != 0) {
+		CERROR("Can't attach ping MD: %d\n", rc);
+		goto failed_2;
+	}
+
+	return 0;
+
+failed_2:
+	rc2 = LNetMEUnlink(me_handle);
+	LASSERT(rc2 == 0);
+failed_1:
+	lnet_ping_info_free(*ppinfo);
+	*ppinfo = NULL;
+failed_0:
+	if (set_eq)
+		LNetEQFree(the_lnet.ln_ping_target_eq);
+	return rc;
+}
+
+static void
+lnet_ping_md_unlink(lnet_ping_info_t *pinfo, lnet_handle_md_t *md_handle)
+{
+	sigset_t blocked = cfs_block_allsigs();
+
+	LNetMDUnlink(*md_handle);
+	LNetInvalidateHandle(md_handle);
+
+	/* NB md could be busy; this just starts the unlink */
+	while (pinfo->pi_features != LNET_PING_FEAT_INVAL) {
+		CDEBUG(D_NET, "Still waiting for ping MD to unlink\n");
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(cfs_time_seconds(1));
+	}
+
+	cfs_restore_sigs(blocked);
+}
+
+static void
+lnet_ping_info_install_locked(lnet_ping_info_t *ping_info)
+{
+	lnet_ni_status_t *ns;
+	lnet_ni_t *ni;
+	int i = 0;
+
+	list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) {
+		LASSERT(i < ping_info->pi_nnis);
+
+		ns = &ping_info->pi_ni[i];
+
+		ns->ns_nid = ni->ni_nid;
+
+		lnet_ni_lock(ni);
+		ns->ns_status = (ni->ni_status) ?
+				 ni->ni_status->ns_status : LNET_NI_STATUS_UP;
+		ni->ni_status = ns;
+		lnet_ni_unlock(ni);
+
+		i++;
+	}
+}
+
+static void
+lnet_ping_target_update(lnet_ping_info_t *pinfo, lnet_handle_md_t md_handle)
+{
+	lnet_ping_info_t *old_pinfo = NULL;
+	lnet_handle_md_t old_md;
+
+	/* switch the NIs to point to the new ping info created */
+	lnet_net_lock(LNET_LOCK_EX);
+
+	if (!the_lnet.ln_routing)
+		pinfo->pi_features |= LNET_PING_FEAT_RTE_DISABLED;
+	lnet_ping_info_install_locked(pinfo);
+
+	if (the_lnet.ln_ping_info) {
+		old_pinfo = the_lnet.ln_ping_info;
+		old_md = the_lnet.ln_ping_target_md;
+	}
+	the_lnet.ln_ping_target_md = md_handle;
+	the_lnet.ln_ping_info = pinfo;
+
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	if (old_pinfo) {
+		/* unlink the old ping info */
+		lnet_ping_md_unlink(old_pinfo, &old_md);
+		lnet_ping_info_free(old_pinfo);
+	}
+}
+
+static void
+lnet_ping_target_fini(void)
+{
+	int rc;
+
+	lnet_ping_md_unlink(the_lnet.ln_ping_info,
+			    &the_lnet.ln_ping_target_md);
+
+	rc = LNetEQFree(the_lnet.ln_ping_target_eq);
+	LASSERT(rc == 0);
+
+	lnet_ping_info_destroy();
+}
+
 static int
 lnet_ni_tq_credits(lnet_ni_t *ni)
 {
@@ -831,12 +1059,74 @@ lnet_ni_tq_credits(lnet_ni_t *ni)
 }
 
 static void
-lnet_shutdown_lndnis(void)
+lnet_clear_zombies_nis_locked(void)
 {
 	int i;
 	int islo;
 	lnet_ni_t *ni;
 
+	/*
+	 * Now wait for the NI's I just nuked to show up on ln_zombie_nis
+	 * and shut them down in guaranteed thread context
+	 */
+	i = 2;
+	while (!list_empty(&the_lnet.ln_nis_zombie)) {
+		int *ref;
+		int j;
+
+		ni = list_entry(the_lnet.ln_nis_zombie.next,
+				lnet_ni_t, ni_list);
+		list_del_init(&ni->ni_list);
+		cfs_percpt_for_each(ref, j, ni->ni_refs) {
+			if (*ref == 0)
+				continue;
+			/* still busy, add it back to zombie list */
+			list_add(&ni->ni_list, &the_lnet.ln_nis_zombie);
+			break;
+		}
+
+		if (!list_empty(&ni->ni_list)) {
+			lnet_net_unlock(LNET_LOCK_EX);
+			++i;
+			if ((i & (-i)) == i) {
+				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
+				       libcfs_nid2str(ni->ni_nid));
+			}
+			set_current_state(TASK_UNINTERRUPTIBLE);
+			schedule_timeout(cfs_time_seconds(1));
+			lnet_net_lock(LNET_LOCK_EX);
+			continue;
+		}
+
+		ni->ni_lnd->lnd_refcount--;
+		lnet_net_unlock(LNET_LOCK_EX);
+
+		islo = ni->ni_lnd->lnd_type == LOLND;
+
+		LASSERT(!in_interrupt());
+		ni->ni_lnd->lnd_shutdown(ni);
+
+		/*
+		 * can't deref lnd anymore now; it might have unregistered
+		 * itself...
+		 */
+		if (!islo)
+			CDEBUG(D_LNI, "Removed LNI %s\n",
+			       libcfs_nid2str(ni->ni_nid));
+
+		lnet_ni_free(ni);
+		i = 2;
+
+		lnet_net_lock(LNET_LOCK_EX);
+	}
+}
+
+static void
+lnet_shutdown_lndnis(void)
+{
+	lnet_ni_t *ni;
+	int i;
+
 	/* NB called holding the global mutex */
 
 	/* All quiet on the API front */
@@ -850,7 +1140,7 @@ lnet_shutdown_lndnis(void)
 	/* Unlink NIs from the global table */
 	while (!list_empty(&the_lnet.ln_nis)) {
 		ni = list_entry(the_lnet.ln_nis.next,
-				    lnet_ni_t, ni_list);
+				lnet_ni_t, ni_list);
 		/* move it to zombie list and nobody can find it anymore */
 		list_move(&ni->ni_list, &the_lnet.ln_nis_zombie);
 		lnet_ni_decref_locked(ni, 0);	/* drop ln_nis' ref */
@@ -885,89 +1175,90 @@ lnet_shutdown_lndnis(void)
 	lnet_peer_tables_cleanup(NULL);
 
 	lnet_net_lock(LNET_LOCK_EX);
-	/* Now wait for the NI's I just nuked to show up on ln_zombie_nis
-	 * and shut them down in guaranteed thread context */
-	i = 2;
-	while (!list_empty(&the_lnet.ln_nis_zombie)) {
-		int *ref;
-		int j;
 
-		ni = list_entry(the_lnet.ln_nis_zombie.next,
-				    lnet_ni_t, ni_list);
-		list_del_init(&ni->ni_list);
-		cfs_percpt_for_each(ref, j, ni->ni_refs) {
-			if (*ref == 0)
-				continue;
-			/* still busy, add it back to zombie list */
-			list_add(&ni->ni_list, &the_lnet.ln_nis_zombie);
-			break;
-		}
+	lnet_clear_zombies_nis_locked();
+	the_lnet.ln_shutdown = 0;
+	lnet_net_unlock(LNET_LOCK_EX);
+}
 
-		if (!list_empty(&ni->ni_list)) {
-			lnet_net_unlock(LNET_LOCK_EX);
-			++i;
-			if ((i & (-i)) == i) {
-				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
-				       libcfs_nid2str(ni->ni_nid));
-			}
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(cfs_time_seconds(1));
-			lnet_net_lock(LNET_LOCK_EX);
-			continue;
-		}
+int
+lnet_shutdown_lndni(__u32 net)
+{
+	lnet_ping_info_t *pinfo;
+	lnet_handle_md_t md_handle;
+	lnet_ni_t *found_ni = NULL;
+	int ni_count;
+	int rc;
 
-		ni->ni_lnd->lnd_refcount--;
-		lnet_net_unlock(LNET_LOCK_EX);
+	if (LNET_NETTYP(net) == LOLND)
+		return -EINVAL;
 
-		islo = ni->ni_lnd->lnd_type == LOLND;
+	ni_count = lnet_get_ni_count();
 
-		LASSERT(!in_interrupt());
-		(ni->ni_lnd->lnd_shutdown)(ni);
+	/* create and link a new ping info, before removing the old one */
+	rc = lnet_ping_info_setup(&pinfo, &md_handle, ni_count - 1, false);
+	if (rc != 0)
+		return rc;
 
-		/* can't deref lnd anymore now; it might have unregistered
-		 * itself...  */
+	/* proceed with shutting down the NI */
+	lnet_net_lock(LNET_LOCK_EX);
 
-		if (!islo)
-			CDEBUG(D_LNI, "Removed LNI %s\n",
-			       libcfs_nid2str(ni->ni_nid));
+	found_ni = lnet_net2ni_locked(net, 0);
+	if (!found_ni) {
+		lnet_net_unlock(LNET_LOCK_EX);
+		lnet_ping_md_unlink(pinfo, &md_handle);
+		lnet_ping_info_free(pinfo);
+		return -EINVAL;
+	}
 
-		lnet_ni_free(ni);
-		i = 2;
+	/*
+	 * decrement the reference counter on found_ni which was
+	 * incremented when we called lnet_net2ni_locked()
+	 */
+	lnet_ni_decref_locked(found_ni, 0);
+	/* Move ni to zombie list so nobody can find it anymore */
+	list_move(&found_ni->ni_list, &the_lnet.ln_nis_zombie);
 
-		lnet_net_lock(LNET_LOCK_EX);
+	/* Drop the lock reference for the ln_nis ref. */
+	lnet_ni_decref_locked(found_ni, 0);
+
+	if (!list_empty(&found_ni->ni_cptlist)) {
+		list_del_init(&found_ni->ni_cptlist);
+		lnet_ni_decref_locked(found_ni, 0);
 	}
 
-	the_lnet.ln_shutdown = 0;
 	lnet_net_unlock(LNET_LOCK_EX);
+
+	/* Do peer table cleanup for this ni */
+	lnet_peer_tables_cleanup(found_ni);
+
+	lnet_net_lock(LNET_LOCK_EX);
+	lnet_clear_zombies_nis_locked();
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	lnet_ping_target_update(pinfo, md_handle);
+
+	return 0;
 }
 
 static int
-lnet_startup_lndnis(void)
+lnet_startup_lndnis(struct list_head *nilist, __s32 peer_timeout,
+		    __s32 peer_cr, __s32 peer_buf_cr, __s32 credits,
+		    int *ni_count)
 {
 	lnd_t *lnd;
 	struct lnet_ni *ni;
 	struct lnet_tx_queue *tq;
-	struct list_head nilist;
 	int i;
 	int rc = 0;
 	__u32 lnd_type;
-	int nicount = 0;
-	char *nets = lnet_get_networks();
-
-	INIT_LIST_HEAD(&nilist);
-
-	if (nets == NULL)
-		goto failed;
-
-	rc = lnet_parse_networks(&nilist, nets);
-	if (rc != 0)
-		goto failed;
 
-	while (!list_empty(&nilist)) {
-		ni = list_entry(nilist.next, lnet_ni_t, ni_list);
+	while (!list_empty(nilist)) {
+		ni = list_entry(nilist->next, lnet_ni_t, ni_list);
 		lnd_type = LNET_NETTYP(LNET_NIDNET(ni->ni_nid));
 
-		LASSERT(libcfs_isknown_lnd(lnd_type));
+		if (!libcfs_isknown_lnd(lnd_type))
+			goto failed;
 
 		if (lnd_type == CIBLND    ||
 		    lnd_type == OPENIBLND ||
@@ -978,6 +1269,24 @@ lnet_startup_lndnis(void)
 			goto failed;
 		}
 
+		/* Make sure this new NI is unique. */
+		lnet_net_lock(LNET_LOCK_EX);
+		if (!lnet_net_unique(LNET_NIDNET(ni->ni_nid),
+				     &the_lnet.ln_nis)) {
+			if (lnd_type == LOLND) {
+				lnet_net_unlock(LNET_LOCK_EX);
+				list_del(&ni->ni_list);
+				lnet_ni_free(ni);
+				continue;
+			}
+
+			CERROR("Net %s is not unique\n",
+			       libcfs_net2str(LNET_NIDNET(ni->ni_nid)));
+			lnet_net_unlock(LNET_LOCK_EX);
+			goto failed;
+		}
+		lnet_net_unlock(LNET_LOCK_EX);
+
 		mutex_lock(&the_lnet.ln_lnd_mutex);
 		lnd = lnet_find_lnd_by_type(lnd_type);
 
@@ -1016,6 +1325,25 @@ lnet_startup_lndnis(void)
 			goto failed;
 		}
 
+		/*
+		 * If given some LND tunable parameters, parse those now to
+		 * override the values in the NI structure.
+		 */
+		if (peer_buf_cr >= 0)
+			ni->ni_peerrtrcredits = peer_buf_cr;
+		if (peer_timeout >= 0)
+			ni->ni_peertimeout = peer_timeout;
+		/*
+		 * TODO
+		 * Note: For now, don't allow the user to change
+		 * peertxcredits as this number is used in the
+		 * IB LND to control queue depth.
+		 * if (peer_cr != -1)
+		 *	ni->ni_peertxcredits = peer_cr;
+		 */
+		if (credits >= 0)
+			ni->ni_maxtxcredits = credits;
+
 		LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query != NULL);
 
 		list_del(&ni->ni_list);
@@ -1032,6 +1360,14 @@ lnet_startup_lndnis(void)
 
 		lnet_net_unlock(LNET_LOCK_EX);
 
+		/* increment the ni_count here to account for the LOLND as
+		 * well.  If we increment past this point then the number
+		 * of count will be missing the LOLND, and then ping and
+		 * will not report the LOLND
+		 */
+		if (ni_count)
+			(*ni_count)++;
+
 		if (lnd->lnd_type == LOLND) {
 			lnet_ni_addref(ni);
 			LASSERT(the_lnet.ln_loni == NULL);
@@ -1058,29 +1394,16 @@ lnet_startup_lndnis(void)
 		       libcfs_nid2str(ni->ni_nid), ni->ni_peertxcredits,
 		       lnet_ni_tq_credits(ni) * LNET_CPT_NUMBER,
 		       ni->ni_peerrtrcredits, ni->ni_peertimeout);
-
-		nicount++;
-	}
-
-	if (the_lnet.ln_eq_waitni != NULL && nicount > 1) {
-		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
-		LCONSOLE_ERROR_MSG(0x109, "LND %s can only run single-network\n",
-				   libcfs_lnd2str(lnd_type));
-		goto failed;
 	}
 
 	return 0;
-
- failed:
-	lnet_shutdown_lndnis();
-
-	while (!list_empty(&nilist)) {
-		ni = list_entry(nilist.next, lnet_ni_t, ni_list);
+failed:
+	while (!list_empty(nilist)) {
+		ni = list_entry(nilist->next, lnet_ni_t, ni_list);
 		list_del(&ni->ni_list);
 		lnet_ni_free(ni);
 	}
-
-	return -ENETDOWN;
+	return -EINVAL;
 }
 
 /**
@@ -1194,6 +1517,15 @@ LNetNIInit(lnet_pid_t requested_pid)
 {
 	int im_a_router = 0;
 	int rc;
+	int ni_count = 0;
+	int lnd_type;
+	struct lnet_ni *ni;
+	lnet_ping_info_t *pinfo;
+	lnet_handle_md_t md_handle;
+	struct list_head net_head;
+	char *nets;
+
+	INIT_LIST_HEAD(&net_head);
 
 	mutex_lock(&the_lnet.ln_api_mutex);
 
@@ -1202,23 +1534,31 @@ LNetNIInit(lnet_pid_t requested_pid)
 
 	if (the_lnet.ln_refcount > 0) {
 		rc = the_lnet.ln_refcount++;
-		goto out;
+		mutex_unlock(&the_lnet.ln_api_mutex);
+		return rc;
 	}
 
-	if (requested_pid == LNET_PID_ANY) {
-		/* Don't instantiate LNET just for me */
-		rc = -ENETDOWN;
-		goto failed0;
-	}
+	nets = lnet_get_networks();
 
 	rc = lnet_prepare(requested_pid);
 	if (rc != 0)
 		goto failed0;
 
-	rc = lnet_startup_lndnis();
+	rc = lnet_parse_networks(&net_head, nets);
+	if (rc < 0)
+		goto failed1;
+
+	rc = lnet_startup_lndnis(&net_head, -1, -1, -1, -1, &ni_count);
 	if (rc != 0)
 		goto failed1;
 
+	if (the_lnet.ln_eq_waitni && ni_count > 1) {
+		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
+		LCONSOLE_ERROR_MSG(0x109, "LND %s can only run single-network\n",
+				   libcfs_lnd2str(lnd_type));
+		goto failed2;
+	}
+
 	rc = lnet_parse_routes(lnet_get_routes(), &im_a_router);
 	if (rc != 0)
 		goto failed2;
@@ -1238,24 +1578,30 @@ LNetNIInit(lnet_pid_t requested_pid)
 	the_lnet.ln_refcount = 1;
 	/* Now I may use my own API functions... */
 
-	/* NB router checker needs the_lnet.ln_ping_info in
-	 * lnet_router_checker -> lnet_update_ni_status_locked */
-	rc = lnet_ping_target_init();
+	rc = lnet_ping_info_setup(&pinfo, &md_handle, ni_count, true);
 	if (rc != 0)
 		goto failed3;
 
+	lnet_ping_target_update(pinfo, md_handle);
+
 	rc = lnet_router_checker_start();
 	if (rc != 0)
 		goto failed4;
 
 	lnet_router_debugfs_init();
-	goto out;
+
+	mutex_unlock(&the_lnet.ln_api_mutex);
+
+	return 0;
 
  failed4:
-	lnet_ping_target_fini();
- failed3:
 	the_lnet.ln_refcount = 0;
+	lnet_ping_md_unlink(pinfo, &md_handle);
+	lnet_ping_info_free(pinfo);
+ failed3:
 	lnet_acceptor_stop();
+	rc = LNetEQFree(the_lnet.ln_ping_target_eq);
+	LASSERT(rc == 0);
  failed2:
 	lnet_destroy_routes();
 	lnet_shutdown_lndnis();
@@ -1263,8 +1609,12 @@ LNetNIInit(lnet_pid_t requested_pid)
 	lnet_unprepare();
  failed0:
 	LASSERT(rc < 0);
- out:
 	mutex_unlock(&the_lnet.ln_api_mutex);
+	while (!list_empty(&net_head)) {
+		ni = list_entry(net_head.next, struct lnet_ni, ni_list);
+		list_del_init(&ni->ni_list);
+		lnet_ni_free(ni);
+	}
 	return rc;
 }
 EXPORT_SYMBOL(LNetNIInit);
@@ -1309,6 +1659,71 @@ LNetNIFini(void)
 }
 EXPORT_SYMBOL(LNetNIFini);
 
+int
+lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
+		__s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
+		__s32 credits)
+{
+	lnet_ping_info_t *pinfo;
+	lnet_handle_md_t md_handle;
+	struct lnet_ni *ni;
+	struct list_head net_head;
+	int rc;
+
+	INIT_LIST_HEAD(&net_head);
+
+	/* Create a ni structure for the network string */
+	rc = lnet_parse_networks(&net_head, nets);
+	if (rc < 0)
+		return rc;
+
+	mutex_lock(&the_lnet.ln_api_mutex);
+
+	if (rc > 1) {
+		rc = -EINVAL; /* only add one interface per call */
+		goto failed0;
+	}
+
+	rc = lnet_ping_info_setup(&pinfo, &md_handle, 1 + lnet_get_ni_count(),
+				  false);
+	if (rc != 0)
+		goto failed0;
+
+	rc = lnet_startup_lndnis(&net_head, peer_timeout, peer_cr,
+				 peer_buf_cr, credits, NULL);
+	if (rc != 0)
+		goto failed1;
+
+	lnet_ping_target_update(pinfo, md_handle);
+	mutex_unlock(&the_lnet.ln_api_mutex);
+
+	return 0;
+
+failed1:
+	lnet_ping_md_unlink(pinfo, &md_handle);
+	lnet_ping_info_free(pinfo);
+failed0:
+	mutex_unlock(&the_lnet.ln_api_mutex);
+	while (!list_empty(&net_head)) {
+		ni = list_entry(net_head.next, struct lnet_ni, ni_list);
+		list_del_init(&ni->ni_list);
+		lnet_ni_free(ni);
+	}
+	return rc;
+}
+
+int
+lnet_dyn_del_ni(__u32 net)
+{
+	int rc;
+
+	mutex_lock(&the_lnet.ln_api_mutex);
+	rc = lnet_shutdown_lndni(net);
+	mutex_unlock(&the_lnet.ln_api_mutex);
+
+	return rc;
+}
+
 /**
  * This is an ugly hack to export IOC_LIBCFS_DEBUG_PEER and
  * IOC_LIBCFS_PORTALS_COMPATIBILITY commands to users, by tweaking the LNet
@@ -1332,7 +1747,6 @@ LNetCtl(unsigned int cmd, void *arg)
 	unsigned long secs_passed;
 
 	LASSERT(the_lnet.ln_init);
-	LASSERT(the_lnet.ln_refcount > 0);
 
 	switch (cmd) {
 	case IOC_LIBCFS_GET_NI:
@@ -1344,12 +1758,17 @@ LNetCtl(unsigned int cmd, void *arg)
 		return lnet_fail_nid(data->ioc_nid, data->ioc_count);
 
 	case IOC_LIBCFS_ADD_ROUTE:
+		mutex_lock(&the_lnet.ln_api_mutex);
 		rc = lnet_add_route(data->ioc_net, data->ioc_count,
 				    data->ioc_nid, data->ioc_priority);
+		mutex_unlock(&the_lnet.ln_api_mutex);
 		return (rc != 0) ? rc : lnet_check_routes();
 
 	case IOC_LIBCFS_DEL_ROUTE:
-		return lnet_del_route(data->ioc_net, data->ioc_nid);
+		mutex_lock(&the_lnet.ln_api_mutex);
+		rc = lnet_del_route(data->ioc_net, data->ioc_nid);
+		mutex_unlock(&the_lnet.ln_api_mutex);
+		return rc;
 
 	case IOC_LIBCFS_GET_ROUTE:
 		return lnet_get_route(data->ioc_count,
@@ -1485,192 +1904,6 @@ LNetSnprintHandle(char *str, int len, lnet_handle_any_t h)
 }
 EXPORT_SYMBOL(LNetSnprintHandle);
 
-static int
-lnet_create_ping_info(void)
-{
-	int i;
-	int n;
-	int rc;
-	unsigned int infosz;
-	lnet_ni_t *ni;
-	lnet_process_id_t id;
-	lnet_ping_info_t *pinfo;
-
-	for (n = 0; ; n++) {
-		rc = LNetGetId(n, &id);
-		if (rc == -ENOENT)
-			break;
-
-		LASSERT(rc == 0);
-	}
-
-	infosz = offsetof(lnet_ping_info_t, pi_ni[n]);
-	LIBCFS_ALLOC(pinfo, infosz);
-	if (pinfo == NULL) {
-		CERROR("Can't allocate ping info[%d]\n", n);
-		return -ENOMEM;
-	}
-
-	pinfo->pi_nnis    = n;
-	pinfo->pi_pid     = the_lnet.ln_pid;
-	pinfo->pi_magic   = LNET_PROTO_PING_MAGIC;
-	pinfo->pi_features = LNET_PING_FEAT_NI_STATUS;
-	if (!the_lnet.ln_routing)
-		pinfo->pi_features |= LNET_PING_FEAT_RTE_DISABLED;
-
-	for (i = 0; i < n; i++) {
-		lnet_ni_status_t *ns = &pinfo->pi_ni[i];
-
-		rc = LNetGetId(i, &id);
-		LASSERT(rc == 0);
-
-		ns->ns_nid    = id.nid;
-		ns->ns_status = LNET_NI_STATUS_UP;
-
-		lnet_net_lock(0);
-
-		ni = lnet_nid2ni_locked(id.nid, 0);
-		LASSERT(ni != NULL);
-
-		lnet_ni_lock(ni);
-		LASSERT(ni->ni_status == NULL);
-		ni->ni_status = ns;
-		lnet_ni_unlock(ni);
-
-		lnet_ni_decref_locked(ni, 0);
-		lnet_net_unlock(0);
-	}
-
-	the_lnet.ln_ping_info = pinfo;
-	return 0;
-}
-
-static void
-lnet_destroy_ping_info(void)
-{
-	struct lnet_ni *ni;
-
-	lnet_net_lock(0);
-
-	list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) {
-		lnet_ni_lock(ni);
-		ni->ni_status = NULL;
-		lnet_ni_unlock(ni);
-	}
-
-	lnet_net_unlock(0);
-
-	LIBCFS_FREE(the_lnet.ln_ping_info,
-		    offsetof(lnet_ping_info_t,
-			     pi_ni[the_lnet.ln_ping_info->pi_nnis]));
-	the_lnet.ln_ping_info = NULL;
-}
-
-int
-lnet_ping_target_init(void)
-{
-	lnet_md_t md = { NULL };
-	lnet_handle_me_t meh;
-	lnet_process_id_t id;
-	int rc;
-	int rc2;
-	int infosz;
-
-	rc = lnet_create_ping_info();
-	if (rc != 0)
-		return rc;
-
-	/* We can have a tiny EQ since we only need to see the unlink event on
-	 * teardown, which by definition is the last one! */
-	rc = LNetEQAlloc(2, LNET_EQ_HANDLER_NONE, &the_lnet.ln_ping_target_eq);
-	if (rc != 0) {
-		CERROR("Can't allocate ping EQ: %d\n", rc);
-		goto failed_0;
-	}
-
-	memset(&id, 0, sizeof(lnet_process_id_t));
-	id.nid = LNET_NID_ANY;
-	id.pid = LNET_PID_ANY;
-
-	rc = LNetMEAttach(LNET_RESERVED_PORTAL, id,
-			  LNET_PROTO_PING_MATCHBITS, 0,
-			  LNET_UNLINK, LNET_INS_AFTER,
-			  &meh);
-	if (rc != 0) {
-		CERROR("Can't create ping ME: %d\n", rc);
-		goto failed_1;
-	}
-
-	/* initialize md content */
-	infosz = offsetof(lnet_ping_info_t,
-			  pi_ni[the_lnet.ln_ping_info->pi_nnis]);
-	md.start     = the_lnet.ln_ping_info;
-	md.length    = infosz;
-	md.threshold = LNET_MD_THRESH_INF;
-	md.max_size  = 0;
-	md.options   = LNET_MD_OP_GET | LNET_MD_TRUNCATE |
-		       LNET_MD_MANAGE_REMOTE;
-	md.user_ptr  = NULL;
-	md.eq_handle = the_lnet.ln_ping_target_eq;
-
-	rc = LNetMDAttach(meh, md,
-			  LNET_RETAIN,
-			  &the_lnet.ln_ping_target_md);
-	if (rc != 0) {
-		CERROR("Can't attach ping MD: %d\n", rc);
-		goto failed_2;
-	}
-
-	return 0;
-
- failed_2:
-	rc2 = LNetMEUnlink(meh);
-	LASSERT(rc2 == 0);
- failed_1:
-	rc2 = LNetEQFree(the_lnet.ln_ping_target_eq);
-	LASSERT(rc2 == 0);
- failed_0:
-	lnet_destroy_ping_info();
-	return rc;
-}
-
-void
-lnet_ping_target_fini(void)
-{
-	lnet_event_t event;
-	int rc;
-	int which;
-	int timeout_ms = 1000;
-	sigset_t blocked = cfs_block_allsigs();
-
-	LNetMDUnlink(the_lnet.ln_ping_target_md);
-	/* NB md could be busy; this just starts the unlink */
-
-	for (;;) {
-		rc = LNetEQPoll(&the_lnet.ln_ping_target_eq, 1,
-				timeout_ms, &event, &which);
-
-		/* I expect overflow... */
-		LASSERT(rc >= 0 || rc == -EOVERFLOW);
-
-		if (rc == 0) {
-			/* timed out: provide a diagnostic */
-			CWARN("Still waiting for ping MD to unlink\n");
-			timeout_ms *= 2;
-			continue;
-		}
-
-		/* Got a valid event */
-		if (event.unlinked)
-			break;
-	}
-
-	rc = LNetEQFree(the_lnet.ln_ping_target_eq);
-	LASSERT(rc == 0);
-	lnet_destroy_ping_info();
-	cfs_restore_sigs(blocked);
-}
-
 int
 lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t *ids, int n_ids)
 {
@@ -1682,7 +1915,7 @@ lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t *ids, int n_id
 	int unlinked = 0;
 	int replied = 0;
 	const int a_long_time = 60000; /* mS */
-	int infosz = offsetof(lnet_ping_info_t, pi_ni[n_ids]);
+	int infosz;
 	lnet_ping_info_t *info;
 	lnet_process_id_t tmpid;
 	int i;
@@ -1691,6 +1924,8 @@ lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t *ids, int n_id
 	int rc2;
 	sigset_t blocked;
 
+	infosz = offsetof(lnet_ping_info_t, pi_ni[n_ids]);
+
 	if (n_ids <= 0 ||
 	    id.nid == LNET_NID_ANY ||
 	    timeout_ms > 500000 ||	      /* arbitrary limit! */
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 7bb140b..1c7ad7c 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -77,7 +77,7 @@ lnet_issep(char c)
 	}
 }
 
-static int
+int
 lnet_net_unique(__u32 net, struct list_head *nilist)
 {
 	struct list_head *tmp;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (10 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 11/40] staging: lustre: DLC Feature dynamic net config James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02  9:48   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command James Simmons
                   ` (28 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the fourth patch of a set of patches that enables DLC.

This patch changes the IOCTL infrastructure in preparation of
adding extra IOCTL communication between user and kernel space.
The changes include:
- adding a common header to be passed to ioctl infra functions
  instead of passing an exact structure.  This header is meant
  to be included in all structures to be passed through that
  interface.  The IOCTL handler casts this header to a particular
  type that it expects
- All sanity testing on the past in structure is performed in the
  generic ioctl infrastructure code.
- All ioctl handlers changed to take the header instead of a
  particular structure type

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Change-Id: I144706a14293637cd5f381d2c020faa0e9c21f6b
Reviewed-on: http://review.whamcloud.com/8021
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   29 +++++----
 drivers/staging/lustre/lnet/lnet/module.c          |    4 +-
 drivers/staging/lustre/lnet/selftest/conctl.c      |    9 ++-
 drivers/staging/lustre/lnet/selftest/console.c     |    2 +-
 drivers/staging/lustre/lnet/selftest/console.h     |    1 -
 .../lustre/lustre/libcfs/linux/linux-module.c      |   60 +++++++++----------
 drivers/staging/lustre/lustre/libcfs/module.c      |   50 ++++++++++++----
 7 files changed, 91 insertions(+), 64 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index 485ab26..e14788c 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -43,9 +43,13 @@
 
 #define LIBCFS_IOCTL_VERSION 0x0001000a
 
-struct libcfs_ioctl_data {
+struct libcfs_ioctl_hdr {
 	__u32 ioc_len;
 	__u32 ioc_version;
+};
+
+struct libcfs_ioctl_data {
+	struct libcfs_ioctl_hdr ioc_hdr;
 
 	__u64 ioc_nid;
 	__u64 ioc_u64[1];
@@ -70,11 +74,6 @@ struct libcfs_ioctl_data {
 
 #define ioc_priority ioc_u32[0]
 
-struct libcfs_ioctl_hdr {
-	__u32 ioc_len;
-	__u32 ioc_version;
-};
-
 struct libcfs_debug_ioctl_data {
 	struct libcfs_ioctl_hdr hdr;
 	unsigned int subs;
@@ -84,13 +83,13 @@ struct libcfs_debug_ioctl_data {
 #define LIBCFS_IOC_INIT(data)			   \
 do {						    \
 	memset(&data, 0, sizeof(data));		 \
-	data.ioc_version = LIBCFS_IOCTL_VERSION;	\
-	data.ioc_len = sizeof(data);		    \
+	data.ioc_hdr.ioc_version = LIBCFS_IOCTL_VERSION;	\
+	data.ioc_hdr.ioc_len = sizeof(data);			\
 } while (0)
 
 struct libcfs_ioctl_handler {
 	struct list_head item;
-	int (*handle_ioctl)(unsigned int cmd, struct libcfs_ioctl_data *data);
+	int (*handle_ioctl)(unsigned int cmd, struct libcfs_ioctl_hdr *hdr);
 };
 
 #define DECLARE_IOCTL_HANDLER(ident, func)		      \
@@ -149,9 +148,9 @@ static inline int libcfs_ioctl_packlen(struct libcfs_ioctl_data *data)
 	return len;
 }
 
-static inline int libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
+static inline bool libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
 {
-	if (data->ioc_len > (1<<30)) {
+	if (data->ioc_hdr.ioc_len > (1 << 30)) {
 		CERROR("LIBCFS ioctl: ioc_len larger than 1<<30\n");
 		return 1;
 	}
@@ -187,7 +186,7 @@ static inline int libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
 		CERROR("LIBCFS ioctl: plen2 nonzero but no pbuf2 pointer\n");
 		return 1;
 	}
-	if ((__u32)libcfs_ioctl_packlen(data) != data->ioc_len) {
+	if ((__u32)libcfs_ioctl_packlen(data) != data->ioc_hdr.ioc_len) {
 		CERROR("LIBCFS ioctl: packlen != ioc_len\n");
 		return 1;
 	}
@@ -207,7 +206,11 @@ static inline int libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
 
 int libcfs_register_ioctl(struct libcfs_ioctl_handler *hand);
 int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand);
-int libcfs_ioctl_getdata(char *buf, char *end, void *arg);
+int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
+			 const void __user *arg);
+int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
+			     __u32 *buf_len);
 int libcfs_ioctl_popdata(void *arg, void *buf, int size);
+int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
 
 #endif /* __LIBCFS_IOCTL_H__ */
diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
index ac2fdf0..0afdad0 100644
--- a/drivers/staging/lustre/lnet/lnet/module.c
+++ b/drivers/staging/lustre/lnet/lnet/module.c
@@ -84,7 +84,7 @@ lnet_unconfigure(void)
 }
 
 static int
-lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_data *data)
+lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
 {
 	int rc;
 
@@ -101,7 +101,7 @@ lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_data *data)
 		 * I'm called into it */
 		rc = LNetNIInit(LNET_PID_ANY);
 		if (rc >= 0) {
-			rc = LNetCtl(cmd, data);
+			rc = LNetCtl(cmd, hdr);
 			LNetNIFini();
 		}
 		return rc;
diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index 2ca7d0e..3dc8ea7 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -814,15 +814,20 @@ out:
 }
 
 int
-lstcon_ioctl_entry(unsigned int cmd, struct libcfs_ioctl_data *data)
+lstcon_ioctl_entry(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
 {
 	char   *buf;
-	int     opc = data->ioc_u32[0];
+	struct libcfs_ioctl_data *data;
+	int     opc;
 	int     rc;
 
 	if (cmd != IOC_LIBCFS_LNETST)
 		return -EINVAL;
 
+	data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
+
+	opc = data->ioc_u32[0];
+
 	if (data->ioc_plen1 > PAGE_CACHE_SIZE)
 		return -EINVAL;
 
diff --git a/drivers/staging/lustre/lnet/selftest/console.c b/drivers/staging/lustre/lnet/selftest/console.c
index 5619fc4..898912b 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -1977,7 +1977,7 @@ static void lstcon_init_acceptor_service(void)
 	lstcon_acceptor_service.sv_wi_total = SFW_FRWK_WI_MAX;
 }
 
-extern int lstcon_ioctl_entry(unsigned int cmd, struct libcfs_ioctl_data *data);
+extern int lstcon_ioctl_entry(unsigned int cmd, struct libcfs_ioctl_hdr *hdr);
 
 static DECLARE_IOCTL_HANDLER(lstcon_ioctl_handler, lstcon_ioctl_entry);
 
diff --git a/drivers/staging/lustre/lnet/selftest/console.h b/drivers/staging/lustre/lnet/selftest/console.h
index 3f3286c..7af3540 100644
--- a/drivers/staging/lustre/lnet/selftest/console.h
+++ b/drivers/staging/lustre/lnet/selftest/console.h
@@ -184,7 +184,6 @@ lstcon_id2hash (lnet_process_id_t id, struct list_head *hash)
 }
 
 int lstcon_console_init(void);
-int lstcon_ioctl_entry(unsigned int cmd, struct libcfs_ioctl_data *data);
 int lstcon_console_fini(void);
 int lstcon_session_match(lst_sid_t sid);
 int lstcon_session_new(char *name, int key, unsigned version,
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
index 70a99cf..ef1c247 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
@@ -40,51 +40,47 @@
 
 #define LNET_MINOR 240
 
-int libcfs_ioctl_getdata(char *buf, char *end, void *arg)
+int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data)
 {
-	struct libcfs_ioctl_hdr   *hdr;
-	struct libcfs_ioctl_data  *data;
-	int orig_len;
-
-	hdr = (struct libcfs_ioctl_hdr *)buf;
-	data = (struct libcfs_ioctl_data *)buf;
-
-	if (copy_from_user(buf, arg, sizeof(*hdr)))
-		return -EFAULT;
-
-	if (hdr->ioc_version != LIBCFS_IOCTL_VERSION) {
-		CERROR("PORTALS: version mismatch kernel vs application\n");
+	if (libcfs_ioctl_is_invalid(data)) {
+		CERROR("LNET: ioctl not correctly formatted\n");
 		return -EINVAL;
 	}
 
-	if (hdr->ioc_len >= end - buf) {
-		CERROR("PORTALS: user buffer exceeds kernel buffer\n");
-		return -EINVAL;
-	}
+	if (data->ioc_inllen1 != 0)
+		data->ioc_inlbuf1 = &data->ioc_bulk[0];
 
-	if (hdr->ioc_len < sizeof(struct libcfs_ioctl_data)) {
-		CERROR("PORTALS: user buffer too small for ioctl\n");
-		return -EINVAL;
-	}
+	if (data->ioc_inllen2 != 0)
+		data->ioc_inlbuf2 = &data->ioc_bulk[0] +
+			cfs_size_round(data->ioc_inllen1);
+
+	return 0;
+}
+
+int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
+			     __u32 *len)
+{
+	struct libcfs_ioctl_hdr hdr;
 
-	orig_len = hdr->ioc_len;
-	if (copy_from_user(buf, arg, hdr->ioc_len))
+	if (copy_from_user(&hdr, arg, sizeof(hdr)))
 		return -EFAULT;
-	if (orig_len != data->ioc_len)
-		return -EINVAL;
 
-	if (libcfs_ioctl_is_invalid(data)) {
-		CERROR("PORTALS: ioctl not correctly formatted\n");
+	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION) {
+		CERROR("LNET: version mismatch expected %#x, got %#x\n",
+		       LIBCFS_IOCTL_VERSION, hdr.ioc_version);
 		return -EINVAL;
 	}
 
-	if (data->ioc_inllen1)
-		data->ioc_inlbuf1 = &data->ioc_bulk[0];
+	*len = hdr.ioc_len;
 
-	if (data->ioc_inllen2)
-		data->ioc_inlbuf2 = &data->ioc_bulk[0] +
-			cfs_size_round(data->ioc_inllen1);
+	return 0;
+}
 
+int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
+			  const void __user *arg)
+{
+	if (copy_from_user(buf, arg, buf_len))
+		return -EFAULT;
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index 75247e9..5348699 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -54,6 +54,8 @@
 
 # define DEBUG_SUBSYSTEM S_LNET
 
+#define LIBCFS_MAX_IOCTL_BUF_LEN 2048
+
 #include "../../include/linux/libcfs/libcfs.h"
 #include <asm/div64.h>
 
@@ -241,11 +243,21 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand)
 }
 EXPORT_SYMBOL(libcfs_deregister_ioctl);
 
-static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd,
-			    void *arg, struct libcfs_ioctl_data *data)
+static int libcfs_ioctl_handle(struct cfs_psdev_file *pfile, unsigned long cmd,
+			       void *arg, struct libcfs_ioctl_hdr *hdr)
 {
+	struct libcfs_ioctl_data *data = NULL;
 	int err = -EINVAL;
 
+	if ((cmd <= IOC_LIBCFS_LNETST) ||
+	    (cmd >= IOC_LIBCFS_REGISTER_MYNID)) {
+		data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
+		err = libcfs_ioctl_data_adjust(data);
+		if (err != 0) {
+			return err;
+		}
+	}
+
 	switch (cmd) {
 	case IOC_LIBCFS_CLEAR_DEBUG:
 		libcfs_debug_clear_buffer();
@@ -280,11 +292,11 @@ static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd,
 		err = -EINVAL;
 		down_read(&ioctl_list_sem);
 		list_for_each_entry(hand, &ioctl_list, item) {
-			err = hand->handle_ioctl(cmd, data);
+			err = hand->handle_ioctl(cmd, hdr);
 			if (err != -EINVAL) {
 				if (err == 0)
 					err = libcfs_ioctl_popdata(arg,
-							data, sizeof(*data));
+							hdr, hdr->ioc_len);
 				break;
 			}
 		}
@@ -298,26 +310,38 @@ static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd,
 
 static int libcfs_ioctl(struct cfs_psdev_file *pfile, unsigned long cmd, void *arg)
 {
-	char    *buf;
-	struct libcfs_ioctl_data *data;
+	struct libcfs_ioctl_hdr *hdr;
 	int err = 0;
+	__u32 buf_len;
+
+	err = libcfs_ioctl_getdata_len(arg, &buf_len);
+	if (err != 0)
+		return err;
+
+	/*
+	 * do a check here to restrict the size of the memory
+	 * to allocate to guard against DoS attacks.
+	 */
+	if (buf_len > LIBCFS_MAX_IOCTL_BUF_LEN) {
+		CERROR("LNET: user buffer exceeds kernel buffer\n");
+		return -EINVAL;
+	}
 
-	LIBCFS_ALLOC_GFP(buf, 1024, GFP_KERNEL);
-	if (buf == NULL)
+	LIBCFS_ALLOC_GFP(hdr, buf_len, GFP_KERNEL);
+	if (!hdr)
 		return -ENOMEM;
 
 	/* 'cmd' and permissions get checked in our arch-specific caller */
-	if (libcfs_ioctl_getdata(buf, buf + 800, arg)) {
-		CERROR("PORTALS ioctl: data error\n");
+	if (libcfs_ioctl_getdata(hdr, buf_len, arg)) {
+		CERROR("LNET ioctl: data error\n");
 		err = -EINVAL;
 		goto out;
 	}
-	data = (struct libcfs_ioctl_data *)buf;
 
-	err = libcfs_ioctl_int(pfile, cmd, arg, data);
+	err = libcfs_ioctl_handle(pfile, cmd, arg, hdr);
 
 out:
-	LIBCFS_FREE(buf, 1024);
+	LIBCFS_FREE(hdr, buf_len);
 	return err;
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (11 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 11:20   ` Dan Carpenter
  2015-12-02 12:00   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 14/40] staging: lustre: fix crash due to NULL networks string James Simmons
                   ` (27 subsequent siblings)
  40 siblings, 2 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the fifth patch of a set of patches that enables DLC.

This patch adds the new structures which will be used
in the IOCTL communication.  It also added a set of
show operations to show buffers, networks, statistics
and peer information.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Change-Id: I96e5cb3dcf07289c6cd1deb46f4acb3c263ae21e
Reviewed-on: http://review.whamcloud.com/8022
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   44 +++++++-
 .../staging/lustre/include/linux/lnet/lib-dlc.h    |  118 ++++++++++++++++++++
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    5 +
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   47 +++++++-
 drivers/staging/lustre/lnet/lnet/module.c          |    4 +
 drivers/staging/lustre/lnet/lnet/peer.c            |   61 ++++++++++
 .../lustre/lustre/libcfs/linux/linux-module.c      |    3 +-
 drivers/staging/lustre/lustre/libcfs/module.c      |   15 ++-
 8 files changed, 282 insertions(+), 15 deletions(-)
 create mode 100644 drivers/staging/lustre/include/linux/lnet/lib-dlc.h

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index e14788c..f24330d 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -41,7 +41,8 @@
 #ifndef __LIBCFS_IOCTL_H__
 #define __LIBCFS_IOCTL_H__
 
-#define LIBCFS_IOCTL_VERSION 0x0001000a
+#define LIBCFS_IOCTL_VERSION	0x0001000a
+#define LIBCFS_IOCTL_VERSION2	0x0001000b
 
 struct libcfs_ioctl_hdr {
 	__u32 ioc_len;
@@ -87,6 +88,13 @@ do {						    \
 	data.ioc_hdr.ioc_len = sizeof(data);			\
 } while (0)
 
+#define LIBCFS_IOC_INIT_V2(data, hdr)			\
+do {							\
+	memset(&(data), 0, sizeof(data));		\
+	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
+	(data).hdr.ioc_len = sizeof(data);		\
+} while (0)
+
 struct libcfs_ioctl_handler {
 	struct list_head item;
 	int (*handle_ioctl)(unsigned int cmd, struct libcfs_ioctl_hdr *hdr);
@@ -112,9 +120,6 @@ struct libcfs_ioctl_handler {
 /* lnet ioctls */
 #define IOC_LIBCFS_GET_NI		  _IOWR('e', 50, long)
 #define IOC_LIBCFS_FAIL_NID		_IOWR('e', 51, long)
-#define IOC_LIBCFS_ADD_ROUTE	       _IOWR('e', 52, long)
-#define IOC_LIBCFS_DEL_ROUTE	       _IOWR('e', 53, long)
-#define IOC_LIBCFS_GET_ROUTE	       _IOWR('e', 54, long)
 #define IOC_LIBCFS_NOTIFY_ROUTER	   _IOWR('e', 55, long)
 #define IOC_LIBCFS_UNCONFIGURE	     _IOWR('e', 56, long)
 #define IOC_LIBCFS_PORTALS_COMPATIBILITY   _IOWR('e', 57, long)
@@ -137,7 +142,36 @@ struct libcfs_ioctl_handler {
 #define IOC_LIBCFS_DEL_INTERFACE	   _IOWR('e', 79, long)
 #define IOC_LIBCFS_GET_INTERFACE	   _IOWR('e', 80, long)
 
-#define IOC_LIBCFS_MAX_NR			     80
+/*
+ * DLC Specific IOCTL numbers.
+ * In order to maintain backward compatibility with any possible external
+ * tools which might be accessing the IOCTL numbers, a new group of IOCTL
+ * number have been allocated.
+ */
+#define IOCTL_CONFIG_SIZE		struct lnet_ioctl_config_data
+#define IOC_LIBCFS_ADD_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 81, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_DEL_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 82, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_GET_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 83, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_ADD_NET		_IOWR(IOC_LIBCFS_TYPE, 84, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_DEL_NET		_IOWR(IOC_LIBCFS_TYPE, 85, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_GET_NET		_IOWR(IOC_LIBCFS_TYPE, 86, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_CONFIG_RTR		_IOWR(IOC_LIBCFS_TYPE, 87, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_ADD_BUF		_IOWR(IOC_LIBCFS_TYPE, 88, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_GET_BUF		_IOWR(IOC_LIBCFS_TYPE, 89, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_GET_PEER_INFO	_IOWR(IOC_LIBCFS_TYPE, 90, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_GET_LNET_STATS	_IOWR(IOC_LIBCFS_TYPE, 91, \
+					      IOCTL_CONFIG_SIZE)
+#define IOC_LIBCFS_MAX_NR		91
 
 static inline int libcfs_ioctl_packlen(struct libcfs_ioctl_data *data)
 {
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-dlc.h b/drivers/staging/lustre/include/linux/lnet/lib-dlc.h
new file mode 100644
index 0000000..b6a2e91
--- /dev/null
+++ b/drivers/staging/lustre/include/linux/lnet/lib-dlc.h
@@ -0,0 +1,118 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ *
+ * Contributers:
+ *   Amir Shehata
+ */
+
+#ifndef LNET_DLC_H
+#define LNET_DLC_H
+
+#include "../libcfs/libcfs_ioctl.h"
+#include "types.h"
+
+#define MAX_NUM_SHOW_ENTRIES	32
+#define LNET_MAX_STR_LEN	128
+#define LNET_MAX_SHOW_NUM_CPT	128
+
+struct lnet_ioctl_net_config {
+	char ni_interfaces[LNET_MAX_INTERFACES][LNET_MAX_STR_LEN];
+	__u32 ni_status;
+	__u32 ni_cpts[LNET_MAX_SHOW_NUM_CPT];
+};
+
+#define LNET_TINY_BUF_IDX	0
+#define LNET_SMALL_BUF_IDX	1
+#define LNET_LARGE_BUF_IDX	2
+
+/* # different router buffer pools */
+#define LNET_NRBPOOLS		(LNET_LARGE_BUF_IDX + 1)
+
+struct lnet_ioctl_pool_cfg {
+	struct {
+		__u32 pl_npages;
+		__u32 pl_nbuffers;
+		__u32 pl_credits;
+		__u32 pl_mincredits;
+	} pl_pools[LNET_NRBPOOLS];
+	__u32 pl_routing;
+};
+
+struct lnet_ioctl_config_data {
+	struct libcfs_ioctl_hdr cfg_hdr;
+
+	__u32 cfg_net;
+	__u32 cfg_count;
+	__u64 cfg_nid;
+	__u32 cfg_ncpts;
+
+	union {
+		struct {
+			__u32 rtr_hop;
+			__u32 rtr_priority;
+			__u32 rtr_flags;
+		} cfg_route;
+		struct {
+			char net_intf[LNET_MAX_STR_LEN];
+			__s32 net_peer_timeout;
+			__s32 net_peer_tx_credits;
+			__s32 net_peer_rtr_credits;
+			__s32 net_max_tx_credits;
+			__u32 net_cksum_algo;
+			__u32 net_pad;
+		} cfg_net;
+		struct {
+			__u32 buf_enable;
+			__s32 buf_tiny;
+			__s32 buf_small;
+			__s32 buf_large;
+		} cfg_buffers;
+	} cfg_config_u;
+
+	char cfg_bulk[0];
+};
+
+struct lnet_ioctl_peer {
+	struct libcfs_ioctl_hdr pr_hdr;
+	__u32 pr_count;
+	__u32 pr_pad;
+	__u64 pr_nid;
+
+	union {
+		struct {
+			char cr_aliveness[LNET_MAX_STR_LEN];
+			__u32 cr_refcount;
+			__u32 cr_ni_peer_tx_credits;
+			__u32 cr_peer_tx_credits;
+			__u32 cr_peer_rtr_credits;
+			__u32 cr_peer_min_rtr_credits;
+			__u32 cr_peer_tx_qnob;
+			__u32 cr_ncpt;
+		} pr_peer_credits;
+	} pr_lnd_u;
+};
+
+struct lnet_ioctl_lnet_stats {
+	struct libcfs_ioctl_hdr st_hdr;
+	struct lnet_counters st_cntrs;
+};
+
+#endif /* LNET_DLC_H */
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 94d0dc5..f2874e0 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -694,6 +694,11 @@ void lnet_peer_tables_cleanup(lnet_ni_t *ni);
 void lnet_peer_tables_destroy(void);
 int lnet_peer_tables_create(void);
 void lnet_debug_peer(lnet_nid_t nid);
+int lnet_get_peers(int count, __u64 *nid, char *alivness,
+		   int *ncpt, int *refcount,
+		   int *ni_peer_tx_credits, int *peer_tx_credits,
+		   int *peer_rtr_credits, int *peer_min_rtr_credtis,
+		   int *peer_tx_qnob);
 
 static inline void
 lnet_peer_set_alive(lnet_peer_t *lp)
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 9661f6a..165345c 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -39,6 +39,7 @@
 #include <linux/ktime.h>
 
 #include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lib-dlc.h"
 
 #define D_LNI D_CONSOLE
 
@@ -1741,6 +1742,7 @@ int
 LNetCtl(unsigned int cmd, void *arg)
 {
 	struct libcfs_ioctl_data *data = arg;
+	struct lnet_ioctl_config_data *config;
 	lnet_process_id_t id = {0};
 	lnet_ni_t *ni;
 	int rc;
@@ -1765,16 +1767,51 @@ LNetCtl(unsigned int cmd, void *arg)
 		return (rc != 0) ? rc : lnet_check_routes();
 
 	case IOC_LIBCFS_DEL_ROUTE:
+		config = arg;
 		mutex_lock(&the_lnet.ln_api_mutex);
-		rc = lnet_del_route(data->ioc_net, data->ioc_nid);
+		rc = lnet_del_route(config->cfg_net, config->cfg_nid);
 		mutex_unlock(&the_lnet.ln_api_mutex);
 		return rc;
 
 	case IOC_LIBCFS_GET_ROUTE:
-		return lnet_get_route(data->ioc_count,
-				      &data->ioc_net, &data->ioc_count,
-				      &data->ioc_nid, &data->ioc_flags,
-				      &data->ioc_priority);
+		config = arg;
+		return lnet_get_route(config->cfg_count,
+				      &config->cfg_net,
+				      &config->cfg_config_u.cfg_route.rtr_hop,
+				      &config->cfg_nid,
+				      &config->cfg_config_u.cfg_route.rtr_flags,
+				      &config->cfg_config_u.cfg_route.
+					rtr_priority);
+
+	case IOC_LIBCFS_ADD_NET:
+		return 0;
+
+	case IOC_LIBCFS_DEL_NET:
+		return 0;
+
+	case IOC_LIBCFS_GET_NET:
+		return 0;
+
+	case IOC_LIBCFS_GET_LNET_STATS:
+	{
+		struct lnet_ioctl_lnet_stats *lnet_stats = arg;
+
+		lnet_counters_get(&lnet_stats->st_cntrs);
+		return 0;
+	}
+
+	case IOC_LIBCFS_CONFIG_RTR:
+		return 0;
+
+	case IOC_LIBCFS_ADD_BUF:
+		return 0;
+
+	case IOC_LIBCFS_GET_BUF:
+		return 0;
+
+	case IOC_LIBCFS_GET_PEER_INFO:
+		return 0;
+
 	case IOC_LIBCFS_NOTIFY_ROUTER:
 		secs_passed = (ktime_get_real_seconds() - data->ioc_u64[0]);
 		return lnet_notify(NULL, data->ioc_nid, data->ioc_flags,
diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
index 0afdad0..ffc5700 100644
--- a/drivers/staging/lustre/lnet/lnet/module.c
+++ b/drivers/staging/lustre/lnet/lnet/module.c
@@ -36,6 +36,7 @@
 
 #define DEBUG_SUBSYSTEM S_LNET
 #include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lib-dlc.h"
 
 static int config_on_load;
 module_param(config_on_load, int, 0444);
@@ -95,6 +96,9 @@ lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
 	case IOC_LIBCFS_UNCONFIGURE:
 		return lnet_unconfigure();
 
+	case IOC_LIBCFS_ADD_NET:
+		return LNetCtl(cmd, hdr);
+
 	default:
 		/* Passing LNET_PID_ANY only gives me a ref if the net is up
 		 * already; I'll need it to ensure the net can't go down while
diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
index bb5a0bb..1402e27 100644
--- a/drivers/staging/lustre/lnet/lnet/peer.c
+++ b/drivers/staging/lustre/lnet/lnet/peer.c
@@ -39,6 +39,7 @@
 #define DEBUG_SUBSYSTEM S_LNET
 
 #include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lib-dlc.h"
 
 int
 lnet_peer_tables_create(void)
@@ -392,3 +393,63 @@ lnet_debug_peer(lnet_nid_t nid)
 
 	lnet_net_unlock(cpt);
 }
+
+int lnet_get_peers(int count, __u64 *nid, char *aliveness,
+		   int *ncpt, int *refcount,
+		   int *ni_peer_tx_credits, int *peer_tx_credits,
+		   int *peer_rtr_credits, int *peer_min_rtr_credits,
+		   int *peer_tx_qnob)
+{
+	struct lnet_peer_table *peer_table;
+	lnet_peer_t *lp;
+	int j;
+	int lncpt, found = 0;
+
+	/* get the number of CPTs */
+	lncpt = cfs_percpt_number(the_lnet.ln_peer_tables);
+
+	/*
+	 * if the cpt number to be examined is >= the number of cpts in
+	 * the system then indicate that there are no more cpts to examin
+	 */
+	if (*ncpt > lncpt)
+		return -1;
+
+	/* get the current table */
+	peer_table = the_lnet.ln_peer_tables[*ncpt];
+	/* if the ptable is NULL then there are no more cpts to examine */
+	if (!peer_table)
+		return -1;
+
+	lnet_net_lock(*ncpt);
+
+	for (j = 0; j < LNET_PEER_HASH_SIZE && !found; j++) {
+		struct list_head *peers = &peer_table->pt_hash[j];
+
+		list_for_each_entry(lp, peers, lp_hashlist) {
+			if (count-- > 0)
+				continue;
+
+			snprintf(aliveness, LNET_MAX_STR_LEN, "NA");
+			if (lnet_isrouter(lp) ||
+			    lnet_peer_aliveness_enabled(lp))
+				snprintf(aliveness, LNET_MAX_STR_LEN,
+					 lp->lp_alive ? "up" : "down");
+
+			*nid = lp->lp_nid;
+			*refcount = lp->lp_refcount;
+			*ni_peer_tx_credits = lp->lp_ni->ni_peertxcredits;
+			*peer_tx_credits = lp->lp_txcredits;
+			*peer_rtr_credits = lp->lp_rtrcredits;
+			*peer_min_rtr_credits = lp->lp_mintxcredits;
+			*peer_tx_qnob = lp->lp_txqnob;
+
+			found = 1;
+		}
+	}
+	lnet_net_unlock(*ncpt);
+
+	*ncpt = lncpt;
+
+	return found ? 0 : -ENOENT;
+}
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
index ef1c247..1c31e2e 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
@@ -65,7 +65,8 @@ int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
 	if (copy_from_user(&hdr, arg, sizeof(hdr)))
 		return -EFAULT;
 
-	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION) {
+	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION &&
+	    hdr.ioc_version != LIBCFS_IOCTL_VERSION2) {
 		CERROR("LNET: version mismatch expected %#x, got %#x\n",
 		       LIBCFS_IOCTL_VERSION, hdr.ioc_version);
 		return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index 5348699..992ff3c 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -54,13 +54,15 @@
 
 # define DEBUG_SUBSYSTEM S_LNET
 
-#define LIBCFS_MAX_IOCTL_BUF_LEN 2048
+#define LNET_MAX_IOCTL_BUF_LEN (sizeof(struct lnet_ioctl_net_config) + \
+				sizeof(struct lnet_ioctl_config_data))
 
 #include "../../include/linux/libcfs/libcfs.h"
 #include <asm/div64.h>
 
 #include "../../include/linux/libcfs/libcfs_crypto.h"
 #include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lib-dlc.h"
 #include "../../include/linux/lnet/lnet.h"
 #include "tracefile.h"
 
@@ -249,8 +251,13 @@ static int libcfs_ioctl_handle(struct cfs_psdev_file *pfile, unsigned long cmd,
 	struct libcfs_ioctl_data *data = NULL;
 	int err = -EINVAL;
 
-	if ((cmd <= IOC_LIBCFS_LNETST) ||
-	    (cmd >= IOC_LIBCFS_REGISTER_MYNID)) {
+	/*
+	 * The libcfs_ioctl_data_adjust() function performs adjustment
+	 * operations on the libcfs_ioctl_data structure to make
+	 * it usable by the code.  This doesn't need to be called
+	 * for new data structures added.
+	 */
+	if (hdr->ioc_version == LIBCFS_IOCTL_VERSION) {
 		data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
 		err = libcfs_ioctl_data_adjust(data);
 		if (err != 0) {
@@ -322,7 +329,7 @@ static int libcfs_ioctl(struct cfs_psdev_file *pfile, unsigned long cmd, void *a
 	 * do a check here to restrict the size of the memory
 	 * to allocate to guard against DoS attacks.
 	 */
-	if (buf_len > LIBCFS_MAX_IOCTL_BUF_LEN) {
+	if (buf_len > LNET_MAX_IOCTL_BUF_LEN) {
 		CERROR("LNET: user buffer exceeds kernel buffer\n");
 		return -EINVAL;
 	}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 14/40] staging: lustre: fix crash due to NULL networks string
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (12 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 11:27   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 15/40] staging: lustre: DLC user/kernel space glue code James Simmons
                   ` (26 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

If there is an invalid networks or ip2nets lnet_parse_networks()
gets called with a NULL 'network' string parameter

lnet_parse_networks() needs to sanitize its input string now that
it's being called from multiple places.  Instead, check for
a NULL string everytime the function is called, which reduces the
probability of errors with other code modifications.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5540
Reviewed-on: http://review.whamcloud.com/11626
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    5 +----
 drivers/staging/lustre/lnet/lnet/config.c |    9 ++++++++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 165345c..cc87900 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1524,7 +1524,6 @@ LNetNIInit(lnet_pid_t requested_pid)
 	lnet_ping_info_t *pinfo;
 	lnet_handle_md_t md_handle;
 	struct list_head net_head;
-	char *nets;
 
 	INIT_LIST_HEAD(&net_head);
 
@@ -1539,13 +1538,11 @@ LNetNIInit(lnet_pid_t requested_pid)
 		return rc;
 	}
 
-	nets = lnet_get_networks();
-
 	rc = lnet_prepare(requested_pid);
 	if (rc != 0)
 		goto failed0;
 
-	rc = lnet_parse_networks(&net_head, nets);
+	rc = lnet_parse_networks(&net_head, lnet_get_networks());
 	if (rc < 0)
 		goto failed1;
 
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 1c7ad7c..d1e0217 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -184,7 +184,7 @@ int
 lnet_parse_networks(struct list_head *nilist, char *networks)
 {
 	struct cfs_expr_list *el = NULL;
-	int tokensize = strlen(networks) + 1;
+	int tokensize;
 	char *tokens;
 	char *str;
 	char *tmp;
@@ -192,6 +192,11 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 	__u32 net;
 	int nnets = 0;
 
+	if (!networks) {
+		CERROR("networks string is undefined\n");
+		return -EINVAL;
+	}
+
 	if (strlen(networks) > LNET_SINGLE_TEXTBUF_NOB) {
 		/* _WAY_ conservative */
 		LCONSOLE_ERROR_MSG(0x112,
@@ -199,6 +204,8 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 		return -EINVAL;
 	}
 
+	tokensize = strlen(networks) + 1;
+
 	LIBCFS_ALLOC(tokens, tokensize);
 	if (tokens == NULL) {
 		CERROR("Can't allocate net tokens\n");
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 15/40] staging: lustre: DLC user/kernel space glue code
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (13 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 14/40] staging: lustre: fix crash due to NULL networks string James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 12:11   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 16/40] staging: lustre: make local functions static for LNet ni James Simmons
                   ` (25 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This is the sixth patch of a set of patches that enables DLC.

This patch enables the user space to call into the kernel space
DLC code.  Added handlers in the LNetCtl function to call
the new functions added for Dynamic Lnet Configuration

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
ntel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
Reviewed-on: http://review.whamcloud.com/8023
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   22 ++-
 .../staging/lustre/include/linux/lnet/lib-types.h  |    8 +
 drivers/staging/lustre/lnet/lnet/api-ni.c          |  172 ++++++++++++++++++--
 drivers/staging/lustre/lnet/lnet/module.c          |   53 ++++++-
 drivers/staging/lustre/lnet/lnet/peer.c            |   34 ++--
 drivers/staging/lustre/lnet/lnet/router.c          |   71 +++++++--
 6 files changed, 307 insertions(+), 53 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index f2874e0..63919dd 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -39,6 +39,7 @@
 #include "api.h"
 #include "lnet.h"
 #include "lib-types.h"
+#include "lib-dlc.h"
 
 extern lnet_t	the_lnet;	/* THE network */
 
@@ -456,6 +457,12 @@ int lnet_del_route(__u32 net, lnet_nid_t gw_nid);
 void lnet_destroy_routes(void);
 int lnet_get_route(int idx, __u32 *net, __u32 *hops,
 		   lnet_nid_t *gateway, __u32 *alive, __u32 *priority);
+int lnet_get_net_config(int idx, __u32 *cpt_count, __u64 *nid,
+			int *peer_timeout, int *peer_tx_credits,
+			int *peer_rtr_cr, int *max_tx_credits,
+			struct lnet_ioctl_net_config *net_config);
+int lnet_get_rtr_pool_cfg(int idx, struct lnet_ioctl_pool_cfg *pool_cfg);
+
 void lnet_router_debugfs_init(void);
 void lnet_router_debugfs_fini(void);
 int  lnet_rtrpools_alloc(int im_a_router);
@@ -465,6 +472,10 @@ int lnet_rtrpools_enable(void);
 void lnet_rtrpools_disable(void);
 void lnet_rtrpools_free(int keep_pools);
 lnet_remotenet_t *lnet_find_net_locked(__u32 net);
+int lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
+		    __s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
+		    __s32 credits);
+int lnet_dyn_del_ni(__u32 net);
 
 int lnet_islocalnid(lnet_nid_t nid);
 int lnet_islocalnet(__u32 net);
@@ -694,11 +705,12 @@ void lnet_peer_tables_cleanup(lnet_ni_t *ni);
 void lnet_peer_tables_destroy(void);
 int lnet_peer_tables_create(void);
 void lnet_debug_peer(lnet_nid_t nid);
-int lnet_get_peers(int count, __u64 *nid, char *alivness,
-		   int *ncpt, int *refcount,
-		   int *ni_peer_tx_credits, int *peer_tx_credits,
-		   int *peer_rtr_credits, int *peer_min_rtr_credtis,
-		   int *peer_tx_qnob);
+int lnet_get_peer_info(__u32 peer_index, __u64 *nid,
+		       char alivness[LNET_MAX_STR_LEN],
+		       __u32 *cpt_iter, __u32 *refcount,
+		       __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits,
+		       __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis,
+		       __u32 *peer_tx_qnob);
 
 static inline void
 lnet_peer_set_alive(lnet_peer_t *lp)
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index e7585b9..3282782 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -611,6 +611,14 @@ typedef struct {
 	/* test protocol compatibility flags */
 	int				  ln_testprotocompat;
 
+	/*
+	 * 0 - load the NIs from the mod params
+	 * 1 - do not load the NIs from the mod params
+	 * Reverse logic to ensure that other calls to LNetNIInit
+	 * need no change
+	 */
+	bool				  ln_nis_from_mod_params;
+
 } lnet_t;
 
 #endif
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index cc87900..125d018 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1542,7 +1542,9 @@ LNetNIInit(lnet_pid_t requested_pid)
 	if (rc != 0)
 		goto failed0;
 
-	rc = lnet_parse_networks(&net_head, lnet_get_networks());
+	rc = lnet_parse_networks(&net_head,
+				 !the_lnet.ln_nis_from_mod_params ?
+				 lnet_get_networks() : "");
 	if (rc < 0)
 		goto failed1;
 
@@ -1657,6 +1659,93 @@ LNetNIFini(void)
 }
 EXPORT_SYMBOL(LNetNIFini);
 
+/**
+ * Grabs the ni data from the ni structure and fills the out
+ * parameters
+ *
+ * \param[in] ni network       interface structure
+ * \param[out] cpt_count       the number of cpts the ni is on
+ * \param[out] nid             Network Interface ID
+ * \param[out] peer_timeout    NI peer timeout
+ * \param[out] peer_tx_crdits  NI peer transmit credits
+ * \param[out] peer_rtr_credits NI peer router credits
+ * \param[out] max_tx_credits  NI max transmit credit
+ * \param[out] net_config      Network configuration
+ */
+static void
+lnet_fill_ni_info(struct lnet_ni *ni, __u32 *cpt_count, __u64 *nid,
+		  int *peer_timeout, int *peer_tx_credits,
+		  int *peer_rtr_credits, int *max_tx_credits,
+		  struct lnet_ioctl_net_config *net_config)
+{
+	int i;
+
+	if (!ni)
+		return;
+
+	if (!net_config)
+		return;
+
+	CLASSERT(ARRAY_SIZE(ni->ni_interfaces) ==
+		 ARRAY_SIZE(net_config->ni_interfaces));
+
+	if (ni->ni_interfaces[0]) {
+		for (i = 0; i < ARRAY_SIZE(ni->ni_interfaces); i++) {
+			if (ni->ni_interfaces[i]) {
+				strncpy(net_config->ni_interfaces[i],
+					ni->ni_interfaces[i],
+					sizeof(net_config->ni_interfaces[i]));
+			}
+		}
+	}
+
+	*nid = ni->ni_nid;
+	*peer_timeout = ni->ni_peertimeout;
+	*peer_tx_credits = ni->ni_peertxcredits;
+	*peer_rtr_credits = ni->ni_peerrtrcredits;
+	*max_tx_credits = ni->ni_maxtxcredits;
+
+	net_config->ni_status = ni->ni_status->ns_status;
+
+	for (i = 0;
+	     ni->ni_cpts && i < ni->ni_ncpts &&
+	     i < LNET_MAX_SHOW_NUM_CPT;
+	     i++)
+		net_config->ni_cpts[i] = ni->ni_cpts[i];
+
+	*cpt_count = ni->ni_ncpts;
+}
+
+int
+lnet_get_net_config(int idx, __u32 *cpt_count, __u64 *nid, int *peer_timeout,
+		    int *peer_tx_credits, int *peer_rtr_credits,
+		    int *max_tx_credits,
+		    struct lnet_ioctl_net_config *net_config)
+{
+	struct lnet_ni *ni;
+	struct list_head *tmp;
+	int cpt;
+	int rc = -ENOENT;
+
+	cpt = lnet_net_lock_current();
+
+	list_for_each(tmp, &the_lnet.ln_nis) {
+		ni = list_entry(tmp, lnet_ni_t, ni_list);
+		if (idx-- == 0) {
+			rc = 0;
+			lnet_ni_lock(ni);
+			lnet_fill_ni_info(ni, cpt_count, nid, peer_timeout,
+					  peer_tx_credits, peer_rtr_credits,
+					  max_tx_credits, net_config);
+			lnet_ni_unlock(ni);
+			break;
+		}
+	}
+
+	lnet_net_unlock(cpt);
+	return rc;
+}
+
 int
 lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 		__s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
@@ -1757,9 +1846,13 @@ LNetCtl(unsigned int cmd, void *arg)
 		return lnet_fail_nid(data->ioc_nid, data->ioc_count);
 
 	case IOC_LIBCFS_ADD_ROUTE:
+		config = arg;
 		mutex_lock(&the_lnet.ln_api_mutex);
-		rc = lnet_add_route(data->ioc_net, data->ioc_count,
-				    data->ioc_nid, data->ioc_priority);
+		rc = lnet_add_route(config->cfg_net,
+				    config->cfg_config_u.cfg_route.rtr_hop,
+				    config->cfg_nid,
+				    config->cfg_config_u.cfg_route.
+					rtr_priority);
 		mutex_unlock(&the_lnet.ln_api_mutex);
 		return (rc != 0) ? rc : lnet_check_routes();
 
@@ -1780,14 +1873,28 @@ LNetCtl(unsigned int cmd, void *arg)
 				      &config->cfg_config_u.cfg_route.
 					rtr_priority);
 
-	case IOC_LIBCFS_ADD_NET:
-		return 0;
-
-	case IOC_LIBCFS_DEL_NET:
-		return 0;
+	case IOC_LIBCFS_GET_NET: {
+		struct lnet_ioctl_net_config *net_config;
 
-	case IOC_LIBCFS_GET_NET:
-		return 0;
+		config = arg;
+		net_config = (struct lnet_ioctl_net_config *)
+				config->cfg_bulk;
+		if (!config || !net_config)
+			return -1;
+
+		return lnet_get_net_config(config->cfg_count,
+					   &config->cfg_ncpts,
+					   &config->cfg_nid,
+					   &config->cfg_config_u.cfg_net.
+						net_peer_timeout,
+					   &config->cfg_config_u.cfg_net.
+						net_peer_tx_credits,
+					   &config->cfg_config_u.cfg_net.
+						net_peer_rtr_credits,
+					   &config->cfg_config_u.cfg_net.
+						net_max_tx_credits,
+					   net_config);
+	}
 
 	case IOC_LIBCFS_GET_LNET_STATS:
 	{
@@ -1798,16 +1905,51 @@ LNetCtl(unsigned int cmd, void *arg)
 	}
 
 	case IOC_LIBCFS_CONFIG_RTR:
+		config = arg;
+		mutex_lock(&the_lnet.ln_api_mutex);
+		if (config->cfg_config_u.cfg_buffers.buf_enable) {
+			rc = lnet_rtrpools_enable();
+			mutex_unlock(&the_lnet.ln_api_mutex);
+			return rc;
+		}
+		lnet_rtrpools_disable();
+		mutex_unlock(&the_lnet.ln_api_mutex);
 		return 0;
 
 	case IOC_LIBCFS_ADD_BUF:
-		return 0;
+		config = arg;
+		mutex_lock(&the_lnet.ln_api_mutex);
+		rc = lnet_rtrpools_adjust(config->cfg_config_u.cfg_buffers.
+						buf_tiny,
+					  config->cfg_config_u.cfg_buffers.
+						buf_small,
+					  config->cfg_config_u.cfg_buffers.
+						buf_large);
+		mutex_unlock(&the_lnet.ln_api_mutex);
+		return rc;
 
-	case IOC_LIBCFS_GET_BUF:
-		return 0;
+	case IOC_LIBCFS_GET_BUF: {
+		struct lnet_ioctl_pool_cfg *pool_cfg;
 
-	case IOC_LIBCFS_GET_PEER_INFO:
-		return 0;
+		config = arg;
+		pool_cfg = (struct lnet_ioctl_pool_cfg *)config->cfg_bulk;
+		return lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg);
+	}
+
+	case IOC_LIBCFS_GET_PEER_INFO: {
+		struct lnet_ioctl_peer *peer_info = arg;
+
+		return lnet_get_peer_info(peer_info->pr_count,
+			&peer_info->pr_nid,
+			peer_info->pr_lnd_u.pr_peer_credits.cr_aliveness,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_ncpt,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_refcount,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_ni_peer_tx_credits,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_credits,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_rtr_credits,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_min_rtr_credits,
+			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_qnob);
+	}
 
 	case IOC_LIBCFS_NOTIFY_ROUTER:
 		secs_passed = (ktime_get_real_seconds() - data->ioc_u64[0]);
diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
index ffc5700..281315c 100644
--- a/drivers/staging/lustre/lnet/lnet/module.c
+++ b/drivers/staging/lustre/lnet/lnet/module.c
@@ -84,20 +84,69 @@ lnet_unconfigure(void)
 	return (refcount == 0) ? 0 : -EBUSY;
 }
 
+int
+lnet_dyn_configure(struct libcfs_ioctl_hdr *hdr)
+{
+	struct lnet_ioctl_config_data *conf =
+		(struct lnet_ioctl_config_data *)hdr;
+	int rc;
+
+	mutex_lock(&lnet_config_mutex);
+	if (the_lnet.ln_niinit_self)
+		rc = lnet_dyn_add_ni(LUSTRE_SRV_LNET_PID,
+				     conf->cfg_config_u.cfg_net.net_intf,
+				     conf->cfg_config_u.cfg_net.
+					net_peer_timeout,
+				     conf->cfg_config_u.cfg_net.
+					net_peer_tx_credits,
+				     conf->cfg_config_u.cfg_net.
+					net_peer_rtr_credits,
+				     conf->cfg_config_u.cfg_net.
+					net_max_tx_credits);
+	else
+		rc = -EINVAL;
+	mutex_unlock(&lnet_config_mutex);
+	return rc;
+}
+
+int
+lnet_dyn_unconfigure(struct libcfs_ioctl_hdr *hdr)
+{
+	struct lnet_ioctl_config_data *conf =
+		(struct lnet_ioctl_config_data *)hdr;
+	int rc;
+
+	mutex_lock(&lnet_config_mutex);
+	if (the_lnet.ln_niinit_self)
+		rc = lnet_dyn_del_ni(conf->cfg_net);
+	else
+		rc = -EINVAL;
+	mutex_unlock(&lnet_config_mutex);
+
+	return rc;
+}
+
 static int
 lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
 {
 	int rc;
 
 	switch (cmd) {
-	case IOC_LIBCFS_CONFIGURE:
+	case IOC_LIBCFS_CONFIGURE: {
+		struct libcfs_ioctl_data *data =
+			(struct libcfs_ioctl_data *)hdr;
+		the_lnet.ln_nis_from_mod_params = data->ioc_flags;
 		return lnet_configure(NULL);
+	}
 
 	case IOC_LIBCFS_UNCONFIGURE:
 		return lnet_unconfigure();
 
 	case IOC_LIBCFS_ADD_NET:
-		return LNetCtl(cmd, hdr);
+		return lnet_dyn_configure(hdr);
+
+	case IOC_LIBCFS_DEL_NET:
+		return lnet_dyn_unconfigure(hdr);
 
 	default:
 		/* Passing LNET_PID_ANY only gives me a ref if the net is up
diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
index 1402e27..3b71812 100644
--- a/drivers/staging/lustre/lnet/lnet/peer.c
+++ b/drivers/staging/lustre/lnet/lnet/peer.c
@@ -394,16 +394,18 @@ lnet_debug_peer(lnet_nid_t nid)
 	lnet_net_unlock(cpt);
 }
 
-int lnet_get_peers(int count, __u64 *nid, char *aliveness,
-		   int *ncpt, int *refcount,
-		   int *ni_peer_tx_credits, int *peer_tx_credits,
-		   int *peer_rtr_credits, int *peer_min_rtr_credits,
-		   int *peer_tx_qnob)
+int
+lnet_get_peer_info(__u32 peer_index, __u64 *nid,
+		   char aliveness[LNET_MAX_STR_LEN],
+		   __u32 *cpt_iter, __u32 *refcount,
+		   __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits,
+		   __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credits,
+		   __u32 *peer_tx_qnob)
 {
 	struct lnet_peer_table *peer_table;
 	lnet_peer_t *lp;
-	int j;
-	int lncpt, found = 0;
+	bool found = false;
+	int lncpt, j;
 
 	/* get the number of CPTs */
 	lncpt = cfs_percpt_number(the_lnet.ln_peer_tables);
@@ -412,22 +414,22 @@ int lnet_get_peers(int count, __u64 *nid, char *aliveness,
 	 * if the cpt number to be examined is >= the number of cpts in
 	 * the system then indicate that there are no more cpts to examin
 	 */
-	if (*ncpt > lncpt)
-		return -1;
+	if (*cpt_iter > lncpt)
+		return -ENOENT;
 
 	/* get the current table */
-	peer_table = the_lnet.ln_peer_tables[*ncpt];
+	peer_table = the_lnet.ln_peer_tables[*cpt_iter];
 	/* if the ptable is NULL then there are no more cpts to examine */
 	if (!peer_table)
-		return -1;
+		return -ENOENT;
 
-	lnet_net_lock(*ncpt);
+	lnet_net_lock(*cpt_iter);
 
 	for (j = 0; j < LNET_PEER_HASH_SIZE && !found; j++) {
 		struct list_head *peers = &peer_table->pt_hash[j];
 
 		list_for_each_entry(lp, peers, lp_hashlist) {
-			if (count-- > 0)
+			if (peer_index-- > 0)
 				continue;
 
 			snprintf(aliveness, LNET_MAX_STR_LEN, "NA");
@@ -444,12 +446,12 @@ int lnet_get_peers(int count, __u64 *nid, char *aliveness,
 			*peer_min_rtr_credits = lp->lp_mintxcredits;
 			*peer_tx_qnob = lp->lp_txqnob;
 
-			found = 1;
+			found = true;
 		}
 	}
-	lnet_net_unlock(*ncpt);
+	lnet_net_unlock(*cpt_iter);
 
-	*ncpt = lncpt;
+	*cpt_iter = lncpt;
 
 	return found ? 0 : -ENOENT;
 }
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 749085f..17e6795 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -541,6 +541,42 @@ lnet_destroy_routes(void)
 	lnet_del_route(LNET_NIDNET(LNET_NID_ANY), LNET_NID_ANY);
 }
 
+int lnet_get_rtr_pool_cfg(int idx, struct lnet_ioctl_pool_cfg *pool_cfg)
+{
+	int i, rc = -ENOENT, lidx, j;
+
+	if (!the_lnet.ln_rtrpools)
+		return rc;
+
+	for (i = 0; i < LNET_NRBPOOLS; i++) {
+		lnet_rtrbufpool_t *rbp;
+
+		lnet_net_lock(LNET_LOCK_EX);
+		lidx = idx;
+		cfs_percpt_for_each(rbp, j, the_lnet.ln_rtrpools) {
+			if (lidx-- == 0) {
+				rc = 0;
+				pool_cfg->pl_pools[i].pl_npages =
+					rbp[i].rbp_npages;
+				pool_cfg->pl_pools[i].pl_nbuffers =
+					rbp[i].rbp_nbuffers;
+				pool_cfg->pl_pools[i].pl_credits =
+					rbp[i].rbp_credits;
+				pool_cfg->pl_pools[i].pl_mincredits =
+					rbp[i].rbp_mincredits;
+				break;
+			}
+		}
+		lnet_net_unlock(LNET_LOCK_EX);
+	}
+
+	lnet_net_lock(LNET_LOCK_EX);
+	pool_cfg->pl_routing = the_lnet.ln_routing;
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	return rc;
+}
+
 int
 lnet_get_route(int idx, __u32 *net, __u32 *hops,
 	       lnet_nid_t *gateway, __u32 *alive, __u32 *priority)
@@ -1531,8 +1567,8 @@ lnet_rtrpools_alloc(int im_a_router)
 	return rc;
 }
 
-int
-lnet_rtrpools_adjust(int tiny, int small, int large)
+static int
+lnet_rtrpools_adjust_helper(int tiny, int small, int large)
 {
 	int nrb = 0;
 	int rc = 0;
@@ -1540,19 +1576,10 @@ lnet_rtrpools_adjust(int tiny, int small, int large)
 	lnet_rtrbufpool_t *rtrp;
 
 	/*
-	 * this function doesn't revert the changes if adding new buffers
-	 * failed.  It's up to the user space caller to revert the
-	 * changes.
-	 */
-
-	if (!the_lnet.ln_routing)
-		return 0;
-
-	/*
 	 * If the provided values for each buffer pool are different than the
 	 * configured values, we need to take action.
 	 */
-	if (tiny >= 0 && tiny != tiny_router_buffers) {
+	if (tiny >= 0) {
 		tiny_router_buffers = tiny;
 		nrb = lnet_nrb_tiny_calculate();
 		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
@@ -1562,7 +1589,7 @@ lnet_rtrpools_adjust(int tiny, int small, int large)
 				return rc;
 		}
 	}
-	if (small >= 0 && small != small_router_buffers) {
+	if (small >= 0) {
 		small_router_buffers = small;
 		nrb = lnet_nrb_small_calculate();
 		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
@@ -1572,7 +1599,7 @@ lnet_rtrpools_adjust(int tiny, int small, int large)
 				return rc;
 		}
 	}
-	if (large >= 0 && large != large_router_buffers) {
+	if (large >= 0) {
 		large_router_buffers = large;
 		nrb = lnet_nrb_large_calculate();
 		cfs_percpt_for_each(rtrp, i, the_lnet.ln_rtrpools) {
@@ -1587,6 +1614,20 @@ lnet_rtrpools_adjust(int tiny, int small, int large)
 }
 
 int
+lnet_rtrpools_adjust(int tiny, int small, int large)
+{
+	/*
+	 * this function doesn't revert the changes if adding new buffers
+	 * failed.  It's up to the user space caller to revert the
+	 * changes.
+	 */
+	if (!the_lnet.ln_routing)
+		return 0;
+
+	return lnet_rtrpools_adjust_helper(tiny, small, large);
+}
+
+int
 lnet_rtrpools_enable(void)
 {
 	int rc;
@@ -1604,7 +1645,7 @@ lnet_rtrpools_enable(void)
 		 */
 		return lnet_rtrpools_alloc(1);
 
-	rc = lnet_rtrpools_adjust(0, 0, 0);
+	rc = lnet_rtrpools_adjust_helper(0, 0, 0);
 	if (rc != 0)
 		return rc;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 16/40] staging: lustre: make local functions static for LNet ni
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (14 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 15/40] staging: lustre: DLC user/kernel space glue code James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 17/40] staging: lustre: add sparse annotation __user wherever needed for lnet James Simmons
                   ` (24 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, frank zago

From: frank zago <fzago@cray.com>

This reduces the code size by about 400 bytes.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/11306
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    3 ---
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   10 +++++++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 63919dd..874af17 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -690,9 +690,6 @@ void lnet_router_checker_stop(void);
 void lnet_router_ni_update_locked(lnet_peer_t *gw, __u32 net);
 void lnet_swap_pinginfo(lnet_ping_info_t *info);
 
-int lnet_ping(lnet_process_id_t id, int timeout_ms,
-	      lnet_process_id_t *ids, int n_ids);
-
 int lnet_parse_ip2nets(char **networksp, char *ip2nets);
 int lnet_parse_routes(char *route_str, int *im_a_router);
 int lnet_parse_networks(struct list_head *nilist, char *networks);
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 125d018..3b2bfd5 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -62,6 +62,10 @@ static int rnet_htable_size = LNET_REMOTE_NETS_HASH_DEFAULT;
 module_param(rnet_htable_size, int, 0444);
 MODULE_PARM_DESC(rnet_htable_size, "size of remote network hash table");
 
+static void lnet_ping_target_fini(void);
+static int lnet_ping(lnet_process_id_t id, int timeout_ms,
+		     lnet_process_id_t *ids, int n_ids);
+
 static char *
 lnet_get_routes(void)
 {
@@ -520,7 +524,7 @@ lnet_res_lh_initialize(struct lnet_res_container *rec, lnet_libhandle_t *lh)
 	list_add(&lh->lh_hash_chain, &rec->rec_lh_hash[hash]);
 }
 
-int lnet_unprepare(void);
+static int lnet_unprepare(void);
 
 static int
 lnet_prepare(lnet_pid_t requested_pid)
@@ -606,7 +610,7 @@ lnet_prepare(lnet_pid_t requested_pid)
 	return rc;
 }
 
-int
+static int
 lnet_unprepare(void)
 {
 	/* NB no LNET_LOCK since this is the last reference.  All LND instances
@@ -2080,7 +2084,7 @@ LNetSnprintHandle(char *str, int len, lnet_handle_any_t h)
 }
 EXPORT_SYMBOL(LNetSnprintHandle);
 
-int
+static int
 lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t *ids, int n_ids)
 {
 	lnet_handle_eq_t eqh;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 17/40] staging: lustre: add sparse annotation __user wherever needed for lnet
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (15 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 16/40] staging: lustre: make local functions static for LNet ni James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 18/40] staging: lustre: remove LUSTRE_{,SRV_}LNET_PID James Simmons
                   ` (23 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, frank zago

From: frank zago <fzago@cray.com>

This fixes sparse warnings such as:

   .../api-ni.c:1639:33: warning: incorrect type in argument 3
                                 (different address spaces)
   .../api-ni.c:1639:33: expected struct lnet_process_id_t
                                 [noderef] [usertype] <asn:1>*ids
   .../api-ni.c:1639:33: got struct lnet_process_id_t
                                 [usertype] *<noident>

There is no code change.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/11819
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    9 +-
 drivers/staging/lustre/include/linux/lnet/lnetst.h |   96 ++++++++++----------
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    5 +-
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    4 +-
 drivers/staging/lustre/lnet/selftest/conrpc.h      |    5 +-
 drivers/staging/lustre/lnet/selftest/console.c     |   91 ++++++++++---------
 drivers/staging/lustre/lnet/selftest/console.h     |   54 ++++++-----
 7 files changed, 140 insertions(+), 124 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 874af17..a1f94db 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -602,7 +602,7 @@ void lnet_copy_kiov2kiov(unsigned int ndkiov, lnet_kiov_t *dkiov,
 			  unsigned int soffset, unsigned int nob);
 
 static inline void
-lnet_copy_iov2flat(int dlen, void *dest, unsigned int doffset,
+lnet_copy_iov2flat(int dlen, void __user *dest, unsigned int doffset,
 		   unsigned int nsiov, struct kvec *siov, unsigned int soffset,
 		   unsigned int nob)
 {
@@ -613,7 +613,7 @@ lnet_copy_iov2flat(int dlen, void *dest, unsigned int doffset,
 }
 
 static inline void
-lnet_copy_kiov2flat(int dlen, void *dest, unsigned int doffset,
+lnet_copy_kiov2flat(int dlen, void __user *dest, unsigned int doffset,
 		    unsigned int nsiov, lnet_kiov_t *skiov,
 		    unsigned int soffset, unsigned int nob)
 {
@@ -625,7 +625,8 @@ lnet_copy_kiov2flat(int dlen, void *dest, unsigned int doffset,
 
 static inline void
 lnet_copy_flat2iov(unsigned int ndiov, struct kvec *diov, unsigned int doffset,
-		   int slen, void *src, unsigned int soffset, unsigned int nob)
+		   int slen, void __user *src, unsigned int soffset,
+		   unsigned int nob)
 {
 	struct kvec siov = {/*.iov_base = */ src, /*.iov_len = */slen};
 
@@ -635,7 +636,7 @@ lnet_copy_flat2iov(unsigned int ndiov, struct kvec *diov, unsigned int doffset,
 
 static inline void
 lnet_copy_flat2kiov(unsigned int ndiov, lnet_kiov_t *dkiov,
-		    unsigned int doffset, int slen, void *src,
+		    unsigned int doffset, int slen, void __user *src,
 		    unsigned int soffset, unsigned int nob)
 {
 	struct kvec siov = {/* .iov_base = */ src, /* .iov_len = */ slen};
diff --git a/drivers/staging/lustre/include/linux/lnet/lnetst.h b/drivers/staging/lustre/include/linux/lnet/lnetst.h
index fd1e0fd..fc183f8 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetst.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetst.h
@@ -245,20 +245,20 @@ typedef struct {
 	int		 lstio_ses_force;	/* IN: force create ? */
 	/** IN: session features */
 	unsigned	 lstio_ses_feats;
-	lst_sid_t	*lstio_ses_idp;		/* OUT: session id */
+	lst_sid_t __user	*lstio_ses_idp;		/* OUT: session id */
 	int		 lstio_ses_nmlen;	/* IN: name length */
-	char		*lstio_ses_namep;	/* IN: session name */
+	char __user		*lstio_ses_namep;	/* IN: session name */
 } lstio_session_new_args_t;
 
 /* query current session */
 typedef struct {
-	lst_sid_t		*lstio_ses_idp;		/* OUT: session id */
-	int			*lstio_ses_keyp;	/* OUT: local key */
+	lst_sid_t __user		*lstio_ses_idp;		/* OUT: session id */
+	int __user			*lstio_ses_keyp;	/* OUT: local key */
 	/** OUT: session features */
-	unsigned		*lstio_ses_featp;
-	lstcon_ndlist_ent_t	*lstio_ses_ndinfo;	/* OUT: */
+	unsigned __user			*lstio_ses_featp;
+	lstcon_ndlist_ent_t __user	*lstio_ses_ndinfo;	/* OUT: */
 	int			 lstio_ses_nmlen;	/* IN: name length */
-	char			*lstio_ses_namep;	/* OUT: session name */
+	char __user			*lstio_ses_namep;	/* OUT: session name */
 } lstio_session_info_args_t;
 
 /* delete a session */
@@ -283,26 +283,26 @@ typedef struct {
 	int			 lstio_dbg_timeout;	/* IN: timeout of
 							       debug */
 	int			 lstio_dbg_nmlen;	/* IN: len of name */
-	char			*lstio_dbg_namep;	/* IN: name of
+	char __user			*lstio_dbg_namep;	/* IN: name of
 							       group|batch */
 	int			 lstio_dbg_count;	/* IN: # of test nodes
 							       to debug */
-	lnet_process_id_t	*lstio_dbg_idsp;	/* IN: id of test
+	lnet_process_id_t __user	*lstio_dbg_idsp;	/* IN: id of test
 							       nodes */
-	struct list_head	*lstio_dbg_resultp;	/* OUT: list head of
+	struct list_head __user		*lstio_dbg_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_debug_args_t;
 
 typedef struct {
 	int	 lstio_grp_key;		/* IN: session key */
 	int	 lstio_grp_nmlen;	/* IN: name length */
-	char	*lstio_grp_namep;	/* IN: group name */
+	char __user			*lstio_grp_namep;	/* IN: group name */
 } lstio_group_add_args_t;
 
 typedef struct {
 	int	 lstio_grp_key;		/* IN: session key */
 	int	 lstio_grp_nmlen;	/* IN: name length */
-	char	*lstio_grp_namep;	/* IN: group name */
+	char __user			*lstio_grp_namep;	/* IN: group name */
 } lstio_group_del_args_t;
 
 #define LST_GROUP_CLEAN		1	/* remove inactive nodes in the group */
@@ -315,22 +315,22 @@ typedef struct {
 	int			 lstio_grp_opc;		/* IN: OPC */
 	int			 lstio_grp_args;	/* IN: arguments */
 	int			 lstio_grp_nmlen;	/* IN: name length */
-	char			*lstio_grp_namep;	/* IN: group name */
+	char __user			*lstio_grp_namep;	/* IN: group name */
 	int			 lstio_grp_count;	/* IN: # of nodes id */
-	lnet_process_id_t	*lstio_grp_idsp;	/* IN: array of nodes */
-	struct list_head	*lstio_grp_resultp;	/* OUT: list head of
+	lnet_process_id_t __user	*lstio_grp_idsp;	/* IN: array of nodes */
+	struct list_head __user		*lstio_grp_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_group_update_args_t;
 
 typedef struct {
 	int			 lstio_grp_key;		/* IN: session key */
 	int			 lstio_grp_nmlen;	/* IN: name length */
-	char			*lstio_grp_namep;	/* IN: group name */
+	char __user			*lstio_grp_namep;	/* IN: group name */
 	int			 lstio_grp_count;	/* IN: # of nodes */
 	/** OUT: session features */
-	unsigned		*lstio_grp_featp;
-	lnet_process_id_t	*lstio_grp_idsp;	/* IN: nodes */
-	struct list_head	*lstio_grp_resultp;	/* OUT: list head of
+	unsigned __user			*lstio_grp_featp;
+	lnet_process_id_t __user	*lstio_grp_idsp;	/* IN: nodes */
+	struct list_head __user		*lstio_grp_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_group_nodes_args_t;
 
@@ -338,18 +338,18 @@ typedef struct {
 	int	 lstio_grp_key;		/* IN: session key */
 	int	 lstio_grp_idx;		/* IN: group idx */
 	int	 lstio_grp_nmlen;	/* IN: name len */
-	char	*lstio_grp_namep;	/* OUT: name */
+	char __user			*lstio_grp_namep;	/* OUT: name */
 } lstio_group_list_args_t;
 
 typedef struct {
 	int			 lstio_grp_key;		/* IN: session key */
 	int			 lstio_grp_nmlen;	/* IN: name len */
-	char			*lstio_grp_namep;	/* IN: name */
-	lstcon_ndlist_ent_t	*lstio_grp_entp;	/* OUT: description of
+	char __user			*lstio_grp_namep;	/* IN: name */
+	lstcon_ndlist_ent_t __user	*lstio_grp_entp;	/* OUT: description of
 								group */
-	int			*lstio_grp_idxp;	/* IN/OUT: node index */
-	int			*lstio_grp_ndentp;	/* IN/OUT: # of nodent */
-	lstcon_node_ent_t	*lstio_grp_dentsp;	/* OUT: nodent array */
+	int __user			*lstio_grp_idxp;	/* IN/OUT: node index */
+	int __user			*lstio_grp_ndentp;	/* IN/OUT: # of nodent */
+	lstcon_node_ent_t __user	*lstio_grp_dentsp;	/* OUT: nodent array */
 } lstio_group_info_args_t;
 
 #define LST_DEFAULT_BATCH	"batch"			/* default batch name */
@@ -357,13 +357,13 @@ typedef struct {
 typedef struct {
 	int	 lstio_bat_key;		/* IN: session key */
 	int	 lstio_bat_nmlen;	/* IN: name length */
-	char	*lstio_bat_namep;	/* IN: batch name */
+	char __user			*lstio_bat_namep;	/* IN: batch name */
 } lstio_batch_add_args_t;
 
 typedef struct {
 	int	 lstio_bat_key;		/* IN: session key */
 	int	 lstio_bat_nmlen;	/* IN: name length */
-	char	*lstio_bat_namep;	/* IN: batch name */
+	char __user		*lstio_bat_namep;	/* IN: batch name */
 } lstio_batch_del_args_t;
 
 typedef struct {
@@ -371,8 +371,8 @@ typedef struct {
 	int			 lstio_bat_timeout;	/* IN: timeout for
 							       the batch */
 	int			 lstio_bat_nmlen;	/* IN: name length */
-	char			*lstio_bat_namep;	/* IN: batch name */
-	struct list_head	*lstio_bat_resultp;	/* OUT: list head of
+	char __user		*lstio_bat_namep;	/* IN: batch name */
+	struct list_head __user	*lstio_bat_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_batch_run_args_t;
 
@@ -381,8 +381,8 @@ typedef struct {
 	int			 lstio_bat_force;	/* IN: abort unfinished
 							       test RPC */
 	int			 lstio_bat_nmlen;	/* IN: name length */
-	char			*lstio_bat_namep;	/* IN: batch name */
-	struct list_head	*lstio_bat_resultp;	/* OUT: list head of
+	char __user		*lstio_bat_namep;	/* IN: batch name */
+	struct list_head __user	*lstio_bat_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_batch_stop_args_t;
 
@@ -394,8 +394,8 @@ typedef struct {
 	int			 lstio_bat_timeout;	/* IN: timeout for
 							       waiting */
 	int			 lstio_bat_nmlen;	/* IN: name length */
-	char			*lstio_bat_namep;	/* IN: batch name */
-	struct list_head	*lstio_bat_resultp;	/* OUT: list head of
+	char __user		*lstio_bat_namep;	/* IN: batch name */
+	struct list_head __user	*lstio_bat_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_batch_query_args_t;
 
@@ -403,21 +403,21 @@ typedef struct {
 	int	 lstio_bat_key;		/* IN: session key */
 	int	 lstio_bat_idx;		/* IN: index */
 	int	 lstio_bat_nmlen;	/* IN: name length */
-	char	*lstio_bat_namep;	/* IN: batch name */
+	char __user		*lstio_bat_namep;	/* IN: batch name */
 } lstio_batch_list_args_t;
 
 typedef struct {
 	int			 lstio_bat_key;		/* IN: session key */
 	int			 lstio_bat_nmlen;	/* IN: name length */
-	char			*lstio_bat_namep;	/* IN: name */
+	char __user			*lstio_bat_namep;	/* IN: name */
 	int			 lstio_bat_server;	/* IN: query server
 							       or not */
 	int			 lstio_bat_testidx;	/* IN: test index */
-	lstcon_test_batch_ent_t	*lstio_bat_entp;	/* OUT: batch ent */
+	lstcon_test_batch_ent_t	__user	*lstio_bat_entp;	/* OUT: batch ent */
 
-	int			*lstio_bat_idxp;	/* IN/OUT: index of node */
-	int			*lstio_bat_ndentp;	/* IN/OUT: # of nodent */
-	lstcon_node_ent_t	*lstio_bat_dentsp;	/* array of nodent */
+	int __user			*lstio_bat_idxp;	/* IN/OUT: index of node */
+	int __user			*lstio_bat_ndentp;	/* IN/OUT: # of nodent */
+	lstcon_node_ent_t __user	*lstio_bat_dentsp;	/* array of nodent */
 } lstio_batch_info_args_t;
 
 /* add stat in session */
@@ -427,10 +427,10 @@ typedef struct {
 							       stat request */
 	int			 lstio_sta_nmlen;	/* IN: group name
 							       length */
-	char			*lstio_sta_namep;	/* IN: group name */
+	char __user			*lstio_sta_namep;	/* IN: group name */
 	int			 lstio_sta_count;	/* IN: # of pid */
-	lnet_process_id_t	*lstio_sta_idsp;	/* IN: pid */
-	struct list_head	*lstio_sta_resultp;	/* OUT: list head of
+	lnet_process_id_t __user	*lstio_sta_idsp;	/* IN: pid */
+	struct list_head __user		*lstio_sta_resultp;	/* OUT: list head of
 								result buffer */
 } lstio_stat_args_t;
 
@@ -445,7 +445,7 @@ typedef enum {
 typedef struct {
 	int		  lstio_tes_key;	/* IN: session key */
 	int		  lstio_tes_bat_nmlen;	/* IN: batch name len */
-	char		 *lstio_tes_bat_name;	/* IN: batch name */
+	char __user		 *lstio_tes_bat_name;	/* IN: batch name */
 	int		  lstio_tes_type;	/* IN: test type */
 	int		  lstio_tes_oneside;	/* IN: one sided test */
 	int		  lstio_tes_loop;	/* IN: loop count */
@@ -457,20 +457,20 @@ typedef struct {
 						       destination groups */
 	int		  lstio_tes_sgrp_nmlen;	/* IN: source group
 						       name length */
-	char		 *lstio_tes_sgrp_name;	/* IN: group name */
+	char __user		 *lstio_tes_sgrp_name;	/* IN: group name */
 	int		  lstio_tes_dgrp_nmlen;	/* IN: destination group
 						       name length */
-	char		 *lstio_tes_dgrp_name;	/* IN: group name */
+	char __user		 *lstio_tes_dgrp_name;	/* IN: group name */
 
 	int		  lstio_tes_param_len;	/* IN: param buffer len */
-	void		 *lstio_tes_param;	/* IN: parameter for specified
+	void __user		 *lstio_tes_param;	/* IN: parameter for specified
 						       test:
 						       lstio_bulk_param_t,
 						       lstio_ping_param_t,
 						       ... more */
-	int		 *lstio_tes_retp;	/* OUT: private returned
+	int __user		*lstio_tes_retp;	/* OUT: private returned
 							value */
-	struct list_head *lstio_tes_resultp;	/* OUT: list head of
+	struct list_head __user	*lstio_tes_resultp;	/* OUT: list head of
 							result buffer */
 } lstio_test_args_t;
 
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 3b2bfd5..acc216e 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1982,7 +1982,7 @@ LNetCtl(unsigned int cmd, void *arg)
 		id.nid = data->ioc_nid;
 		id.pid = data->ioc_u32[0];
 		rc = lnet_ping(id, data->ioc_u32[1], /* timeout */
-			       (lnet_process_id_t *)data->ioc_pbuf1,
+			       (lnet_process_id_t __user *)data->ioc_pbuf1,
 			       data->ioc_plen1/sizeof(lnet_process_id_t));
 		if (rc < 0)
 			return rc;
@@ -2085,7 +2085,8 @@ LNetSnprintHandle(char *str, int len, lnet_handle_any_t h)
 EXPORT_SYMBOL(LNetSnprintHandle);
 
 static int
-lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t *ids, int n_ids)
+lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t __user *ids,
+	  int n_ids)
 {
 	lnet_handle_eq_t eqh;
 	lnet_handle_md_t mdh;
diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
index 1066c70..15a61de 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.c
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
@@ -466,11 +466,11 @@ lstcon_rpc_trans_stat(lstcon_rpc_trans_t *trans, lstcon_trans_stat_t *stat)
 
 int
 lstcon_rpc_trans_interpreter(lstcon_rpc_trans_t *trans,
-			     struct list_head *head_up,
+			     struct list_head __user *head_up,
 			     lstcon_rpc_readent_func_t readent)
 {
 	struct list_head tmp;
-	struct list_head *next;
+	struct list_head __user *next;
 	lstcon_rpc_ent_t *ent;
 	srpc_generic_reply_t *rep;
 	lstcon_rpc_t *crpc;
diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.h b/drivers/staging/lustre/lnet/selftest/conrpc.h
index 95c832f..d2133bc 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.h
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.h
@@ -106,7 +106,8 @@ typedef struct lstcon_rpc_trans {
 #define LST_TRANS_STATQRY       0x21
 
 typedef int (*lstcon_rpc_cond_func_t)(int, struct lstcon_node *, void *);
-typedef int (*lstcon_rpc_readent_func_t)(int, srpc_msg_t *, lstcon_rpc_ent_t *);
+typedef int (*lstcon_rpc_readent_func_t)(int, srpc_msg_t *,
+					 lstcon_rpc_ent_t __user *);
 
 int  lstcon_sesrpc_prep(struct lstcon_node *nd, int transop,
 			unsigned version, lstcon_rpc_t **crpc);
@@ -128,7 +129,7 @@ int  lstcon_rpc_trans_ndlist(struct list_head *ndlist,
 void lstcon_rpc_trans_stat(lstcon_rpc_trans_t *trans,
 			   lstcon_trans_stat_t *stat);
 int  lstcon_rpc_trans_interpreter(lstcon_rpc_trans_t *trans,
-				  struct list_head *head_up,
+				  struct list_head __user *head_up,
 				  lstcon_rpc_readent_func_t readent);
 void lstcon_rpc_trans_abort(lstcon_rpc_trans_t *trans, int error);
 void lstcon_rpc_trans_destroy(lstcon_rpc_trans_t *trans);
diff --git a/drivers/staging/lustre/lnet/selftest/console.c b/drivers/staging/lustre/lnet/selftest/console.c
index 898912b..f8d6dfe 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -363,7 +363,7 @@ lstcon_sesrpc_condition(int transop, lstcon_node_t *nd, void *arg)
 
 static int
 lstcon_sesrpc_readent(int transop, srpc_msg_t *msg,
-		      lstcon_rpc_ent_t *ent_up)
+		      lstcon_rpc_ent_t __user *ent_up)
 {
 	srpc_debug_reply_t *rep;
 
@@ -392,8 +392,8 @@ lstcon_sesrpc_readent(int transop, srpc_msg_t *msg,
 
 static int
 lstcon_group_nodes_add(lstcon_group_t *grp,
-		       int count, lnet_process_id_t *ids_up,
-		       unsigned *featp, struct list_head *result_up)
+		       int count, lnet_process_id_t __user *ids_up,
+		       unsigned *featp, struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	lstcon_ndlink_t *ndl;
@@ -459,8 +459,8 @@ lstcon_group_nodes_add(lstcon_group_t *grp,
 
 static int
 lstcon_group_nodes_remove(lstcon_group_t *grp,
-			  int count, lnet_process_id_t *ids_up,
-			  struct list_head *result_up)
+			  int count, lnet_process_id_t __user *ids_up,
+			  struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	lstcon_ndlink_t *ndl;
@@ -537,8 +537,8 @@ lstcon_group_add(char *name)
 }
 
 int
-lstcon_nodes_add(char *name, int count, lnet_process_id_t *ids_up,
-		 unsigned *featp, struct list_head *result_up)
+lstcon_nodes_add(char *name, int count, lnet_process_id_t __user *ids_up,
+		 unsigned *featp, struct list_head __user *result_up)
 {
 	lstcon_group_t *grp;
 	int rc;
@@ -642,7 +642,8 @@ lstcon_group_clean(char *name, int args)
 
 int
 lstcon_nodes_remove(char *name, int count,
-		    lnet_process_id_t *ids_up, struct list_head *result_up)
+		    lnet_process_id_t __user *ids_up,
+		    struct list_head __user *result_up)
 {
 	lstcon_group_t *grp = NULL;
 	int rc;
@@ -671,7 +672,7 @@ lstcon_nodes_remove(char *name, int count,
 }
 
 int
-lstcon_group_refresh(char *name, struct list_head *result_up)
+lstcon_group_refresh(char *name, struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	lstcon_group_t *grp;
@@ -732,7 +733,7 @@ lstcon_group_list(int index, int len, char *name_up)
 
 static int
 lstcon_nodes_getent(struct list_head *head, int *index_p,
-		    int *count_p, lstcon_node_ent_t *dents_up)
+		    int *count_p, lstcon_node_ent_t __user *dents_up)
 {
 	lstcon_ndlink_t *ndl;
 	lstcon_node_t *nd;
@@ -771,8 +772,9 @@ lstcon_nodes_getent(struct list_head *head, int *index_p,
 }
 
 int
-lstcon_group_info(char *name, lstcon_ndlist_ent_t *gents_p,
-		  int *index_p, int *count_p, lstcon_node_ent_t *dents_up)
+lstcon_group_info(char *name, lstcon_ndlist_ent_t __user *gents_p,
+		  int *index_p, int *count_p,
+		  lstcon_node_ent_t __user *dents_up)
 {
 	lstcon_ndlist_ent_t *gentp;
 	lstcon_group_t *grp;
@@ -910,9 +912,9 @@ lstcon_batch_list(int index, int len, char *name_up)
 }
 
 int
-lstcon_batch_info(char *name, lstcon_test_batch_ent_t *ent_up, int server,
-		  int testidx, int *index_p, int *ndent_p,
-		  lstcon_node_ent_t *dents_up)
+lstcon_batch_info(char *name, lstcon_test_batch_ent_t __user *ent_up,
+		  int server, int testidx, int *index_p, int *ndent_p,
+		  lstcon_node_ent_t __user *dents_up)
 {
 	lstcon_test_batch_ent_t *entp;
 	struct list_head *clilst;
@@ -1006,7 +1008,7 @@ lstcon_batrpc_condition(int transop, lstcon_node_t *nd, void *arg)
 
 static int
 lstcon_batch_op(lstcon_batch_t *bat, int transop,
-		struct list_head *result_up)
+		struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	int rc;
@@ -1029,7 +1031,7 @@ lstcon_batch_op(lstcon_batch_t *bat, int transop,
 }
 
 int
-lstcon_batch_run(char *name, int timeout, struct list_head *result_up)
+lstcon_batch_run(char *name, int timeout, struct list_head __user *result_up)
 {
 	lstcon_batch_t *bat;
 	int rc;
@@ -1051,7 +1053,7 @@ lstcon_batch_run(char *name, int timeout, struct list_head *result_up)
 }
 
 int
-lstcon_batch_stop(char *name, int force, struct list_head *result_up)
+lstcon_batch_stop(char *name, int force, struct list_head __user *result_up)
 {
 	lstcon_batch_t *bat;
 	int rc;
@@ -1170,7 +1172,7 @@ lstcon_testrpc_condition(int transop, lstcon_node_t *nd, void *arg)
 }
 
 static int
-lstcon_test_nodes_add(lstcon_test_t *test, struct list_head *result_up)
+lstcon_test_nodes_add(lstcon_test_t *test, struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	lstcon_group_t *grp;
@@ -1266,7 +1268,7 @@ lstcon_test_add(char *batch_name, int type, int loop,
 		int concur, int dist, int span,
 		char *src_name, char *dst_name,
 		void *param, int paramlen, int *retp,
-		struct list_head *result_up)
+		struct list_head __user *result_up)
 {
 	lstcon_test_t	 *test	 = NULL;
 	int		 rc;
@@ -1369,7 +1371,7 @@ lstcon_test_find(lstcon_batch_t *batch, int idx, lstcon_test_t **testpp)
 
 static int
 lstcon_tsbrpc_readent(int transop, srpc_msg_t *msg,
-		      lstcon_rpc_ent_t *ent_up)
+		      lstcon_rpc_ent_t __user *ent_up)
 {
 	srpc_batch_reply_t *rep = &msg->msg_body.bat_reply;
 
@@ -1386,7 +1388,7 @@ lstcon_tsbrpc_readent(int transop, srpc_msg_t *msg,
 
 int
 lstcon_test_batch_query(char *name, int testidx, int client,
-			int timeout, struct list_head *result_up)
+			int timeout, struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	struct list_head *translist;
@@ -1448,19 +1450,21 @@ lstcon_test_batch_query(char *name, int testidx, int client,
 
 static int
 lstcon_statrpc_readent(int transop, srpc_msg_t *msg,
-		       lstcon_rpc_ent_t *ent_up)
+		       lstcon_rpc_ent_t __user *ent_up)
 {
 	srpc_stat_reply_t *rep = &msg->msg_body.stat_reply;
-	sfw_counters_t *sfwk_stat;
-	srpc_counters_t *srpc_stat;
-	lnet_counters_t *lnet_stat;
+	sfw_counters_t __user *sfwk_stat;
+	srpc_counters_t __user *srpc_stat;
+	lnet_counters_t __user *lnet_stat;
 
 	if (rep->str_status != 0)
 		return 0;
 
-	sfwk_stat = (sfw_counters_t *)&ent_up->rpe_payload[0];
-	srpc_stat = (srpc_counters_t *)((char *)sfwk_stat + sizeof(*sfwk_stat));
-	lnet_stat = (lnet_counters_t *)((char *)srpc_stat + sizeof(*srpc_stat));
+	sfwk_stat = (sfw_counters_t __user *)&ent_up->rpe_payload[0];
+	srpc_stat = (srpc_counters_t __user *)((char __user *)sfwk_stat +
+						sizeof(*sfwk_stat));
+	lnet_stat = (lnet_counters_t __user *)((char __user *)srpc_stat +
+						sizeof(*srpc_stat));
 
 	if (copy_to_user(sfwk_stat, &rep->str_fw, sizeof(*sfwk_stat)) ||
 	    copy_to_user(srpc_stat, &rep->str_rpc, sizeof(*srpc_stat)) ||
@@ -1472,7 +1476,7 @@ lstcon_statrpc_readent(int transop, srpc_msg_t *msg,
 
 static int
 lstcon_ndlist_stat(struct list_head *ndlist,
-		   int timeout, struct list_head *result_up)
+		   int timeout, struct list_head __user *result_up)
 {
 	struct list_head head;
 	lstcon_rpc_trans_t *trans;
@@ -1497,7 +1501,8 @@ lstcon_ndlist_stat(struct list_head *ndlist,
 }
 
 int
-lstcon_group_stat(char *grp_name, int timeout, struct list_head *result_up)
+lstcon_group_stat(char *grp_name, int timeout,
+		  struct list_head __user *result_up)
 {
 	lstcon_group_t *grp;
 	int rc;
@@ -1516,8 +1521,8 @@ lstcon_group_stat(char *grp_name, int timeout, struct list_head *result_up)
 }
 
 int
-lstcon_nodes_stat(int count, lnet_process_id_t *ids_up,
-		  int timeout, struct list_head *result_up)
+lstcon_nodes_stat(int count, lnet_process_id_t __user *ids_up,
+		  int timeout, struct list_head __user *result_up)
 {
 	lstcon_ndlink_t *ndl;
 	lstcon_group_t *tmp;
@@ -1562,7 +1567,7 @@ lstcon_nodes_stat(int count, lnet_process_id_t *ids_up,
 static int
 lstcon_debug_ndlist(struct list_head *ndlist,
 		    struct list_head *translist,
-		    int timeout, struct list_head *result_up)
+		    int timeout, struct list_head __user *result_up)
 {
 	lstcon_rpc_trans_t *trans;
 	int		 rc;
@@ -1584,7 +1589,7 @@ lstcon_debug_ndlist(struct list_head *ndlist,
 }
 
 int
-lstcon_session_debug(int timeout, struct list_head *result_up)
+lstcon_session_debug(int timeout, struct list_head __user *result_up)
 {
 	return lstcon_debug_ndlist(&console_session.ses_ndl_list,
 				   NULL, timeout, result_up);
@@ -1592,7 +1597,7 @@ lstcon_session_debug(int timeout, struct list_head *result_up)
 
 int
 lstcon_batch_debug(int timeout, char *name,
-		   int client, struct list_head *result_up)
+		   int client, struct list_head __user *result_up)
 {
 	lstcon_batch_t *bat;
 	int rc;
@@ -1610,7 +1615,7 @@ lstcon_batch_debug(int timeout, char *name,
 
 int
 lstcon_group_debug(int timeout, char *name,
-		   struct list_head *result_up)
+		   struct list_head __user *result_up)
 {
 	lstcon_group_t *grp;
 	int rc;
@@ -1628,8 +1633,8 @@ lstcon_group_debug(int timeout, char *name,
 
 int
 lstcon_nodes_debug(int timeout,
-		   int count, lnet_process_id_t *ids_up,
-		   struct list_head *result_up)
+		   int count, lnet_process_id_t __user *ids_up,
+		   struct list_head __user *result_up)
 {
 	lnet_process_id_t id;
 	lstcon_ndlink_t *ndl;
@@ -1693,7 +1698,7 @@ extern srpc_service_t lstcon_acceptor_service;
 
 int
 lstcon_session_new(char *name, int key, unsigned feats,
-		   int timeout, int force, lst_sid_t *sid_up)
+		   int timeout, int force, lst_sid_t __user *sid_up)
 {
 	int rc = 0;
 	int i;
@@ -1758,8 +1763,10 @@ lstcon_session_new(char *name, int key, unsigned feats,
 }
 
 int
-lstcon_session_info(lst_sid_t *sid_up, int *key_up, unsigned *featp,
-		    lstcon_ndlist_ent_t *ndinfo_up, char *name_up, int len)
+lstcon_session_info(lst_sid_t __user *sid_up, int __user *key_up,
+		    unsigned __user *featp,
+		    lstcon_ndlist_ent_t __user *ndinfo_up,
+		    char __user *name_up, int len)
 {
 	lstcon_ndlist_ent_t *entp;
 	lstcon_ndlink_t *ndl;
diff --git a/drivers/staging/lustre/lnet/selftest/console.h b/drivers/staging/lustre/lnet/selftest/console.h
index 7af3540..bab6557 100644
--- a/drivers/staging/lustre/lnet/selftest/console.h
+++ b/drivers/staging/lustre/lnet/selftest/console.h
@@ -187,47 +187,53 @@ int lstcon_console_init(void);
 int lstcon_console_fini(void);
 int lstcon_session_match(lst_sid_t sid);
 int lstcon_session_new(char *name, int key, unsigned version,
-		       int timeout, int flags, lst_sid_t *sid_up);
-int lstcon_session_info(lst_sid_t *sid_up, int *key, unsigned *verp,
-			lstcon_ndlist_ent_t *entp, char *name_up, int len);
+		       int timeout, int flags, lst_sid_t __user *sid_up);
+int lstcon_session_info(lst_sid_t __user *sid_up, int __user *key,
+			unsigned __user *verp,
+			lstcon_ndlist_ent_t __user *entp,
+			char __user *name_up, int len);
 int lstcon_session_end(void);
-int lstcon_session_debug(int timeout, struct list_head *result_up);
+int lstcon_session_debug(int timeout, struct list_head __user *result_up);
 int lstcon_session_feats_check(unsigned feats);
 int lstcon_batch_debug(int timeout, char *name,
-		       int client, struct list_head *result_up);
+		       int client, struct list_head __user *result_up);
 int lstcon_group_debug(int timeout, char *name,
-		       struct list_head *result_up);
-int lstcon_nodes_debug(int timeout, int nnd, lnet_process_id_t *nds_up,
-		       struct list_head *result_up);
+		       struct list_head __user *result_up);
+int lstcon_nodes_debug(int timeout, int nnd,
+		       lnet_process_id_t __user *nds_up,
+		       struct list_head __user *result_up);
 int lstcon_group_add(char *name);
 int lstcon_group_del(char *name);
 int lstcon_group_clean(char *name, int args);
-int lstcon_group_refresh(char *name, struct list_head *result_up);
-int lstcon_nodes_add(char *name, int nnd, lnet_process_id_t *nds_up,
-		     unsigned *featp, struct list_head *result_up);
-int lstcon_nodes_remove(char *name, int nnd, lnet_process_id_t *nds_up,
-			struct list_head *result_up);
-int lstcon_group_info(char *name, lstcon_ndlist_ent_t *gent_up,
-		      int *index_p, int *ndent_p, lstcon_node_ent_t *ndents_up);
+int lstcon_group_refresh(char *name, struct list_head __user *result_up);
+int lstcon_nodes_add(char *name, int nnd, lnet_process_id_t __user *nds_up,
+		     unsigned *featp, struct list_head __user *result_up);
+int lstcon_nodes_remove(char *name, int nnd, lnet_process_id_t __user *nds_up,
+			struct list_head __user *result_up);
+int lstcon_group_info(char *name, lstcon_ndlist_ent_t __user *gent_up,
+		      int *index_p, int *ndent_p,
+		      lstcon_node_ent_t __user *ndents_up);
 int lstcon_group_list(int idx, int len, char *name_up);
 int lstcon_batch_add(char *name);
-int lstcon_batch_run(char *name, int timeout, struct list_head *result_up);
-int lstcon_batch_stop(char *name, int force, struct list_head *result_up);
+int lstcon_batch_run(char *name, int timeout,
+		     struct list_head __user *result_up);
+int lstcon_batch_stop(char *name, int force,
+		      struct list_head __user *result_up);
 int lstcon_test_batch_query(char *name, int testidx,
 			    int client, int timeout,
-			    struct list_head *result_up);
+			    struct list_head __user *result_up);
 int lstcon_batch_del(char *name);
 int lstcon_batch_list(int idx, int namelen, char *name_up);
-int lstcon_batch_info(char *name, lstcon_test_batch_ent_t *ent_up,
+int lstcon_batch_info(char *name, lstcon_test_batch_ent_t __user *ent_up,
 		      int server, int testidx, int *index_p,
-		      int *ndent_p, lstcon_node_ent_t *dents_up);
+		      int *ndent_p, lstcon_node_ent_t __user *dents_up);
 int lstcon_group_stat(char *grp_name, int timeout,
-		      struct list_head *result_up);
-int lstcon_nodes_stat(int count, lnet_process_id_t *ids_up,
-		      int timeout, struct list_head *result_up);
+		      struct list_head __user *result_up);
+int lstcon_nodes_stat(int count, lnet_process_id_t __user *ids_up,
+		      int timeout, struct list_head __user *result_up);
 int lstcon_test_add(char *batch_name, int type, int loop,
 		    int concur, int dist, int span,
 		    char *src_name, char *dst_name,
 		    void *param, int paramlen, int *retp,
-		    struct list_head *result_up);
+		    struct list_head __user *result_up);
 #endif
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 18/40] staging: lustre: remove LUSTRE_{,SRV_}LNET_PID
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (16 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 17/40] staging: lustre: add sparse annotation __user wherever needed for lnet James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer James Simmons
                   ` (22 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, John L. Hammond

From: John L. Hammond <john.hammond@intel.com>

Remove LUSTRE_LNET_PID (12354) and LUSTRE_SRV_LNET_PID (12345) from
the libcfs headers and replace their uses with a new macro
LNET_PID_LUSTRE (also 12345) in lnet/types.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675
Reviewed-on: http://review.whamcloud.com/11985
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    2 --
 .../lustre/include/linux/libcfs/linux/libcfs.h     |    3 ---
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    7 +++++--
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    2 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    2 +-
 drivers/staging/lustre/lnet/lnet/module.c          |    4 ++--
 drivers/staging/lustre/lnet/lnet/router.c          |    2 +-
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/events.c      |    4 ++--
 9 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 4d74e8a..1cca6c7 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -58,8 +58,6 @@ static inline int __is_po2(unsigned long long val)
 #define LERRCHKSUM(hexnum) (((hexnum) & 0xf) ^ ((hexnum) >> 4 & 0xf) ^ \
 			   ((hexnum) >> 8 & 0xf))
 
-#define LUSTRE_SRV_LNET_PID      LUSTRE_LNET_PID
-
 #include <linux/list.h>
 
 /* need both kernel and user-land acceptor */
diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
index aac5900..d94b266 100644
--- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
@@ -118,9 +118,6 @@ do {								    \
 #define CDEBUG_STACK() (0L)
 #endif /* __x86_64__ */
 
-/* initial pid  */
-#define LUSTRE_LNET_PID	  12345
-
 #define __current_nesting_level() (0)
 
 /**
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index ebde036..6b88902 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -1788,7 +1788,10 @@ ksocknal_query(lnet_ni_t *ni, lnet_nid_t nid, unsigned long *when)
 	unsigned long now = cfs_time_current();
 	ksock_peer_t *peer = NULL;
 	rwlock_t *glock = &ksocknal_data.ksnd_global_lock;
-	lnet_process_id_t id = {.nid = nid, .pid = LUSTRE_SRV_LNET_PID};
+	lnet_process_id_t id = {
+		.nid = nid,
+		.pid = LNET_PID_LUSTRE,
+	};
 
 	read_lock(glock);
 
@@ -2136,7 +2139,7 @@ ksocknal_ctl(lnet_ni_t *ni, unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_ADD_PEER:
 		id.nid = data->ioc_nid;
-		id.pid = LUSTRE_SRV_LNET_PID;
+		id.pid = LNET_PID_LUSTRE;
 		return ksocknal_add_peer(ni, id,
 					  data->ioc_u32[0], /* IP */
 					  data->ioc_u32[1]); /* port */
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index acc216e..949fa2f 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -2114,7 +2114,7 @@ lnet_ping(lnet_process_id_t id, int timeout_ms, lnet_process_id_t __user *ids,
 		return -EINVAL;
 
 	if (id.pid == LNET_PID_ANY)
-		id.pid = LUSTRE_SRV_LNET_PID;
+		id.pid = LNET_PID_LUSTRE;
 
 	LIBCFS_ALLOC(info, infosz);
 	if (info == NULL)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index e1461af..b9388ed 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1388,7 +1388,7 @@ lnet_send(lnet_nid_t src_nid, lnet_msg_t *msg, lnet_nid_t rtr_nid)
 
 		msg->msg_target_is_router = 1;
 		msg->msg_target.nid = lp->lp_nid;
-		msg->msg_target.pid = LUSTRE_SRV_LNET_PID;
+		msg->msg_target.pid = LNET_PID_LUSTRE;
 	}
 
 	/* 'lp' is our best choice of peer */
diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
index 281315c..48eb085 100644
--- a/drivers/staging/lustre/lnet/lnet/module.c
+++ b/drivers/staging/lustre/lnet/lnet/module.c
@@ -53,7 +53,7 @@ lnet_configure(void *arg)
 	mutex_lock(&lnet_config_mutex);
 
 	if (!the_lnet.ln_niinit_self) {
-		rc = LNetNIInit(LUSTRE_SRV_LNET_PID);
+		rc = LNetNIInit(LNET_PID_LUSTRE);
 		if (rc >= 0) {
 			the_lnet.ln_niinit_self = 1;
 			rc = 0;
@@ -93,7 +93,7 @@ lnet_dyn_configure(struct libcfs_ioctl_hdr *hdr)
 
 	mutex_lock(&lnet_config_mutex);
 	if (the_lnet.ln_niinit_self)
-		rc = lnet_dyn_add_ni(LUSTRE_SRV_LNET_PID,
+		rc = lnet_dyn_add_ni(LNET_PID_LUSTRE,
 				     conf->cfg_config_u.cfg_net.net_intf,
 				     conf->cfg_config_u.cfg_net.
 					net_peer_timeout,
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 17e6795..1f5a4b1 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -1009,7 +1009,7 @@ lnet_ping_router_locked(lnet_peer_t *rtr)
 		lnet_handle_md_t mdh;
 
 		id.nid = rtr->lp_nid;
-		id.pid = LUSTRE_SRV_LNET_PID;
+		id.pid = LNET_PID_LUSTRE;
 		CDEBUG(D_NET, "Check: %s\n", libcfs_id2str(id));
 
 		rtr->lp_ping_notsent   = 1;
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 86de680..2212199 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -1587,7 +1587,7 @@ srpc_startup(void)
 
 	srpc_data.rpc_state = SRPC_STATE_NONE;
 
-	rc = LNetNIInit(LUSTRE_SRV_LNET_PID);
+	rc = LNetNIInit(LNET_PID_LUSTRE);
 	if (rc < 0) {
 		CERROR("LNetNIInit() has failed: %d\n", rc);
 		return rc;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c
index 9c2fd34..2a0dfa5 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -446,7 +446,7 @@ int ptlrpc_uuid_to_peer(struct obd_uuid *uuid,
 
 	portals_compatibility = LNetCtl(IOC_LIBCFS_PORTALS_COMPATIBILITY, NULL);
 
-	peer->pid = LUSTRE_SRV_LNET_PID;
+	peer->pid = LNET_PID_LUSTRE;
 
 	/* Choose the matching UUID that's closest */
 	while (lustre_uuid_to_peer(uuid->uuid, &dst_nid, count++) == 0) {
@@ -524,7 +524,7 @@ static lnet_pid_t ptl_get_pid(void)
 {
 	lnet_pid_t pid;
 
-	pid = LUSTRE_SRV_LNET_PID;
+	pid = LNET_PID_LUSTRE;
 	return pid;
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (17 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 18/40] staging: lustre: remove LUSTRE_{,SRV_}LNET_PID James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 12:34   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start James Simmons
                   ` (21 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

  - libcfs_ioctl_popdata should copy out inline buffers.
  - code cleanup for libcfs ioctl handler
  - error number fix for obd_ioctl_getdata
  - add new function libcfs_ioctl_unpack for upcoming patches

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
Reviewed-on: http://review.whamcloud.com/11313
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   24 +++-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    2 +
 .../lustre/lustre/libcfs/linux/linux-module.c      |   45 +++++---
 drivers/staging/lustre/lustre/libcfs/module.c      |  119 ++++++++------------
 .../lustre/lustre/obdclass/linux/linux-module.c    |   17 +--
 5 files changed, 97 insertions(+), 110 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index f24330d..3468933 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -49,6 +49,9 @@ struct libcfs_ioctl_hdr {
 	__u32 ioc_version;
 };
 
+/** max size to copy from userspace */
+#define LIBCFS_IOC_DATA_MAX	(128 * 1024)
+
 struct libcfs_ioctl_data {
 	struct libcfs_ioctl_hdr ioc_hdr;
 
@@ -240,11 +243,22 @@ static inline bool libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
 
 int libcfs_register_ioctl(struct libcfs_ioctl_handler *hand);
 int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand);
-int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
-			 const void __user *arg);
-int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
-			     __u32 *buf_len);
-int libcfs_ioctl_popdata(void *arg, void *buf, int size);
+int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
+			 struct libcfs_ioctl_hdr __user *uparam);
+
+static inline int libcfs_ioctl_popdata(struct libcfs_ioctl_hdr *hdr,
+				       struct libcfs_ioctl_hdr __user *uparam)
+{
+	if (copy_to_user(uparam, hdr, hdr->ioc_len))
+		return -EFAULT;
+	return 0;
+}
+
+static inline void libcfs_ioctl_freedata(struct libcfs_ioctl_hdr *hdr)
+{
+	LIBCFS_FREE(hdr, hdr->ioc_len);
+}
+
 int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
 
 #endif /* __LIBCFS_IOCTL_H__ */
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 949fa2f..4c4e6d3 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1838,6 +1838,8 @@ LNetCtl(unsigned int cmd, void *arg)
 	int rc;
 	unsigned long secs_passed;
 
+	CLASSERT(sizeof(struct lnet_ioctl_net_config) +
+		 sizeof(struct lnet_ioctl_config_data) < LIBCFS_IOC_DATA_MAX);
 	LASSERT(the_lnet.ln_init);
 
 	switch (cmd) {
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
index 1c31e2e..50a5464 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
@@ -43,7 +43,7 @@
 int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data)
 {
 	if (libcfs_ioctl_is_invalid(data)) {
-		CERROR("LNET: ioctl not correctly formatted\n");
+		CERROR("libcfs ioctl: parameter not correctly formatted\n");
 		return -EINVAL;
 	}
 
@@ -57,39 +57,46 @@ int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data)
 	return 0;
 }
 
-int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
-			     __u32 *len)
+int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
+			 struct libcfs_ioctl_hdr __user *uhdr)
 {
 	struct libcfs_ioctl_hdr hdr;
+	int err = 0;
 
-	if (copy_from_user(&hdr, arg, sizeof(hdr)))
+	if (copy_from_user(&hdr, uhdr, sizeof(hdr)))
 		return -EFAULT;
 
 	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION &&
 	    hdr.ioc_version != LIBCFS_IOCTL_VERSION2) {
-		CERROR("LNET: version mismatch expected %#x, got %#x\n",
+		CERROR("libcfs ioctl: version mismatch expected %#x, got %#x\n",
 		       LIBCFS_IOCTL_VERSION, hdr.ioc_version);
 		return -EINVAL;
 	}
 
-	*len = hdr.ioc_len;
+	if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
+		CERROR("libcfs ioctl: user buffer too small for ioctl\n");
+		return -EINVAL;
+	}
 
-	return 0;
-}
+	if (hdr.ioc_len > LIBCFS_IOC_DATA_MAX) {
+		CERROR("libcfs ioctl: user buffer is too large %d/%d\n",
+		       hdr.ioc_len, LIBCFS_IOC_DATA_MAX);
+		return -EINVAL;
+	}
 
-int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
-			  const void __user *arg)
-{
-	if (copy_from_user(buf, arg, buf_len))
-		return -EFAULT;
-	return 0;
-}
+	LIBCFS_ALLOC(*hdr_pp, hdr.ioc_len);
+	if (!*hdr_pp)
+		return -ENOMEM;
+
+	if (copy_from_user(*hdr_pp, uhdr, hdr.ioc_len)) {
+		err = -EFAULT;
+		goto failed;
+	}
 
-int libcfs_ioctl_popdata(void *arg, void *data, int size)
-{
-	if (copy_to_user((char *)arg, data, size))
-		return -EFAULT;
 	return 0;
+failed:
+	libcfs_ioctl_freedata(*hdr_pp);
+	return err;
 }
 
 static int
diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index 992ff3c..1be814d 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -54,9 +54,6 @@
 
 # define DEBUG_SUBSYSTEM S_LNET
 
-#define LNET_MAX_IOCTL_BUF_LEN (sizeof(struct lnet_ioctl_net_config) + \
-				sizeof(struct lnet_ioctl_config_data))
-
 #include "../../include/linux/libcfs/libcfs.h"
 #include <asm/div64.h>
 
@@ -245,52 +242,63 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand)
 }
 EXPORT_SYMBOL(libcfs_deregister_ioctl);
 
-static int libcfs_ioctl_handle(struct cfs_psdev_file *pfile, unsigned long cmd,
-			       void *arg, struct libcfs_ioctl_hdr *hdr)
+static int libcfs_ioctl(struct cfs_psdev_file *pfile,
+			unsigned long cmd, void __user *uparam)
 {
 	struct libcfs_ioctl_data *data = NULL;
-	int err = -EINVAL;
+	struct libcfs_ioctl_hdr *hdr;
+	int err;
+
+	/* 'cmd' and permissions get checked in our arch-specific caller */
+	err = libcfs_ioctl_getdata(&hdr, uparam);
+	if (err != 0) {
+		CDEBUG_LIMIT(D_ERROR,
+			     "libcfs ioctl: data header error %d\n", err);
+		return err;
+	}
 
-	/*
-	 * The libcfs_ioctl_data_adjust() function performs adjustment
-	 * operations on the libcfs_ioctl_data structure to make
-	 * it usable by the code.  This doesn't need to be called
-	 * for new data structures added.
-	 */
 	if (hdr->ioc_version == LIBCFS_IOCTL_VERSION) {
+		/*
+		 * The libcfs_ioctl_data_adjust() function performs adjustment
+		 * operations on the libcfs_ioctl_data structure to make
+		 * it usable by the code.  This doesn't need to be called
+		 * for new data structures added.
+		 */
 		data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
 		err = libcfs_ioctl_data_adjust(data);
-		if (err != 0) {
-			return err;
-		}
+		if (err != 0)
+			goto out;
 	}
 
+	CDEBUG(D_IOCTL, "libcfs ioctl cmd %lu\n", cmd);
 	switch (cmd) {
 	case IOC_LIBCFS_CLEAR_DEBUG:
 		libcfs_debug_clear_buffer();
-		return 0;
+		break;
 	/*
 	 * case IOC_LIBCFS_PANIC:
 	 * Handled in arch/cfs_module.c
 	 */
 	case IOC_LIBCFS_MARK_DEBUG:
-		if (data->ioc_inlbuf1 == NULL ||
-		    data->ioc_inlbuf1[data->ioc_inllen1 - 1] != '\0')
-			return -EINVAL;
+		if (!data || !data->ioc_inlbuf1 ||
+		    data->ioc_inlbuf1[data->ioc_inllen1 - 1] != '\0') {
+			err = -EINVAL;
+			goto out;
+		}
 		libcfs_debug_mark_buffer(data->ioc_inlbuf1);
-		return 0;
+		break;
+
 	case IOC_LIBCFS_MEMHOG:
-		if (pfile->private_data == NULL) {
+		if (!data || !pfile->private_data) {
 			err = -EINVAL;
-		} else {
-			kportal_memhog_free(pfile->private_data);
-			/* XXX The ioc_flags is not GFP flags now, need to be fixed */
-			err = kportal_memhog_alloc(pfile->private_data,
-						   data->ioc_count,
-						   data->ioc_flags);
-			if (err != 0)
-				kportal_memhog_free(pfile->private_data);
+			goto out;
 		}
+
+		kportal_memhog_free(pfile->private_data);
+		err = kportal_memhog_alloc(pfile->private_data,
+					   data->ioc_count, data->ioc_flags);
+		if (err != 0)
+			kportal_memhog_free(pfile->private_data);
 		break;
 
 	default: {
@@ -300,55 +308,18 @@ static int libcfs_ioctl_handle(struct cfs_psdev_file *pfile, unsigned long cmd,
 		down_read(&ioctl_list_sem);
 		list_for_each_entry(hand, &ioctl_list, item) {
 			err = hand->handle_ioctl(cmd, hdr);
-			if (err != -EINVAL) {
-				if (err == 0)
-					err = libcfs_ioctl_popdata(arg,
-							hdr, hdr->ioc_len);
-				break;
-			}
+			if (err == -EINVAL)
+				continue;
+
+			if (!err)
+				err = libcfs_ioctl_popdata(hdr, uparam);
+			break;
 		}
 		up_read(&ioctl_list_sem);
-		break;
+		break; }
 	}
-	}
-
-	return err;
-}
-
-static int libcfs_ioctl(struct cfs_psdev_file *pfile, unsigned long cmd, void *arg)
-{
-	struct libcfs_ioctl_hdr *hdr;
-	int err = 0;
-	__u32 buf_len;
-
-	err = libcfs_ioctl_getdata_len(arg, &buf_len);
-	if (err != 0)
-		return err;
-
-	/*
-	 * do a check here to restrict the size of the memory
-	 * to allocate to guard against DoS attacks.
-	 */
-	if (buf_len > LNET_MAX_IOCTL_BUF_LEN) {
-		CERROR("LNET: user buffer exceeds kernel buffer\n");
-		return -EINVAL;
-	}
-
-	LIBCFS_ALLOC_GFP(hdr, buf_len, GFP_KERNEL);
-	if (!hdr)
-		return -ENOMEM;
-
-	/* 'cmd' and permissions get checked in our arch-specific caller */
-	if (libcfs_ioctl_getdata(hdr, buf_len, arg)) {
-		CERROR("LNET ioctl: data error\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	err = libcfs_ioctl_handle(pfile, cmd, arg, hdr);
-
 out:
-	LIBCFS_FREE(hdr, buf_len);
+	libcfs_ioctl_freedata(hdr);
 	return err;
 }
 
diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
index a055cbb..4e9b0c4 100644
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
@@ -74,14 +74,14 @@
 #include "../../include/lustre/lustre_build_version.h"
 
 /* buffer MUST be at least the size of obd_ioctl_hdr */
-int obd_ioctl_getdata(char **buf, int *len, void *arg)
+int obd_ioctl_getdata(char **buf, int *len, void __user *arg)
 {
 	struct obd_ioctl_hdr hdr;
 	struct obd_ioctl_data *data;
 	int err;
 	int offset = 0;
 
-	if (copy_from_user(&hdr, (void *)arg, sizeof(hdr)))
+	if (copy_from_user(&hdr, arg, sizeof(hdr)))
 		return -EFAULT;
 
 	if (hdr.ioc_version != OBD_IOCTL_VERSION) {
@@ -114,14 +114,10 @@ int obd_ioctl_getdata(char **buf, int *len, void *arg)
 	*len = hdr.ioc_len;
 	data = (struct obd_ioctl_data *)*buf;
 
-	if (copy_from_user(*buf, (void *)arg, hdr.ioc_len)) {
+	if (copy_from_user(*buf, arg, hdr.ioc_len)) {
 		err = -EFAULT;
 		goto free_buf;
 	}
-	if (hdr.ioc_len != data->ioc_len) {
-		err = -EINVAL;
-		goto free_buf;
-	}
 
 	if (obd_ioctl_is_invalid(data)) {
 		CERROR("ioctl not correctly formatted\n");
@@ -144,9 +140,8 @@ int obd_ioctl_getdata(char **buf, int *len, void *arg)
 		offset += cfs_size_round(data->ioc_inllen3);
 	}
 
-	if (data->ioc_inllen4) {
+	if (data->ioc_inllen4)
 		data->ioc_inlbuf4 = &data->ioc_bulk[0] + offset;
-	}
 
 	return 0;
 
@@ -160,9 +155,7 @@ int obd_ioctl_popdata(void *arg, void *data, int len)
 {
 	int err;
 
-	err = copy_to_user(arg, data, len);
-	if (err)
-		err = -EFAULT;
+	err = copy_to_user(arg, data, len) ? -EFAULT : 0;
 	return err;
 }
 EXPORT_SYMBOL(obd_ioctl_popdata);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (18 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 12:44   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 21/40] staging: lustre: improve LNet clean up code and API James Simmons
                   ` (20 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

When loading Lustre modules without proper network configuration,
it always hit the following kernel panic:
LNetError: 105-4: Error -100 starting up LNI tcp
LNetError: 2145:0:(api-ni.c:823:lnet_unprepare())
ASSERTION( list_empty(&the_lnet.ln_nis) ) failed:
NetError: 2145:0:(api-ni.c:823:lnet_unprepare()) LBUG
Pid: 2145, comm: modprobe
x0aCall Trace:
[<ffffffffa044f853>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[<ffffffffa044fdf5>] lbug_with_loc+0x45/0xc0 [libcfs]
[<ffffffffa04f3267>] lnet_unprepare+0x297/0x340 [lnet]
[<ffffffffa04f3b5c>] LNetNIInit+0x25c/0x3e0 [lnet]
[<ffffffff81061bc6>] ? put_online_cpus+0x56/0x80
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa081310c>] ptlrpc_ni_init+0x2c/0x1a0 [ptlrpc]
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa0813291>] ptlrpc_init_portals+0x11/0xf0 [ptlrpc]
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa09831c4>] init_module+0x1c4/0x1000 [ptlrpc]
[<ffffffff810020e2>] do_one_initcall+0xe2/0x190
[<ffffffff810ca7fb>] load_module+0x129b/0x1a90
[<ffffffff812da590>] ? ddebug_dyndbg_module_param_cb+0x0/0x60
[<ffffffff810c7133>] ? copy_module_from_fd.isra.43+0x53/0x150
[<ffffffff810cb1a6>] SyS_finit_module+0xa6/0xd0
[<ffffffff815f2119>] system_call_fastpath+0x16/0x1b
...
This is because in lnet_startup_lndnis(), we may add list items to
@the_lnet.ln_nis and @the_lnet.ln_nis_cpt before it failed. But in
lnet_startup_lndis() failure path,it did not cleanup list thus
causing assertion in lnet_unprepare().

Fix the assertion by cleaning up using lnet_shutdown_lndnis()
if the startup fails.

In a future enahancement the ni startup API will be modified to
cleanup after itself in case of failure.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5568
Reviewed-on: http://review.whamcloud.com/12512
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 4c4e6d3..bfc1f13 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1246,6 +1246,10 @@ lnet_shutdown_lndni(__u32 net)
 	return 0;
 }
 
+/*
+ * Callers of lnet_startup_lndnis need to clean up using
+ * lnet_shutdown_lndnis if startup fails
+ */
 static int
 lnet_startup_lndnis(struct list_head *nilist, __s32 peer_timeout,
 		    __s32 peer_cr, __s32 peer_buf_cr, __s32 credits,
@@ -1554,7 +1558,7 @@ LNetNIInit(lnet_pid_t requested_pid)
 
 	rc = lnet_startup_lndnis(&net_head, -1, -1, -1, -1, &ni_count);
 	if (rc != 0)
-		goto failed1;
+		goto failed2;
 
 	if (the_lnet.ln_eq_waitni && ni_count > 1) {
 		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (19 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-12-02 12:59   ` Dan Carpenter
  2015-11-20 23:35 ` [PATCH 22/40] staging: lustre: Fixes to make lnetctl function as expected James Simmons
                   ` (19 subsequent siblings)
  40 siblings, 1 reply; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This patch addresses a set of related issues: LU-5734, LU-5839,
LU-5849, LU-5850.

Create the local lnet_startup_lndni() API.  This function starts
up one LND.  lnet_startup_lndnis() calls this function in a loop
on every ni in the list passed in.  lnet_startup_lndni() is
responsible for cleaning up after itself in case of failure.
It calls lnet_free_ni() if the ni fails to start.  It calls
lnet_shutdown_lndni() if it successfully called the
lnd startup function, but fails later on.

lnet_startup_lndnis() also cleans up after itself.
If lnet_startup_lndni() fails then lnet_shutdown_lndnis() is
called to clean up all nis that might have been
started, and then free the rest of the nis on the list
which have not been started yet.

To facilitate the above changes lnet_dyn_del_ni() now
manages the ping info.  It calls lnet_shutdown_lndni(),
to shutdown the NI.  lnet_shutdown_lndni() is no longer
an exposed API and doesn't manage the ping info, making
it callable from lnet_startup_lndni() as well.

There are two scenarios for calling lnet_startup_lndni()

1. from lnet_startup_lndnis()
If lnet_startup_lndni() fails it requires to shutdown the ni
without doing anything with the ping information as it hasn't
been created yet.

2. from lnet_dyn_add_ni()
As above it will shutdown the ni, and then lnet_dyn_add_ni() will
take care of managing the ping info

The second part of this change is to ensure that the LOLND is not
added by lnet_parse_networks(), but the caller which needs to do
it (IE: LNetNIInit)

This change ensures that lnet_dyn_add_ni() need only check if there is
only one net that's being added, if not then it frees everything,
otherwise it proceeds to startup the requested net.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5734
Reviewed-on: http://review.whamcloud.com/12658
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    2 +
 drivers/staging/lustre/lnet/lnet/api-ni.c          |  461 ++++++++++----------
 drivers/staging/lustre/lnet/lnet/config.c          |   14 +-
 3 files changed, 245 insertions(+), 232 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index a1f94db..4c2d824 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -420,6 +420,8 @@ lnet_ni_decref(lnet_ni_t *ni)
 }
 
 void lnet_ni_free(lnet_ni_t *ni);
+lnet_ni_t *
+lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist);
 
 static inline int
 lnet_nid2peerhash(lnet_nid_t nid)
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index bfc1f13..e40c657 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1064,6 +1064,20 @@ lnet_ni_tq_credits(lnet_ni_t *ni)
 }
 
 static void
+lnet_ni_unlink_locked(lnet_ni_t *ni)
+{
+	if (!list_empty(&ni->ni_cptlist)) {
+		list_del_init(&ni->ni_cptlist);
+		lnet_ni_decref_locked(ni, 0);
+	}
+
+	/* move it to zombie list and nobody can find it anymore */
+	LASSERT(!list_empty(&ni->ni_list));
+	list_move(&ni->ni_list, &the_lnet.ln_nis_zombie);
+	lnet_ni_decref_locked(ni, 0);	/* drop ln_nis' ref */
+}
+
+static void
 lnet_clear_zombies_nis_locked(void)
 {
 	int i;
@@ -1146,14 +1160,7 @@ lnet_shutdown_lndnis(void)
 	while (!list_empty(&the_lnet.ln_nis)) {
 		ni = list_entry(the_lnet.ln_nis.next,
 				lnet_ni_t, ni_list);
-		/* move it to zombie list and nobody can find it anymore */
-		list_move(&ni->ni_list, &the_lnet.ln_nis_zombie);
-		lnet_ni_decref_locked(ni, 0);	/* drop ln_nis' ref */
-
-		if (!list_empty(&ni->ni_cptlist)) {
-			list_del_init(&ni->ni_cptlist);
-			lnet_ni_decref_locked(ni, 0);
-		}
+		lnet_ni_unlink_locked(ni);
 	}
 
 	/* Drop the cached eqwait NI. */
@@ -1186,233 +1193,196 @@ lnet_shutdown_lndnis(void)
 	lnet_net_unlock(LNET_LOCK_EX);
 }
 
-int
-lnet_shutdown_lndni(__u32 net)
+/* shutdown down the NI and release refcount */
+static void
+lnet_shutdown_lndni(struct lnet_ni *ni)
 {
-	lnet_ping_info_t *pinfo;
-	lnet_handle_md_t md_handle;
-	lnet_ni_t *found_ni = NULL;
-	int ni_count;
-	int rc;
-
-	if (LNET_NETTYP(net) == LOLND)
-		return -EINVAL;
-
-	ni_count = lnet_get_ni_count();
-
-	/* create and link a new ping info, before removing the old one */
-	rc = lnet_ping_info_setup(&pinfo, &md_handle, ni_count - 1, false);
-	if (rc != 0)
-		return rc;
-
-	/* proceed with shutting down the NI */
 	lnet_net_lock(LNET_LOCK_EX);
-
-	found_ni = lnet_net2ni_locked(net, 0);
-	if (!found_ni) {
-		lnet_net_unlock(LNET_LOCK_EX);
-		lnet_ping_md_unlink(pinfo, &md_handle);
-		lnet_ping_info_free(pinfo);
-		return -EINVAL;
-	}
-
-	/*
-	 * decrement the reference counter on found_ni which was
-	 * incremented when we called lnet_net2ni_locked()
-	 */
-	lnet_ni_decref_locked(found_ni, 0);
-	/* Move ni to zombie list so nobody can find it anymore */
-	list_move(&found_ni->ni_list, &the_lnet.ln_nis_zombie);
-
-	/* Drop the lock reference for the ln_nis ref. */
-	lnet_ni_decref_locked(found_ni, 0);
-
-	if (!list_empty(&found_ni->ni_cptlist)) {
-		list_del_init(&found_ni->ni_cptlist);
-		lnet_ni_decref_locked(found_ni, 0);
-	}
-
+	lnet_ni_unlink_locked(ni);
 	lnet_net_unlock(LNET_LOCK_EX);
 
 	/* Do peer table cleanup for this ni */
-	lnet_peer_tables_cleanup(found_ni);
+	lnet_peer_tables_cleanup(ni);
 
 	lnet_net_lock(LNET_LOCK_EX);
 	lnet_clear_zombies_nis_locked();
 	lnet_net_unlock(LNET_LOCK_EX);
-
-	lnet_ping_target_update(pinfo, md_handle);
-
-	return 0;
 }
 
-/*
- * Callers of lnet_startup_lndnis need to clean up using
- * lnet_shutdown_lndnis if startup fails
- */
 static int
-lnet_startup_lndnis(struct list_head *nilist, __s32 peer_timeout,
-		    __s32 peer_cr, __s32 peer_buf_cr, __s32 credits,
-		    int *ni_count)
+lnet_startup_lndni(struct lnet_ni *ni, __s32 peer_timeout,
+		   __s32 peer_cr, __s32 peer_buf_cr, __s32 credits)
 {
+	int rc = 0;
+	__u32 lnd_type;
 	lnd_t *lnd;
-	struct lnet_ni *ni;
 	struct lnet_tx_queue *tq;
 	int i;
-	int rc = 0;
-	__u32 lnd_type;
 
-	while (!list_empty(nilist)) {
-		ni = list_entry(nilist->next, lnet_ni_t, ni_list);
-		lnd_type = LNET_NETTYP(LNET_NIDNET(ni->ni_nid));
+	lnd_type = LNET_NETTYP(LNET_NIDNET(ni->ni_nid));
 
-		if (!libcfs_isknown_lnd(lnd_type))
-			goto failed;
-
-		if (lnd_type == CIBLND    ||
-		    lnd_type == OPENIBLND ||
-		    lnd_type == IIBLND    ||
-		    lnd_type == VIBLND) {
-			CERROR("LND %s obsoleted\n",
-			       libcfs_lnd2str(lnd_type));
-			goto failed;
-		}
+	LASSERT(libcfs_isknown_lnd(lnd_type));
 
-		/* Make sure this new NI is unique. */
-		lnet_net_lock(LNET_LOCK_EX);
-		if (!lnet_net_unique(LNET_NIDNET(ni->ni_nid),
-				     &the_lnet.ln_nis)) {
-			if (lnd_type == LOLND) {
-				lnet_net_unlock(LNET_LOCK_EX);
-				list_del(&ni->ni_list);
-				lnet_ni_free(ni);
-				continue;
-			}
+	if (lnd_type == CIBLND || lnd_type == OPENIBLND ||
+	    lnd_type == IIBLND || lnd_type == VIBLND) {
+		CERROR("LND %s obsoleted\n", libcfs_lnd2str(lnd_type));
+		goto failed0;
+	}
 
-			CERROR("Net %s is not unique\n",
-			       libcfs_net2str(LNET_NIDNET(ni->ni_nid)));
+	/* Make sure this new NI is unique. */
+	lnet_net_lock(LNET_LOCK_EX);
+	if (!lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nis)) {
+		if (lnd_type == LOLND) {
 			lnet_net_unlock(LNET_LOCK_EX);
-			goto failed;
+			lnet_ni_free(ni);
+			return 0;
 		}
 		lnet_net_unlock(LNET_LOCK_EX);
 
+		CERROR("Net %s is not unique\n",
+		       libcfs_net2str(LNET_NIDNET(ni->ni_nid)));
+		goto failed0;
+	}
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	mutex_lock(&the_lnet.ln_lnd_mutex);
+	lnd = lnet_find_lnd_by_type(lnd_type);
+
+	if (!lnd) {
+		mutex_unlock(&the_lnet.ln_lnd_mutex);
+		rc = request_module("%s", libcfs_lnd2modname(lnd_type));
 		mutex_lock(&the_lnet.ln_lnd_mutex);
-		lnd = lnet_find_lnd_by_type(lnd_type);
 
+		lnd = lnet_find_lnd_by_type(lnd_type);
 		if (lnd == NULL) {
 			mutex_unlock(&the_lnet.ln_lnd_mutex);
-			rc = request_module("%s",
-						libcfs_lnd2modname(lnd_type));
-			mutex_lock(&the_lnet.ln_lnd_mutex);
-
-			lnd = lnet_find_lnd_by_type(lnd_type);
-			if (lnd == NULL) {
-				mutex_unlock(&the_lnet.ln_lnd_mutex);
-				CERROR("Can't load LND %s, module %s, rc=%d\n",
-				       libcfs_lnd2str(lnd_type),
-				       libcfs_lnd2modname(lnd_type), rc);
-				goto failed;
-			}
+			CERROR("Can't load LND %s, module %s, rc=%d\n",
+			       libcfs_lnd2str(lnd_type),
+			       libcfs_lnd2modname(lnd_type), rc);
+			goto failed0;
 		}
+	}
+
+	lnet_net_lock(LNET_LOCK_EX);
+	lnd->lnd_refcount++;
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	ni->ni_lnd = lnd;
 
+	rc = lnd->lnd_startup(ni);
+
+	mutex_unlock(&the_lnet.ln_lnd_mutex);
+
+	if (rc != 0) {
+		LCONSOLE_ERROR_MSG(0x105, "Error %d starting up LNI %s\n",
+				   rc, libcfs_lnd2str(lnd->lnd_type));
 		lnet_net_lock(LNET_LOCK_EX);
-		lnd->lnd_refcount++;
+		lnd->lnd_refcount--;
 		lnet_net_unlock(LNET_LOCK_EX);
+		goto failed0;
+	}
 
-		ni->ni_lnd = lnd;
+	/*
+	 * If given some LND tunable parameters, parse those now to
+	 * override the values in the NI structure.
+	 */
+	if (peer_buf_cr >= 0)
+		ni->ni_peerrtrcredits = peer_buf_cr;
+	if (peer_timeout >= 0)
+		ni->ni_peertimeout = peer_timeout;
+	/*
+	 * TODO
+	 * Note: For now, don't allow the user to change
+	 * peertxcredits as this number is used in the
+	 * IB LND to control queue depth.
+	 * if (peer_cr != -1)
+	 *	ni->ni_peertxcredits = peer_cr;
+	 */
+	if (credits >= 0)
+		ni->ni_maxtxcredits = credits;
 
-		rc = (lnd->lnd_startup)(ni);
+	LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query);
 
-		mutex_unlock(&the_lnet.ln_lnd_mutex);
+	lnet_net_lock(LNET_LOCK_EX);
+	/* refcount for ln_nis */
+	lnet_ni_addref_locked(ni, 0);
+	list_add_tail(&ni->ni_list, &the_lnet.ln_nis);
+	if (ni->ni_cpts) {
+		lnet_ni_addref_locked(ni, 0);
+		list_add_tail(&ni->ni_cptlist, &the_lnet.ln_nis_cpt);
+	}
 
-		if (rc != 0) {
-			LCONSOLE_ERROR_MSG(0x105, "Error %d starting up LNI %s\n",
-					   rc, libcfs_lnd2str(lnd->lnd_type));
-			lnet_net_lock(LNET_LOCK_EX);
-			lnd->lnd_refcount--;
-			lnet_net_unlock(LNET_LOCK_EX);
-			goto failed;
-		}
+	lnet_net_unlock(LNET_LOCK_EX);
 
+	if (lnd->lnd_type == LOLND) {
+		lnet_ni_addref(ni);
+		LASSERT(!the_lnet.ln_loni);
+		the_lnet.ln_loni = ni;
+		return 0;
+	}
+
+	if (ni->ni_peertxcredits == 0 || ni->ni_maxtxcredits == 0) {
+		LCONSOLE_ERROR_MSG(0x107, "LNI %s has no %scredits\n",
+				   libcfs_lnd2str(lnd->lnd_type),
+				   ni->ni_peertxcredits == 0 ?
+				   "" : "per-peer ");
 		/*
-		 * If given some LND tunable parameters, parse those now to
-		 * override the values in the NI structure.
-		 */
-		if (peer_buf_cr >= 0)
-			ni->ni_peerrtrcredits = peer_buf_cr;
-		if (peer_timeout >= 0)
-			ni->ni_peertimeout = peer_timeout;
-		/*
-		 * TODO
-		 * Note: For now, don't allow the user to change
-		 * peertxcredits as this number is used in the
-		 * IB LND to control queue depth.
-		 * if (peer_cr != -1)
-		 *	ni->ni_peertxcredits = peer_cr;
+		 * shutdown the NI since if we get here then it must've already
+		 * been started
 		 */
-		if (credits >= 0)
-			ni->ni_maxtxcredits = credits;
+		lnet_shutdown_lndni(ni);
+		return -EINVAL;
+	}
 
-		LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query != NULL);
+	cfs_percpt_for_each(tq, i, ni->ni_tx_queues) {
+		tq->tq_credits_min =
+		tq->tq_credits_max =
+		tq->tq_credits = lnet_ni_tq_credits(ni);
+	}
 
-		list_del(&ni->ni_list);
+	CDEBUG(D_LNI, "Added LNI %s [%d/%d/%d/%d]\n",
+	       libcfs_nid2str(ni->ni_nid), ni->ni_peertxcredits,
+	       lnet_ni_tq_credits(ni) * LNET_CPT_NUMBER,
+	       ni->ni_peerrtrcredits, ni->ni_peertimeout);
 
-		lnet_net_lock(LNET_LOCK_EX);
-		/* refcount for ln_nis */
-		lnet_ni_addref_locked(ni, 0);
-		list_add_tail(&ni->ni_list, &the_lnet.ln_nis);
-		if (ni->ni_cpts != NULL) {
-			list_add_tail(&ni->ni_cptlist,
-					  &the_lnet.ln_nis_cpt);
-			lnet_ni_addref_locked(ni, 0);
-		}
-
-		lnet_net_unlock(LNET_LOCK_EX);
+	return 0;
+failed0:
+	lnet_ni_free(ni);
+	return -EINVAL;
+}
 
-		/* increment the ni_count here to account for the LOLND as
-		 * well.  If we increment past this point then the number
-		 * of count will be missing the LOLND, and then ping and
-		 * will not report the LOLND
-		 */
-		if (ni_count)
-			(*ni_count)++;
+static int
+lnet_startup_lndnis(struct list_head *nilist)
+{
+	struct lnet_ni *ni;
+	int rc;
+	int lnd_type;
+	int ni_count = 0;
 
-		if (lnd->lnd_type == LOLND) {
-			lnet_ni_addref(ni);
-			LASSERT(the_lnet.ln_loni == NULL);
-			the_lnet.ln_loni = ni;
-			continue;
-		}
+	while (!list_empty(nilist)) {
+		ni = list_entry(nilist->next, lnet_ni_t, ni_list);
+		list_del(&ni->ni_list);
+		rc = lnet_startup_lndni(ni, -1, -1, -1, -1);
 
-		if (ni->ni_peertxcredits == 0 ||
-		    ni->ni_maxtxcredits == 0) {
-			LCONSOLE_ERROR_MSG(0x107, "LNI %s has no %scredits\n",
-					   libcfs_lnd2str(lnd->lnd_type),
-					   ni->ni_peertxcredits == 0 ?
-					   "" : "per-peer ");
+		if (rc < 0)
 			goto failed;
-		}
 
-		cfs_percpt_for_each(tq, i, ni->ni_tx_queues) {
-			tq->tq_credits_min =
-			tq->tq_credits_max =
-			tq->tq_credits = lnet_ni_tq_credits(ni);
-		}
+		ni_count++;
+	}
 
-		CDEBUG(D_LNI, "Added LNI %s [%d/%d/%d/%d]\n",
-		       libcfs_nid2str(ni->ni_nid), ni->ni_peertxcredits,
-		       lnet_ni_tq_credits(ni) * LNET_CPT_NUMBER,
-		       ni->ni_peerrtrcredits, ni->ni_peertimeout);
+	if (the_lnet.ln_eq_waitni && ni_count > 1) {
+		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
+		LCONSOLE_ERROR_MSG(0x109, "LND %s can only run single-network\n",
+				   libcfs_lnd2str(lnd_type));
+		rc = -EINVAL;
+		goto failed;
 	}
 
-	return 0;
+	return ni_count;
 failed:
-	while (!list_empty(nilist)) {
-		ni = list_entry(nilist->next, lnet_ni_t, ni_list);
-		list_del(&ni->ni_list);
-		lnet_ni_free(ni);
-	}
-	return -EINVAL;
+	lnet_shutdown_lndnis();
+
+	return rc;
 }
 
 /**
@@ -1525,10 +1495,8 @@ int
 LNetNIInit(lnet_pid_t requested_pid)
 {
 	int im_a_router = 0;
-	int rc;
-	int ni_count = 0;
-	int lnd_type;
-	struct lnet_ni *ni;
+	int rc, rc2;
+	int ni_count;
 	lnet_ping_info_t *pinfo;
 	lnet_handle_md_t md_handle;
 	struct list_head net_head;
@@ -1547,37 +1515,50 @@ LNetNIInit(lnet_pid_t requested_pid)
 	}
 
 	rc = lnet_prepare(requested_pid);
-	if (rc != 0)
-		goto failed0;
+	if (rc != 0) {
+		mutex_unlock(&the_lnet.ln_api_mutex);
+		return rc;
+	}
 
-	rc = lnet_parse_networks(&net_head,
-				 !the_lnet.ln_nis_from_mod_params ?
-				 lnet_get_networks() : "");
-	if (rc < 0)
-		goto failed1;
+	/* Add in the loopback network */
+	if (!lnet_ni_alloc(LNET_MKNET(LOLND, 0), NULL, &net_head)) {
+		rc = -ENOMEM;
+		goto failed0;
+	}
 
-	rc = lnet_startup_lndnis(&net_head, -1, -1, -1, -1, &ni_count);
-	if (rc != 0)
-		goto failed2;
+	/*
+	 * If LNet is being initialized via DLC it is possible
+	 * that the user requests not to load module parameters (ones which
+	 * are supported by DLC) on initialization.  Therefore, make sure not
+	 * to load networks, routes and forwarding from module parameters
+	 * in this case. On cleanup in case of failure only clean up
+	 * routes if it has been loaded
+	 */
+	if (!the_lnet.ln_nis_from_mod_params) {
+		rc = lnet_parse_networks(&net_head, lnet_get_networks());
+		if (rc < 0)
+			goto failed0;
+	}
 
-	if (the_lnet.ln_eq_waitni && ni_count > 1) {
-		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
-		LCONSOLE_ERROR_MSG(0x109, "LND %s can only run single-network\n",
-				   libcfs_lnd2str(lnd_type));
-		goto failed2;
+	ni_count = lnet_startup_lndnis(&net_head);
+	if (ni_count < 0) {
+		rc = ni_count;
+		goto failed0;
 	}
 
-	rc = lnet_parse_routes(lnet_get_routes(), &im_a_router);
-	if (rc != 0)
-		goto failed2;
+	if (!the_lnet.ln_nis_from_mod_params) {
+		rc = lnet_parse_routes(lnet_get_routes(), &im_a_router);
+		if (rc != 0)
+			goto failed1;
 
-	rc = lnet_check_routes();
-	if (rc != 0)
-		goto failed2;
+		rc = lnet_check_routes();
+		if (rc != 0)
+			goto failed2;
 
-	rc = lnet_rtrpools_alloc(im_a_router);
-	if (rc != 0)
-		goto failed2;
+		rc = lnet_rtrpools_alloc(im_a_router);
+		if (rc != 0)
+			goto failed2;
+	}
 
 	rc = lnet_acceptor_start();
 	if (rc != 0)
@@ -1603,22 +1584,25 @@ LNetNIInit(lnet_pid_t requested_pid)
 	return 0;
 
  failed4:
-	the_lnet.ln_refcount = 0;
 	lnet_ping_md_unlink(pinfo, &md_handle);
 	lnet_ping_info_free(pinfo);
+	rc2 = LNetEQFree(the_lnet.ln_ping_target_eq);
+	LASSERT(rc2 == 0);
  failed3:
+	the_lnet.ln_refcount = 0;
 	lnet_acceptor_stop();
-	rc = LNetEQFree(the_lnet.ln_ping_target_eq);
-	LASSERT(rc == 0);
  failed2:
-	lnet_destroy_routes();
-	lnet_shutdown_lndnis();
+	if (!the_lnet.ln_nis_from_mod_params)
+		lnet_destroy_routes();
  failed1:
-	lnet_unprepare();
+	lnet_shutdown_lndnis();
  failed0:
+	lnet_unprepare();
 	LASSERT(rc < 0);
 	mutex_unlock(&the_lnet.ln_api_mutex);
 	while (!list_empty(&net_head)) {
+		struct lnet_ni *ni;
+
 		ni = list_entry(net_head.next, struct lnet_ni, ni_list);
 		list_del_init(&ni->ni_list);
 		lnet_ni_free(ni);
@@ -1769,8 +1753,8 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 
 	/* Create a ni structure for the network string */
 	rc = lnet_parse_networks(&net_head, nets);
-	if (rc < 0)
-		return rc;
+	if (rc <= 0)
+		return rc == 0 ? -EINVAL : rc;
 
 	mutex_lock(&the_lnet.ln_api_mutex);
 
@@ -1784,8 +1768,11 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 	if (rc != 0)
 		goto failed0;
 
-	rc = lnet_startup_lndnis(&net_head, peer_timeout, peer_cr,
-				 peer_buf_cr, credits, NULL);
+	ni = list_entry(net_head.next, struct lnet_ni, ni_list);
+	list_del_init(&ni->ni_list);
+
+	rc = lnet_startup_lndni(ni, peer_timeout, peer_cr,
+				peer_buf_cr, credits);
 	if (rc != 0)
 		goto failed1;
 
@@ -1810,10 +1797,38 @@ failed0:
 int
 lnet_dyn_del_ni(__u32 net)
 {
+	lnet_ni_t *ni;
+	lnet_ping_info_t *pinfo;
+	lnet_handle_md_t md_handle;
 	int rc;
 
+	/* don't allow userspace to shutdown the LOLND */
+	if (LNET_NETTYP(net) == LOLND)
+		return -EINVAL;
+
 	mutex_lock(&the_lnet.ln_api_mutex);
-	rc = lnet_shutdown_lndni(net);
+	/* create and link a new ping info, before removing the old one */
+	rc = lnet_ping_info_setup(&pinfo, &md_handle,
+				  lnet_get_ni_count() - 1, false);
+	if (rc != 0)
+		goto out;
+
+	ni = lnet_net2ni(net);
+	if (!ni) {
+		rc = -EINVAL;
+		goto failed;
+	}
+
+	/* decrement the reference counter taken by lnet_net2ni() */
+	lnet_ni_decref_locked(ni, 0);
+
+	lnet_shutdown_lndni(ni);
+	lnet_ping_target_update(pinfo, md_handle);
+	goto out;
+failed:
+	lnet_ping_md_unlink(pinfo, &md_handle);
+	lnet_ping_info_free(pinfo);
+out:
 	mutex_unlock(&the_lnet.ln_api_mutex);
 
 	return rc;
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index d1e0217..7b7412b 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -114,7 +114,7 @@ lnet_ni_free(struct lnet_ni *ni)
 	LIBCFS_FREE(ni, sizeof(*ni));
 }
 
-static lnet_ni_t *
+lnet_ni_t *
 lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist)
 {
 	struct lnet_tx_queue *tq;
@@ -191,6 +191,7 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 	struct lnet_ni *ni;
 	__u32 net;
 	int nnets = 0;
+	struct list_head *temp_node;
 
 	if (!networks) {
 		CERROR("networks string is undefined\n");
@@ -215,11 +216,6 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 	memcpy(tokens, networks, tokensize);
 	str = tmp = tokens;
 
-	/* Add in the loopback network */
-	ni = lnet_ni_alloc(LNET_MKNET(LOLND, 0), NULL, nilist);
-	if (ni == NULL)
-		goto failed;
-
 	while (str != NULL && *str != 0) {
 		char *comma = strchr(str, ',');
 		char *bracket = strchr(str, '(');
@@ -292,7 +288,6 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 			goto failed_syntax;
 		}
 
-		nnets++;
 		ni = lnet_ni_alloc(net, el, nilist);
 		if (ni == NULL)
 			goto failed;
@@ -370,10 +365,11 @@ lnet_parse_networks(struct list_head *nilist, char *networks)
 		}
 	}
 
-	LASSERT(!list_empty(nilist));
+	list_for_each(temp_node, nilist)
+		nnets++;
 
 	LIBCFS_FREE(tokens, tokensize);
-	return 0;
+	return nnets;
 
  failed_syntax:
 	lnet_syntax("networks", networks, (int)(tmp - tokens), strlen(tmp));
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 22/40] staging: lustre: Fixes to make lnetctl function as expected.
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (20 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 21/40] staging: lustre: improve LNet clean up code and API James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:35 ` [PATCH 23/40] staging: lustre: return appropriate errno when adding route James Simmons
                   ` (18 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, James Simmons, James Simmons

During testing of the lnetctl utility I ran into some
issues. One of the problems is when you print out help
for peer_credits you get help for stats. With this patch
the help option for peer_credits is set to return the
proper help message.

The second problem was for grabbing stats data. No data
was returned but instead it reported a error. The reason
for this is that libcfs_ioctl_getdata() test to see if the
size of the data passed in is less than the struct
libcfs_ioctl_data in size. For the stats function its
data structure struct lnet_ioctl_lnet_stats is smaller
than what is allowed. Instead of checking if the data
is less than libcfs_ioctl_data in size we check to ensure
that the data is not smaller than the ioctl hdr data which
is universal.

The bug in libcfs_ioctl_getdata() exposed a bunch of cases
with new ioctls that don't check to see if the data imported
from userland equals the size reported in the ioctl hdr
data. We address those cases in this patch as well.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5935
Reviewed-on: http://review.whamcloud.com/12782
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   36 ++++++++++++++++++++
 .../lustre/lustre/libcfs/linux/linux-module.c      |    2 +-
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index e40c657..7657f88 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1872,6 +1872,10 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_ADD_ROUTE:
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < sizeof(*config))
+			return -EINVAL;
+
 		mutex_lock(&the_lnet.ln_api_mutex);
 		rc = lnet_add_route(config->cfg_net,
 				    config->cfg_config_u.cfg_route.rtr_hop,
@@ -1883,6 +1887,10 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_DEL_ROUTE:
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < sizeof(*config))
+			return -EINVAL;
+
 		mutex_lock(&the_lnet.ln_api_mutex);
 		rc = lnet_del_route(config->cfg_net, config->cfg_nid);
 		mutex_unlock(&the_lnet.ln_api_mutex);
@@ -1890,6 +1898,10 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_GET_ROUTE:
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < sizeof(*config))
+			return -EINVAL;
+
 		return lnet_get_route(config->cfg_count,
 				      &config->cfg_net,
 				      &config->cfg_config_u.cfg_route.rtr_hop,
@@ -1900,8 +1912,13 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_GET_NET: {
 		struct lnet_ioctl_net_config *net_config;
+		size_t total = sizeof(*config) + sizeof(*net_config);
 
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < total)
+			return -EINVAL;
+
 		net_config = (struct lnet_ioctl_net_config *)
 				config->cfg_bulk;
 		if (!config || !net_config)
@@ -1925,12 +1942,19 @@ LNetCtl(unsigned int cmd, void *arg)
 	{
 		struct lnet_ioctl_lnet_stats *lnet_stats = arg;
 
+		if (lnet_stats->st_hdr.ioc_len < sizeof(*lnet_stats))
+			return -EINVAL;
+
 		lnet_counters_get(&lnet_stats->st_cntrs);
 		return 0;
 	}
 
 	case IOC_LIBCFS_CONFIG_RTR:
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < sizeof(*config))
+			return -EINVAL;
+
 		mutex_lock(&the_lnet.ln_api_mutex);
 		if (config->cfg_config_u.cfg_buffers.buf_enable) {
 			rc = lnet_rtrpools_enable();
@@ -1943,6 +1967,10 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_ADD_BUF:
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < sizeof(*config))
+			return -EINVAL;
+
 		mutex_lock(&the_lnet.ln_api_mutex);
 		rc = lnet_rtrpools_adjust(config->cfg_config_u.cfg_buffers.
 						buf_tiny,
@@ -1955,8 +1983,13 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	case IOC_LIBCFS_GET_BUF: {
 		struct lnet_ioctl_pool_cfg *pool_cfg;
+		size_t total = sizeof(*config) + sizeof(*pool_cfg);
 
 		config = arg;
+
+		if (config->cfg_hdr.ioc_len < total)
+			return -EINVAL;
+
 		pool_cfg = (struct lnet_ioctl_pool_cfg *)config->cfg_bulk;
 		return lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg);
 	}
@@ -1964,6 +1997,9 @@ LNetCtl(unsigned int cmd, void *arg)
 	case IOC_LIBCFS_GET_PEER_INFO: {
 		struct lnet_ioctl_peer *peer_info = arg;
 
+		if (peer_info->pr_hdr.ioc_len < sizeof(*peer_info))
+			return -EINVAL;
+
 		return lnet_get_peer_info(peer_info->pr_count,
 			&peer_info->pr_nid,
 			peer_info->pr_lnd_u.pr_peer_credits.cr_aliveness,
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
index 50a5464..9414746 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
@@ -73,7 +73,7 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
 		return -EINVAL;
 	}
 
-	if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
+	if (hdr.ioc_len < sizeof(struct libcfs_ioctl_hdr)) {
 		CERROR("libcfs ioctl: user buffer too small for ioctl\n");
 		return -EINVAL;
 	}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 23/40] staging: lustre: return appropriate errno when adding route
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (21 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 22/40] staging: lustre: Fixes to make lnetctl function as expected James Simmons
@ 2015-11-20 23:35 ` James Simmons
  2015-11-20 23:36 ` [PATCH 24/40] staging: lustre: make some lnet functions static James Simmons
                   ` (17 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

When adding route it ignored specific scenarios, namely:
1. route already exists
2. route is on a local net
3. route is unreacheable

This patch returns the appropriate return codes from the lower level
function lnet_add_route(), and then ignores the above case from the
calling function, lnet_parse_route().  This is needed so we don't
halt processing routes in the module parameters.

However, we can now add routes dynamically, and it should be returned
to the user whether adding the requested route succeeded or failed.

In userspace it is determined whether to continue adding routes or to
halt processing.  Currently "lnetctl import < config" continues
adding the rest of the configuration and reports at the end which
operations passed and which ones failed.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6045
Reviewed-on: http://review.whamcloud.com/13116
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/config.c |    2 +-
 drivers/staging/lustre/lnet/lnet/router.c |   11 +++++++----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 7b7412b..1028195 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -771,7 +771,7 @@ lnet_parse_route(char *str, int *im_a_router)
 			}
 
 			rc = lnet_add_route(net, hops, nid, priority);
-			if (rc != 0) {
+			if (rc != 0 && rc != -EEXIST && rc != -EHOSTUNREACH) {
 				CERROR("Can't create route to %s via %s\n",
 				       libcfs_net2str(net),
 				       libcfs_nid2str(nid));
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 1f5a4b1..9271be6 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -315,7 +315,7 @@ lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway,
 		return -EINVAL;
 
 	if (lnet_islocalnet(net))	       /* it's a local network */
-		return 0;		       /* ignore the route entry */
+		return -EEXIST;
 
 	/* Assume net, route, all new */
 	LIBCFS_ALLOC(route, sizeof(*route));
@@ -346,7 +346,7 @@ lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway,
 		LIBCFS_FREE(rnet, sizeof(*rnet));
 
 		if (rc == -EHOSTUNREACH) /* gateway is not on a local net */
-			return 0;	/* ignore the route entry */
+			return rc;	/* ignore the route entry */
 		CERROR("Error %d creating route %s %d %s\n", rc,
 		       libcfs_net2str(net), hops,
 		       libcfs_nid2str(gateway));
@@ -394,14 +394,17 @@ lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway,
 	/* -1 for notify or !add_route */
 	lnet_peer_decref_locked(route->lr_gateway);
 	lnet_net_unlock(LNET_LOCK_EX);
+	rc = 0;
 
-	if (!add_route)
+	if (!add_route) {
+		rc = -EEXIST;
 		LIBCFS_FREE(route, sizeof(*route));
+	}
 
 	if (rnet != rnet2)
 		LIBCFS_FREE(rnet, sizeof(*rnet));
 
-	return 0;
+	return rc;
 }
 
 int
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 24/40] staging: lustre: make some lnet functions static
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (22 preceding siblings ...)
  2015-11-20 23:35 ` [PATCH 23/40] staging: lustre: return appropriate errno when adding route James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 25/40] staging: lustre: missed a few cases of using NULL instead of 0 James Simmons
                   ` (16 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, frank zago

From: frank zago <fzago@cray.com>

Some functions and variables are only used in their C file, so reduce
their scope. This reduces the code size, and fixes sparse warnings
such as:

warning: symbol 'proc_lnet_routes' was not declared.
        Should it be static?
warning: symbol 'proc_lnet_routers' was not declared.
        Should it be static?

Some prototypes were removed from C files and added to the proper
header.

Signed-off-by: Frank Zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/12206
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    2 ++
 drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 --
 drivers/staging/lustre/lnet/selftest/console.c     |    4 +---
 drivers/staging/lustre/lnet/selftest/framework.c   |   10 ----------
 drivers/staging/lustre/lnet/selftest/module.c      |    4 +---
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 6 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 4c2d824..00ef4d0 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -446,6 +446,8 @@ lnet_ni_t *lnet_nid2ni_locked(lnet_nid_t nid, int cpt);
 lnet_ni_t *lnet_net2ni_locked(__u32 net, int cpt);
 lnet_ni_t *lnet_net2ni(__u32 net);
 
+extern int portal_rotor;
+
 int lnet_init(void);
 void lnet_fini(void);
 
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index af7423f..73183f1 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -795,8 +795,6 @@ static struct lnet_portal_rotors	portal_rotors[] = {
 	},
 };
 
-extern int portal_rotor;
-
 static int __proc_lnet_portal_rotor(void *data, int write,
 				    loff_t pos, void __user *buffer, int nob)
 {
diff --git a/drivers/staging/lustre/lnet/selftest/console.c b/drivers/staging/lustre/lnet/selftest/console.c
index f8d6dfe..551d664 100644
--- a/drivers/staging/lustre/lnet/selftest/console.c
+++ b/drivers/staging/lustre/lnet/selftest/console.c
@@ -1694,8 +1694,6 @@ lstcon_new_session_id(lst_sid_t *sid)
 	sid->ses_stamp = cfs_time_current();
 }
 
-extern srpc_service_t lstcon_acceptor_service;
-
 int
 lstcon_session_new(char *name, int key, unsigned feats,
 		   int timeout, int force, lst_sid_t __user *sid_up)
@@ -1974,7 +1972,7 @@ out:
 	return rc;
 }
 
-srpc_service_t lstcon_acceptor_service;
+static srpc_service_t lstcon_acceptor_service;
 static void lstcon_init_acceptor_service(void)
 {
 	/* initialize selftest console acceptor service table */
diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index 1a2da74..b04c147 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -1622,16 +1622,6 @@ static srpc_service_t sfw_services[] = {
 	}
 };
 
-extern sfw_test_client_ops_t ping_test_client;
-extern srpc_service_t	ping_test_service;
-extern void ping_init_test_client(void);
-extern void ping_init_test_service(void);
-
-extern sfw_test_client_ops_t brw_test_client;
-extern srpc_service_t	brw_test_service;
-extern void brw_init_test_client(void);
-extern void brw_init_test_service(void);
-
 int
 sfw_startup(void)
 {
diff --git a/drivers/staging/lustre/lnet/selftest/module.c b/drivers/staging/lustre/lnet/selftest/module.c
index 46cbdf0..91564e5 100644
--- a/drivers/staging/lustre/lnet/selftest/module.c
+++ b/drivers/staging/lustre/lnet/selftest/module.c
@@ -37,6 +37,7 @@
 #define DEBUG_SUBSYSTEM S_LNET
 
 #include "selftest.h"
+#include "console.h"
 
 enum {
 	LST_INIT_NONE = 0,
@@ -47,9 +48,6 @@ enum {
 	LST_INIT_CONSOLE
 };
 
-extern int lstcon_console_init(void);
-extern int lstcon_console_fini(void);
-
 static int lst_init_step = LST_INIT_NONE;
 
 struct cfs_wi_sched *lst_sched_serial;
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 2212199..9b823be 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -1082,7 +1082,7 @@ srpc_client_rpc_expired(void *data)
 	spin_unlock(&srpc_data.rpc_glock);
 }
 
-inline void
+static void
 srpc_add_client_rpc_timer(srpc_client_rpc_t *rpc)
 {
 	stt_timer_t *timer = &rpc->crpc_timer;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 25/40] staging: lustre: missed a few cases of using NULL instead of 0
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (23 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 24/40] staging: lustre: make some lnet functions static James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 26/40] staging: lustre: startup lnet acceptor thread dynamically James Simmons
                   ` (15 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, frank zago

From: frank zago <fzago@cray.com>

It is preferable to use NULL instead of 0 for pointers. This fixes sparse
warnings such as:

lustre/fld/fld_request.c:126:17: warning: Using plain integer as NULL pointer

The second parameter of class_match_param() was changed to a const, to
be able to remove a cast in one user, to prevent splitting a long
line. No other code change.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/12567
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    2 +-
 .../staging/lustre/lustre/obdclass/obd_config.c    |    4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 7657f88..89b390a 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -902,7 +902,7 @@ lnet_ping_info_setup(lnet_ping_info_t **ppinfo, lnet_handle_md_t *md_handle,
 {
 	lnet_process_id_t id = {LNET_NID_ANY, LNET_PID_ANY};
 	lnet_handle_me_t me_handle;
-	lnet_md_t md = {0};
+	lnet_md_t md = { NULL };
 	int rc, rc2;
 
 	if (set_eq) {
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 4a8c759..2109297 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1941,7 +1941,7 @@ int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req,
 		  struct super_block *sb, struct lookup_intent *it)
 {
 	struct ll_sb_info *sbi = NULL;
-	struct lustre_md md;
+	struct lustre_md md = { NULL };
 	int rc;
 
 	LASSERT(*inode || sb);
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c b/drivers/staging/lustre/lustre/obdclass/obd_config.c
index c231e0d..db8e12d 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
@@ -72,7 +72,7 @@ EXPORT_SYMBOL(class_find_param);
 
 /* returns 0 if this is the first key in the buffer, else 1.
    valp points to first char after key. */
-static int class_match_param(char *buf, char *key, char **valp)
+static int class_match_param(char *buf, const char *key, char **valp)
 {
 	if (!buf)
 		return 1;
@@ -1008,7 +1008,7 @@ int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
 		/* Search proc entries */
 		while (lvars[j].name) {
 			var = &lvars[j];
-			if (class_match_param(key, (char *)var->name, NULL) == 0
+			if (class_match_param(key, var->name, NULL) == 0
 			    && keylen == strlen(var->name)) {
 				matched++;
 				rc = -EROFS;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 26/40] staging: lustre: startup lnet acceptor thread dynamically
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (24 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 25/40] staging: lustre: missed a few cases of using NULL instead of 0 James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 27/40] staging: lustre: reject invalid net configuration for lnet James Simmons
                   ` (14 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

With DLC it's possible to start up a system with no NIs that require
the acceptor thread, and thus it won't start.  Later on the user
can add an NI that requires the acceptor thread to start, it is
then necessary to start it up.

If the user removes a NI and as a result there are no more
NIs that require the acceptor thread then it should be stopped.
This patch adds logic in the dynamically adding and removing NIs
code to ensure the above logic is implemented.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
ntel-bug-id: https://jira.hpdd.intel.com/browse/LU-6002
Reviewed-on: http://review.whamcloud.com/13010
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/acceptor.c |   10 ++++++++--
 drivers/staging/lustre/lnet/lnet/api-ni.c   |   14 ++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index d05754d..112b166 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -46,7 +46,9 @@ static struct {
 	int			pta_shutdown;
 	struct socket		*pta_sock;
 	struct completion	pta_signal;
-} lnet_acceptor_state;
+} lnet_acceptor_state = {
+	.pta_shutdown = 1
+};
 
 int
 lnet_acceptor_port(void)
@@ -441,6 +443,10 @@ lnet_acceptor_start(void)
 	long rc2;
 	long secure;
 
+	/* if acceptor is already running return immediately */
+	if (!lnet_acceptor_state.pta_shutdown)
+		return 0;
+
 	LASSERT(lnet_acceptor_state.pta_sock == NULL);
 
 	rc = lnet_acceptor_get_tunables();
@@ -481,7 +487,7 @@ lnet_acceptor_start(void)
 void
 lnet_acceptor_stop(void)
 {
-	if (lnet_acceptor_state.pta_sock == NULL) /* not running */
+	if (lnet_acceptor_state.pta_shutdown) /* not running */
 		return;
 
 	lnet_acceptor_state.pta_shutdown = 1;
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 89b390a..7d6e59f 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1776,6 +1776,16 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 	if (rc != 0)
 		goto failed1;
 
+	if (ni->ni_lnd->lnd_accept) {
+		rc = lnet_acceptor_start();
+		if (rc < 0) {
+			/* shutdown the ni that we just started */
+			CERROR("Failed to start up acceptor thread\n");
+			lnet_shutdown_lndni(ni);
+			goto failed1;
+		}
+	}
+
 	lnet_ping_target_update(pinfo, md_handle);
 	mutex_unlock(&the_lnet.ln_api_mutex);
 
@@ -1823,6 +1833,10 @@ lnet_dyn_del_ni(__u32 net)
 	lnet_ni_decref_locked(ni, 0);
 
 	lnet_shutdown_lndni(ni);
+
+	if (lnet_count_acceptor_nis() == 0)
+		lnet_acceptor_stop();
+
 	lnet_ping_target_update(pinfo, md_handle);
 	goto out;
 failed:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 27/40] staging: lustre: reject invalid net configuration for lnet
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (25 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 26/40] staging: lustre: startup lnet acceptor thread dynamically James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 28/40] staging: lustre: return -EEXIST if NI is not unique James Simmons
                   ` (13 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

Currently if there exists a route that goes over a
remote net and then this net is added dynamically as
a local net, then traffic stops because the code in
lnet_send() determines that the destination nid
can be reached from another local_ni, but the src_nid
is still stuck on the earlier NI, because the src_nid
is stored in the ptlrpc layer and is not updated
when a local NI is configured.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5874
Reviewed-on: http://review.whamcloud.com/12912
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 7d6e59f..9b00bc1 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1747,6 +1747,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 	lnet_handle_md_t md_handle;
 	struct lnet_ni *ni;
 	struct list_head net_head;
+	lnet_remotenet_t *rnet;
 	int rc;
 
 	INIT_LIST_HEAD(&net_head);
@@ -1763,12 +1764,27 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 		goto failed0;
 	}
 
+	ni = list_entry(net_head.next, struct lnet_ni, ni_list);
+
+	lnet_net_lock(LNET_LOCK_EX);
+	rnet = lnet_find_net_locked(LNET_NIDNET(ni->ni_nid));
+	lnet_net_unlock(LNET_LOCK_EX);
+	/*
+	 * make sure that the net added doesn't invalidate the current
+	 * configuration LNet is keeping
+	 */
+	if (rnet) {
+		CERROR("Adding net %s will invalidate routing configuration\n",
+		       nets);
+		rc = -EUSERS;
+		goto failed0;
+	}
+
 	rc = lnet_ping_info_setup(&pinfo, &md_handle, 1 + lnet_get_ni_count(),
 				  false);
 	if (rc != 0)
 		goto failed0;
 
-	ni = list_entry(net_head.next, struct lnet_ni, ni_list);
 	list_del_init(&ni->ni_list);
 
 	rc = lnet_startup_lndni(ni, peer_timeout, peer_cr,
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 28/40] staging: lustre: return -EEXIST if NI is not unique
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (26 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 27/40] staging: lustre: reject invalid net configuration for lnet James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 29/40] staging: lustre: handle lnet_check_routes() errors James Simmons
                   ` (12 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

Return -EEXIST and not -EINVAL when trying to add a
network interface which is not unique.

Some minor cleanup in api-ni.c

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5875
Reviewed-on: http://review.whamcloud.com/13056
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |   20 +++++++++-----------
 1 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 9b00bc1..ed167c8 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1213,7 +1213,7 @@ static int
 lnet_startup_lndni(struct lnet_ni *ni, __s32 peer_timeout,
 		   __s32 peer_cr, __s32 peer_buf_cr, __s32 credits)
 {
-	int rc = 0;
+	int rc = -EINVAL;
 	__u32 lnd_type;
 	lnd_t *lnd;
 	struct lnet_tx_queue *tq;
@@ -1231,19 +1231,19 @@ lnet_startup_lndni(struct lnet_ni *ni, __s32 peer_timeout,
 
 	/* Make sure this new NI is unique. */
 	lnet_net_lock(LNET_LOCK_EX);
-	if (!lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nis)) {
+	rc = lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nis);
+	lnet_net_unlock(LNET_LOCK_EX);
+	if (!rc) {
 		if (lnd_type == LOLND) {
-			lnet_net_unlock(LNET_LOCK_EX);
 			lnet_ni_free(ni);
 			return 0;
 		}
-		lnet_net_unlock(LNET_LOCK_EX);
 
 		CERROR("Net %s is not unique\n",
 		       libcfs_net2str(LNET_NIDNET(ni->ni_nid)));
+		rc = -EEXIST;
 		goto failed0;
 	}
-	lnet_net_unlock(LNET_LOCK_EX);
 
 	mutex_lock(&the_lnet.ln_lnd_mutex);
 	lnd = lnet_find_lnd_by_type(lnd_type);
@@ -1259,6 +1259,7 @@ lnet_startup_lndni(struct lnet_ni *ni, __s32 peer_timeout,
 			CERROR("Can't load LND %s, module %s, rc=%d\n",
 			       libcfs_lnd2str(lnd_type),
 			       libcfs_lnd2modname(lnd_type), rc);
+			rc = -EINVAL;
 			goto failed0;
 		}
 	}
@@ -1348,7 +1349,7 @@ lnet_startup_lndni(struct lnet_ni *ni, __s32 peer_timeout,
 	return 0;
 failed0:
 	lnet_ni_free(ni);
-	return -EINVAL;
+	return rc;
 }
 
 static int
@@ -1495,7 +1496,7 @@ int
 LNetNIInit(lnet_pid_t requested_pid)
 {
 	int im_a_router = 0;
-	int rc, rc2;
+	int rc;
 	int ni_count;
 	lnet_ping_info_t *pinfo;
 	lnet_handle_md_t md_handle;
@@ -1584,10 +1585,7 @@ LNetNIInit(lnet_pid_t requested_pid)
 	return 0;
 
  failed4:
-	lnet_ping_md_unlink(pinfo, &md_handle);
-	lnet_ping_info_free(pinfo);
-	rc2 = LNetEQFree(the_lnet.ln_ping_target_eq);
-	LASSERT(rc2 == 0);
+	lnet_ping_target_fini();
  failed3:
 	the_lnet.ln_refcount = 0;
 	lnet_acceptor_stop();
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 29/40] staging: lustre: handle lnet_check_routes() errors
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (27 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 28/40] staging: lustre: return -EEXIST if NI is not unique James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 30/40] staging: lustre: improvement to router checker James Simmons
                   ` (11 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

After adding a route, lnet_check_routes() is called to ensure that
the route added doesn't invalidate the routing configuration.  If
lnet_check_routes() fails then the route just added, which caused the
current configuration to be invalidated is deleted, and an error
is returned to the user.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6218
Reviewed-on: http://review.whamcloud.com/13445
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index ed167c8..b119c6c 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1910,8 +1910,14 @@ LNetCtl(unsigned int cmd, void *arg)
 				    config->cfg_nid,
 				    config->cfg_config_u.cfg_route.
 					rtr_priority);
+		if (rc == 0) {
+			rc = lnet_check_routes();
+			if (rc != 0)
+				lnet_del_route(config->cfg_net,
+					       config->cfg_nid);
+		}
 		mutex_unlock(&the_lnet.ln_api_mutex);
-		return (rc != 0) ? rc : lnet_check_routes();
+		return rc;
 
 	case IOC_LIBCFS_DEL_ROUTE:
 		config = arg;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 30/40] staging: lustre: improvement to router checker
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (28 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 29/40] staging: lustre: handle lnet_check_routes() errors James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 31/40] staging: lustre: assume a kernel build James Simmons
                   ` (10 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This patch starts router checker thread all the time.

The router checker only checks routes by ping if
live_router_check_interval or dead_router_check_interval are set
to something other than 0, and there are routes configured.

If these conditions are not met the router checker sleeps until woken
up when a route is added.  It is also woken up whenever the RC is
being stopped to ensure the thread doesn't hang.

In the future when DLC starts configuring the live and dead
router_check_interval parameters, then by manipulating them
the router checker can be turned on and off by the user.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6003
Reviewed-on: http://review.whamcloud.com/13035
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-types.h  |    7 +++
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    1 +
 drivers/staging/lustre/lnet/lnet/router.c          |   51 +++++++++++++++++---
 3 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index 3282782..574de55 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -619,6 +619,13 @@ typedef struct {
 	 */
 	bool				  ln_nis_from_mod_params;
 
+	/*
+	 * waitq for router checker.  As long as there are no routes in
+	 * the list, the router checker will sleep on this queue.  when
+	 * routes are added the thread will wake up
+	 */
+	wait_queue_head_t		  ln_rc_waitq;
+
 } lnet_t;
 
 #endif
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index b119c6c..09656a1 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -99,6 +99,7 @@ lnet_init_locks(void)
 {
 	spin_lock_init(&the_lnet.ln_eq_wait_lock);
 	init_waitqueue_head(&the_lnet.ln_eq_waitq);
+	init_waitqueue_head(&the_lnet.ln_rc_waitq);
 	mutex_init(&the_lnet.ln_lnd_mutex);
 	mutex_init(&the_lnet.ln_api_mutex);
 }
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 9271be6..b4ac670 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -404,6 +404,9 @@ lnet_add_route(__u32 net, unsigned int hops, lnet_nid_t gateway,
 	if (rnet != rnet2)
 		LIBCFS_FREE(rnet, sizeof(*rnet));
 
+	/* indicate to startup the router checker if configured */
+	wake_up(&the_lnet.ln_rc_waitq);
+
 	return rc;
 }
 
@@ -1053,11 +1056,6 @@ lnet_router_checker_start(void)
 		return -EINVAL;
 	}
 
-	if (!the_lnet.ln_routing &&
-	    live_router_check_interval <= 0 &&
-	    dead_router_check_interval <= 0)
-		return 0;
-
 	sema_init(&the_lnet.ln_rc_signal, 0);
 	/* EQ size doesn't matter; the callback is guaranteed to get every
 	 * event */
@@ -1102,6 +1100,8 @@ lnet_router_checker_stop(void)
 
 	LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_RUNNING);
 	the_lnet.ln_rc_state = LNET_RC_STATE_STOPPING;
+	/* wakeup the RC thread if it's sleeping */
+	wake_up(&the_lnet.ln_rc_waitq);
 
 	/* block until event callback signals exit */
 	down(&the_lnet.ln_rc_signal);
@@ -1192,6 +1192,33 @@ lnet_prune_rc_data(int wait_unlink)
 	lnet_net_unlock(LNET_LOCK_EX);
 }
 
+/*
+ * This function is called to check if the RC should block indefinitely.
+ * It's called from lnet_router_checker() as well as being passed to
+ * wait_event_interruptible() to avoid the lost wake_up problem.
+ *
+ * When it's called from wait_event_interruptible() it is necessary to
+ * also not sleep if the rc state is not running to avoid a deadlock
+ * when the system is shutting down
+ */
+static inline bool
+lnet_router_checker_active(void)
+{
+	if (the_lnet.ln_rc_state != LNET_RC_STATE_RUNNING)
+		return true;
+
+	/*
+	 * Router Checker thread needs to run when routing is enabled in
+	 * order to call lnet_update_ni_status_locked()
+	 */
+	if (the_lnet.ln_routing)
+		return true;
+
+	return !list_empty(&the_lnet.ln_routers) &&
+		(live_router_check_interval > 0 ||
+		 dead_router_check_interval > 0);
+}
+
 static int
 lnet_router_checker(void *arg)
 {
@@ -1243,8 +1270,18 @@ rescan:
 		/* Call schedule_timeout() here always adds 1 to load average
 		 * because kernel counts # active tasks as nr_running
 		 * + nr_uninterruptible. */
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(cfs_time_seconds(1));
+		/*
+		 * if there are any routes then wakeup every second.  If
+		 * there are no routes then sleep indefinitely until woken
+		 * up by a user adding a route
+		 */
+		if (!lnet_router_checker_active())
+			wait_event_interruptible(the_lnet.ln_rc_waitq,
+						 lnet_router_checker_active());
+		else
+			wait_event_interruptible_timeout(the_lnet.ln_rc_waitq,
+							 false,
+							 cfs_time_seconds(1));
 	}
 
 	LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_STOPPING);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 31/40] staging: lustre: assume a kernel build
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (29 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 30/40] staging: lustre: improvement to router checker James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 32/40] staging: lustre: prevent assert on LNet module unload James Simmons
                   ` (9 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, John L. Hammond

From: John L. Hammond <john.hammond@intel.com>

In lnet/lnet/ and lnet/selftest/ assume a kernel build (assume that
 __KERNEL__ is defined). Remove some common code only needed for user
space LNet.

Only part of the work of this patch got merged. This is the final
bits.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675
Reviewed-on: http://review.whamcloud.com/13121
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-types.h  |    4 --
 drivers/staging/lustre/lnet/lnet/acceptor.c        |    2 -
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   38 ++------------------
 drivers/staging/lustre/lnet/lnet/lib-eq.c          |    3 --
 drivers/staging/lustre/lnet/lnet/lib-md.c          |    3 --
 drivers/staging/lustre/lnet/lnet/lib-me.c          |    3 --
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    5 ---
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |   20 +----------
 drivers/staging/lustre/lnet/lnet/router.c          |    9 ++---
 9 files changed, 7 insertions(+), 80 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index 574de55..dbfb069 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -561,8 +561,6 @@ typedef struct {
 	/* dying LND instances */
 	struct list_head		  ln_nis_zombie;
 	lnet_ni_t			 *ln_loni;	/* the loopback NI */
-	/* NI to wait for events in */
-	lnet_ni_t			 *ln_eq_waitni;
 
 	/* remote networks with routes to them */
 	struct list_head		 *ln_remote_nets_hash;
@@ -592,8 +590,6 @@ typedef struct {
 
 	struct mutex			  ln_api_mutex;
 	struct mutex			  ln_lnd_mutex;
-	int				  ln_init;	/* lnet_init()
-							   called? */
 	/* Have I called LNetNIInit myself? */
 	int				  ln_niinit_self;
 	/* LNetNIInit/LNetNIFini counter */
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index 112b166..ffa2b19 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -204,8 +204,6 @@ lnet_connect(struct socket **sockp, lnet_nid_t peer_nid,
 }
 EXPORT_SYMBOL(lnet_connect);
 
-/* Below is the code common for both kernel and MT user-space */
-
 static int
 lnet_accept(struct socket *sock, __u32 magic)
 {
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 09656a1..e78b079 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -290,7 +290,6 @@ lnet_register_lnd(lnd_t *lnd)
 {
 	mutex_lock(&the_lnet.ln_lnd_mutex);
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(libcfs_isknown_lnd(lnd->lnd_type));
 	LASSERT(lnet_find_lnd_by_type(lnd->lnd_type) == NULL);
 
@@ -308,7 +307,6 @@ lnet_unregister_lnd(lnd_t *lnd)
 {
 	mutex_lock(&the_lnet.ln_lnd_mutex);
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(lnet_find_lnd_by_type(lnd->lnd_type) == lnd);
 	LASSERT(lnd->lnd_refcount == 0);
 
@@ -1164,12 +1162,6 @@ lnet_shutdown_lndnis(void)
 		lnet_ni_unlink_locked(ni);
 	}
 
-	/* Drop the cached eqwait NI. */
-	if (the_lnet.ln_eq_waitni != NULL) {
-		lnet_ni_decref_locked(the_lnet.ln_eq_waitni, 0);
-		the_lnet.ln_eq_waitni = NULL;
-	}
-
 	/* Drop the cached loopback NI. */
 	if (the_lnet.ln_loni != NULL) {
 		lnet_ni_decref_locked(the_lnet.ln_loni, 0);
@@ -1358,7 +1350,6 @@ lnet_startup_lndnis(struct list_head *nilist)
 {
 	struct lnet_ni *ni;
 	int rc;
-	int lnd_type;
 	int ni_count = 0;
 
 	while (!list_empty(nilist)) {
@@ -1372,14 +1363,6 @@ lnet_startup_lndnis(struct list_head *nilist)
 		ni_count++;
 	}
 
-	if (the_lnet.ln_eq_waitni && ni_count > 1) {
-		lnd_type = the_lnet.ln_eq_waitni->ni_lnd->lnd_type;
-		LCONSOLE_ERROR_MSG(0x109, "LND %s can only run single-network\n",
-				   libcfs_lnd2str(lnd_type));
-		rc = -EINVAL;
-		goto failed;
-	}
-
 	return ni_count;
 failed:
 	lnet_shutdown_lndnis();
@@ -1390,10 +1373,9 @@ failed:
 /**
  * Initialize LNet library.
  *
- * Only userspace program needs to call this function - it's automatically
- * called in the kernel at module loading time. Caller has to call lnet_fini()
- * after a call to lnet_init(), if and only if the latter returned 0. It must
- * be called exactly once.
+ * Automatically called at module loading time. Caller has to call
+ * lnet_exit() after a call to lnet_init(), if and only if the
+ * latter returned 0. It must be called exactly once.
  *
  * \return 0 on success, and -ve on failures.
  */
@@ -1403,7 +1385,6 @@ lnet_init(void)
 	int rc;
 
 	lnet_assert_wire_constants();
-	LASSERT(!the_lnet.ln_init);
 
 	memset(&the_lnet, 0, sizeof(the_lnet));
 
@@ -1429,7 +1410,6 @@ lnet_init(void)
 	}
 
 	the_lnet.ln_refcount = 0;
-	the_lnet.ln_init = 1;
 	LNetInvalidateHandle(&the_lnet.ln_rc_eqh);
 	INIT_LIST_HEAD(&the_lnet.ln_lnds);
 	INIT_LIST_HEAD(&the_lnet.ln_rcd_zombie);
@@ -1456,31 +1436,24 @@ EXPORT_SYMBOL(lnet_init);
 /**
  * Finalize LNet library.
  *
- * Only userspace program needs to call this function. It can be called
- * at most once.
- *
  * \pre lnet_init() called with success.
  * \pre All LNet users called LNetNIFini() for matching LNetNIInit() calls.
  */
 void
 lnet_fini(void)
 {
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount == 0);
 
 	while (!list_empty(&the_lnet.ln_lnds))
 		lnet_unregister_lnd(list_entry(the_lnet.ln_lnds.next,
 						   lnd_t, lnd_list));
 	lnet_destroy_locks();
-
-	the_lnet.ln_init = 0;
 }
 EXPORT_SYMBOL(lnet_fini);
 
 /**
  * Set LNet PID and start LNet interfaces, routing, and forwarding.
  *
- * Userspace program should call this after a successful call to lnet_init().
  * Users must call this function at least once before any other functions.
  * For each successful call there must be a corresponding call to
  * LNetNIFini(). For subsequent calls to LNetNIInit(), \a requested_pid is
@@ -1507,7 +1480,6 @@ LNetNIInit(lnet_pid_t requested_pid)
 
 	mutex_lock(&the_lnet.ln_api_mutex);
 
-	LASSERT(the_lnet.ln_init);
 	CDEBUG(D_OTHER, "refs %d\n", the_lnet.ln_refcount);
 
 	if (the_lnet.ln_refcount > 0) {
@@ -1624,7 +1596,6 @@ LNetNIFini(void)
 {
 	mutex_lock(&the_lnet.ln_api_mutex);
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (the_lnet.ln_refcount != 1) {
@@ -1888,7 +1859,6 @@ LNetCtl(unsigned int cmd, void *arg)
 
 	CLASSERT(sizeof(struct lnet_ioctl_net_config) +
 		 sizeof(struct lnet_ioctl_config_data) < LIBCFS_IOC_DATA_MAX);
-	LASSERT(the_lnet.ln_init);
 
 	switch (cmd) {
 	case IOC_LIBCFS_GET_NI:
@@ -2140,8 +2110,6 @@ LNetGetId(unsigned int index, lnet_process_id_t *id)
 	int cpt;
 	int rc = -ENOENT;
 
-	LASSERT(the_lnet.ln_init);
-
 	/* LNetNI initilization failed? */
 	if (the_lnet.ln_refcount == 0)
 		return rc;
diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
index 60889eb..5deefaf 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
@@ -72,7 +72,6 @@ LNetEQAlloc(unsigned int count, lnet_eq_handler_t callback,
 {
 	lnet_eq_t *eq;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	/* We need count to be a power of 2 so that when eq_{enq,deq}_seq
@@ -159,7 +158,6 @@ LNetEQFree(lnet_handle_eq_t eqh)
 	int size = 0;
 	int i;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	lnet_res_lock(LNET_LOCK_EX);
@@ -390,7 +388,6 @@ LNetEQPoll(lnet_handle_eq_t *eventqs, int neq, int timeout_ms,
 	int rc;
 	int i;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (neq < 1)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-md.c b/drivers/staging/lustre/lnet/lnet/lib-md.c
index 758f5be..dc35f2d 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-md.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-md.c
@@ -273,7 +273,6 @@ LNetMDAttach(lnet_handle_me_t meh, lnet_md_t umd,
 	int cpt;
 	int rc;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (lnet_md_validate(&umd) != 0)
@@ -350,7 +349,6 @@ LNetMDBind(lnet_md_t umd, lnet_unlink_t unlink, lnet_handle_md_t *handle)
 	int cpt;
 	int rc;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (lnet_md_validate(&umd) != 0)
@@ -425,7 +423,6 @@ LNetMDUnlink(lnet_handle_md_t mdh)
 	lnet_libmd_t *md;
 	int cpt;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	cpt = lnet_cpt_of_cookie(mdh.cookie);
diff --git a/drivers/staging/lustre/lnet/lnet/lib-me.c b/drivers/staging/lustre/lnet/lnet/lib-me.c
index 42fc99e..69a9314 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-me.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-me.c
@@ -83,7 +83,6 @@ LNetMEAttach(unsigned int portal,
 	struct lnet_me *me;
 	struct list_head *head;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if ((int)portal >= the_lnet.ln_nportals)
@@ -156,7 +155,6 @@ LNetMEInsert(lnet_handle_me_t current_meh,
 	struct lnet_portal *ptl;
 	int cpt;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (pos == LNET_INS_LOCAL)
@@ -233,7 +231,6 @@ LNetMEUnlink(lnet_handle_me_t meh)
 	lnet_event_t ev;
 	int cpt;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	cpt = lnet_cpt_of_cookie(meh.cookie);
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index b9388ed..c2cc8c8 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -59,8 +59,6 @@ lnet_fail_nid(lnet_nid_t nid, unsigned int threshold)
 	struct list_head *next;
 	struct list_head cull;
 
-	LASSERT(the_lnet.ln_init);
-
 	/* NB: use lnet_net_lock(0) to serialize operations on test peers */
 	if (threshold != 0) {
 		/* Adding a new entry */
@@ -2137,7 +2135,6 @@ LNetPut(lnet_nid_t self, lnet_handle_md_t mdh, lnet_ack_req_t ack,
 	int cpt;
 	int rc;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (!list_empty(&the_lnet.ln_test_peers) && /* normally we don't */
@@ -2337,7 +2334,6 @@ LNetGet(lnet_nid_t self, lnet_handle_md_t mdh,
 	int cpt;
 	int rc;
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	if (!list_empty(&the_lnet.ln_test_peers) && /* normally we don't */
@@ -2436,7 +2432,6 @@ LNetDist(lnet_nid_t dstnid, lnet_nid_t *srcnidp, __u32 *orderp)
 	 * keep order 0 free for 0@lo and order 1 free for a local NID
 	 * match */
 
-	LASSERT(the_lnet.ln_init);
 	LASSERT(the_lnet.ln_refcount > 0);
 
 	cpt = lnet_net_lock_current();
diff --git a/drivers/staging/lustre/lnet/lnet/lib-msg.c b/drivers/staging/lustre/lnet/lnet/lib-msg.c
index 43977e8..5751dc4 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-msg.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-msg.c
@@ -559,35 +559,17 @@ lnet_msg_container_cleanup(struct lnet_msg_container *container)
 			    sizeof(*container->msc_finalizers));
 		container->msc_finalizers = NULL;
 	}
-#ifdef LNET_USE_LIB_FREELIST
-	lnet_freelist_fini(&container->msc_freelist);
-#endif
 	container->msc_init = 0;
 }
 
 int
 lnet_msg_container_setup(struct lnet_msg_container *container, int cpt)
 {
-	int rc;
-
 	container->msc_init = 1;
 
 	INIT_LIST_HEAD(&container->msc_active);
 	INIT_LIST_HEAD(&container->msc_finalizing);
 
-#ifdef LNET_USE_LIB_FREELIST
-	memset(&container->msc_freelist, 0, sizeof(lnet_freelist_t));
-
-	rc = lnet_freelist_init(&container->msc_freelist,
-				LNET_FL_MAX_MSGS, sizeof(lnet_msg_t));
-	if (rc != 0) {
-		CERROR("Failed to init freelist for message container\n");
-		lnet_msg_container_cleanup(container);
-		return rc;
-	}
-#else
-	rc = 0;
-#endif
 	/* number of CPUs */
 	container->msc_nfinalizers = cfs_cpt_weight(lnet_cpt_table(), cpt);
 
@@ -601,7 +583,7 @@ lnet_msg_container_setup(struct lnet_msg_container *container, int cpt)
 		return -ENOMEM;
 	}
 
-	return rc;
+	return 0;
 }
 
 void
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index b4ac670..91f3f09 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -1046,7 +1046,7 @@ lnet_router_checker_start(void)
 {
 	struct task_struct *task;
 	int rc;
-	int eqsz;
+	int eqsz = 0;
 
 	LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_SHUTDOWN);
 
@@ -1057,11 +1057,8 @@ lnet_router_checker_start(void)
 	}
 
 	sema_init(&the_lnet.ln_rc_signal, 0);
-	/* EQ size doesn't matter; the callback is guaranteed to get every
-	 * event */
-	eqsz = 0;
-	rc = LNetEQAlloc(eqsz, lnet_router_checker_event,
-			 &the_lnet.ln_rc_eqh);
+
+	rc = LNetEQAlloc(0, lnet_router_checker_event, &the_lnet.ln_rc_eqh);
 	if (rc != 0) {
 		CERROR("Can't allocate EQ(%d): %d\n", eqsz, rc);
 		return -ENOMEM;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 32/40] staging: lustre: prevent assert on LNet module unload
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (30 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 31/40] staging: lustre: assume a kernel build James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 33/40] staging: lustre: remove messages from lazy portal on NI shutdown James Simmons
                   ` (8 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

There is a use case where lnet can be unloaded while there are
no NIs configured.  Removing lnet in this case will cause
LNetFini() to be called without a prior call to LNetNIFini().
This will cause the LASSERT(the_lnet.ln_refcount == 0) to be
triggered.

To deal with this use case when LNet is configured a reference
count on the module is taken using try_module_get().  This way
LNet must be unconfigured before it could be removed; therefore
avoiding the above case.  When LNet is unconfigured module_put()
is called to return the reference count.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6010
Reviewed-on: http://review.whamcloud.com/13110
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/module.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
index 48eb085..b1f32a6 100644
--- a/drivers/staging/lustre/lnet/lnet/module.c
+++ b/drivers/staging/lustre/lnet/lnet/module.c
@@ -53,13 +53,21 @@ lnet_configure(void *arg)
 	mutex_lock(&lnet_config_mutex);
 
 	if (!the_lnet.ln_niinit_self) {
+		rc = try_module_get(THIS_MODULE);
+
+		if (rc != 1)
+			goto out;
+
 		rc = LNetNIInit(LNET_PID_LUSTRE);
 		if (rc >= 0) {
 			the_lnet.ln_niinit_self = 1;
 			rc = 0;
+		} else {
+			module_put(THIS_MODULE);
 		}
 	}
 
+out:
 	mutex_unlock(&lnet_config_mutex);
 	return rc;
 }
@@ -74,6 +82,7 @@ lnet_unconfigure(void)
 	if (the_lnet.ln_niinit_self) {
 		the_lnet.ln_niinit_self = 0;
 		LNetNIFini();
+		module_put(THIS_MODULE);
 	}
 
 	mutex_lock(&the_lnet.ln_api_mutex);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 33/40] staging: lustre: remove messages from lazy portal on NI shutdown
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (31 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 32/40] staging: lustre: prevent assert on LNet module unload James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 34/40] staging: lustre: remove unnecessary EXPORT_SYMBOL from lnet layer James Simmons
                   ` (7 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

When shutting down an NI in a busy system, some messages received
on this NI, might be on the lazy portal.  They would have grabbed
a ref count on the NI.  Therefore NI will not be removed until
messages are processed.

In order to avoid this scenario, when an NI is shutdown go through
all messages queued on the lazy portal and drop messages for the
NI being shutdown

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6040
Reviewed-on: http://review.whamcloud.com/13836
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |    1 +
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 ++
 drivers/staging/lustre/lnet/lnet/lib-ptl.c         |   54 +++++++++++++-------
 3 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 00ef4d0..6dce2c9 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -480,6 +480,7 @@ int lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
 		    __s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
 		    __s32 credits);
 int lnet_dyn_del_ni(__u32 net);
+int lnet_clear_lazy_portal(struct lnet_ni *ni, int portal, char *reason);
 
 int lnet_islocalnid(lnet_nid_t nid);
 int lnet_islocalnet(__u32 net);
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index e78b079..34f8c1b 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1190,10 +1190,16 @@ lnet_shutdown_lndnis(void)
 static void
 lnet_shutdown_lndni(struct lnet_ni *ni)
 {
+	int i;
+
 	lnet_net_lock(LNET_LOCK_EX);
 	lnet_ni_unlink_locked(ni);
 	lnet_net_unlock(LNET_LOCK_EX);
 
+	/* clear messages for this NI on the lazy portal */
+	for (i = 0; i < the_lnet.ln_nportals; i++)
+		lnet_clear_lazy_portal(ni, i, "Shutting down NI");
+
 	/* Do peer table cleanup for this ni */
 	lnet_peer_tables_cleanup(ni);
 
diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c
index b4f573a..93bc3dc 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c
@@ -886,17 +886,8 @@ LNetSetLazyPortal(int portal)
 }
 EXPORT_SYMBOL(LNetSetLazyPortal);
 
-/**
- * Turn off the lazy portal attribute. Delayed requests on the portal,
- * if any, will be all dropped when this function returns.
- *
- * \param portal Index of the portal to disable the lazy attribute on.
- *
- * \retval 0       On success.
- * \retval -EINVAL If \a portal is not a valid index.
- */
 int
-LNetClearLazyPortal(int portal)
+lnet_clear_lazy_portal(struct lnet_ni *ni, int portal, char *reason)
 {
 	struct lnet_portal *ptl;
 	LIST_HEAD(zombies);
@@ -915,21 +906,48 @@ LNetClearLazyPortal(int portal)
 		return 0;
 	}
 
-	if (the_lnet.ln_shutdown)
-		CWARN("Active lazy portal %d on exit\n", portal);
-	else
-		CDEBUG(D_NET, "clearing portal %d lazy\n", portal);
+	if (ni) {
+		struct lnet_msg *msg, *tmp;
 
-	/* grab all the blocked messages atomically */
-	list_splice_init(&ptl->ptl_msg_delayed, &zombies);
+		/* grab all messages which are on the NI passed in */
+		list_for_each_entry_safe(msg, tmp, &ptl->ptl_msg_delayed,
+					 msg_list) {
+			if (msg->msg_rxpeer->lp_ni == ni)
+				list_move(&msg->msg_list, &zombies);
+		}
+	} else {
+		if (the_lnet.ln_shutdown)
+			CWARN("Active lazy portal %d on exit\n", portal);
+		else
+			CDEBUG(D_NET, "clearing portal %d lazy\n", portal);
+
+		/* grab all the blocked messages atomically */
+		list_splice_init(&ptl->ptl_msg_delayed, &zombies);
 
-	lnet_ptl_unsetopt(ptl, LNET_PTL_LAZY);
+		lnet_ptl_unsetopt(ptl, LNET_PTL_LAZY);
+	}
 
 	lnet_ptl_unlock(ptl);
 	lnet_res_unlock(LNET_LOCK_EX);
 
-	lnet_drop_delayed_msg_list(&zombies, "Clearing lazy portal attr");
+	lnet_drop_delayed_msg_list(&zombies, reason);
 
 	return 0;
 }
+
+/**
+ * Turn off the lazy portal attribute. Delayed requests on the portal,
+ * if any, will be all dropped when this function returns.
+ *
+ * \param portal Index of the portal to disable the lazy attribute on.
+ *
+ * \retval 0       On success.
+ * \retval -EINVAL If \a portal is not a valid index.
+ */
+int
+LNetClearLazyPortal(int portal)
+{
+	return lnet_clear_lazy_portal(NULL, portal,
+				      "Clearing lazy portal attr");
+}
 EXPORT_SYMBOL(LNetClearLazyPortal);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 34/40] staging: lustre: remove unnecessary EXPORT_SYMBOL from lnet layer
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (32 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 33/40] staging: lustre: remove messages from lazy portal on NI shutdown James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 35/40] staging: lustre: avoid race during lnet acceptor thread termination James Simmons
                   ` (6 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, frank zago

From: frank zago <fzago@cray.com>

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5829
Reviewed-on: http://review.whamcloud.com/13320
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c     |    1 -
 drivers/staging/lustre/lnet/lnet/lib-move.c   |    1 -
 drivers/staging/lustre/lnet/lnet/lib-socket.c |    3 ---
 drivers/staging/lustre/lnet/selftest/conctl.c |    2 --
 4 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 34f8c1b..7f5e0e8 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -358,7 +358,6 @@ lnet_counters_reset(void)
 
 	lnet_net_unlock(LNET_LOCK_EX);
 }
-EXPORT_SYMBOL(lnet_counters_reset);
 
 static char *
 lnet_res_type2str(int type)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index c2cc8c8..430cb9a 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1715,7 +1715,6 @@ lnet_msgtyp2str(int type)
 		return "<UNKNOWN>";
 	}
 }
-EXPORT_SYMBOL(lnet_msgtyp2str);
 
 void
 lnet_print_hdr(lnet_hdr_t *hdr)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-socket.c b/drivers/staging/lustre/lnet/lnet/lib-socket.c
index 6f7ef4c..ccb425d 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-socket.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-socket.c
@@ -513,7 +513,6 @@ lnet_sock_listen(struct socket **sockp, __u32 local_ip, int local_port,
 	sock_release(*sockp);
 	return rc;
 }
-EXPORT_SYMBOL(lnet_sock_listen);
 
 int
 lnet_sock_accept(struct socket **newsockp, struct socket *sock)
@@ -555,7 +554,6 @@ failed:
 	sock_release(newsock);
 	return rc;
 }
-EXPORT_SYMBOL(lnet_sock_accept);
 
 int
 lnet_sock_connect(struct socket **sockp, int *fatal, __u32 local_ip,
@@ -591,4 +589,3 @@ lnet_sock_connect(struct socket **sockp, int *fatal, __u32 local_ip,
 	sock_release(*sockp);
 	return rc;
 }
-EXPORT_SYMBOL(lnet_sock_connect);
diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index 3dc8ea7..73e71cd 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -931,5 +931,3 @@ out:
 
 	return rc;
 }
-
-EXPORT_SYMBOL(lstcon_ioctl_entry);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 35/40] staging: lustre: avoid race during lnet acceptor thread termination
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (33 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 34/40] staging: lustre: remove unnecessary EXPORT_SYMBOL from lnet layer James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 36/40] staging: lustre: test for sk_sleep presence in compact-2.6.h James Simmons
                   ` (5 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Bruno Faccini

From: Bruno Faccini <bruno.faccini@intel.com>

This patch will avoid potential race, around socket sleepers
wait list, during acceptor thread termination and using
sk_callback_lock RW-Lock protection.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6476
Reviewed-on: http://review.whamcloud.com/14503
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/acceptor.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index ffa2b19..61806ce 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -485,11 +485,17 @@ lnet_acceptor_start(void)
 void
 lnet_acceptor_stop(void)
 {
+	struct sock *sk;
+
 	if (lnet_acceptor_state.pta_shutdown) /* not running */
 		return;
 
 	lnet_acceptor_state.pta_shutdown = 1;
-	wake_up_all(sk_sleep(lnet_acceptor_state.pta_sock->sk));
+
+	sk = lnet_acceptor_state.pta_sock->sk;
+
+	/* awake any sleepers using safe method */
+	sk->sk_state_change(sk);
 
 	/* block until acceptor signals exit */
 	wait_for_completion(&lnet_acceptor_state.pta_signal);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 36/40] staging: lustre: test for sk_sleep presence in compact-2.6.h
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (34 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 35/40] staging: lustre: avoid race during lnet acceptor thread termination James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 37/40] staging: lustre: remove unnecessary NULL check in IOC_LIBCFS_GET_NET James Simmons
                   ` (4 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, James Simmons

Like Lustre external infiniband stacks create a compatibility
layer to handle various distributions and kernel versions.
Due to this besides the linux kernel sk_sleep can also be
defined by the external infiniband stack. We need to examine
the infiniband stack's headers to see if sk_sleep is available
there as well.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6763
Reviewed-on: http://review.whamcloud.com/15386
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-types.h  |    1 -
 drivers/staging/lustre/lnet/lnet/acceptor.c        |    1 +
 2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index dbfb069..695b5be 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -38,7 +38,6 @@
 #include <linux/kthread.h>
 #include <linux/uio.h>
 #include <linux/types.h>
-#include <net/sock.h>
 
 #include "types.h"
 
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index 61806ce..05eb5b2 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -36,6 +36,7 @@
 
 #define DEBUG_SUBSYSTEM S_LNET
 #include <linux/completion.h>
+#include <net/sock.h>
 #include "../../include/linux/lnet/lib-lnet.h"
 
 static int   accept_port    = 988;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 37/40] staging: lustre: remove unnecessary NULL check in IOC_LIBCFS_GET_NET
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (35 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 36/40] staging: lustre: test for sk_sleep presence in compact-2.6.h James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 38/40] staging: lustre: Allocate the correct number of rtr buffers James Simmons
                   ` (3 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

In LNetCtl():IOC_LIBCFS_GET_NET there is a check for config == NULL
This is not necessary as it'll never be NULL.  That's ensured before
the call to LNetCtl.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6502
Reviewed-on: http://review.whamcloud.com/15779
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 7f5e0e8..6373de0 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1931,7 +1931,7 @@ LNetCtl(unsigned int cmd, void *arg)
 
 		net_config = (struct lnet_ioctl_net_config *)
 				config->cfg_bulk;
-		if (!config || !net_config)
+		if (!net_config)
 			return -1;
 
 		return lnet_get_net_config(config->cfg_count,
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 38/40] staging: lustre: Allocate the correct number of rtr buffers
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (36 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 37/40] staging: lustre: remove unnecessary NULL check in IOC_LIBCFS_GET_NET James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 39/40] staging: lustre: Use lnet_is_route_alive for router aliveness James Simmons
                   ` (2 subsequent siblings)
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Amir Shehata

From: Amir Shehata <amir.shehata@intel.com>

This patch ensures that the correct number of router buffers are
allocated.  It keeps a count that keeps track of the number of
buffers allocated.  Another count keeps the number of buffers
requested. The number of buffers allocated is set when creating
new buffers and reduced when buffers are freed.

The number of requested buffer is set when the buffers are
allocated and is checked when credits are returned to determine
whether the buffer should be freed or kept.

In lnet_rtrpool_adjust_bufs() grab lnet_net_lock() before using
rbp_nbuffers to ensure that it doesn't change by
lnet_return_rx_credits_locked() during the process of allocating
new buffers.  All other access to rbp_nbuffers is already being
protected by lnet_net_lock().

This avoids the case where we allocate less than the desired
number of buffers.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6122
Reviewed-on: http://review.whamcloud.com/13519
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-types.h  |    5 ++-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    3 +-
 drivers/staging/lustre/lnet/lnet/router.c          |   32 +++++++++++++++-----
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index 695b5be..24b3c1a 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -386,7 +386,10 @@ typedef struct {
 	struct list_head	rbp_msgs;	/* messages blocking
 						   for a buffer */
 	int			rbp_npages;	/* # pages in each buffer */
-	int			rbp_nbuffers;	/* # buffers */
+	/* requested number of buffers */
+	int			rbp_req_nbuffers;
+	/* # buffers actually allocated */
+	int			rbp_nbuffers;
 	int			rbp_credits;	/* # free buffers /
 						     blocked messages */
 	int			rbp_mincredits;	/* low water mark */
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 430cb9a..21a7c6f 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1101,9 +1101,10 @@ lnet_return_rx_credits_locked(lnet_msg_t *msg)
 		 * buffers in this pool.  Make sure we never put back
 		 * more buffers than the stated number.
 		 */
-		if (rbp->rbp_credits >= rbp->rbp_nbuffers) {
+		if (unlikely(rbp->rbp_credits >= rbp->rbp_req_nbuffers)) {
 			/* Discard this buffer so we don't have too many. */
 			lnet_destroy_rtrbuf(rb, rbp->rbp_npages);
+			rbp->rbp_nbuffers--;
 		} else {
 			list_add(&rb->rb_list, &rbp->rbp_bufs);
 			rbp->rbp_credits++;
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 91f3f09..35cfced 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -1352,6 +1352,7 @@ lnet_rtrpool_free_bufs(lnet_rtrbufpool_t *rbp, int cpt)
 	lnet_net_lock(cpt);
 	lnet_drop_routed_msgs_locked(&rbp->rbp_msgs, cpt);
 	list_splice_init(&rbp->rbp_bufs, &tmp);
+	rbp->rbp_req_nbuffers = 0;
 	rbp->rbp_nbuffers = 0;
 	rbp->rbp_credits = 0;
 	rbp->rbp_mincredits = 0;
@@ -1372,20 +1373,33 @@ lnet_rtrpool_adjust_bufs(lnet_rtrbufpool_t *rbp, int nbufs, int cpt)
 	lnet_rtrbuf_t *rb;
 	int num_rb;
 	int num_buffers = 0;
+	int old_req_nbufs;
 	int npages = rbp->rbp_npages;
 
+	lnet_net_lock(cpt);
 	/*
 	 * If we are called for less buffers than already in the pool, we
-	 * just lower the nbuffers number and excess buffers will be
+	 * just lower the req_nbuffers number and excess buffers will be
 	 * thrown away as they are returned to the free list.  Credits
 	 * then get adjusted as well.
+	 * If we already have enough buffers allocated to serve the
+	 * increase requested, then we can treat that the same way as we
+	 * do the decrease.
 	 */
-	if (nbufs <= rbp->rbp_nbuffers) {
-		lnet_net_lock(cpt);
-		rbp->rbp_nbuffers = nbufs;
+	num_rb = nbufs - rbp->rbp_nbuffers;
+	if (nbufs <= rbp->rbp_req_nbuffers || num_rb <= 0) {
+		rbp->rbp_req_nbuffers = nbufs;
 		lnet_net_unlock(cpt);
 		return 0;
 	}
+	/*
+	 * store the older value of rbp_req_nbuffers and then set it to
+	 * the new request to prevent lnet_return_rx_credits_locked() from
+	 * freeing buffers that we need to keep around
+	 */
+	old_req_nbufs = rbp->rbp_req_nbuffers;
+	rbp->rbp_req_nbuffers = nbufs;
+	lnet_net_unlock(cpt);
 
 	INIT_LIST_HEAD(&rb_list);
 
@@ -1394,19 +1408,21 @@ lnet_rtrpool_adjust_bufs(lnet_rtrbufpool_t *rbp, int nbufs, int cpt)
 	 * allocated successfully then join this list to the rbp buffer
 	 * list. If not then free all allocated buffers.
 	 */
-	num_rb = rbp->rbp_nbuffers;
-
-	while (num_rb < nbufs) {
+	while (num_rb-- > 0) {
 		rb = lnet_new_rtrbuf(rbp, cpt);
 		if (rb == NULL) {
 			CERROR("Failed to allocate %d route bufs of %d pages\n",
 			       nbufs, npages);
+
+			lnet_net_lock(cpt);
+			rbp->rbp_req_nbuffers = old_req_nbufs;
+			lnet_net_unlock(cpt);
+
 			goto failed;
 		}
 
 		list_add(&rb->rb_list, &rb_list);
 		num_buffers++;
-		num_rb++;
 	}
 
 	lnet_net_lock(cpt);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 39/40] staging: lustre: Use lnet_is_route_alive for router aliveness
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (37 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 38/40] staging: lustre: Allocate the correct number of rtr buffers James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-11-20 23:36 ` [PATCH 40/40] staging: lustre: Remove LASSERTS from router checker James Simmons
  2015-12-21 23:41 ` [PATCH 00/40] Sync upstream lustre client LNet core Greg Kroah-Hartman
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Chris Horn

From: Chris Horn <hornc@cray.com>

lctl show_route and lctl route_list will output router aliveness
information via lnet_get_route(). lnet_get_route() should use the
lnet_is_route_alive() function, introduced in e8a1124
http://review.whamcloud.com/7857, to determine route aliveness.

Signed-off-by: Chris Horn <hornc@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5733
Reviewed-on: http://review.whamcloud.com/14055
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/router.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 35cfced..83d233b 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -611,8 +611,7 @@ lnet_get_route(int idx, __u32 *net, __u32 *hops,
 					*hops     = route->lr_hops;
 					*priority = route->lr_priority;
 					*gateway  = route->lr_gateway->lp_nid;
-					*alive = route->lr_gateway->lp_alive &&
-						 !route->lr_downis;
+					*alive = lnet_is_route_alive(route);
 					lnet_net_unlock(cpt);
 					return 0;
 				}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 40/40] staging: lustre: Remove LASSERTS from router checker
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (38 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 39/40] staging: lustre: Use lnet_is_route_alive for router aliveness James Simmons
@ 2015-11-20 23:36 ` James Simmons
  2015-12-21 23:41 ` [PATCH 00/40] Sync upstream lustre client LNet core Greg Kroah-Hartman
  40 siblings, 0 replies; 65+ messages in thread
From: James Simmons @ 2015-11-20 23:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger
  Cc: Linux Kernel Mailing List, lustre-devel, Doug Oucharek

From: Doug Oucharek <doug.s.oucharek@intel.com>

In lnet_router_checker(), there are two LASSERTS.  Neither protects
us from anything and one of them triggered for a customer crashing
the system unecessarily.  This patch removes them.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7362
Reviewed-on: http://review.whamcloud.com/17003
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Matt Ezell <ezellma@ornl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/router.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 83d233b..476b4dd 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -1223,8 +1223,6 @@ lnet_router_checker(void *arg)
 
 	cfs_block_allsigs();
 
-	LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_RUNNING);
-
 	while (the_lnet.ln_rc_state == LNET_RC_STATE_RUNNING) {
 		__u64 version;
 		int cpt;
@@ -1280,8 +1278,6 @@ rescan:
 							 cfs_time_seconds(1));
 	}
 
-	LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_STOPPING);
-
 	lnet_prune_rc_data(1); /* wait for UNLINK */
 
 	the_lnet.ln_rc_state = LNET_RC_STATE_SHUTDOWN;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE()
  2015-11-20 23:35 ` [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE() James Simmons
@ 2015-11-21 18:45   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-11-21 18:45 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	John L. Hammond, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:42PM -0500, James Simmons wrote:
> @@ -1577,15 +1578,20 @@ static int mdc_ioc_changelog_send(struct obd_device *obd,
>  	 * New thread because we should return to user app before
>  	 * writing into our pipe
>  	 */
> -	rc = PTR_ERR(kthread_run(mdc_changelog_send_thread, cs,
> -				 "mdc_clg_send_thread"));
> -	if (!IS_ERR_VALUE(rc)) {
> -		CDEBUG(D_CHANGELOG, "start changelog thread\n");
> -		return 0;
> +	task = kthread_run(mdc_changelog_send_thread, cs,
> +			   "mdc_clg_send_thread");
> +	if (IS_ERR(task)) {
> +		rc = PTR_ERR(task);
> +		CERROR("%s: can't start changelog thread: rc = %d\n",
> +		       obd->obd_name, rc);
> +		kfree(cs);
> +	} else {
> +		rc = 0;
> +		CDEBUG(D_CHANGELOG, "%s: started changelog thread\n",
> +		       obd->obd_name);
>  	}
>  
>  	CERROR("Failed to start changelog thread: %d\n", rc);
> -	kfree(cs);
>  	return rc;
>  }
>  

This will print an error when it succeeds.

It better to keep the error path and the success path as separate as
possible.  For the normal case, the success path is at indent level 1
and the fail path is at indent level 2.  Like this:

	ret = one();
	if (ret)
		return ret;
	ret = two();
	if (ret)
		return ret;

	return 0;

When it's written that way it is:

	success;
	if (ret)
		fail_path;

	success;
	if (ret)
		fail_path;
	success;


The current code looks like:

	success;
	if (ret) {
		fail;
	} else {
		success;
	}
	mixed;

You see what I mean?  Ideally the function should look like a list of
directives in a row at indent level 1 with only minimal indenting for
errors.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet
  2015-11-20 23:35 ` [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet James Simmons
@ 2015-12-02  7:46   ` Dan Carpenter
  2015-12-15 18:08     ` Simmons, James A.
  0 siblings, 1 reply; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02  7:46 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Sebastien Buisson, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:38PM -0500, James Simmons wrote:
> From: Sebastien Buisson <sebastien.buisson@bull.net>
> 
> Fix 'NULL pointer dereference' defects found by Coverity version
> 6.5.3:
> Dereference after null check (FORWARD_NULL)
> For instance, Passing null pointer to a function which dereferences
> it.
> Dereference before null check (REVERSE_INULL)
> Null-checking variable suggests that it may be null, but it has
> already been dereferenced on all paths leading to the check.
> Dereference null return value (NULL_RETURNS)
> 
> The following fixes for the LNet layer are broken out of patch
> http://review.whamcloud.com/4720.
> 
> Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2217
> Reviewed-on: http://review.whamcloud.com/4720
> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
>  .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |    2 +-
>  drivers/staging/lustre/lnet/lnet/lib-move.c        |    2 +
>  drivers/staging/lustre/lnet/selftest/conctl.c      |   51 ++++++++++----------
>  3 files changed, 29 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> index de0f85f..0f4154c 100644
> --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> @@ -2829,7 +2829,7 @@ int kiblnd_startup(lnet_ni_t *ni)
>  	return 0;
>  
>  failed:
> -	if (net->ibn_dev == NULL && ibdev != NULL)
> +	if (net && net->ibn_dev == NULL && ibdev != NULL)
>  		kiblnd_destroy_dev(ibdev);
>  
>  net_failed:

I think the warning must be for a really really old version.  This was
fixed in: 3247c4e5ef5d ('staging: lustre: lnet: klnds: o2iblnd: fix null
dereference on failed path in o2iblnd.c').

The new NULL check is superflous.

> diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
> index 5631f60..7a68382 100644
> --- a/drivers/staging/lustre/lnet/lnet/lib-move.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
> @@ -162,6 +162,7 @@ lnet_iov_nob(unsigned int niov, struct kvec *iov)
>  {
>  	unsigned int nob = 0;
>  
> +	LASSERT(niov == 0 || iov);
>  	while (niov-- > 0)
>  		nob += (iov++)->iov_len;
>  
> @@ -280,6 +281,7 @@ lnet_kiov_nob(unsigned int niov, lnet_kiov_t *kiov)
>  {
>  	unsigned int nob = 0;
>  
> +	LASSERT(niov == 0 || kiov);
>  	while (niov-- > 0)
>  		nob += (kiov++)->kiov_len;
>  

Fine, I suppose.

> diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
> index 556c837..2ca7d0e 100644
> --- a/drivers/staging/lustre/lnet/selftest/conctl.c
> +++ b/drivers/staging/lustre/lnet/selftest/conctl.c
> @@ -679,45 +679,46 @@ static int
>  lst_stat_query_ioctl(lstio_stat_args_t *args)
>  {
>  	int rc;
> -	char *name;
> +	char *name = NULL;
>  
>  	/* TODO: not finished */
>  	if (args->lstio_sta_key != console_session.ses_key)
>  		return -EACCES;
>  
> -	if (args->lstio_sta_resultp == NULL ||
> -	    (args->lstio_sta_namep  == NULL &&
> -	     args->lstio_sta_idsp   == NULL) ||
> -	    args->lstio_sta_nmlen <= 0 ||
> -	    args->lstio_sta_nmlen > LST_NAME_SIZE)
> -		return -EINVAL;
> -
> -	if (args->lstio_sta_idsp != NULL &&
> -	    args->lstio_sta_count <= 0)
> +	if (!args->lstio_sta_resultp)
>  		return -EINVAL;
>  
> -	LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
> -	if (name == NULL)
> -		return -ENOMEM;
> -
> -	if (copy_from_user(name, args->lstio_sta_namep,
> -			       args->lstio_sta_nmlen)) {
> -		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
> -		return -EFAULT;
> -	}
> +	if (args->lstio_sta_idsp) {
> +		if (args->lstio_sta_count <= 0)
> +			return -EINVAL;
>  
> -	if (args->lstio_sta_idsp == NULL) {
> -		rc = lstcon_group_stat(name, args->lstio_sta_timeout,
> -				       args->lstio_sta_resultp);
> -	} else {
>  		rc = lstcon_nodes_stat(args->lstio_sta_count,
>  				       args->lstio_sta_idsp,
>  				       args->lstio_sta_timeout,
>  				       args->lstio_sta_resultp);
> -	}
> +	} else if (args->lstio_sta_namep) {
> +		if (args->lstio_sta_nmlen <= 0 ||
> +		    args->lstio_sta_nmlen > LST_NAME_SIZE)
> +			return -EINVAL;
> +
> +		LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
> +		if (!name)
> +			return -ENOMEM;
>  
> -	LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
> +		rc = copy_from_user(name, args->lstio_sta_namep,
> +				    args->lstio_sta_nmlen);
> +		if (!rc)
> +			rc = lstcon_group_stat(name, args->lstio_sta_timeout,
> +					       args->lstio_sta_resultp);
> +		else
> +			rc = -EFAULT;
>  
> +	} else {
> +		rc = -EINVAL;
> +	}
> +
> +	if (name)
> +		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);


There is no bug fix here.  This code was fine when it was merged into
the kernel in 2013 so I have no idea how out of date the static checker
warning is...  The new code doesn't do unnecessary allocations so that's
good but "name" should be declared in the block where it is used instead
of at the start of the function.  Btw, we assume that the user gives us
a NUL terminated string for "name" so we should fix that bug as well.

TODO: lustre: don't assume "name" is NUL terminated

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes
  2015-11-20 23:35 ` [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes James Simmons
@ 2015-12-02  7:54   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02  7:54 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Chris Horn, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:39PM -0500, James Simmons wrote:
> From: Chris Horn <hornc@cray.com>
> 
> We consider routes "down" if the router is down or the router
> NI for the target network is down. This should be reflected
> in the output of /proc/sys/lnet/routes
> 
> Signed-off-by: Chris Horn <hornc@cray.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3679
> Reviewed-on: http://review.whamcloud.com/7857
> Reviewed-by: Cory Spitz <spitzcor@cray.com>
> Reviewed-by: Isaac Huang <he.huang@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
>  .../staging/lustre/include/linux/lnet/lib-lnet.h   |   13 ++++++++
>  drivers/staging/lustre/lnet/lnet/lib-move.c        |   32 ++++++++++----------
>  drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 +-
>  3 files changed, 30 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> index b61d504..09c6bfe 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> @@ -64,6 +64,19 @@ extern lnet_t	the_lnet;	/* THE network */
>  /** exclusive lock */
>  #define LNET_LOCK_EX		CFS_PERCPT_LOCK_EX
>  
> +static inline int lnet_is_route_alive(lnet_route_t *route)
> +{
> +	/* gateway is down */
> +	if (!route->lr_gateway->lp_alive)
> +		return 0;
> +	/* no NI status, assume it's alive */
> +	if ((route->lr_gateway->lp_ping_feats &
> +	     LNET_PING_FEAT_NI_STATUS) == 0)
> +		return 1;
> +	/* has NI status, check # down NIs */
> +	return route->lr_downis == 0;
> +}
> +
>  static inline int lnet_is_wire_handle_none(lnet_handle_wire_t *wh)
>  {
>  	return (wh->wh_interface_cookie == LNET_WIRE_HANDLE_COOKIE_NONE &&
> diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
> index 7a68382..c56de44 100644
> --- a/drivers/staging/lustre/lnet/lnet/lib-move.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
> @@ -1122,9 +1122,9 @@ static lnet_peer_t *
>  lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
>  {
>  	lnet_remotenet_t *rnet;
> -	lnet_route_t *rtr;
> -	lnet_route_t *rtr_best;
> -	lnet_route_t *rtr_last;
> +	lnet_route_t *route;
> +	lnet_route_t *best_route;
> +	lnet_route_t *last_route;

Unrelated variable renaming.

>  	struct lnet_peer *lp_best;
>  	struct lnet_peer *lp;
>  	int rc;
> @@ -1137,13 +1137,12 @@ lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
>  		return NULL;
>  
>  	lp_best = NULL;
> -	rtr_best = rtr_last = NULL;
> -	list_for_each_entry(rtr, &rnet->lrn_routes, lr_list) {
> -		lp = rtr->lr_gateway;
> +	best_route = NULL;
> +	last_route = NULL;

Unrelated checkpatch fixes.

> +	list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
> +		lp = route->lr_gateway;
>  
> -		if (!lp->lp_alive || /* gateway is down */
> -		    ((lp->lp_ping_feats & LNET_PING_FEAT_NI_STATUS) != 0 &&
> -		     rtr->lr_downis != 0)) /* NI to target is down */
> +		if (!lnet_is_route_alive(route))

This section is related to the patch, we moved the check out into its
own function.

>  			continue;
>  
>  		if (ni != NULL && lp->lp_ni != ni)
> @@ -1153,28 +1152,29 @@ lnet_find_route_locked(lnet_ni_t *ni, lnet_nid_t target, lnet_nid_t rtr_nid)
>  			return lp;
>  
>  		if (lp_best == NULL) {
> -			rtr_best = rtr_last = rtr;
> +			best_route = route;
> +			last_route = route;

More unrelated checkpatch fixes.

>  			lp_best = lp;
>  			continue;
>  		}
>  
>  		/* no protection on below fields, but it's harmless */
> -		if (rtr_last->lr_seq - rtr->lr_seq < 0)
> -			rtr_last = rtr;
> +		if (last_route->lr_seq - route->lr_seq < 0)
> +			last_route = route;
>  
> -		rc = lnet_compare_routes(rtr, rtr_best);
> +		rc = lnet_compare_routes(route, best_route);
>  		if (rc < 0)
>  			continue;
>  
> -		rtr_best = rtr;
> +		best_route = route;
>  		lp_best = lp;
>  	}
>  
>  	/* set sequence number on the best router to the latest sequence + 1
>  	 * so we can round-robin all routers, it's race and inaccurate but
>  	 * harmless and functional  */
> -	if (rtr_best != NULL)
> -		rtr_best->lr_seq = rtr_last->lr_seq + 1;
> +	if (best_route)

More checkpatch fixes.

> +		best_route->lr_seq = last_route->lr_seq + 1;
>  	return lp_best;
>  }
>  
> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
> index 396c7c4..af7423f 100644
> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
> @@ -240,7 +240,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
>  			unsigned int hops = route->lr_hops;
>  			unsigned int priority = route->lr_priority;
>  			lnet_nid_t nid = route->lr_gateway->lp_nid;
> -			int alive = route->lr_gateway->lp_alive;
> +			int alive = lnet_is_route_alive(route);

This line is the bugfix.

I know that people hate breaking patches up into reviewable patches but
this is a one line fix which is hidden behind 30 lines of unrelated
changes.  It makes it very hard to follow what is going on.

I have scripts to review checkpatch fixes basically automatically so it
really really helps me when people do one thing per patch.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 11/40] staging: lustre: DLC Feature dynamic net config
  2015-11-20 23:35 ` [PATCH 11/40] staging: lustre: DLC Feature dynamic net config James Simmons
@ 2015-12-02  9:23   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02  9:23 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:47PM -0500, James Simmons wrote:
> +
> +	return 0;
>  
>   failed4:
> -	lnet_ping_target_fini();
> - failed3:
>  	the_lnet.ln_refcount = 0;
> +	lnet_ping_md_unlink(pinfo, &md_handle);
> +	lnet_ping_info_free(pinfo);
> + failed3:
>  	lnet_acceptor_stop();
> +	rc = LNetEQFree(the_lnet.ln_ping_target_eq);
> +	LASSERT(rc == 0);
        ^^^^^^^^^^^^^^^^

>   failed2:
>  	lnet_destroy_routes();
>  	lnet_shutdown_lndnis();
> @@ -1263,8 +1609,12 @@ LNetNIInit(lnet_pid_t requested_pid)
>  	lnet_unprepare();
>   failed0:
>  	LASSERT(rc < 0);
        ^^^^^^^^^^^^^^^

These asserts contradict each other.


But mostly please remove all the GW-BASIC style numbered labeled names
from this patch.  You wouldn't name your variables "int var1, var2,
var3" so for label names you should give them meaningful names as well.
Don't name them after the goto location, name them after the label
location to say the first thing that the label does.

err_fini:
err_acceptor_stop:
err_destroy_routes:
err_empty_list:

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes
  2015-11-20 23:35 ` [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes James Simmons
@ 2015-12-02  9:48   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02  9:48 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:48PM -0500, James Simmons wrote:
> +int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
> +			     __u32 *len)
> +{
> +	struct libcfs_ioctl_hdr hdr;
>  
> -	orig_len = hdr->ioc_len;
> -	if (copy_from_user(buf, arg, hdr->ioc_len))
> +	if (copy_from_user(&hdr, arg, sizeof(hdr)))
>  		return -EFAULT;
> -	if (orig_len != data->ioc_len)
> -		return -EINVAL;


This check was actually important.  I don't see where it was moved to so
it looks like this patch introduces a serious information leak.


>  
> -	if (libcfs_ioctl_is_invalid(data)) {
> -		CERROR("PORTALS: ioctl not correctly formatted\n");
> +	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION) {
> +		CERROR("LNET: version mismatch expected %#x, got %#x\n",
> +		       LIBCFS_IOCTL_VERSION, hdr.ioc_version);
>  		return -EINVAL;
>  	}
>  
> -	if (data->ioc_inllen1)
> -		data->ioc_inlbuf1 = &data->ioc_bulk[0];
> +	*len = hdr.ioc_len;
>  
> -	if (data->ioc_inllen2)
> -		data->ioc_inlbuf2 = &data->ioc_bulk[0] +
> -			cfs_size_round(data->ioc_inllen1);
> +	return 0;
> +}
>  
> +int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
> +			  const void __user *arg)
> +{
> +	if (copy_from_user(buf, arg, buf_len))
> +		return -EFAULT;
>  	return 0;
>  }

Don't introduce this wrapper.  Abstraction layers just make the code
harder to read and obscures bugs.  Also the caller changes -EFAULT to
-EINVAL so right away it starts to be buggy.

>  
> diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
> index 75247e9..5348699 100644
> --- a/drivers/staging/lustre/lustre/libcfs/module.c
> +++ b/drivers/staging/lustre/lustre/libcfs/module.c
> @@ -54,6 +54,8 @@
>  
>  # define DEBUG_SUBSYSTEM S_LNET
>  
> +#define LIBCFS_MAX_IOCTL_BUF_LEN 2048
> +
>  #include "../../include/linux/libcfs/libcfs.h"
>  #include <asm/div64.h>
>  
> @@ -241,11 +243,21 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand)
>  }
>  EXPORT_SYMBOL(libcfs_deregister_ioctl);
>  
> -static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd,
> -			    void *arg, struct libcfs_ioctl_data *data)
> +static int libcfs_ioctl_handle(struct cfs_psdev_file *pfile, unsigned long cmd,
> +			       void *arg, struct libcfs_ioctl_hdr *hdr)
>  {
> +	struct libcfs_ioctl_data *data = NULL;
>  	int err = -EINVAL;
>  
> +	if ((cmd <= IOC_LIBCFS_LNETST) ||
> +	    (cmd >= IOC_LIBCFS_REGISTER_MYNID)) {
> +		data = container_of(hdr, struct libcfs_ioctl_data, ioc_hdr);
> +		err = libcfs_ioctl_data_adjust(data);
> +		if (err != 0) {

Generally, remove pointless double negatives like this.  It should be
just "if (err) " instead of "if (err != 0 != 0 != 0 != 0) " or whatever.

> +			return err;
> +		}
> +	}
> +
>  	switch (cmd) {
>  	case IOC_LIBCFS_CLEAR_DEBUG:
>  		libcfs_debug_clear_buffer();
> @@ -280,11 +292,11 @@ static int libcfs_ioctl_int(struct cfs_psdev_file *pfile, unsigned long cmd,
>  		err = -EINVAL;
>  		down_read(&ioctl_list_sem);
>  		list_for_each_entry(hand, &ioctl_list, item) {
> -			err = hand->handle_ioctl(cmd, data);
> +			err = hand->handle_ioctl(cmd, hdr);
>  			if (err != -EINVAL) {
>  				if (err == 0)
>  					err = libcfs_ioctl_popdata(arg,
> -							data, sizeof(*data));
> +							hdr, hdr->ioc_len);

This variable has not been verified since the user wrote to it last so
here is the information leak.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-11-20 23:35 ` [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command James Simmons
@ 2015-12-02 11:20   ` Dan Carpenter
  2015-12-15 18:14     ` Simmons, James A.
  2015-12-02 12:00   ` Dan Carpenter
  1 sibling, 1 reply; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 11:20 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:49PM -0500, James Simmons wrote:
> From: Amir Shehata <amir.shehata@intel.com>
> 
> This is the fifth patch of a set of patches that enables DLC.
> 
> This patch adds the new structures which will be used
> in the IOCTL communication.  It also added a set of
> show operations to show buffers, networks, statistics
> and peer information.
> 
> Signed-off-by: Amir Shehata <amir.shehata@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
> Change-Id: I96e5cb3dcf07289c6cd1deb46f4acb3c263ae21e
> Reviewed-on: http://review.whamcloud.com/8022
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
>  .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   44 +++++++-
>  .../staging/lustre/include/linux/lnet/lib-dlc.h    |  118 ++++++++++++++++++++
>  .../staging/lustre/include/linux/lnet/lib-lnet.h   |    5 +
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |   47 +++++++-
>  drivers/staging/lustre/lnet/lnet/module.c          |    4 +
>  drivers/staging/lustre/lnet/lnet/peer.c            |   61 ++++++++++
>  .../lustre/lustre/libcfs/linux/linux-module.c      |    3 +-
>  drivers/staging/lustre/lustre/libcfs/module.c      |   15 ++-
>  8 files changed, 282 insertions(+), 15 deletions(-)
>  create mode 100644 drivers/staging/lustre/include/linux/lnet/lib-dlc.h
> 
> diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> index e14788c..f24330d 100644
> --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> @@ -41,7 +41,8 @@
>  #ifndef __LIBCFS_IOCTL_H__
>  #define __LIBCFS_IOCTL_H__
>  
> -#define LIBCFS_IOCTL_VERSION 0x0001000a
> +#define LIBCFS_IOCTL_VERSION	0x0001000a
> +#define LIBCFS_IOCTL_VERSION2	0x0001000b
>  
>  struct libcfs_ioctl_hdr {
>  	__u32 ioc_len;
> @@ -87,6 +88,13 @@ do {						    \
>  	data.ioc_hdr.ioc_len = sizeof(data);			\
>  } while (0)
>  
> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
> +do {							\
> +	memset(&(data), 0, sizeof(data));		\
> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
> +	(data).hdr.ioc_len = sizeof(data);		\
> +} while (0)
> +

Do we really need this?

>  struct libcfs_ioctl_handler {
>  	struct list_head item;
>  	int (*handle_ioctl)(unsigned int cmd, struct libcfs_ioctl_hdr *hdr);
> @@ -112,9 +120,6 @@ struct libcfs_ioctl_handler {
>  /* lnet ioctls */
>  #define IOC_LIBCFS_GET_NI		  _IOWR('e', 50, long)
>  #define IOC_LIBCFS_FAIL_NID		_IOWR('e', 51, long)
> -#define IOC_LIBCFS_ADD_ROUTE	       _IOWR('e', 52, long)
> -#define IOC_LIBCFS_DEL_ROUTE	       _IOWR('e', 53, long)
> -#define IOC_LIBCFS_GET_ROUTE	       _IOWR('e', 54, long)
>  #define IOC_LIBCFS_NOTIFY_ROUTER	   _IOWR('e', 55, long)
>  #define IOC_LIBCFS_UNCONFIGURE	     _IOWR('e', 56, long)
>  #define IOC_LIBCFS_PORTALS_COMPATIBILITY   _IOWR('e', 57, long)
> @@ -137,7 +142,36 @@ struct libcfs_ioctl_handler {
>  #define IOC_LIBCFS_DEL_INTERFACE	   _IOWR('e', 79, long)
>  #define IOC_LIBCFS_GET_INTERFACE	   _IOWR('e', 80, long)
>  
> -#define IOC_LIBCFS_MAX_NR			     80
> +/*
> + * DLC Specific IOCTL numbers.
> + * In order to maintain backward compatibility with any possible external
> + * tools which might be accessing the IOCTL numbers, a new group of IOCTL
> + * number have been allocated.
> + */
> +#define IOCTL_CONFIG_SIZE		struct lnet_ioctl_config_data
> +#define IOC_LIBCFS_ADD_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 81, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_DEL_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 82, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_GET_ROUTE		_IOWR(IOC_LIBCFS_TYPE, 83, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_ADD_NET		_IOWR(IOC_LIBCFS_TYPE, 84, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_DEL_NET		_IOWR(IOC_LIBCFS_TYPE, 85, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_GET_NET		_IOWR(IOC_LIBCFS_TYPE, 86, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_CONFIG_RTR		_IOWR(IOC_LIBCFS_TYPE, 87, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_ADD_BUF		_IOWR(IOC_LIBCFS_TYPE, 88, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_GET_BUF		_IOWR(IOC_LIBCFS_TYPE, 89, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_GET_PEER_INFO	_IOWR(IOC_LIBCFS_TYPE, 90, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_GET_LNET_STATS	_IOWR(IOC_LIBCFS_TYPE, 91, \
> +					      IOCTL_CONFIG_SIZE)
> +#define IOC_LIBCFS_MAX_NR		91


Do it like this:

#define IOC_LIBCFS_DEL_ROUTE    _IOWR(IOC_LIBCFS_TYPE, 82, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_GET_ROUTE    _IOWR(IOC_LIBCFS_TYPE, 83, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_ADD_NET      _IOWR(IOC_LIBCFS_TYPE, 84, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_DEL_NET      _IOWR(IOC_LIBCFS_TYPE, 85, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_GET_NET      _IOWR(IOC_LIBCFS_TYPE, 86, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_CONFIG_RTR   _IOWR(IOC_LIBCFS_TYPE, 87, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_ADD_BUF      _IOWR(IOC_LIBCFS_TYPE, 88, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_GET_BUF      _IOWR(IOC_LIBCFS_TYPE, 89, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_GET_PEER_INFO  _IOWR(IOC_LIBCFS_TYPE, 90, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_GET_LNET_STATS _IOWR(IOC_LIBCFS_TYPE, 91, IOCTL_CONFIG_SIZE)
#define IOC_LIBCFS_MAX_NR               91

>  
>  static inline int libcfs_ioctl_packlen(struct libcfs_ioctl_data *data)
>  {
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-dlc.h b/drivers/staging/lustre/include/linux/lnet/lib-dlc.h
> new file mode 100644
> index 0000000..b6a2e91
> --- /dev/null
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-dlc.h
> @@ -0,0 +1,118 @@
> +/*
> + * GPL HEADER START
> + *
> + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 only,
> + * as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 for more details (a copy is included
> + * in the LICENSE file that accompanied this code).
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 along with this program; If not, see
> + * http://www.gnu.org/licenses/gpl-2.0.html
> + *
> + * GPL HEADER END
> + *
> + * Contributers:
> + *   Amir Shehata
> + */
> +
> +#ifndef LNET_DLC_H
> +#define LNET_DLC_H
> +
> +#include "../libcfs/libcfs_ioctl.h"
> +#include "types.h"
> +
> +#define MAX_NUM_SHOW_ENTRIES	32
> +#define LNET_MAX_STR_LEN	128
> +#define LNET_MAX_SHOW_NUM_CPT	128
> +
> +struct lnet_ioctl_net_config {
> +	char ni_interfaces[LNET_MAX_INTERFACES][LNET_MAX_STR_LEN];
> +	__u32 ni_status;
> +	__u32 ni_cpts[LNET_MAX_SHOW_NUM_CPT];
> +};
> +
> +#define LNET_TINY_BUF_IDX	0
> +#define LNET_SMALL_BUF_IDX	1
> +#define LNET_LARGE_BUF_IDX	2
> +
> +/* # different router buffer pools */
> +#define LNET_NRBPOOLS		(LNET_LARGE_BUF_IDX + 1)
> +
> +struct lnet_ioctl_pool_cfg {
> +	struct {
> +		__u32 pl_npages;
> +		__u32 pl_nbuffers;
> +		__u32 pl_credits;
> +		__u32 pl_mincredits;
> +	} pl_pools[LNET_NRBPOOLS];
> +	__u32 pl_routing;
> +};
> +
> +struct lnet_ioctl_config_data {
> +	struct libcfs_ioctl_hdr cfg_hdr;
> +
> +	__u32 cfg_net;
> +	__u32 cfg_count;
> +	__u64 cfg_nid;
> +	__u32 cfg_ncpts;
> +
> +	union {
> +		struct {
> +			__u32 rtr_hop;
> +			__u32 rtr_priority;
> +			__u32 rtr_flags;
> +		} cfg_route;
> +		struct {
> +			char net_intf[LNET_MAX_STR_LEN];
> +			__s32 net_peer_timeout;
> +			__s32 net_peer_tx_credits;
> +			__s32 net_peer_rtr_credits;
> +			__s32 net_max_tx_credits;
> +			__u32 net_cksum_algo;
> +			__u32 net_pad;
> +		} cfg_net;
> +		struct {
> +			__u32 buf_enable;
> +			__s32 buf_tiny;
> +			__s32 buf_small;
> +			__s32 buf_large;
> +		} cfg_buffers;
> +	} cfg_config_u;
> +
> +	char cfg_bulk[0];
> +};
> +
> +struct lnet_ioctl_peer {
> +	struct libcfs_ioctl_hdr pr_hdr;
> +	__u32 pr_count;
> +	__u32 pr_pad;
> +	__u64 pr_nid;
> +
> +	union {
> +		struct {
> +			char cr_aliveness[LNET_MAX_STR_LEN];
> +			__u32 cr_refcount;
> +			__u32 cr_ni_peer_tx_credits;
> +			__u32 cr_peer_tx_credits;
> +			__u32 cr_peer_rtr_credits;
> +			__u32 cr_peer_min_rtr_credits;
> +			__u32 cr_peer_tx_qnob;
> +			__u32 cr_ncpt;
> +		} pr_peer_credits;
> +	} pr_lnd_u;
> +};
> +
> +struct lnet_ioctl_lnet_stats {
> +	struct libcfs_ioctl_hdr st_hdr;
> +	struct lnet_counters st_cntrs;
> +};
> +
> +#endif /* LNET_DLC_H */
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> index 94d0dc5..f2874e0 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> @@ -694,6 +694,11 @@ void lnet_peer_tables_cleanup(lnet_ni_t *ni);
>  void lnet_peer_tables_destroy(void);
>  int lnet_peer_tables_create(void);
>  void lnet_debug_peer(lnet_nid_t nid);
> +int lnet_get_peers(int count, __u64 *nid, char *alivness,
> +		   int *ncpt, int *refcount,
> +		   int *ni_peer_tx_credits, int *peer_tx_credits,
> +		   int *peer_rtr_credits, int *peer_min_rtr_credtis,
> +		   int *peer_tx_qnob);
>  
>  static inline void
>  lnet_peer_set_alive(lnet_peer_t *lp)
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index 9661f6a..165345c 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -39,6 +39,7 @@
>  #include <linux/ktime.h>
>  
>  #include "../../include/linux/lnet/lib-lnet.h"
> +#include "../../include/linux/lnet/lib-dlc.h"
>  
>  #define D_LNI D_CONSOLE
>  
> @@ -1741,6 +1742,7 @@ int
>  LNetCtl(unsigned int cmd, void *arg)
>  {
>  	struct libcfs_ioctl_data *data = arg;
> +	struct lnet_ioctl_config_data *config;
>  	lnet_process_id_t id = {0};
>  	lnet_ni_t *ni;
>  	int rc;
> @@ -1765,16 +1767,51 @@ LNetCtl(unsigned int cmd, void *arg)
>  		return (rc != 0) ? rc : lnet_check_routes();
>  
>  	case IOC_LIBCFS_DEL_ROUTE:
> +		config = arg;

I think you need to verify:

		if (config->cfg_hdr->ioc_len < sizeof(*config))
			return -EINVAL;


>  		mutex_lock(&the_lnet.ln_api_mutex);
> -		rc = lnet_del_route(data->ioc_net, data->ioc_nid);
> +		rc = lnet_del_route(config->cfg_net, config->cfg_nid);
>  		mutex_unlock(&the_lnet.ln_api_mutex);
>  		return rc;
>  
>  	case IOC_LIBCFS_GET_ROUTE:
> -		return lnet_get_route(data->ioc_count,
> -				      &data->ioc_net, &data->ioc_count,
> -				      &data->ioc_nid, &data->ioc_flags,
> -				      &data->ioc_priority);
> +		config = arg;

Verify ioc_len.

> +		return lnet_get_route(config->cfg_count,
> +				      &config->cfg_net,
> +				      &config->cfg_config_u.cfg_route.rtr_hop,
> +				      &config->cfg_nid,
> +				      &config->cfg_config_u.cfg_route.rtr_flags,
> +				      &config->cfg_config_u.cfg_route.
> +					rtr_priority);
> +
> +	case IOC_LIBCFS_ADD_NET:
> +		return 0;
> +
> +	case IOC_LIBCFS_DEL_NET:
> +		return 0;
> +
> +	case IOC_LIBCFS_GET_NET:
> +		return 0;
> +
> +	case IOC_LIBCFS_GET_LNET_STATS:
> +	{

Put this curly brace on the line before.

> +		struct lnet_ioctl_lnet_stats *lnet_stats = arg;
> +

Verify ioc_len is large enough.

> +		lnet_counters_get(&lnet_stats->st_cntrs);
> +		return 0;
> +	}
> +
> +	case IOC_LIBCFS_CONFIG_RTR:
> +		return 0;
> +
> +	case IOC_LIBCFS_ADD_BUF:
> +		return 0;
> +
> +	case IOC_LIBCFS_GET_BUF:
> +		return 0;
> +
> +	case IOC_LIBCFS_GET_PEER_INFO:
> +		return 0;
> +
>  	case IOC_LIBCFS_NOTIFY_ROUTER:
>  		secs_passed = (ktime_get_real_seconds() - data->ioc_u64[0]);
>  		return lnet_notify(NULL, data->ioc_nid, data->ioc_flags,
> diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
> index 0afdad0..ffc5700 100644
> --- a/drivers/staging/lustre/lnet/lnet/module.c
> +++ b/drivers/staging/lustre/lnet/lnet/module.c
> @@ -36,6 +36,7 @@
>  
>  #define DEBUG_SUBSYSTEM S_LNET
>  #include "../../include/linux/lnet/lib-lnet.h"
> +#include "../../include/linux/lnet/lib-dlc.h"
>  
>  static int config_on_load;
>  module_param(config_on_load, int, 0444);
> @@ -95,6 +96,9 @@ lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
>  	case IOC_LIBCFS_UNCONFIGURE:
>  		return lnet_unconfigure();
>  
> +	case IOC_LIBCFS_ADD_NET:
> +		return LNetCtl(cmd, hdr);
> +
>  	default:
>  		/* Passing LNET_PID_ANY only gives me a ref if the net is up
>  		 * already; I'll need it to ensure the net can't go down while
> diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
> index bb5a0bb..1402e27 100644
> --- a/drivers/staging/lustre/lnet/lnet/peer.c
> +++ b/drivers/staging/lustre/lnet/lnet/peer.c
> @@ -39,6 +39,7 @@
>  #define DEBUG_SUBSYSTEM S_LNET
>  
>  #include "../../include/linux/lnet/lib-lnet.h"
> +#include "../../include/linux/lnet/lib-dlc.h"
>  
>  int
>  lnet_peer_tables_create(void)
> @@ -392,3 +393,63 @@ lnet_debug_peer(lnet_nid_t nid)
>  
>  	lnet_net_unlock(cpt);
>  }
> +
> +int lnet_get_peers(int count, __u64 *nid, char *aliveness,
> +		   int *ncpt, int *refcount,
> +		   int *ni_peer_tx_credits, int *peer_tx_credits,
> +		   int *peer_rtr_credits, int *peer_min_rtr_credits,
> +		   int *peer_tx_qnob)
> +{
> +	struct lnet_peer_table *peer_table;
> +	lnet_peer_t *lp;
> +	int j;
> +	int lncpt, found = 0;
> +
> +	/* get the number of CPTs */
> +	lncpt = cfs_percpt_number(the_lnet.ln_peer_tables);
> +
> +	/*
> +	 * if the cpt number to be examined is >= the number of cpts in
> +	 * the system then indicate that there are no more cpts to examin
> +	 */
> +	if (*ncpt > lncpt)
> +		return -1;

Add some documentation to the start of the function to say what -1 means
here.  Or reading below it looks like normal error codes were intended.

> +
> +	/* get the current table */
> +	peer_table = the_lnet.ln_peer_tables[*ncpt];
> +	/* if the ptable is NULL then there are no more cpts to examine */
> +	if (!peer_table)
> +		return -1;
> +
> +	lnet_net_lock(*ncpt);
> +
> +	for (j = 0; j < LNET_PEER_HASH_SIZE && !found; j++) {
> +		struct list_head *peers = &peer_table->pt_hash[j];
> +
> +		list_for_each_entry(lp, peers, lp_hashlist) {
> +			if (count-- > 0)
> +				continue;
> +
> +			snprintf(aliveness, LNET_MAX_STR_LEN, "NA");
> +			if (lnet_isrouter(lp) ||
> +			    lnet_peer_aliveness_enabled(lp))
> +				snprintf(aliveness, LNET_MAX_STR_LEN,
> +					 lp->lp_alive ? "up" : "down");
> +
> +			*nid = lp->lp_nid;
> +			*refcount = lp->lp_refcount;
> +			*ni_peer_tx_credits = lp->lp_ni->ni_peertxcredits;
> +			*peer_tx_credits = lp->lp_txcredits;
> +			*peer_rtr_credits = lp->lp_rtrcredits;
> +			*peer_min_rtr_credits = lp->lp_mintxcredits;
> +			*peer_tx_qnob = lp->lp_txqnob;
> +
> +			found = 1;
> +		}
> +	}
> +	lnet_net_unlock(*ncpt);
> +
> +	*ncpt = lncpt;
> +
> +	return found ? 0 : -ENOENT;
> +}

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 14/40] staging: lustre: fix crash due to NULL networks string
  2015-11-20 23:35 ` [PATCH 14/40] staging: lustre: fix crash due to NULL networks string James Simmons
@ 2015-12-02 11:27   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 11:27 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

This feels like we are fixing a bug introduce in PATCH 11 when we
removed a NULL check.  Don't introduce bugs and then fix them in the
same patchset; the fix has to be folded into the original patch.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-11-20 23:35 ` [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command James Simmons
  2015-12-02 11:20   ` Dan Carpenter
@ 2015-12-02 12:00   ` Dan Carpenter
  1 sibling, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 12:00 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:49PM -0500, James Simmons wrote:
> +int lnet_get_peers(int count, __u64 *nid, char *aliveness,
> +		   int *ncpt, int *refcount,
> +		   int *ni_peer_tx_credits, int *peer_tx_credits,
> +		   int *peer_rtr_credits, int *peer_min_rtr_credits,
> +		   int *peer_tx_qnob)
> +{
> +	struct lnet_peer_table *peer_table;
> +	lnet_peer_t *lp;
> +	int j;
> +	int lncpt, found = 0;
> +
> +	/* get the number of CPTs */
> +	lncpt = cfs_percpt_number(the_lnet.ln_peer_tables);
> +
> +	/*
> +	 * if the cpt number to be examined is >= the number of cpts in
> +	 * the system then indicate that there are no more cpts to examin
> +	 */
> +	if (*ncpt > lncpt)
> +		return -1;


The comment is correct but the code is off by one.

	if (*ncpt >= lncpt)
		return -EINVAL;

Also not a correct error code.  I assume that you will review the whole
patchset again and fix all the -1 returns, of course.

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 15/40] staging: lustre: DLC user/kernel space glue code
  2015-11-20 23:35 ` [PATCH 15/40] staging: lustre: DLC user/kernel space glue code James Simmons
@ 2015-12-02 12:11   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 12:11 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:51PM -0500, James Simmons wrote:
> From: Amir Shehata <amir.shehata@intel.com>
> 
> This is the sixth patch of a set of patches that enables DLC.
> 
> This patch enables the user space to call into the kernel space
> DLC code.  Added handlers in the LNetCtl function to call
> the new functions added for Dynamic Lnet Configuration
> 
> Signed-off-by: Amir Shehata <amir.shehata@intel.com>
> ntel-bug-id: https://jira.hpdd.intel.com/browse/LU-2456
> Reviewed-on: http://review.whamcloud.com/8023
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
>  .../staging/lustre/include/linux/lnet/lib-lnet.h   |   22 ++-
>  .../staging/lustre/include/linux/lnet/lib-types.h  |    8 +
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |  172 ++++++++++++++++++--
>  drivers/staging/lustre/lnet/lnet/module.c          |   53 ++++++-
>  drivers/staging/lustre/lnet/lnet/peer.c            |   34 ++--
>  drivers/staging/lustre/lnet/lnet/router.c          |   71 +++++++--
>  6 files changed, 307 insertions(+), 53 deletions(-)
> 
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> index f2874e0..63919dd 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> @@ -39,6 +39,7 @@
>  #include "api.h"
>  #include "lnet.h"
>  #include "lib-types.h"
> +#include "lib-dlc.h"
>  
>  extern lnet_t	the_lnet;	/* THE network */
>  
> @@ -456,6 +457,12 @@ int lnet_del_route(__u32 net, lnet_nid_t gw_nid);
>  void lnet_destroy_routes(void);
>  int lnet_get_route(int idx, __u32 *net, __u32 *hops,
>  		   lnet_nid_t *gateway, __u32 *alive, __u32 *priority);
> +int lnet_get_net_config(int idx, __u32 *cpt_count, __u64 *nid,
> +			int *peer_timeout, int *peer_tx_credits,
> +			int *peer_rtr_cr, int *max_tx_credits,
> +			struct lnet_ioctl_net_config *net_config);
> +int lnet_get_rtr_pool_cfg(int idx, struct lnet_ioctl_pool_cfg *pool_cfg);
> +
>  void lnet_router_debugfs_init(void);
>  void lnet_router_debugfs_fini(void);
>  int  lnet_rtrpools_alloc(int im_a_router);
> @@ -465,6 +472,10 @@ int lnet_rtrpools_enable(void);
>  void lnet_rtrpools_disable(void);
>  void lnet_rtrpools_free(int keep_pools);
>  lnet_remotenet_t *lnet_find_net_locked(__u32 net);
> +int lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
> +		    __s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
> +		    __s32 credits);
> +int lnet_dyn_del_ni(__u32 net);
>  
>  int lnet_islocalnid(lnet_nid_t nid);
>  int lnet_islocalnet(__u32 net);
> @@ -694,11 +705,12 @@ void lnet_peer_tables_cleanup(lnet_ni_t *ni);
>  void lnet_peer_tables_destroy(void);
>  int lnet_peer_tables_create(void);
>  void lnet_debug_peer(lnet_nid_t nid);
> -int lnet_get_peers(int count, __u64 *nid, char *alivness,
> -		   int *ncpt, int *refcount,
> -		   int *ni_peer_tx_credits, int *peer_tx_credits,
> -		   int *peer_rtr_credits, int *peer_min_rtr_credtis,
> -		   int *peer_tx_qnob);
> +int lnet_get_peer_info(__u32 peer_index, __u64 *nid,
> +		       char alivness[LNET_MAX_STR_LEN],
> +		       __u32 *cpt_iter, __u32 *refcount,
> +		       __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits,
> +		       __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis,
> +		       __u32 *peer_tx_qnob);
>  
>  static inline void
>  lnet_peer_set_alive(lnet_peer_t *lp)
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
> index e7585b9..3282782 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
> @@ -611,6 +611,14 @@ typedef struct {
>  	/* test protocol compatibility flags */
>  	int				  ln_testprotocompat;
>  
> +	/*
> +	 * 0 - load the NIs from the mod params
> +	 * 1 - do not load the NIs from the mod params
> +	 * Reverse logic to ensure that other calls to LNetNIInit
> +	 * need no change
> +	 */
> +	bool				  ln_nis_from_mod_params;
> +
>  } lnet_t;
>  
>  #endif
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index cc87900..125d018 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -1542,7 +1542,9 @@ LNetNIInit(lnet_pid_t requested_pid)
>  	if (rc != 0)
>  		goto failed0;
>  
> -	rc = lnet_parse_networks(&net_head, lnet_get_networks());
> +	rc = lnet_parse_networks(&net_head,
> +				 !the_lnet.ln_nis_from_mod_params ?
> +				 lnet_get_networks() : "");
>  	if (rc < 0)
>  		goto failed1;
>  
> @@ -1657,6 +1659,93 @@ LNetNIFini(void)
>  }
>  EXPORT_SYMBOL(LNetNIFini);
>  
> +/**
> + * Grabs the ni data from the ni structure and fills the out
> + * parameters
> + *
> + * \param[in] ni network       interface structure
> + * \param[out] cpt_count       the number of cpts the ni is on
> + * \param[out] nid             Network Interface ID
> + * \param[out] peer_timeout    NI peer timeout
> + * \param[out] peer_tx_crdits  NI peer transmit credits
> + * \param[out] peer_rtr_credits NI peer router credits
> + * \param[out] max_tx_credits  NI max transmit credit
> + * \param[out] net_config      Network configuration
> + */
> +static void
> +lnet_fill_ni_info(struct lnet_ni *ni, __u32 *cpt_count, __u64 *nid,
> +		  int *peer_timeout, int *peer_tx_credits,
> +		  int *peer_rtr_credits, int *max_tx_credits,
> +		  struct lnet_ioctl_net_config *net_config)
> +{
> +	int i;
> +
> +	if (!ni)
> +		return;
> +
> +	if (!net_config)
> +		return;
> +
> +	CLASSERT(ARRAY_SIZE(ni->ni_interfaces) ==
> +		 ARRAY_SIZE(net_config->ni_interfaces));

The kernel has a macro for this BUILD_BUG_ON().

> +
> +	if (ni->ni_interfaces[0]) {

Couldn't we just break at the first NULL?

> +		for (i = 0; i < ARRAY_SIZE(ni->ni_interfaces); i++) {
> +			if (ni->ni_interfaces[i]) {
> +				strncpy(net_config->ni_interfaces[i],
> +					ni->ni_interfaces[i],
> +					sizeof(net_config->ni_interfaces[i]));
> +			}
> +		}
> +	}
> +


	for (i = 0; i < ARRAY_SIZE(ni->ni_interfaces); i++) {
		if (!ni->ni_interfaces[i])
			break;
		strncpy(net_config->ni_interfaces[i],
			ni->ni_interfaces[i],
			sizeof(net_config->ni_interfaces[i]));
	}


> +	*nid = ni->ni_nid;
> +	*peer_timeout = ni->ni_peertimeout;
> +	*peer_tx_credits = ni->ni_peertxcredits;
> +	*peer_rtr_credits = ni->ni_peerrtrcredits;
> +	*max_tx_credits = ni->ni_maxtxcredits;
> +
> +	net_config->ni_status = ni->ni_status->ns_status;
> +
> +	for (i = 0;
> +	     ni->ni_cpts && i < ni->ni_ncpts &&
> +	     i < LNET_MAX_SHOW_NUM_CPT;

This is really sort of a crap condition and either you are doing it to
please an inferior static checker (don't write nonsense code to please
the static checker) or there is something very confused about your code
and you should fix it elsewhere.  The loop here should be:

	for (i = 0; i < ni->ni_ncpts; i++)
		net_config->ni_cpts[i] = ni->ni_cpts[i];


> +	     i++)
> +		net_config->ni_cpts[i] = ni->ni_cpts[i];
> +
> +	*cpt_count = ni->ni_ncpts;
> +}
> +
> +int
> +lnet_get_net_config(int idx, __u32 *cpt_count, __u64 *nid, int *peer_timeout,
> +		    int *peer_tx_credits, int *peer_rtr_credits,
> +		    int *max_tx_credits,
> +		    struct lnet_ioctl_net_config *net_config)
> +{
> +	struct lnet_ni *ni;
> +	struct list_head *tmp;
> +	int cpt;
> +	int rc = -ENOENT;
> +
> +	cpt = lnet_net_lock_current();
> +
> +	list_for_each(tmp, &the_lnet.ln_nis) {
> +		ni = list_entry(tmp, lnet_ni_t, ni_list);
> +		if (idx-- == 0) {
> +			rc = 0;
> +			lnet_ni_lock(ni);
> +			lnet_fill_ni_info(ni, cpt_count, nid, peer_timeout,
> +					  peer_tx_credits, peer_rtr_credits,
> +					  max_tx_credits, net_config);
> +			lnet_ni_unlock(ni);
> +			break;
> +		}
> +	}

Write it something like this:

	i = 0;
	list_for_each(tmp, &the_lnet.ln_nis) {
		if (i++ != idx)
			continue;

		ni = list_entry(tmp, lnet_ni_t, ni_list);
		lnet_ni_lock(ni);
		lnet_fill_ni_info(ni, cpt_count, nid, peer_timeout,
				  peer_tx_credits, peer_rtr_credits,
				  max_tx_credits, net_config);
		lnet_ni_unlock(ni);
		rc = 0;
		break;
	}

> +
> +	lnet_net_unlock(cpt);
> +	return rc;
> +}
> +
>  int
>  lnet_dyn_add_ni(lnet_pid_t requested_pid, char *nets,
>  		__s32 peer_timeout, __s32 peer_cr, __s32 peer_buf_cr,
> @@ -1757,9 +1846,13 @@ LNetCtl(unsigned int cmd, void *arg)
>  		return lnet_fail_nid(data->ioc_nid, data->ioc_count);
>  
>  	case IOC_LIBCFS_ADD_ROUTE:
> +		config = arg;

Check ioc_len.

>  		mutex_lock(&the_lnet.ln_api_mutex);
> -		rc = lnet_add_route(data->ioc_net, data->ioc_count,
> -				    data->ioc_nid, data->ioc_priority);
> +		rc = lnet_add_route(config->cfg_net,
> +				    config->cfg_config_u.cfg_route.rtr_hop,
> +				    config->cfg_nid,
> +				    config->cfg_config_u.cfg_route.
> +					rtr_priority);
>  		mutex_unlock(&the_lnet.ln_api_mutex);
>  		return (rc != 0) ? rc : lnet_check_routes();
>  
> @@ -1780,14 +1873,28 @@ LNetCtl(unsigned int cmd, void *arg)
>  				      &config->cfg_config_u.cfg_route.
>  					rtr_priority);
>  
> -	case IOC_LIBCFS_ADD_NET:
> -		return 0;
> -
> -	case IOC_LIBCFS_DEL_NET:
> -		return 0;
> +	case IOC_LIBCFS_GET_NET: {
> +		struct lnet_ioctl_net_config *net_config;
>  
> -	case IOC_LIBCFS_GET_NET:
> -		return 0;
> +		config = arg;

Check ioc_len.

> +		net_config = (struct lnet_ioctl_net_config *)
> +				config->cfg_bulk;
> +		if (!config || !net_config)

We already dereferenced "config" so it's a bit late to check for NULL.
But that's an interesting question if we need to add NULL checks every
where.  I forget.  Please check this and reply to the email thread.

> +			return -1;

This is not a correct error code.  return -EINVAL.

> +
> +		return lnet_get_net_config(config->cfg_count,
> +					   &config->cfg_ncpts,
> +					   &config->cfg_nid,
> +					   &config->cfg_config_u.cfg_net.
> +						net_peer_timeout,
> +					   &config->cfg_config_u.cfg_net.
> +						net_peer_tx_credits,
> +					   &config->cfg_config_u.cfg_net.
> +						net_peer_rtr_credits,
> +					   &config->cfg_config_u.cfg_net.
> +						net_max_tx_credits,
> +					   net_config);

Breaking it up like this is nasty.

		return lnet_get_net_config(
				config->cfg_count,
				&config->cfg_ncpts,
				&config->cfg_nid,
				&config->cfg_config_u.cfg_net.net_peer_timeout,
				&config->cfg_config_u.cfg_net.net_peer_tx_credits,
				&config->cfg_config_u.cfg_net.net_peer_rtr_credits,
				&config->cfg_config_u.cfg_net.net_max_tx_credits,
				net_config);

It violates checkpatch.pl rules, but don't write ugly code just to make
a tool happy.


> +	}
>  
>  	case IOC_LIBCFS_GET_LNET_STATS:
>  	{
> @@ -1798,16 +1905,51 @@ LNetCtl(unsigned int cmd, void *arg)
>  	}
>  
>  	case IOC_LIBCFS_CONFIG_RTR:
> +		config = arg;

ioc_len.

> +		mutex_lock(&the_lnet.ln_api_mutex);
> +		if (config->cfg_config_u.cfg_buffers.buf_enable) {
> +			rc = lnet_rtrpools_enable();
> +			mutex_unlock(&the_lnet.ln_api_mutex);
> +			return rc;
> +		}
> +		lnet_rtrpools_disable();
> +		mutex_unlock(&the_lnet.ln_api_mutex);
>  		return 0;
>  
>  	case IOC_LIBCFS_ADD_BUF:
> -		return 0;
> +		config = arg;

ioc_len.

> +		mutex_lock(&the_lnet.ln_api_mutex);
> +		rc = lnet_rtrpools_adjust(config->cfg_config_u.cfg_buffers.
> +						buf_tiny,
> +					  config->cfg_config_u.cfg_buffers.
> +						buf_small,
> +					  config->cfg_config_u.cfg_buffers.
> +						buf_large);

Ugh.

> +		mutex_unlock(&the_lnet.ln_api_mutex);
> +		return rc;
>  
> -	case IOC_LIBCFS_GET_BUF:
> -		return 0;
> +	case IOC_LIBCFS_GET_BUF: {
> +		struct lnet_ioctl_pool_cfg *pool_cfg;
>  
> -	case IOC_LIBCFS_GET_PEER_INFO:
> -		return 0;
> +		config = arg;

ioc_len.

> +		pool_cfg = (struct lnet_ioctl_pool_cfg *)config->cfg_bulk;
> +		return lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg);
> +	}
> +
> +	case IOC_LIBCFS_GET_PEER_INFO: {
> +		struct lnet_ioctl_peer *peer_info = arg;


ioc_len.


> +
> +		return lnet_get_peer_info(peer_info->pr_count,
> +			&peer_info->pr_nid,
> +			peer_info->pr_lnd_u.pr_peer_credits.cr_aliveness,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_ncpt,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_refcount,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_ni_peer_tx_credits,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_credits,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_rtr_credits,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_min_rtr_credits,
> +			&peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_qnob);
> +	}
>  
>  	case IOC_LIBCFS_NOTIFY_ROUTER:
>  		secs_passed = (ktime_get_real_seconds() - data->ioc_u64[0]);
> diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c
> index ffc5700..281315c 100644
> --- a/drivers/staging/lustre/lnet/lnet/module.c
> +++ b/drivers/staging/lustre/lnet/lnet/module.c
> @@ -84,20 +84,69 @@ lnet_unconfigure(void)
>  	return (refcount == 0) ? 0 : -EBUSY;
>  }
>  
> +int
> +lnet_dyn_configure(struct libcfs_ioctl_hdr *hdr)
> +{
> +	struct lnet_ioctl_config_data *conf =
> +		(struct lnet_ioctl_config_data *)hdr;
> +	int rc;
> +

We never checked ioc_len before calling this function so it could oops
if we read unmapped memory.

> +	mutex_lock(&lnet_config_mutex);
> +	if (the_lnet.ln_niinit_self)

Flip this around:

	if (!the_lnet.ln_niinit_self) {
		rc = -EINVAL;
		goto unlock;
	}

	rc = lnet_dyn_add_ni(LUSTRE_SRV_LNET_PID, ...

> +		rc = lnet_dyn_add_ni(LUSTRE_SRV_LNET_PID,
> +				     conf->cfg_config_u.cfg_net.net_intf,
> +				     conf->cfg_config_u.cfg_net.
> +					net_peer_timeout,
> +				     conf->cfg_config_u.cfg_net.
> +					net_peer_tx_credits,
> +				     conf->cfg_config_u.cfg_net.
> +					net_peer_rtr_credits,
> +				     conf->cfg_config_u.cfg_net.
> +					net_max_tx_credits);
> +	else
> +		rc = -EINVAL;
> +	mutex_unlock(&lnet_config_mutex);
> +	return rc;
> +}
> +
> +int
> +lnet_dyn_unconfigure(struct libcfs_ioctl_hdr *hdr)
> +{
> +	struct lnet_ioctl_config_data *conf =
> +		(struct lnet_ioctl_config_data *)hdr;
> +	int rc;
> +
> +	mutex_lock(&lnet_config_mutex);
> +	if (the_lnet.ln_niinit_self)

Always consistently do error handling instead of success handling.

	if (!the_lnet.ln_niinit_self) {
		rc = -EINVAL;
		goto unlock;
	}

> +		rc = lnet_dyn_del_ni(conf->cfg_net);
> +	else
> +		rc = -EINVAL;
> +	mutex_unlock(&lnet_config_mutex);
> +
> +	return rc;
> +}
> +
>  static int
>  lnet_ioctl(unsigned int cmd, struct libcfs_ioctl_hdr *hdr)
>  {
>  	int rc;
>  
>  	switch (cmd) {
> -	case IOC_LIBCFS_CONFIGURE:
> +	case IOC_LIBCFS_CONFIGURE: {
> +		struct libcfs_ioctl_data *data =
> +			(struct libcfs_ioctl_data *)hdr;

ioc_len.

> +		the_lnet.ln_nis_from_mod_params = data->ioc_flags;
>  		return lnet_configure(NULL);
> +	}
>  
>  	case IOC_LIBCFS_UNCONFIGURE:
>  		return lnet_unconfigure();
>  
>  	case IOC_LIBCFS_ADD_NET:
> -		return LNetCtl(cmd, hdr);
> +		return lnet_dyn_configure(hdr);
> +
> +	case IOC_LIBCFS_DEL_NET:
> +		return lnet_dyn_unconfigure(hdr);

ioc_len etc.

>  
>  	default:
>  		/* Passing LNET_PID_ANY only gives me a ref if the net is up
> diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
> index 1402e27..3b71812 100644
> --- a/drivers/staging/lustre/lnet/lnet/peer.c
> +++ b/drivers/staging/lustre/lnet/lnet/peer.c
> @@ -394,16 +394,18 @@ lnet_debug_peer(lnet_nid_t nid)
>  	lnet_net_unlock(cpt);
>  }
>  
> -int lnet_get_peers(int count, __u64 *nid, char *aliveness,
> -		   int *ncpt, int *refcount,
> -		   int *ni_peer_tx_credits, int *peer_tx_credits,
> -		   int *peer_rtr_credits, int *peer_min_rtr_credits,
> -		   int *peer_tx_qnob)
> +int
> +lnet_get_peer_info(__u32 peer_index, __u64 *nid,
> +		   char aliveness[LNET_MAX_STR_LEN],
> +		   __u32 *cpt_iter, __u32 *refcount,
> +		   __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits,
> +		   __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credits,
> +		   __u32 *peer_tx_qnob)
>  {
>  	struct lnet_peer_table *peer_table;
>  	lnet_peer_t *lp;
> -	int j;
> -	int lncpt, found = 0;
> +	bool found = false;

Unrelated.

> +	int lncpt, j;
>  
>  	/* get the number of CPTs */
>  	lncpt = cfs_percpt_number(the_lnet.ln_peer_tables);
> @@ -412,22 +414,22 @@ int lnet_get_peers(int count, __u64 *nid, char *aliveness,
>  	 * if the cpt number to be examined is >= the number of cpts in
>  	 * the system then indicate that there are no more cpts to examin
>  	 */
> -	if (*ncpt > lncpt)
> -		return -1;
> +	if (*cpt_iter > lncpt)
> +		return -ENOENT;

Don't mix unrelated changes.

>  
>  	/* get the current table */
> -	peer_table = the_lnet.ln_peer_tables[*ncpt];
> +	peer_table = the_lnet.ln_peer_tables[*cpt_iter];
>  	/* if the ptable is NULL then there are no more cpts to examine */
>  	if (!peer_table)
> -		return -1;
> +		return -ENOENT;
>  
> -	lnet_net_lock(*ncpt);
> +	lnet_net_lock(*cpt_iter);
>  
>  	for (j = 0; j < LNET_PEER_HASH_SIZE && !found; j++) {
>  		struct list_head *peers = &peer_table->pt_hash[j];
>  
>  		list_for_each_entry(lp, peers, lp_hashlist) {
> -			if (count-- > 0)
> +			if (peer_index-- > 0)
>  				continue;
>  
>  			snprintf(aliveness, LNET_MAX_STR_LEN, "NA");
> @@ -444,12 +446,12 @@ int lnet_get_peers(int count, __u64 *nid, char *aliveness,
>  			*peer_min_rtr_credits = lp->lp_mintxcredits;
>  			*peer_tx_qnob = lp->lp_txqnob;
>  
> -			found = 1;
> +			found = true;
>  		}
>  	}
> -	lnet_net_unlock(*ncpt);
> +	lnet_net_unlock(*cpt_iter);
>  
> -	*ncpt = lncpt;
> +	*cpt_iter = lncpt;
>  
>  	return found ? 0 : -ENOENT;
>  }
> diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
> index 749085f..17e6795 100644
> --- a/drivers/staging/lustre/lnet/lnet/router.c
> +++ b/drivers/staging/lustre/lnet/lnet/router.c
> @@ -541,6 +541,42 @@ lnet_destroy_routes(void)
>  	lnet_del_route(LNET_NIDNET(LNET_NID_ANY), LNET_NID_ANY);
>  }
>  
> +int lnet_get_rtr_pool_cfg(int idx, struct lnet_ioctl_pool_cfg *pool_cfg)
> +{
> +	int i, rc = -ENOENT, lidx, j;
> +
> +	if (!the_lnet.ln_rtrpools)
> +		return rc;

Use literals where ever possible.

		return -ENOENT;


> +
> +	for (i = 0; i < LNET_NRBPOOLS; i++) {
> +		lnet_rtrbufpool_t *rbp;
> +
> +		lnet_net_lock(LNET_LOCK_EX);
> +		lidx = idx;
> +		cfs_percpt_for_each(rbp, j, the_lnet.ln_rtrpools) {
> +			if (lidx-- == 0) {

Flip this around so you're not smooshed against the right side of the
screen.

			if (i++ != idx)
				continue;

Don't modify the original parameter.  That way you can use it again and
also it makes the code easier to understand.

> +				rc = 0;

Put the rc = 0 at the end next to the break.  rc is used by the return
keep them near together so that the code is easier to understand.

> +				pool_cfg->pl_pools[i].pl_npages =
> +					rbp[i].rbp_npages;
> +				pool_cfg->pl_pools[i].pl_nbuffers =
> +					rbp[i].rbp_nbuffers;
> +				pool_cfg->pl_pools[i].pl_credits =
> +					rbp[i].rbp_credits;
> +				pool_cfg->pl_pools[i].pl_mincredits =
> +					rbp[i].rbp_mincredits;
> +				break;
> +			}
> +		}
> +		lnet_net_unlock(LNET_LOCK_EX);
> +	}
> +
> +	lnet_net_lock(LNET_LOCK_EX);
> +	pool_cfg->pl_routing = the_lnet.ln_routing;
> +	lnet_net_unlock(LNET_LOCK_EX);
> +
> +	return rc;
> +}
> +
>  int
>  lnet_get_route(int idx, __u32 *net, __u32 *hops,
>  	       lnet_nid_t *gateway, __u32 *alive, __u32 *priority)
> @@ -1531,8 +1567,8 @@ lnet_rtrpools_alloc(int im_a_router)
>  	return rc;
>  }
>  
> -int
> -lnet_rtrpools_adjust(int tiny, int small, int large)
> +static int
> +lnet_rtrpools_adjust_helper(int tiny, int small, int large)
>  {
>  	int nrb = 0;
>  	int rc = 0;
> @@ -1540,19 +1576,10 @@ lnet_rtrpools_adjust(int tiny, int small, int large)
>  	lnet_rtrbufpool_t *rtrp;
>  
>  	/*
> -	 * this function doesn't revert the changes if adding new buffers
> -	 * failed.  It's up to the user space caller to revert the
> -	 * changes.
> -	 */
> -
> -	if (!the_lnet.ln_routing)
> -		return 0;
> -
> -	/*
>  	 * If the provided values for each buffer pool are different than the
>  	 * configured values, we need to take action.
>  	 */
> -	if (tiny >= 0 && tiny != tiny_router_buffers) {
> +	if (tiny >= 0) {


Unrelated cleanups.  And below.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer
  2015-11-20 23:35 ` [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer James Simmons
@ 2015-12-02 12:34   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 12:34 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Liang Zhen, Linux Kernel Mailing List, lustre-devel

On Fri, Nov 20, 2015 at 06:35:55PM -0500, James Simmons wrote:
> From: Liang Zhen <liang.zhen@intel.com>
> 
>   - libcfs_ioctl_popdata should copy out inline buffers.
>   - code cleanup for libcfs ioctl handler
>   - error number fix for obd_ioctl_getdata
>   - add new function libcfs_ioctl_unpack for upcoming patches
> 

Without looking at the patch, I can already tell you it should be four
separate patches instead of one.  Don't mix bug fixes and cleanups.
This should not be a new rule for anyone.

Guys, what the actual heck is going on???  This patchset is making me
really depressed and I'm not even half way through.

> Signed-off-by: Liang Zhen <liang.zhen@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
> Reviewed-on: http://review.whamcloud.com/11313
> Reviewed-by: Bobi Jam <bobijam@gmail.com>
> Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
>  .../lustre/include/linux/libcfs/libcfs_ioctl.h     |   24 +++-
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |    2 +
>  .../lustre/lustre/libcfs/linux/linux-module.c      |   45 +++++---
>  drivers/staging/lustre/lustre/libcfs/module.c      |  119 ++++++++------------
>  .../lustre/lustre/obdclass/linux/linux-module.c    |   17 +--
>  5 files changed, 97 insertions(+), 110 deletions(-)
> 
> diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> index f24330d..3468933 100644
> --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
> @@ -49,6 +49,9 @@ struct libcfs_ioctl_hdr {
>  	__u32 ioc_version;
>  };
>  
> +/** max size to copy from userspace */
> +#define LIBCFS_IOC_DATA_MAX	(128 * 1024)
> +
>  struct libcfs_ioctl_data {
>  	struct libcfs_ioctl_hdr ioc_hdr;
>  
> @@ -240,11 +243,22 @@ static inline bool libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
>  
>  int libcfs_register_ioctl(struct libcfs_ioctl_handler *hand);
>  int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand);
> -int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
> -			 const void __user *arg);
> -int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
> -			     __u32 *buf_len);
> -int libcfs_ioctl_popdata(void *arg, void *buf, int size);
> +int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
> +			 struct libcfs_ioctl_hdr __user *uparam);
> +
> +static inline int libcfs_ioctl_popdata(struct libcfs_ioctl_hdr *hdr,
> +				       struct libcfs_ioctl_hdr __user *uparam)
> +{
> +	if (copy_to_user(uparam, hdr, hdr->ioc_len))
> +		return -EFAULT;
> +	return 0;
> +}

No.  Don't do this.

> +
> +static inline void libcfs_ioctl_freedata(struct libcfs_ioctl_hdr *hdr)
> +{
> +	LIBCFS_FREE(hdr, hdr->ioc_len);
> +}

No.  We need to transition to kmalloc() and kfree() instead of adding
even more abstraction layers.  In this patchset we add new calls to
ALLOC() when we should be using normal kernel functions like kstrdup()
but we can't because then we wouldn't have a size parameter to pass to
LIBCFS_FREE().

> +
>  int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
>  
>  #endif /* __LIBCFS_IOCTL_H__ */
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index 949fa2f..4c4e6d3 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -1838,6 +1838,8 @@ LNetCtl(unsigned int cmd, void *arg)
>  	int rc;
>  	unsigned long secs_passed;
>  
> +	CLASSERT(sizeof(struct lnet_ioctl_net_config) +
> +		 sizeof(struct lnet_ioctl_config_data) < LIBCFS_IOC_DATA_MAX);


BUILD_BUG_ON().

>  	LASSERT(the_lnet.ln_init);
>  
>  	switch (cmd) {
> diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
> index 1c31e2e..50a5464 100644
> --- a/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
> +++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-module.c
> @@ -43,7 +43,7 @@
>  int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data)
>  {
>  	if (libcfs_ioctl_is_invalid(data)) {
> -		CERROR("LNET: ioctl not correctly formatted\n");
> +		CERROR("libcfs ioctl: parameter not correctly formatted\n");
>  		return -EINVAL;
>  	}
>  
> @@ -57,39 +57,46 @@ int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data)
>  	return 0;
>  }
>  
> -int libcfs_ioctl_getdata_len(const struct libcfs_ioctl_hdr __user *arg,
> -			     __u32 *len)
> +int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
> +			 struct libcfs_ioctl_hdr __user *uhdr)
>  {
>  	struct libcfs_ioctl_hdr hdr;
> +	int err = 0;
>  
> -	if (copy_from_user(&hdr, arg, sizeof(hdr)))
> +	if (copy_from_user(&hdr, uhdr, sizeof(hdr)))
>  		return -EFAULT;
>  
>  	if (hdr.ioc_version != LIBCFS_IOCTL_VERSION &&
>  	    hdr.ioc_version != LIBCFS_IOCTL_VERSION2) {
> -		CERROR("LNET: version mismatch expected %#x, got %#x\n",
> +		CERROR("libcfs ioctl: version mismatch expected %#x, got %#x\n",
>  		       LIBCFS_IOCTL_VERSION, hdr.ioc_version);
>  		return -EINVAL;
>  	}
>  
> -	*len = hdr.ioc_len;
> +	if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
> +		CERROR("libcfs ioctl: user buffer too small for ioctl\n");
> +		return -EINVAL;
> +	}

It's sort of good to check here, but we will need to add all the other
checks I mentioned as well.  Also don't fix introduce bugs and fix them
in the same patchset, the fix has to be folded into the buggy patch.

>  
> -	return 0;
> -}
> +	if (hdr.ioc_len > LIBCFS_IOC_DATA_MAX) {
> +		CERROR("libcfs ioctl: user buffer is too large %d/%d\n",
> +		       hdr.ioc_len, LIBCFS_IOC_DATA_MAX);
> +		return -EINVAL;
> +	}
>  
> -int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr *buf, __u32 buf_len,
> -			  const void __user *arg)
> -{
> -	if (copy_from_user(buf, arg, buf_len))
> -		return -EFAULT;
> -	return 0;
> -}
> +	LIBCFS_ALLOC(*hdr_pp, hdr.ioc_len);
> +	if (!*hdr_pp)
> +		return -ENOMEM;
> +
> +	if (copy_from_user(*hdr_pp, uhdr, hdr.ioc_len)) {
> +		err = -EFAULT;
> +		goto failed;
> +	}

We still need to re-check hdr.ioc_len after re-reading it from the user.

Anyway, break this patch up and I will review it properly.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start
  2015-11-20 23:35 ` [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start James Simmons
@ 2015-12-02 12:44   ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 12:44 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

Fold this into the original patch.  (Which you're going to have to redo
anyway to fix the other bug and because of the GW-BASIC label names).

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-11-20 23:35 ` [PATCH 21/40] staging: lustre: improve LNet clean up code and API James Simmons
@ 2015-12-02 12:59   ` Dan Carpenter
  2015-12-02 13:20     ` [lustre-devel] " Alexander Zarochentsev
  2015-12-15 17:10     ` Simmons, James A.
  0 siblings, 2 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 12:59 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, devel, Oleg Drokin, Andreas Dilger,
	Amir Shehata, Linux Kernel Mailing List, lustre-devel

Actually we're going to have to redo so much code that it's not worth it
for me to review the rest of these patches.  Please just look over
everything again:

 BAD:	return -1;
GOOD:	return -EINVAL;

 BAD:  failed0:
GOOD:  free_something:

 BAD:	if (rc != 0)
GOOD:	if (rc)

Do one thing per patch.
Do not introduce a bug and then fix it in a later patch.
Check ioc_len more carefully.
Don't make the code look ugly just to please checkpatch.pl.
Do error handling not success handling.
Try to avoid indenting a far to the right.


regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [lustre-devel] [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-12-02 12:59   ` Dan Carpenter
@ 2015-12-02 13:20     ` Alexander Zarochentsev
  2015-12-02 13:59       ` Dan Carpenter
  2015-12-15 17:10     ` Simmons, James A.
  1 sibling, 1 reply; 65+ messages in thread
From: Alexander Zarochentsev @ 2015-12-02 13:20 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: James Simmons, devel, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

Hello,

On Wed, Dec 2, 2015 at 3:59 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> Actually we're going to have to redo so much code that it's not worth it
> for me to review the rest of these patches.  Please just look over
> everything again:
>
>  BAD:   return -1;
> GOOD:   return -EINVAL;
>
>  BAD:  failed0:
> GOOD:  free_something:
>
>  BAD:   if (rc != 0)
> GOOD:   if (rc)

The latest suggestion is not correct,
from http://wiki.lustre.org/Lustre_Coding_Guidelines :
Conditional boolean (if (expr)), scalar (if (val != 0)) and pointer
(if (ptr != NULL)) expressions should be written consistently.

Thanks,

-- 
Alexander Zarochentsev
Seagate Technology, LLC
www.seagate.com

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [lustre-devel] [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-12-02 13:20     ` [lustre-devel] " Alexander Zarochentsev
@ 2015-12-02 13:59       ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-02 13:59 UTC (permalink / raw)
  To: Alexander Zarochentsev
  Cc: James Simmons, devel, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

On Wed, Dec 02, 2015 at 04:20:59PM +0300, Alexander Zarochentsev wrote:
> >  BAD:   if (rc != 0)
> > GOOD:   if (rc)
> 
> The latest suggestion is not correct,
> from http://wiki.lustre.org/Lustre_Coding_Guidelines :
> Conditional boolean (if (expr)), scalar (if (val != 0)) and pointer
> (if (ptr != NULL)) expressions should be written consistently.

Kernel style trumps Lustre style.  Double negative don't not hurt
readability.  != NULL is a checkpatch.pl warning.  Also comparisons like
== false or == true are checkpatch warnings.

I can think of two places where comparing with zero is appropriate and
those are:

1)  If you are talking about the numer zero.

	if (x == 0 || x == 2)

2) strcmp() and other *cmp() functions.

	if (strcmp(foo, bar) == 0)   /* foo and bar are the same */
	if (strcmp(foo, bar) < 0)    /* foo less than bar */
	if (strcmp(foo, bar) != 0)   /* foo not the same as bar */

For "if (rc) {" a zero return doesn't mean zero, it means success so
comparing against zero is bad style.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-12-02 12:59   ` Dan Carpenter
  2015-12-02 13:20     ` [lustre-devel] " Alexander Zarochentsev
@ 2015-12-15 17:10     ` Simmons, James A.
  2015-12-15 17:41       ` Dan Carpenter
  1 sibling, 1 reply; 65+ messages in thread
From: Simmons, James A. @ 2015-12-15 17:10 UTC (permalink / raw)
  To: 'Dan Carpenter', James Simmons
  Cc: devel, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

>Actually we're going to have to redo so much code that it's not worth it
>for me to review the rest of these patches.  

Sorry I didn't get back to you sooner but I was on vacation.  Thanks for 
reviewing this work. Especially since this is the first major bug fixing merge
for the lustre client which means a lot of pain involved to iron out how to
do this. I have been pondering if pushing bug fixes before style cleanups
is the right thing to do. I pushed a bunch of bug fixes earlier and none got
merged which either means Greg is just backed up and hasn't the time to
merge them or  style issues are higher priority. Assuming these bug fixes are
in scope of the staging tree. Should I continue to push this work first?
Well either way I should update this patch series so it ready to merge at some
point.

>Please just look over everything again:
>
> BAD:	return -1;
>GOOD:	return -EINVAL;
>
> BAD:  failed0:
>GOOD:  free_something:
>
> BAD:	if (rc != 0)
>GOOD:	if (rc)
>
>Do one thing per patch.
>Do not introduce a bug and then fix it in a later patch.
>Check ioc_len more carefully.
>Don't make the code look ugly just to please checkpatch.pl.
>Do error handling not success handling.
>Try to avoid indenting a far to the right.

Okay. Will start to do the patch cleanup.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 21/40] staging: lustre: improve LNet clean up code and API
  2015-12-15 17:10     ` Simmons, James A.
@ 2015-12-15 17:41       ` Dan Carpenter
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Carpenter @ 2015-12-15 17:41 UTC (permalink / raw)
  To: Simmons, James A.
  Cc: James Simmons, devel, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

On Tue, Dec 15, 2015 at 05:10:39PM +0000, Simmons, James A. wrote:
>I have been pondering if pushing bug fixes before style cleanups
> is the right thing to do.

Generally push the least controversial patches first so that if you
have to redo one patch, then the rest are already applied and don't need
to be changed.

> I pushed a bunch of bug fixes earlier and none got
> merged which either means Greg is just backed up and hasn't the time to
> merge them or  style issues are higher priority

I have no idea which patchset you are talking about so I can't comment.
Greg always (except if there is a mistake) applies things in first come
first serve order.  He doesn't sort them.

> Assuming these bug fixes are in scope of the staging tree. Should I
> continue to push this work first?

You've pushed a bunch of stuff.  I don't know which stuff has been
applied and which has not.  If no one replied to it and there isn't a
dire issue such as a compile failure or it doesn't apply then Greg is
likely to apply it.  He doesn't silently patches, so you will get an
email either way.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet
  2015-12-02  7:46   ` Dan Carpenter
@ 2015-12-15 18:08     ` Simmons, James A.
  0 siblings, 0 replies; 65+ messages in thread
From: Simmons, James A. @ 2015-12-15 18:08 UTC (permalink / raw)
  To: 'Dan Carpenter', James Simmons
  Cc: devel, Sebastien Buisson, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, lustre-devel

>> diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
>> index 556c837..2ca7d0e 100644
>> --- a/drivers/staging/lustre/lnet/selftest/conctl.c
>> +++ b/drivers/staging/lustre/lnet/selftest/conctl.c
>> @@ -679,45 +679,46 @@ static int
>>  lst_stat_query_ioctl(lstio_stat_args_t *args)
>>  {
>>  	int rc;
>> -	char *name;
>> +	char *name = NULL;
>>  
>>  	/* TODO: not finished */
>>  	if (args->lstio_sta_key != console_session.ses_key)
>>  		return -EACCES;
>>  
>> -	if (args->lstio_sta_resultp == NULL ||
>> -	    (args->lstio_sta_namep  == NULL &&
>> -	     args->lstio_sta_idsp   == NULL) ||
>> -	    args->lstio_sta_nmlen <= 0 ||
>> -	    args->lstio_sta_nmlen > LST_NAME_SIZE)
>> -		return -EINVAL;
>> -
>> -	if (args->lstio_sta_idsp != NULL &&
>> -	    args->lstio_sta_count <= 0)
>> +	if (!args->lstio_sta_resultp)
>>  		return -EINVAL;
>>  
>> -	LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
>> -	if (name == NULL)
>> -		return -ENOMEM;
>> -
>> -	if (copy_from_user(name, args->lstio_sta_namep,
>> -			       args->lstio_sta_nmlen)) {
>> -		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
>> -		return -EFAULT;
>> -	}
>> +	if (args->lstio_sta_idsp) {
>> +		if (args->lstio_sta_count <= 0)
>> +			return -EINVAL;
>>  
>> -	if (args->lstio_sta_idsp == NULL) {
>> -		rc = lstcon_group_stat(name, args->lstio_sta_timeout,
>> -				       args->lstio_sta_resultp);
>> -	} else {
>>  		rc = lstcon_nodes_stat(args->lstio_sta_count,
>>  				       args->lstio_sta_idsp,
>>  				       args->lstio_sta_timeout,
>>  				       args->lstio_sta_resultp);
>> -	}
>> +	} else if (args->lstio_sta_namep) {
>> +		if (args->lstio_sta_nmlen <= 0 ||
>> +		    args->lstio_sta_nmlen > LST_NAME_SIZE)
>> +			return -EINVAL;
>> +
>> +		LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
>> +		if (!name)
>> +			return -ENOMEM;
>>  
>> -	LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
>> +		rc = copy_from_user(name, args->lstio_sta_namep,
>> +				    args->lstio_sta_nmlen);
>> +		if (!rc)
>> +			rc = lstcon_group_stat(name, args->lstio_sta_timeout,
>> +					       args->lstio_sta_resultp);
>> +		else
>> +			rc = -EFAULT;
>>  
>> +	} else {
>> +		rc = -EINVAL;
>> +	}
>> +
>> +	if (name)
>> +		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
>
>There is no bug fix here.  This code was fine when it was merged into
>the kernel in 2013 so I have no idea how out of date the static checker
>warning is...  The new code doesn't do unnecessary allocations so that's
>good but "name" should be declared in the block where it is used instead
>of at the start of the function.  Btw, we assume that the user gives us
>a NUL terminated string for "name" so we should fix that bug as well.
>
>TODO: lustre: don't assume "name" is NUL terminated

Ugh. I see breakage everywhere in this code :-( Need to address.  I think we
should convert that to strcpy_to_user as well.



^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-02 11:20   ` Dan Carpenter
@ 2015-12-15 18:14     ` Simmons, James A.
  2015-12-15 18:19       ` Dan Carpenter
  2015-12-15 18:48       ` Greg Kroah-Hartman
  0 siblings, 2 replies; 65+ messages in thread
From: Simmons, James A. @ 2015-12-15 18:14 UTC (permalink / raw)
  To: 'Dan Carpenter', James Simmons
  Cc: devel, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

  
>>  struct libcfs_ioctl_hdr {
>>  	__u32 ioc_len;
>> @@ -87,6 +88,13 @@ do {						    \
>>  	data.ioc_hdr.ioc_len = sizeof(data);			\
>>  } while (0)
>>  
>> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
>> +do {							\
>> +	memset(&(data), 0, sizeof(data));		\
>> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
>> +	(data).hdr.ioc_len = sizeof(data);		\
>> +} while (0)
>> +
>
>Do we really need this?

Would you be okay if this was a inline function? This is used by user land and kernel space code.



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-15 18:14     ` Simmons, James A.
@ 2015-12-15 18:19       ` Dan Carpenter
  2015-12-15 18:39         ` Simmons, James A.
  2015-12-15 18:48       ` Greg Kroah-Hartman
  1 sibling, 1 reply; 65+ messages in thread
From: Dan Carpenter @ 2015-12-15 18:19 UTC (permalink / raw)
  To: Simmons, James A.
  Cc: James Simmons, devel, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

On Tue, Dec 15, 2015 at 06:14:19PM +0000, Simmons, James A. wrote:
>   
> >>  struct libcfs_ioctl_hdr {
> >>  	__u32 ioc_len;
> >> @@ -87,6 +88,13 @@ do {						    \
> >>  	data.ioc_hdr.ioc_len = sizeof(data);			\
> >>  } while (0)
> >>  
> >> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
> >> +do {							\
> >> +	memset(&(data), 0, sizeof(data));		\
> >> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
> >> +	(data).hdr.ioc_len = sizeof(data);		\
> >> +} while (0)
> >> +
> >
> >Do we really need this?
> 
> Would you be okay if this was a inline function? This is used by user land and kernel space code.
> 

I try (not very hard) to sound like a broken record but this business of
sharing code with userland is a pain in the butt.  It's not used in the
kernel or in any patches you have sent.

It would look better as an inline function though so I wouldn't have
even noticed it.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-15 18:19       ` Dan Carpenter
@ 2015-12-15 18:39         ` Simmons, James A.
  0 siblings, 0 replies; 65+ messages in thread
From: Simmons, James A. @ 2015-12-15 18:39 UTC (permalink / raw)
  To: 'Dan Carpenter'
  Cc: James Simmons, devel, Andreas Dilger, Greg Kroah-Hartman,
	Linux Kernel Mailing List, Oleg Drokin, Amir Shehata,
	lustre-devel

>On Tue, Dec 15, 2015 at 06:14:19PM +0000, Simmons, James A. wrote:
>>   
>> >>  struct libcfs_ioctl_hdr {
>> >>  	__u32 ioc_len;
>> >> @@ -87,6 +88,13 @@ do {						    \
>> >>  	data.ioc_hdr.ioc_len = sizeof(data);			\
>> >>  } while (0)
>> >>  
>> >> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
>> >> +do {							\
>> >> +	memset(&(data), 0, sizeof(data));		\
>> >> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
>> >> +	(data).hdr.ioc_len = sizeof(data);		\
>> >> +} while (0)
>> >> +
>> >
>> >Do we really need this?
>> 
>> Would you be okay if this was a inline function? This is used by user land and kernel space code.
>> 
>
>I try (not very hard) to sound like a broken record but this business of
>sharing code with userland is a pain in the butt.  It's not used in the
>kernel or in any patches you have sent.
>
>It would look better as an inline function though so I wouldn't have
>even noticed it.

I'm glad you noticed.  I just looked at the production source code and yep it is only used
in the userland tools code. I need to update our tools so they don't break. Then we can
remove these macros.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-15 18:14     ` Simmons, James A.
  2015-12-15 18:19       ` Dan Carpenter
@ 2015-12-15 18:48       ` Greg Kroah-Hartman
  2015-12-15 19:48         ` Simmons, James A.
  1 sibling, 1 reply; 65+ messages in thread
From: Greg Kroah-Hartman @ 2015-12-15 18:48 UTC (permalink / raw)
  To: Simmons, James A.
  Cc: 'Dan Carpenter',
	James Simmons, devel, Andreas Dilger, Linux Kernel Mailing List,
	Oleg Drokin, Amir Shehata, lustre-devel

On Tue, Dec 15, 2015 at 06:14:19PM +0000, Simmons, James A. wrote:
>   
> >>  struct libcfs_ioctl_hdr {
> >>  	__u32 ioc_len;
> >> @@ -87,6 +88,13 @@ do {						    \
> >>  	data.ioc_hdr.ioc_len = sizeof(data);			\
> >>  } while (0)
> >>  
> >> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
> >> +do {							\
> >> +	memset(&(data), 0, sizeof(data));		\
> >> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
> >> +	(data).hdr.ioc_len = sizeof(data);		\
> >> +} while (0)
> >> +
> >
> >Do we really need this?
> 
> Would you be okay if this was a inline function? This is used by user
> land and kernel space code.

Then your code is broken, please never do that.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* RE: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-15 18:48       ` Greg Kroah-Hartman
@ 2015-12-15 19:48         ` Simmons, James A.
  2015-12-15 19:55           ` 'Greg Kroah-Hartman'
  0 siblings, 1 reply; 65+ messages in thread
From: Simmons, James A. @ 2015-12-15 19:48 UTC (permalink / raw)
  To: 'Greg Kroah-Hartman'
  Cc: 'Dan Carpenter',
	James Simmons, devel, Andreas Dilger, Linux Kernel Mailing List,
	Oleg Drokin, Amir Shehata, lustre-devel

>On Tue, Dec 15, 2015 at 06:14:19PM +0000, Simmons, James A. wrote:
>>   
>> >>  struct libcfs_ioctl_hdr {
>> >>  	__u32 ioc_len;
>> >> @@ -87,6 +88,13 @@ do {						    \
>> >>  	data.ioc_hdr.ioc_len = sizeof(data);			\
>> >>  } while (0)
>> >>  
>> >> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
>> >> +do {							\
>> >> +	memset(&(data), 0, sizeof(data));		\
>> >> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
>> >> +	(data).hdr.ioc_len = sizeof(data);		\
>> >> +} while (0)
>> >> +
>> >
>> >Do we really need this?
>> 
>> Would you be okay if this was a inline function? This is used by user
>> land and kernel space code.
>
>Then your code is broken, please never do that.

This brings up a good point. This header doesn't contain structures for userland so it is a uapi
type header.  Should such headers only contain data structures?


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command
  2015-12-15 19:48         ` Simmons, James A.
@ 2015-12-15 19:55           ` 'Greg Kroah-Hartman'
  0 siblings, 0 replies; 65+ messages in thread
From: 'Greg Kroah-Hartman' @ 2015-12-15 19:55 UTC (permalink / raw)
  To: Simmons, James A.
  Cc: 'Dan Carpenter',
	James Simmons, devel, Andreas Dilger, Linux Kernel Mailing List,
	Oleg Drokin, Amir Shehata, lustre-devel

On Tue, Dec 15, 2015 at 07:48:22PM +0000, Simmons, James A. wrote:
> >On Tue, Dec 15, 2015 at 06:14:19PM +0000, Simmons, James A. wrote:
> >>   
> >> >>  struct libcfs_ioctl_hdr {
> >> >>  	__u32 ioc_len;
> >> >> @@ -87,6 +88,13 @@ do {						    \
> >> >>  	data.ioc_hdr.ioc_len = sizeof(data);			\
> >> >>  } while (0)
> >> >>  
> >> >> +#define LIBCFS_IOC_INIT_V2(data, hdr)			\
> >> >> +do {							\
> >> >> +	memset(&(data), 0, sizeof(data));		\
> >> >> +	(data).hdr.ioc_version = LIBCFS_IOCTL_VERSION2;	\
> >> >> +	(data).hdr.ioc_len = sizeof(data);		\
> >> >> +} while (0)
> >> >> +
> >> >
> >> >Do we really need this?
> >> 
> >> Would you be okay if this was a inline function? This is used by user
> >> land and kernel space code.
> >
> >Then your code is broken, please never do that.
> 
> This brings up a good point. This header doesn't contain structures for userland so it is a uapi
> type header.  Should such headers only contain data structures?

Yes, that would make more sense.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 00/40] Sync upstream lustre client LNet core
  2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
                   ` (39 preceding siblings ...)
  2015-11-20 23:36 ` [PATCH 40/40] staging: lustre: Remove LASSERTS from router checker James Simmons
@ 2015-12-21 23:41 ` Greg Kroah-Hartman
  40 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2015-12-21 23:41 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Oleg Drokin, Andreas Dilger, Linux Kernel Mailing List,
	lustre-devel

On Fri, Nov 20, 2015 at 06:35:36PM -0500, James Simmons wrote:
> This is the majority of the fixes that have gone into the LNet layer.
> Outside a few remaining patches this brings LNet close to what is
> running in production world wide.
> 
> This patch series needs the remove IOC_LIBCFS_PING_TEST ioctl patch
> landed first.

Please resend an updated version of this series based on the review it
had.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2015-12-21 23:42 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20 23:35 [PATCH 00/40] Sync upstream lustre client LNet core James Simmons
2015-11-20 23:35 ` [PATCH 01/40] staging: lustre: drop *_t from end of struct lnet_text_buf James Simmons
2015-11-20 23:35 ` [PATCH 02/40] staging: lustre: fix 'NULL pointer dereference' errors for LNet James Simmons
2015-12-02  7:46   ` Dan Carpenter
2015-12-15 18:08     ` Simmons, James A.
2015-11-20 23:35 ` [PATCH 03/40] staging: lustre: reflect down routes in /proc/sys/lnet/routes James Simmons
2015-12-02  7:54   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 04/40] staging: lustre: fix failure handle of create reply James Simmons
2015-11-20 23:35 ` [PATCH 05/40] staging: lustre: eliminate obsolete Cray SeaStar support James Simmons
2015-11-20 23:35 ` [PATCH 06/40] staging: lustre: remove uses of IS_ERR_VALUE() James Simmons
2015-11-21 18:45   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 07/40] staging: lustre: return +ve for blocked lnet message James Simmons
2015-11-20 23:35 ` [PATCH 08/40] staging: lustre: do not memset after LIBCFS_ALLOC James Simmons
2015-11-20 23:35 ` [PATCH 09/40] staging: lustre: Dynamic LNet Configuration (DLC) James Simmons
2015-11-20 23:35 ` [PATCH 10/40] staging: lustre: Dynamic LNet Configuration (DLC) dynamic routing James Simmons
2015-11-20 23:35 ` [PATCH 11/40] staging: lustre: DLC Feature dynamic net config James Simmons
2015-12-02  9:23   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 12/40] staging: lustre: Dynamic LNet Configuration (DLC) IOCTL changes James Simmons
2015-12-02  9:48   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 13/40] staging: lustre: Dynamic LNet Configuration (DLC) show command James Simmons
2015-12-02 11:20   ` Dan Carpenter
2015-12-15 18:14     ` Simmons, James A.
2015-12-15 18:19       ` Dan Carpenter
2015-12-15 18:39         ` Simmons, James A.
2015-12-15 18:48       ` Greg Kroah-Hartman
2015-12-15 19:48         ` Simmons, James A.
2015-12-15 19:55           ` 'Greg Kroah-Hartman'
2015-12-02 12:00   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 14/40] staging: lustre: fix crash due to NULL networks string James Simmons
2015-12-02 11:27   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 15/40] staging: lustre: DLC user/kernel space glue code James Simmons
2015-12-02 12:11   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 16/40] staging: lustre: make local functions static for LNet ni James Simmons
2015-11-20 23:35 ` [PATCH 17/40] staging: lustre: add sparse annotation __user wherever needed for lnet James Simmons
2015-11-20 23:35 ` [PATCH 18/40] staging: lustre: remove LUSTRE_{,SRV_}LNET_PID James Simmons
2015-11-20 23:35 ` [PATCH 19/40] staging: lustre: copy out libcfs ioctl inline buffer James Simmons
2015-12-02 12:34   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 20/40] staging: lustre: fix kernel crash when network failed to start James Simmons
2015-12-02 12:44   ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 21/40] staging: lustre: improve LNet clean up code and API James Simmons
2015-12-02 12:59   ` Dan Carpenter
2015-12-02 13:20     ` [lustre-devel] " Alexander Zarochentsev
2015-12-02 13:59       ` Dan Carpenter
2015-12-15 17:10     ` Simmons, James A.
2015-12-15 17:41       ` Dan Carpenter
2015-11-20 23:35 ` [PATCH 22/40] staging: lustre: Fixes to make lnetctl function as expected James Simmons
2015-11-20 23:35 ` [PATCH 23/40] staging: lustre: return appropriate errno when adding route James Simmons
2015-11-20 23:36 ` [PATCH 24/40] staging: lustre: make some lnet functions static James Simmons
2015-11-20 23:36 ` [PATCH 25/40] staging: lustre: missed a few cases of using NULL instead of 0 James Simmons
2015-11-20 23:36 ` [PATCH 26/40] staging: lustre: startup lnet acceptor thread dynamically James Simmons
2015-11-20 23:36 ` [PATCH 27/40] staging: lustre: reject invalid net configuration for lnet James Simmons
2015-11-20 23:36 ` [PATCH 28/40] staging: lustre: return -EEXIST if NI is not unique James Simmons
2015-11-20 23:36 ` [PATCH 29/40] staging: lustre: handle lnet_check_routes() errors James Simmons
2015-11-20 23:36 ` [PATCH 30/40] staging: lustre: improvement to router checker James Simmons
2015-11-20 23:36 ` [PATCH 31/40] staging: lustre: assume a kernel build James Simmons
2015-11-20 23:36 ` [PATCH 32/40] staging: lustre: prevent assert on LNet module unload James Simmons
2015-11-20 23:36 ` [PATCH 33/40] staging: lustre: remove messages from lazy portal on NI shutdown James Simmons
2015-11-20 23:36 ` [PATCH 34/40] staging: lustre: remove unnecessary EXPORT_SYMBOL from lnet layer James Simmons
2015-11-20 23:36 ` [PATCH 35/40] staging: lustre: avoid race during lnet acceptor thread termination James Simmons
2015-11-20 23:36 ` [PATCH 36/40] staging: lustre: test for sk_sleep presence in compact-2.6.h James Simmons
2015-11-20 23:36 ` [PATCH 37/40] staging: lustre: remove unnecessary NULL check in IOC_LIBCFS_GET_NET James Simmons
2015-11-20 23:36 ` [PATCH 38/40] staging: lustre: Allocate the correct number of rtr buffers James Simmons
2015-11-20 23:36 ` [PATCH 39/40] staging: lustre: Use lnet_is_route_alive for router aliveness James Simmons
2015-11-20 23:36 ` [PATCH 40/40] staging: lustre: Remove LASSERTS from router checker James Simmons
2015-12-21 23:41 ` [PATCH 00/40] Sync upstream lustre client LNet core Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).