All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Last batch of fixes for LNet
@ 2016-03-05  2:09 ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

This batch merges the remaining LNet patches from the OpenSFS
branch for the upstream client. Once merged the LNet code
will be up to date with the latest production code. Only style
issues are remaining. Still future patches being developed
for LNet will be landed to the upstream client as soon as they
are ready after extensive testing.

Frank Zago (1):
  staging: lustre: add last missing sparse annotation __user

James Nunez (1):
  staging: lustre: Correct missing newline

James Simmons (3):
  staging: lustre: change test to asser in LNetGetId
  staging: lustre: rename proc_call_handler to lprocfs_call_handler
  staging: lustre: make LNet use lprocfs_call_handler

Liang Zhen (2):
  staging: lustre: LNet drop rule implementation
  staging: lustre: LNet network latency simulation

Sebastien Buisson (3):
  staging: lustre: fix 'data race condition' issue in conrpc.c
  staging: lustre: fix 'NULL pointer dereference' errors
  staging: lustre: fix 'data race condition' issue in framework.c

 .../staging/lustre/include/linux/libcfs/libcfs.h   |    4 +
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   27 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    5 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |  100 ++
 drivers/staging/lustre/lnet/lnet/Makefile          |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   13 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   83 +-
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |    6 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       | 1025 ++++++++++++++++++++
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   32 +-
 drivers/staging/lustre/lnet/selftest/conctl.c      |   49 +-
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    8 +-
 drivers/staging/lustre/lnet/selftest/framework.c   |    9 +-
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    9 +-
 .../staging/lustre/lustre/libcfs/libcfs_string.c   |   27 +-
 drivers/staging/lustre/lustre/libcfs/module.c      |   25 +-
 drivers/staging/lustre/lustre/llite/file.c         |    8 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    4 +-
 drivers/staging/lustre/lustre/lov/lov_lock.c       |    2 +-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c       |    4 +-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    3 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   11 +-
 drivers/staging/lustre/lustre/obdclass/cl_lock.c   |    2 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |   26 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 .../staging/lustre/lustre/obdecho/echo_client.c    |    8 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    9 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    3 +-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    6 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    4 +-
 37 files changed, 1380 insertions(+), 150 deletions(-)
 create mode 100644 drivers/staging/lustre/lnet/lnet/net_fault.c

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 00/10] Last batch of fixes for LNet
@ 2016-03-05  2:09 ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

This batch merges the remaining LNet patches from the OpenSFS
branch for the upstream client. Once merged the LNet code
will be up to date with the latest production code. Only style
issues are remaining. Still future patches being developed
for LNet will be landed to the upstream client as soon as they
are ready after extensive testing.

Frank Zago (1):
  staging: lustre: add last missing sparse annotation __user

James Nunez (1):
  staging: lustre: Correct missing newline

James Simmons (3):
  staging: lustre: change test to asser in LNetGetId
  staging: lustre: rename proc_call_handler to lprocfs_call_handler
  staging: lustre: make LNet use lprocfs_call_handler

Liang Zhen (2):
  staging: lustre: LNet drop rule implementation
  staging: lustre: LNet network latency simulation

Sebastien Buisson (3):
  staging: lustre: fix 'data race condition' issue in conrpc.c
  staging: lustre: fix 'NULL pointer dereference' errors
  staging: lustre: fix 'data race condition' issue in framework.c

 .../staging/lustre/include/linux/libcfs/libcfs.h   |    4 +
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   27 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    5 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |  100 ++
 drivers/staging/lustre/lnet/lnet/Makefile          |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   13 +-
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   83 +-
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |    6 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       | 1025 ++++++++++++++++++++
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   32 +-
 drivers/staging/lustre/lnet/selftest/conctl.c      |   49 +-
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    8 +-
 drivers/staging/lustre/lnet/selftest/framework.c   |    9 +-
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    9 +-
 .../staging/lustre/lustre/libcfs/libcfs_string.c   |   27 +-
 drivers/staging/lustre/lustre/libcfs/module.c      |   25 +-
 drivers/staging/lustre/lustre/llite/file.c         |    8 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    4 +-
 drivers/staging/lustre/lustre/lov/lov_lock.c       |    2 +-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c       |    4 +-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    3 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   11 +-
 drivers/staging/lustre/lustre/obdclass/cl_lock.c   |    2 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |   26 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 .../staging/lustre/lustre/obdecho/echo_client.c    |    8 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    9 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    3 +-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    6 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    4 +-
 37 files changed, 1380 insertions(+), 150 deletions(-)
 create mode 100644 drivers/staging/lustre/lnet/lnet/net_fault.c

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/10] staging: lustre: LNet drop rule implementation
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

This is implementation of LNet Drop Rule, which can randomly drop
LNet messages at specified rate.

LNet Drop Rule can only be applied to receive side of message. User
can add drop_rule either on end point of cluster (client/server) or
on LNet routers.

Here are lctl command to control LNet Drop Rules:
 - net_drop_add -s SRC_NID -d DEST_NID --rate VALUE
   drop 1/@VALUE of messages from @SRC_NID to @DEST_NID

 - net_drop_del -s SRC_NID -d DEST_NID
   remove all drop rules from @SRC_NID to @DEST_NID

 - net_drop_list
   list all drop rules on current node

 Examples:
 - lctl net_drop_add -s *@o2ib0 -d 192.168.1.102@tcp 1000
   add new drop rule, it will drop 1/1000 messages from network o2ib0
   to 192.168.1.102@tcp

 - lctl net_drop_add -s 10.8.6.123@o2ib1 -d * 500
   add new drop rule, it will drop 1/500 messages from 10.8.6.123@o2ib1
   to all nodes

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
Reviewed-on: http://review.whamcloud.com/11314
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   10 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |   83 ++++
 drivers/staging/lustre/lnet/lnet/Makefile          |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 +
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    8 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       |  436 ++++++++++++++++++++
 8 files changed, 547 insertions(+), 1 deletions(-)
 create mode 100644 drivers/staging/lustre/lnet/lnet/net_fault.c

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index f788631..5ca99bd 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -121,6 +121,7 @@ struct libcfs_ioctl_handler {
 #define IOC_LIBCFS_PING		    _IOWR('e', 61, long)
 /*	#define IOC_LIBCFS_DEBUG_PEER	      _IOWR('e', 62, long) */
 #define IOC_LIBCFS_LNETST		  _IOWR('e', 63, long)
+#define	IOC_LIBCFS_LNET_FAULT		_IOWR('e', 64, long)
 /* lnd ioctls */
 #define IOC_LIBCFS_REGISTER_MYNID	  _IOWR('e', 70, long)
 #define IOC_LIBCFS_CLOSE_CONNECTION	_IOWR('e', 71, long)
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 84642dc..7b3f858 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -578,6 +578,16 @@ char *lnet_msgtyp2str(int type);
 void lnet_print_hdr(lnet_hdr_t *hdr);
 int lnet_fail_nid(lnet_nid_t nid, unsigned int threshold);
 
+/** \addtogroup lnet_fault_simulation @{ */
+
+int lnet_fault_ctl(int cmd, struct libcfs_ioctl_data *data);
+int lnet_fault_init(void);
+void lnet_fault_fini(void);
+
+bool lnet_drop_rule_match(lnet_hdr_t *hdr);
+
+/** @} lnet_fault_simulation */
+
 void lnet_counters_get(lnet_counters_t *counters);
 void lnet_counters_reset(void);
 
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index d2513db..cb09a8a 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -40,6 +40,7 @@
 #include <linux/types.h>
 
 #include "types.h"
+#include "lnetctl.h"
 
 /* Max payload size */
 #define LNET_MAX_PAYLOAD      CONFIG_LNET_MAX_PAYLOAD
@@ -572,6 +573,7 @@ typedef struct {
 	struct lnet_peer_table		**ln_peer_tables;
 	/* failure simulation */
 	struct list_head		  ln_test_peers;
+	struct list_head		  ln_drop_rules;
 
 	struct list_head		  ln_nis;	/* LND instances */
 	/* NIs bond on specific CPT(s) */
diff --git a/drivers/staging/lustre/include/linux/lnet/lnetctl.h b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
index 4b64f62..ec33bf8 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetctl.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
@@ -17,6 +17,89 @@
 
 #include "types.h"
 
+/** \addtogroup lnet_fault_simulation
+ * @{
+ */
+
+enum {
+	LNET_CTL_DROP_ADD,
+	LNET_CTL_DROP_DEL,
+	LNET_CTL_DROP_RESET,
+	LNET_CTL_DROP_LIST,
+};
+
+#define LNET_ACK_BIT		BIT(0)
+#define LNET_PUT_BIT		BIT(1)
+#define LNET_GET_BIT		BIT(2)
+#define LNET_REPLY_BIT		BIT(3)
+
+/** ioctl parameter for LNet fault simulation */
+struct lnet_fault_attr {
+	/**
+	 * source NID of drop rule
+	 * LNET_NID_ANY is wildcard for all sources
+	 * 255.255.255.255@net is wildcard for all addresses from @net
+	 */
+	lnet_nid_t			fa_src;
+	/** destination NID of drop rule, see \a dr_src for details */
+	lnet_nid_t			fa_dst;
+	/**
+	 * Portal mask to drop, -1 means all portals, for example:
+	 * fa_ptl_mask = (1 << _LDLM_CB_REQUEST_PORTAL ) |
+	 *		 (1 << LDLM_CANCEL_REQUEST_PORTAL)
+	 *
+	 * If it is non-zero then only PUT and GET will be filtered, otherwise
+	 * there is no portal filter, all matched messages will be checked.
+	 */
+	__u64				fa_ptl_mask;
+	/**
+	 * message types to drop, for example:
+	 * dra_type = LNET_DROP_ACK_BIT | LNET_DROP_PUT_BIT
+	 *
+	 * If it is non-zero then only specified message types are filtered,
+	 * otherwise all message types will be checked.
+	 */
+	__u32				fa_msg_mask;
+	union {
+		/** message drop simulation */
+		struct {
+			/** drop rate of this rule */
+			__u32			da_rate;
+			/**
+			 * time interval of message drop, it is exclusive
+			 * with da_rate
+			 */
+			__u32			da_interval;
+		} drop;
+		/** TODO: add more */
+		__u64			space[8];
+	} u;
+};
+
+/** fault simluation stats */
+struct lnet_fault_stat {
+	/** total # matched messages */
+	__u64				fs_count;
+	/** # dropped LNET_MSG_PUT by this rule */
+	__u64				fs_put;
+	/** # dropped LNET_MSG_ACK by this rule */
+	__u64				fs_ack;
+	/** # dropped LNET_MSG_GET by this rule */
+	__u64				fs_get;
+	/** # dropped LNET_MSG_REPLY by this rule */
+	__u64				fs_reply;
+	union {
+		struct {
+			/** total # dropped messages */
+			__u64			ds_dropped;
+		} drop;
+		/** TODO: add more */
+		__u64			space[8];
+	} u;
+};
+
+/** @} lnet_fault_simulation */
+
 #define LNET_DEV_ID 0
 #define LNET_DEV_PATH "/dev/lnet"
 #define LNET_DEV_MAJOR 10
diff --git a/drivers/staging/lustre/lnet/lnet/Makefile b/drivers/staging/lustre/lnet/lnet/Makefile
index e276fe2..4c81fa1 100644
--- a/drivers/staging/lustre/lnet/lnet/Makefile
+++ b/drivers/staging/lustre/lnet/lnet/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LNET) += lnet.o
 
-lnet-y := api-ni.o config.o nidstrings.o			\
+lnet-y := api-ni.o config.o nidstrings.o net_fault.o		\
 	  lib-me.o lib-msg.o lib-eq.o lib-md.o lib-ptl.o	\
 	  lib-socket.o lib-move.o module.o lo.o			\
 	  router.o router_proc.o acceptor.o peer.o
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 99cdf9e..4d77ca3 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -550,6 +550,7 @@ lnet_prepare(lnet_pid_t requested_pid)
 	INIT_LIST_HEAD(&the_lnet.ln_nis_cpt);
 	INIT_LIST_HEAD(&the_lnet.ln_nis_zombie);
 	INIT_LIST_HEAD(&the_lnet.ln_routers);
+	INIT_LIST_HEAD(&the_lnet.ln_drop_rules);
 
 	rc = lnet_create_remote_nets_table();
 	if (rc)
@@ -1564,6 +1565,7 @@ LNetNIInit(lnet_pid_t requested_pid)
 	if (rc)
 		goto err_stop_ping;
 
+	lnet_fault_init();
 	lnet_router_debugfs_init();
 
 	mutex_unlock(&the_lnet.ln_api_mutex);
@@ -1616,6 +1618,7 @@ LNetNIFini(void)
 	} else {
 		LASSERT(!the_lnet.ln_niinit_self);
 
+		lnet_fault_fini();
 		lnet_router_debugfs_fini();
 		lnet_router_checker_stop();
 		lnet_ping_target_fini();
@@ -2030,6 +2033,9 @@ LNetCtl(unsigned int cmd, void *arg)
 		lnet_net_unlock(LNET_LOCK_EX);
 		return 0;
 
+	case IOC_LIBCFS_LNET_FAULT:
+		return lnet_fault_ctl(data->ioc_flags, data);
+
 	case IOC_LIBCFS_PING:
 		id.nid = data->ioc_nid;
 		id.pid = data->ioc_u32[0];
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 2d187e4..7a0f185 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1931,6 +1931,14 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 		goto drop;
 	}
 
+	if (!list_empty(&the_lnet.ln_drop_rules) &&
+		lnet_drop_rule_match(hdr)) {
+		CDEBUG(D_NET, "%s, src %s, dst %s: Dropping %s to simulate silent message loss\n",
+		       libcfs_nid2str(from_nid), libcfs_nid2str(src_nid),
+		       libcfs_nid2str(dest_nid), lnet_msgtyp2str(type));
+		goto drop;
+	}
+
 	msg = lnet_msg_alloc();
 	if (!msg) {
 		CERROR("%s, src %s: Dropping %s (out of memory)\n",
diff --git a/drivers/staging/lustre/lnet/lnet/net_fault.c b/drivers/staging/lustre/lnet/lnet/net_fault.c
new file mode 100644
index 0000000..8ed05b6
--- /dev/null
+++ b/drivers/staging/lustre/lnet/lnet/net_fault.c
@@ -0,0 +1,436 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ */
+/*
+ * This file is part of Lustre, http://www.lustre.org/
+ * Lustre is a trademark of Seagate, Inc.
+ *
+ * lnet/lnet/net_fault.c
+ *
+ * Lustre network fault simulation
+ *
+ * Author: liang.zhen@intel.com
+ */
+
+#define DEBUG_SUBSYSTEM S_LNET
+
+#include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lnetctl.h"
+
+#define LNET_MSG_MASK		(LNET_PUT_BIT | LNET_ACK_BIT | \
+				 LNET_GET_BIT | LNET_REPLY_BIT)
+
+struct lnet_drop_rule {
+	/** link chain on the_lnet.ln_drop_rules */
+	struct list_head	dr_link;
+	/** attributes of this rule */
+	struct lnet_fault_attr	dr_attr;
+	/** lock to protect \a dr_drop_at and \a dr_stat */
+	spinlock_t		dr_lock;
+	/**
+	 * the message sequence to drop, which means message is dropped when
+	 * dr_stat.drs_count == dr_drop_at
+	 */
+	unsigned long		dr_drop_at;
+	/**
+	 * seconds to drop the next message, it's exclusive with dr_drop_at
+	 */
+	unsigned long		dr_drop_time;
+	/** baseline to caculate dr_drop_time */
+	unsigned long		dr_time_base;
+	/** statistic of dropped messages */
+	struct lnet_fault_stat	dr_stat;
+};
+
+static bool
+lnet_fault_nid_match(lnet_nid_t nid, lnet_nid_t msg_nid)
+{
+	if (nid == msg_nid || nid == LNET_NID_ANY)
+		return true;
+
+	if (LNET_NIDNET(nid) != LNET_NIDNET(msg_nid))
+		return false;
+
+	/* 255.255.255.255@net is wildcard for all addresses in a network */
+	return LNET_NIDADDR(nid) == LNET_NIDADDR(LNET_NID_ANY);
+}
+
+static bool
+lnet_fault_attr_match(struct lnet_fault_attr *attr, lnet_nid_t src,
+		      lnet_nid_t dst, unsigned int type, unsigned int portal)
+{
+	if (!lnet_fault_nid_match(attr->fa_src, src) ||
+	    !lnet_fault_nid_match(attr->fa_dst, dst))
+		return false;
+
+	if (!(attr->fa_msg_mask & (1 << type)))
+		return false;
+
+	/**
+	 * NB: ACK and REPLY have no portal, but they should have been
+	 * rejected by message mask
+	 */
+	if (attr->fa_ptl_mask && /* has portal filter */
+	    !(attr->fa_ptl_mask & (1ULL << portal)))
+		return false;
+
+	return true;
+}
+
+static int
+lnet_fault_attr_validate(struct lnet_fault_attr *attr)
+{
+	if (!attr->fa_msg_mask)
+		attr->fa_msg_mask = LNET_MSG_MASK; /* all message types */
+
+	if (!attr->fa_ptl_mask) /* no portal filter */
+		return 0;
+
+	/* NB: only PUT and GET can be filtered if portal filter has been set */
+	attr->fa_msg_mask &= LNET_GET_BIT | LNET_PUT_BIT;
+	if (!attr->fa_msg_mask) {
+		CDEBUG(D_NET, "can't find valid message type bits %x\n",
+		       attr->fa_msg_mask);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static void
+lnet_fault_stat_inc(struct lnet_fault_stat *stat, unsigned int type)
+{
+	/* NB: fs_counter is NOT updated by this function */
+	switch (type) {
+	case LNET_MSG_PUT:
+		stat->fs_put++;
+		return;
+	case LNET_MSG_ACK:
+		stat->fs_ack++;
+		return;
+	case LNET_MSG_GET:
+		stat->fs_get++;
+		return;
+	case LNET_MSG_REPLY:
+		stat->fs_reply++;
+		return;
+	}
+}
+
+/**
+ * Add a new drop rule to LNet
+ * There is no check for duplicated drop rule, all rules will be checked for
+ * incoming message.
+ */
+static int
+lnet_drop_rule_add(struct lnet_fault_attr *attr)
+{
+	struct lnet_drop_rule *rule;
+
+	if (!attr->u.drop.da_rate == !attr->u.drop.da_interval) {
+		CDEBUG(D_NET, "invalid rate %d or interval %d\n",
+		       attr->u.drop.da_rate, attr->u.drop.da_interval);
+		return -EINVAL;
+	}
+
+	if (lnet_fault_attr_validate(attr))
+		return -EINVAL;
+
+	CFS_ALLOC_PTR(rule);
+	if (!rule)
+		return -ENOMEM;
+
+	spin_lock_init(&rule->dr_lock);
+
+	rule->dr_attr = *attr;
+	if (attr->u.drop.da_interval) {
+		rule->dr_time_base = cfs_time_shift(attr->u.drop.da_interval);
+		rule->dr_drop_time = cfs_time_shift(cfs_rand() %
+						    attr->u.drop.da_interval);
+	} else {
+		rule->dr_drop_at = cfs_rand() % attr->u.drop.da_rate;
+	}
+
+	lnet_net_lock(LNET_LOCK_EX);
+	list_add(&rule->dr_link, &the_lnet.ln_drop_rules);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	CDEBUG(D_NET, "Added drop rule: src %s, dst %s, rate %d, interval %d\n",
+	       libcfs_nid2str(attr->fa_src), libcfs_nid2str(attr->fa_src),
+	       attr->u.drop.da_rate, attr->u.drop.da_interval);
+	return 0;
+}
+
+/**
+ * Remove matched drop rules from lnet, all rules that can match \a src and
+ * \a dst will be removed.
+ * If \a src is zero, then all rules have \a dst as destination will be remove
+ * If \a dst is zero, then all rules have \a src as source will be removed
+ * If both of them are zero, all rules will be removed
+ */
+static int
+lnet_drop_rule_del(lnet_nid_t src, lnet_nid_t dst)
+{
+	struct lnet_drop_rule *rule;
+	struct lnet_drop_rule *tmp;
+	struct list_head zombies;
+	int n = 0;
+
+	INIT_LIST_HEAD(&zombies);
+
+	lnet_net_lock(LNET_LOCK_EX);
+	list_for_each_entry_safe(rule, tmp, &the_lnet.ln_drop_rules, dr_link) {
+		if (rule->dr_attr.fa_src != src && src)
+			continue;
+
+		if (rule->dr_attr.fa_dst != dst && dst)
+			continue;
+
+		list_move(&rule->dr_link, &zombies);
+	}
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &zombies, dr_link) {
+		CDEBUG(D_NET, "Remove drop rule: src %s->dst: %s (1/%d, %d)\n",
+		       libcfs_nid2str(rule->dr_attr.fa_src),
+		       libcfs_nid2str(rule->dr_attr.fa_dst),
+		       rule->dr_attr.u.drop.da_rate,
+		       rule->dr_attr.u.drop.da_interval);
+
+		list_del(&rule->dr_link);
+		CFS_FREE_PTR(rule);
+		n++;
+	}
+
+	return n;
+}
+
+/**
+ * List drop rule at position of \a pos
+ */
+static int
+lnet_drop_rule_list(int pos, struct lnet_fault_attr *attr,
+		    struct lnet_fault_stat *stat)
+{
+	struct lnet_drop_rule *rule;
+	int cpt;
+	int i = 0;
+	int rc = -ENOENT;
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		if (i++ < pos)
+			continue;
+
+		spin_lock(&rule->dr_lock);
+		*attr = rule->dr_attr;
+		*stat = rule->dr_stat;
+		spin_unlock(&rule->dr_lock);
+		rc = 0;
+		break;
+	}
+
+	lnet_net_unlock(cpt);
+	return rc;
+}
+
+/**
+ * reset counters for all drop rules
+ */
+static void
+lnet_drop_rule_reset(void)
+{
+	struct lnet_drop_rule *rule;
+	int cpt;
+
+	cpt = lnet_net_lock_current();
+
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		struct lnet_fault_attr *attr = &rule->dr_attr;
+
+		spin_lock(&rule->dr_lock);
+
+		memset(&rule->dr_stat, 0, sizeof(rule->dr_stat));
+		if (attr->u.drop.da_rate) {
+			rule->dr_drop_at = cfs_rand() % attr->u.drop.da_rate;
+		} else {
+			rule->dr_drop_time = cfs_time_shift(cfs_rand() %
+						attr->u.drop.da_interval);
+			rule->dr_time_base = cfs_time_shift(attr->u.drop.
+								  da_interval);
+		}
+		spin_unlock(&rule->dr_lock);
+	}
+
+	lnet_net_unlock(cpt);
+}
+
+/**
+ * check source/destination NID, portal, message type and drop rate,
+ * decide whether should drop this message or not
+ */
+static bool
+drop_rule_match(struct lnet_drop_rule *rule, lnet_nid_t src,
+		lnet_nid_t dst, unsigned int type, unsigned int portal)
+{
+	struct lnet_fault_attr *attr = &rule->dr_attr;
+	bool drop;
+
+	if (!lnet_fault_attr_match(attr, src, dst, type, portal))
+		return false;
+
+	/* match this rule, check drop rate now */
+	spin_lock(&rule->dr_lock);
+	if (rule->dr_drop_time) { /* time based drop */
+		unsigned long now = cfs_time_current();
+
+		rule->dr_stat.fs_count++;
+		drop = cfs_time_aftereq(now, rule->dr_drop_time);
+		if (drop) {
+			if (cfs_time_after(now, rule->dr_time_base))
+				rule->dr_time_base = now;
+
+			rule->dr_drop_time = rule->dr_time_base +
+					     cfs_time_seconds(cfs_rand() %
+						attr->u.drop.da_interval);
+			rule->dr_time_base += cfs_time_seconds(attr->u.drop.
+							       da_interval);
+
+			CDEBUG(D_NET, "Drop Rule %s->%s: next drop : %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst),
+			       rule->dr_drop_time);
+		}
+
+	} else { /* rate based drop */
+		drop = rule->dr_stat.fs_count++ == rule->dr_drop_at;
+
+		if (!(rule->dr_stat.fs_count % attr->u.drop.da_rate)) {
+			rule->dr_drop_at = rule->dr_stat.fs_count +
+					   cfs_rand() % attr->u.drop.da_rate;
+			CDEBUG(D_NET, "Drop Rule %s->%s: next drop: %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst), rule->dr_drop_at);
+		}
+	}
+
+	if (drop) { /* drop this message, update counters */
+		lnet_fault_stat_inc(&rule->dr_stat, type);
+		rule->dr_stat.u.drop.ds_dropped++;
+	}
+
+	spin_unlock(&rule->dr_lock);
+	return drop;
+}
+
+/**
+ * Check if message from \a src to \a dst can match any existed drop rule
+ */
+bool
+lnet_drop_rule_match(lnet_hdr_t *hdr)
+{
+	struct lnet_drop_rule *rule;
+	lnet_nid_t src = le64_to_cpu(hdr->src_nid);
+	lnet_nid_t dst = le64_to_cpu(hdr->dest_nid);
+	unsigned int typ = le32_to_cpu(hdr->type);
+	unsigned int ptl = -1;
+	bool drop = false;
+	int cpt;
+
+	/**
+	 * NB: if Portal is specified, then only PUT and GET will be
+	 * filtered by drop rule
+	 */
+	if (typ == LNET_MSG_PUT)
+		ptl = le32_to_cpu(hdr->msg.put.ptl_index);
+	else if (typ == LNET_MSG_GET)
+		ptl = le32_to_cpu(hdr->msg.get.ptl_index);
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		drop = drop_rule_match(rule, src, dst, typ, ptl);
+		if (drop)
+			break;
+	}
+
+	lnet_net_unlock(cpt);
+	return drop;
+}
+
+int
+lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
+{
+	struct lnet_fault_attr *attr;
+	struct lnet_fault_stat *stat;
+
+	attr = (struct lnet_fault_attr *)data->ioc_inlbuf1;
+
+	switch (opc) {
+	default:
+		return -EINVAL;
+
+	case LNET_CTL_DROP_ADD:
+		if (!attr)
+			return -EINVAL;
+
+		return lnet_drop_rule_add(attr);
+
+	case LNET_CTL_DROP_DEL:
+		if (!attr)
+			return -EINVAL;
+
+		data->ioc_count = lnet_drop_rule_del(attr->fa_src,
+						     attr->fa_dst);
+		return 0;
+
+	case LNET_CTL_DROP_RESET:
+		lnet_drop_rule_reset();
+		return 0;
+
+	case LNET_CTL_DROP_LIST:
+		stat = (struct lnet_fault_stat *)data->ioc_inlbuf2;
+		if (!attr || !stat)
+			return -EINVAL;
+
+		return lnet_drop_rule_list(data->ioc_count, attr, stat);
+	}
+}
+
+int
+lnet_fault_init(void)
+{
+	CLASSERT(LNET_PUT_BIT == 1 << LNET_MSG_PUT);
+	CLASSERT(LNET_ACK_BIT == 1 << LNET_MSG_ACK);
+	CLASSERT(LNET_GET_BIT == 1 << LNET_MSG_GET);
+	CLASSERT(LNET_REPLY_BIT == 1 << LNET_MSG_REPLY);
+
+	return 0;
+}
+
+void
+lnet_fault_fini(void)
+{
+	lnet_drop_rule_del(0, 0);
+
+	LASSERT(list_empty(&the_lnet.ln_drop_rules));
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 01/10] staging: lustre: LNet drop rule implementation
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

This is implementation of LNet Drop Rule, which can randomly drop
LNet messages at specified rate.

LNet Drop Rule can only be applied to receive side of message. User
can add drop_rule either on end point of cluster (client/server) or
on LNet routers.

Here are lctl command to control LNet Drop Rules:
 - net_drop_add -s SRC_NID -d DEST_NID --rate VALUE
   drop 1/@VALUE of messages from @SRC_NID to @DEST_NID

 - net_drop_del -s SRC_NID -d DEST_NID
   remove all drop rules from @SRC_NID to @DEST_NID

 - net_drop_list
   list all drop rules on current node

 Examples:
 - lctl net_drop_add -s *@o2ib0 -d 192.168.1.102 at tcp 1000
   add new drop rule, it will drop 1/1000 messages from network o2ib0
   to 192.168.1.102 at tcp

 - lctl net_drop_add -s 10.8.6.123 at o2ib1 -d * 500
   add new drop rule, it will drop 1/500 messages from 10.8.6.123 at o2ib1
   to all nodes

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
Reviewed-on: http://review.whamcloud.com/11314
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../lustre/include/linux/libcfs/libcfs_ioctl.h     |    1 +
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   10 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |   83 ++++
 drivers/staging/lustre/lnet/lnet/Makefile          |    2 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 +
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    8 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       |  436 ++++++++++++++++++++
 8 files changed, 547 insertions(+), 1 deletions(-)
 create mode 100644 drivers/staging/lustre/lnet/lnet/net_fault.c

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
index f788631..5ca99bd 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h
@@ -121,6 +121,7 @@ struct libcfs_ioctl_handler {
 #define IOC_LIBCFS_PING		    _IOWR('e', 61, long)
 /*	#define IOC_LIBCFS_DEBUG_PEER	      _IOWR('e', 62, long) */
 #define IOC_LIBCFS_LNETST		  _IOWR('e', 63, long)
+#define	IOC_LIBCFS_LNET_FAULT		_IOWR('e', 64, long)
 /* lnd ioctls */
 #define IOC_LIBCFS_REGISTER_MYNID	  _IOWR('e', 70, long)
 #define IOC_LIBCFS_CLOSE_CONNECTION	_IOWR('e', 71, long)
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 84642dc..7b3f858 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -578,6 +578,16 @@ char *lnet_msgtyp2str(int type);
 void lnet_print_hdr(lnet_hdr_t *hdr);
 int lnet_fail_nid(lnet_nid_t nid, unsigned int threshold);
 
+/** \addtogroup lnet_fault_simulation @{ */
+
+int lnet_fault_ctl(int cmd, struct libcfs_ioctl_data *data);
+int lnet_fault_init(void);
+void lnet_fault_fini(void);
+
+bool lnet_drop_rule_match(lnet_hdr_t *hdr);
+
+/** @} lnet_fault_simulation */
+
 void lnet_counters_get(lnet_counters_t *counters);
 void lnet_counters_reset(void);
 
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index d2513db..cb09a8a 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -40,6 +40,7 @@
 #include <linux/types.h>
 
 #include "types.h"
+#include "lnetctl.h"
 
 /* Max payload size */
 #define LNET_MAX_PAYLOAD      CONFIG_LNET_MAX_PAYLOAD
@@ -572,6 +573,7 @@ typedef struct {
 	struct lnet_peer_table		**ln_peer_tables;
 	/* failure simulation */
 	struct list_head		  ln_test_peers;
+	struct list_head		  ln_drop_rules;
 
 	struct list_head		  ln_nis;	/* LND instances */
 	/* NIs bond on specific CPT(s) */
diff --git a/drivers/staging/lustre/include/linux/lnet/lnetctl.h b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
index 4b64f62..ec33bf8 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetctl.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
@@ -17,6 +17,89 @@
 
 #include "types.h"
 
+/** \addtogroup lnet_fault_simulation
+ * @{
+ */
+
+enum {
+	LNET_CTL_DROP_ADD,
+	LNET_CTL_DROP_DEL,
+	LNET_CTL_DROP_RESET,
+	LNET_CTL_DROP_LIST,
+};
+
+#define LNET_ACK_BIT		BIT(0)
+#define LNET_PUT_BIT		BIT(1)
+#define LNET_GET_BIT		BIT(2)
+#define LNET_REPLY_BIT		BIT(3)
+
+/** ioctl parameter for LNet fault simulation */
+struct lnet_fault_attr {
+	/**
+	 * source NID of drop rule
+	 * LNET_NID_ANY is wildcard for all sources
+	 * 255.255.255.255 at net is wildcard for all addresses from @net
+	 */
+	lnet_nid_t			fa_src;
+	/** destination NID of drop rule, see \a dr_src for details */
+	lnet_nid_t			fa_dst;
+	/**
+	 * Portal mask to drop, -1 means all portals, for example:
+	 * fa_ptl_mask = (1 << _LDLM_CB_REQUEST_PORTAL ) |
+	 *		 (1 << LDLM_CANCEL_REQUEST_PORTAL)
+	 *
+	 * If it is non-zero then only PUT and GET will be filtered, otherwise
+	 * there is no portal filter, all matched messages will be checked.
+	 */
+	__u64				fa_ptl_mask;
+	/**
+	 * message types to drop, for example:
+	 * dra_type = LNET_DROP_ACK_BIT | LNET_DROP_PUT_BIT
+	 *
+	 * If it is non-zero then only specified message types are filtered,
+	 * otherwise all message types will be checked.
+	 */
+	__u32				fa_msg_mask;
+	union {
+		/** message drop simulation */
+		struct {
+			/** drop rate of this rule */
+			__u32			da_rate;
+			/**
+			 * time interval of message drop, it is exclusive
+			 * with da_rate
+			 */
+			__u32			da_interval;
+		} drop;
+		/** TODO: add more */
+		__u64			space[8];
+	} u;
+};
+
+/** fault simluation stats */
+struct lnet_fault_stat {
+	/** total # matched messages */
+	__u64				fs_count;
+	/** # dropped LNET_MSG_PUT by this rule */
+	__u64				fs_put;
+	/** # dropped LNET_MSG_ACK by this rule */
+	__u64				fs_ack;
+	/** # dropped LNET_MSG_GET by this rule */
+	__u64				fs_get;
+	/** # dropped LNET_MSG_REPLY by this rule */
+	__u64				fs_reply;
+	union {
+		struct {
+			/** total # dropped messages */
+			__u64			ds_dropped;
+		} drop;
+		/** TODO: add more */
+		__u64			space[8];
+	} u;
+};
+
+/** @} lnet_fault_simulation */
+
 #define LNET_DEV_ID 0
 #define LNET_DEV_PATH "/dev/lnet"
 #define LNET_DEV_MAJOR 10
diff --git a/drivers/staging/lustre/lnet/lnet/Makefile b/drivers/staging/lustre/lnet/lnet/Makefile
index e276fe2..4c81fa1 100644
--- a/drivers/staging/lustre/lnet/lnet/Makefile
+++ b/drivers/staging/lustre/lnet/lnet/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LNET) += lnet.o
 
-lnet-y := api-ni.o config.o nidstrings.o			\
+lnet-y := api-ni.o config.o nidstrings.o net_fault.o		\
 	  lib-me.o lib-msg.o lib-eq.o lib-md.o lib-ptl.o	\
 	  lib-socket.o lib-move.o module.o lo.o			\
 	  router.o router_proc.o acceptor.o peer.o
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 99cdf9e..4d77ca3 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -550,6 +550,7 @@ lnet_prepare(lnet_pid_t requested_pid)
 	INIT_LIST_HEAD(&the_lnet.ln_nis_cpt);
 	INIT_LIST_HEAD(&the_lnet.ln_nis_zombie);
 	INIT_LIST_HEAD(&the_lnet.ln_routers);
+	INIT_LIST_HEAD(&the_lnet.ln_drop_rules);
 
 	rc = lnet_create_remote_nets_table();
 	if (rc)
@@ -1564,6 +1565,7 @@ LNetNIInit(lnet_pid_t requested_pid)
 	if (rc)
 		goto err_stop_ping;
 
+	lnet_fault_init();
 	lnet_router_debugfs_init();
 
 	mutex_unlock(&the_lnet.ln_api_mutex);
@@ -1616,6 +1618,7 @@ LNetNIFini(void)
 	} else {
 		LASSERT(!the_lnet.ln_niinit_self);
 
+		lnet_fault_fini();
 		lnet_router_debugfs_fini();
 		lnet_router_checker_stop();
 		lnet_ping_target_fini();
@@ -2030,6 +2033,9 @@ LNetCtl(unsigned int cmd, void *arg)
 		lnet_net_unlock(LNET_LOCK_EX);
 		return 0;
 
+	case IOC_LIBCFS_LNET_FAULT:
+		return lnet_fault_ctl(data->ioc_flags, data);
+
 	case IOC_LIBCFS_PING:
 		id.nid = data->ioc_nid;
 		id.pid = data->ioc_u32[0];
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 2d187e4..7a0f185 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1931,6 +1931,14 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 		goto drop;
 	}
 
+	if (!list_empty(&the_lnet.ln_drop_rules) &&
+		lnet_drop_rule_match(hdr)) {
+		CDEBUG(D_NET, "%s, src %s, dst %s: Dropping %s to simulate silent message loss\n",
+		       libcfs_nid2str(from_nid), libcfs_nid2str(src_nid),
+		       libcfs_nid2str(dest_nid), lnet_msgtyp2str(type));
+		goto drop;
+	}
+
 	msg = lnet_msg_alloc();
 	if (!msg) {
 		CERROR("%s, src %s: Dropping %s (out of memory)\n",
diff --git a/drivers/staging/lustre/lnet/lnet/net_fault.c b/drivers/staging/lustre/lnet/lnet/net_fault.c
new file mode 100644
index 0000000..8ed05b6
--- /dev/null
+++ b/drivers/staging/lustre/lnet/lnet/net_fault.c
@@ -0,0 +1,436 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ */
+/*
+ * This file is part of Lustre, http://www.lustre.org/
+ * Lustre is a trademark of Seagate, Inc.
+ *
+ * lnet/lnet/net_fault.c
+ *
+ * Lustre network fault simulation
+ *
+ * Author: liang.zhen at intel.com
+ */
+
+#define DEBUG_SUBSYSTEM S_LNET
+
+#include "../../include/linux/lnet/lib-lnet.h"
+#include "../../include/linux/lnet/lnetctl.h"
+
+#define LNET_MSG_MASK		(LNET_PUT_BIT | LNET_ACK_BIT | \
+				 LNET_GET_BIT | LNET_REPLY_BIT)
+
+struct lnet_drop_rule {
+	/** link chain on the_lnet.ln_drop_rules */
+	struct list_head	dr_link;
+	/** attributes of this rule */
+	struct lnet_fault_attr	dr_attr;
+	/** lock to protect \a dr_drop_at and \a dr_stat */
+	spinlock_t		dr_lock;
+	/**
+	 * the message sequence to drop, which means message is dropped when
+	 * dr_stat.drs_count == dr_drop_at
+	 */
+	unsigned long		dr_drop_at;
+	/**
+	 * seconds to drop the next message, it's exclusive with dr_drop_at
+	 */
+	unsigned long		dr_drop_time;
+	/** baseline to caculate dr_drop_time */
+	unsigned long		dr_time_base;
+	/** statistic of dropped messages */
+	struct lnet_fault_stat	dr_stat;
+};
+
+static bool
+lnet_fault_nid_match(lnet_nid_t nid, lnet_nid_t msg_nid)
+{
+	if (nid == msg_nid || nid == LNET_NID_ANY)
+		return true;
+
+	if (LNET_NIDNET(nid) != LNET_NIDNET(msg_nid))
+		return false;
+
+	/* 255.255.255.255 at net is wildcard for all addresses in a network */
+	return LNET_NIDADDR(nid) == LNET_NIDADDR(LNET_NID_ANY);
+}
+
+static bool
+lnet_fault_attr_match(struct lnet_fault_attr *attr, lnet_nid_t src,
+		      lnet_nid_t dst, unsigned int type, unsigned int portal)
+{
+	if (!lnet_fault_nid_match(attr->fa_src, src) ||
+	    !lnet_fault_nid_match(attr->fa_dst, dst))
+		return false;
+
+	if (!(attr->fa_msg_mask & (1 << type)))
+		return false;
+
+	/**
+	 * NB: ACK and REPLY have no portal, but they should have been
+	 * rejected by message mask
+	 */
+	if (attr->fa_ptl_mask && /* has portal filter */
+	    !(attr->fa_ptl_mask & (1ULL << portal)))
+		return false;
+
+	return true;
+}
+
+static int
+lnet_fault_attr_validate(struct lnet_fault_attr *attr)
+{
+	if (!attr->fa_msg_mask)
+		attr->fa_msg_mask = LNET_MSG_MASK; /* all message types */
+
+	if (!attr->fa_ptl_mask) /* no portal filter */
+		return 0;
+
+	/* NB: only PUT and GET can be filtered if portal filter has been set */
+	attr->fa_msg_mask &= LNET_GET_BIT | LNET_PUT_BIT;
+	if (!attr->fa_msg_mask) {
+		CDEBUG(D_NET, "can't find valid message type bits %x\n",
+		       attr->fa_msg_mask);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static void
+lnet_fault_stat_inc(struct lnet_fault_stat *stat, unsigned int type)
+{
+	/* NB: fs_counter is NOT updated by this function */
+	switch (type) {
+	case LNET_MSG_PUT:
+		stat->fs_put++;
+		return;
+	case LNET_MSG_ACK:
+		stat->fs_ack++;
+		return;
+	case LNET_MSG_GET:
+		stat->fs_get++;
+		return;
+	case LNET_MSG_REPLY:
+		stat->fs_reply++;
+		return;
+	}
+}
+
+/**
+ * Add a new drop rule to LNet
+ * There is no check for duplicated drop rule, all rules will be checked for
+ * incoming message.
+ */
+static int
+lnet_drop_rule_add(struct lnet_fault_attr *attr)
+{
+	struct lnet_drop_rule *rule;
+
+	if (!attr->u.drop.da_rate == !attr->u.drop.da_interval) {
+		CDEBUG(D_NET, "invalid rate %d or interval %d\n",
+		       attr->u.drop.da_rate, attr->u.drop.da_interval);
+		return -EINVAL;
+	}
+
+	if (lnet_fault_attr_validate(attr))
+		return -EINVAL;
+
+	CFS_ALLOC_PTR(rule);
+	if (!rule)
+		return -ENOMEM;
+
+	spin_lock_init(&rule->dr_lock);
+
+	rule->dr_attr = *attr;
+	if (attr->u.drop.da_interval) {
+		rule->dr_time_base = cfs_time_shift(attr->u.drop.da_interval);
+		rule->dr_drop_time = cfs_time_shift(cfs_rand() %
+						    attr->u.drop.da_interval);
+	} else {
+		rule->dr_drop_at = cfs_rand() % attr->u.drop.da_rate;
+	}
+
+	lnet_net_lock(LNET_LOCK_EX);
+	list_add(&rule->dr_link, &the_lnet.ln_drop_rules);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	CDEBUG(D_NET, "Added drop rule: src %s, dst %s, rate %d, interval %d\n",
+	       libcfs_nid2str(attr->fa_src), libcfs_nid2str(attr->fa_src),
+	       attr->u.drop.da_rate, attr->u.drop.da_interval);
+	return 0;
+}
+
+/**
+ * Remove matched drop rules from lnet, all rules that can match \a src and
+ * \a dst will be removed.
+ * If \a src is zero, then all rules have \a dst as destination will be remove
+ * If \a dst is zero, then all rules have \a src as source will be removed
+ * If both of them are zero, all rules will be removed
+ */
+static int
+lnet_drop_rule_del(lnet_nid_t src, lnet_nid_t dst)
+{
+	struct lnet_drop_rule *rule;
+	struct lnet_drop_rule *tmp;
+	struct list_head zombies;
+	int n = 0;
+
+	INIT_LIST_HEAD(&zombies);
+
+	lnet_net_lock(LNET_LOCK_EX);
+	list_for_each_entry_safe(rule, tmp, &the_lnet.ln_drop_rules, dr_link) {
+		if (rule->dr_attr.fa_src != src && src)
+			continue;
+
+		if (rule->dr_attr.fa_dst != dst && dst)
+			continue;
+
+		list_move(&rule->dr_link, &zombies);
+	}
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &zombies, dr_link) {
+		CDEBUG(D_NET, "Remove drop rule: src %s->dst: %s (1/%d, %d)\n",
+		       libcfs_nid2str(rule->dr_attr.fa_src),
+		       libcfs_nid2str(rule->dr_attr.fa_dst),
+		       rule->dr_attr.u.drop.da_rate,
+		       rule->dr_attr.u.drop.da_interval);
+
+		list_del(&rule->dr_link);
+		CFS_FREE_PTR(rule);
+		n++;
+	}
+
+	return n;
+}
+
+/**
+ * List drop rule at position of \a pos
+ */
+static int
+lnet_drop_rule_list(int pos, struct lnet_fault_attr *attr,
+		    struct lnet_fault_stat *stat)
+{
+	struct lnet_drop_rule *rule;
+	int cpt;
+	int i = 0;
+	int rc = -ENOENT;
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		if (i++ < pos)
+			continue;
+
+		spin_lock(&rule->dr_lock);
+		*attr = rule->dr_attr;
+		*stat = rule->dr_stat;
+		spin_unlock(&rule->dr_lock);
+		rc = 0;
+		break;
+	}
+
+	lnet_net_unlock(cpt);
+	return rc;
+}
+
+/**
+ * reset counters for all drop rules
+ */
+static void
+lnet_drop_rule_reset(void)
+{
+	struct lnet_drop_rule *rule;
+	int cpt;
+
+	cpt = lnet_net_lock_current();
+
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		struct lnet_fault_attr *attr = &rule->dr_attr;
+
+		spin_lock(&rule->dr_lock);
+
+		memset(&rule->dr_stat, 0, sizeof(rule->dr_stat));
+		if (attr->u.drop.da_rate) {
+			rule->dr_drop_at = cfs_rand() % attr->u.drop.da_rate;
+		} else {
+			rule->dr_drop_time = cfs_time_shift(cfs_rand() %
+						attr->u.drop.da_interval);
+			rule->dr_time_base = cfs_time_shift(attr->u.drop.
+								  da_interval);
+		}
+		spin_unlock(&rule->dr_lock);
+	}
+
+	lnet_net_unlock(cpt);
+}
+
+/**
+ * check source/destination NID, portal, message type and drop rate,
+ * decide whether should drop this message or not
+ */
+static bool
+drop_rule_match(struct lnet_drop_rule *rule, lnet_nid_t src,
+		lnet_nid_t dst, unsigned int type, unsigned int portal)
+{
+	struct lnet_fault_attr *attr = &rule->dr_attr;
+	bool drop;
+
+	if (!lnet_fault_attr_match(attr, src, dst, type, portal))
+		return false;
+
+	/* match this rule, check drop rate now */
+	spin_lock(&rule->dr_lock);
+	if (rule->dr_drop_time) { /* time based drop */
+		unsigned long now = cfs_time_current();
+
+		rule->dr_stat.fs_count++;
+		drop = cfs_time_aftereq(now, rule->dr_drop_time);
+		if (drop) {
+			if (cfs_time_after(now, rule->dr_time_base))
+				rule->dr_time_base = now;
+
+			rule->dr_drop_time = rule->dr_time_base +
+					     cfs_time_seconds(cfs_rand() %
+						attr->u.drop.da_interval);
+			rule->dr_time_base += cfs_time_seconds(attr->u.drop.
+							       da_interval);
+
+			CDEBUG(D_NET, "Drop Rule %s->%s: next drop : %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst),
+			       rule->dr_drop_time);
+		}
+
+	} else { /* rate based drop */
+		drop = rule->dr_stat.fs_count++ == rule->dr_drop_at;
+
+		if (!(rule->dr_stat.fs_count % attr->u.drop.da_rate)) {
+			rule->dr_drop_at = rule->dr_stat.fs_count +
+					   cfs_rand() % attr->u.drop.da_rate;
+			CDEBUG(D_NET, "Drop Rule %s->%s: next drop: %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst), rule->dr_drop_at);
+		}
+	}
+
+	if (drop) { /* drop this message, update counters */
+		lnet_fault_stat_inc(&rule->dr_stat, type);
+		rule->dr_stat.u.drop.ds_dropped++;
+	}
+
+	spin_unlock(&rule->dr_lock);
+	return drop;
+}
+
+/**
+ * Check if message from \a src to \a dst can match any existed drop rule
+ */
+bool
+lnet_drop_rule_match(lnet_hdr_t *hdr)
+{
+	struct lnet_drop_rule *rule;
+	lnet_nid_t src = le64_to_cpu(hdr->src_nid);
+	lnet_nid_t dst = le64_to_cpu(hdr->dest_nid);
+	unsigned int typ = le32_to_cpu(hdr->type);
+	unsigned int ptl = -1;
+	bool drop = false;
+	int cpt;
+
+	/**
+	 * NB: if Portal is specified, then only PUT and GET will be
+	 * filtered by drop rule
+	 */
+	if (typ == LNET_MSG_PUT)
+		ptl = le32_to_cpu(hdr->msg.put.ptl_index);
+	else if (typ == LNET_MSG_GET)
+		ptl = le32_to_cpu(hdr->msg.get.ptl_index);
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_drop_rules, dr_link) {
+		drop = drop_rule_match(rule, src, dst, typ, ptl);
+		if (drop)
+			break;
+	}
+
+	lnet_net_unlock(cpt);
+	return drop;
+}
+
+int
+lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
+{
+	struct lnet_fault_attr *attr;
+	struct lnet_fault_stat *stat;
+
+	attr = (struct lnet_fault_attr *)data->ioc_inlbuf1;
+
+	switch (opc) {
+	default:
+		return -EINVAL;
+
+	case LNET_CTL_DROP_ADD:
+		if (!attr)
+			return -EINVAL;
+
+		return lnet_drop_rule_add(attr);
+
+	case LNET_CTL_DROP_DEL:
+		if (!attr)
+			return -EINVAL;
+
+		data->ioc_count = lnet_drop_rule_del(attr->fa_src,
+						     attr->fa_dst);
+		return 0;
+
+	case LNET_CTL_DROP_RESET:
+		lnet_drop_rule_reset();
+		return 0;
+
+	case LNET_CTL_DROP_LIST:
+		stat = (struct lnet_fault_stat *)data->ioc_inlbuf2;
+		if (!attr || !stat)
+			return -EINVAL;
+
+		return lnet_drop_rule_list(data->ioc_count, attr, stat);
+	}
+}
+
+int
+lnet_fault_init(void)
+{
+	CLASSERT(LNET_PUT_BIT == 1 << LNET_MSG_PUT);
+	CLASSERT(LNET_ACK_BIT == 1 << LNET_MSG_ACK);
+	CLASSERT(LNET_GET_BIT == 1 << LNET_MSG_GET);
+	CLASSERT(LNET_REPLY_BIT == 1 << LNET_MSG_REPLY);
+
+	return 0;
+}
+
+void
+lnet_fault_fini(void)
+{
+	lnet_drop_rule_del(0, 0);
+
+	LASSERT(list_empty(&the_lnet.ln_drop_rules));
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/10] staging: lustre: LNet network latency simulation
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

Incoming lnet message can be delayed for seconds if it can match
any of LNet Delay Rules.

User can add/remove/list Delay Rule by lctl commands:
- lctl net_delay_add
  Add a new Delay Rule to LNet, options
  <-s | --source SRC_NID>
  <-d | --dest DST_NID>
  <<-r | --rate RATE_NUMBER>
  <-i | --interlval SECONDS>>
  <-l | --latency DELAY_LATENCY>

- lctl net_delay_del
  Remove matched Delay Rule from LNet, options:
  <[-a | --all] |
  <-s | --source SRC_NID>
  <-d | --dest DST_NID>>

- lctl net_delay_list
  List all Delay Rules in LNet

- lctl net_delay_reset
  Reset statistic counters for all Delay Rules

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
Reviewed-on: http://review.whamcloud.com/11409
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   17 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    3 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |   21 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    1 +
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   73 ++-
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |    6 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       |  601 +++++++++++++++++++-
 7 files changed, 683 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 7b3f858..dfc0208 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -559,13 +559,22 @@ void lnet_portals_destroy(void);
 /* message functions */
 int lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr,
 	       lnet_nid_t fromnid, void *private, int rdma_req);
+int lnet_parse_local(lnet_ni_t *ni, lnet_msg_t *msg);
+int lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg);
+
 void lnet_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg, int delayed,
 	       unsigned int offset, unsigned int mlen, unsigned int rlen);
+void lnet_ni_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg,
+		  int delayed, unsigned int offset,
+		  unsigned int mlen, unsigned int rlen);
+
 lnet_msg_t *lnet_create_reply_msg(lnet_ni_t *ni, lnet_msg_t *get_msg);
 void lnet_set_reply_msg_len(lnet_ni_t *ni, lnet_msg_t *msg, unsigned int len);
 
 void lnet_finalize(lnet_ni_t *ni, lnet_msg_t *msg, int rc);
 
+void lnet_drop_message(lnet_ni_t *ni, int cpt, void *private,
+		       unsigned int nob);
 void lnet_drop_delayed_msg_list(struct list_head *head, char *reason);
 void lnet_recv_delayed_msg_list(struct list_head *head);
 
@@ -586,6 +595,14 @@ void lnet_fault_fini(void);
 
 bool lnet_drop_rule_match(lnet_hdr_t *hdr);
 
+int lnet_delay_rule_add(struct lnet_fault_attr *attr);
+int lnet_delay_rule_del(lnet_nid_t src, lnet_nid_t dst, bool shutdown);
+int lnet_delay_rule_list(int pos, struct lnet_fault_attr *attr,
+			 struct lnet_fault_stat *stat);
+void lnet_delay_rule_reset(void);
+void lnet_delay_rule_check(void);
+bool lnet_delay_rule_match_locked(lnet_hdr_t *hdr, struct lnet_msg *msg);
+
 /** @} lnet_fault_simulation */
 
 void lnet_counters_get(lnet_counters_t *counters);
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index cb09a8a..29c72f8 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -88,6 +88,7 @@ typedef struct lnet_msg {
 	unsigned int	msg_rtrcredit:1;	/* taken a global router credit */
 	unsigned int	msg_peerrtrcredit:1;	/* taken a peer router credit */
 	unsigned int	msg_onactivelist:1;	/* on the activelist */
+	unsigned int	msg_rdma_get:1;
 
 	struct lnet_peer	*msg_txpeer;	 /* peer I'm sending to */
 	struct lnet_peer	*msg_rxpeer;	 /* peer I received from */
@@ -574,6 +575,7 @@ typedef struct {
 	/* failure simulation */
 	struct list_head		  ln_test_peers;
 	struct list_head		  ln_drop_rules;
+	struct list_head		  ln_delay_rules;
 
 	struct list_head		  ln_nis;	/* LND instances */
 	/* NIs bond on specific CPT(s) */
@@ -610,6 +612,7 @@ typedef struct {
 
 	struct mutex			  ln_api_mutex;
 	struct mutex			  ln_lnd_mutex;
+	struct mutex			  ln_delay_mutex;
 	/* Have I called LNetNIInit myself? */
 	int				  ln_niinit_self;
 	/* LNetNIInit/LNetNIFini counter */
diff --git a/drivers/staging/lustre/include/linux/lnet/lnetctl.h b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
index ec33bf8..3957507 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetctl.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
@@ -26,6 +26,10 @@ enum {
 	LNET_CTL_DROP_DEL,
 	LNET_CTL_DROP_RESET,
 	LNET_CTL_DROP_LIST,
+	LNET_CTL_DELAY_ADD,
+	LNET_CTL_DELAY_DEL,
+	LNET_CTL_DELAY_RESET,
+	LNET_CTL_DELAY_LIST,
 };
 
 #define LNET_ACK_BIT		BIT(0)
@@ -71,7 +75,17 @@ struct lnet_fault_attr {
 			 */
 			__u32			da_interval;
 		} drop;
-		/** TODO: add more */
+		/** message latency simulation */
+		struct {
+			__u32			la_rate;
+			/**
+			 * time interval of message delay, it is exclusive
+			 * with la_rate
+			 */
+			__u32			la_interval;
+			/** latency to delay */
+			__u32			la_latency;
+		} delay;
 		__u64			space[8];
 	} u;
 };
@@ -93,7 +107,10 @@ struct lnet_fault_stat {
 			/** total # dropped messages */
 			__u64			ds_dropped;
 		} drop;
-		/** TODO: add more */
+		struct {
+			/** total # delayed messages */
+			__u64			ls_delayed;
+		} delay;
 		__u64			space[8];
 	} u;
 };
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 4d77ca3..a666d49 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -551,6 +551,7 @@ lnet_prepare(lnet_pid_t requested_pid)
 	INIT_LIST_HEAD(&the_lnet.ln_nis_zombie);
 	INIT_LIST_HEAD(&the_lnet.ln_routers);
 	INIT_LIST_HEAD(&the_lnet.ln_drop_rules);
+	INIT_LIST_HEAD(&the_lnet.ln_delay_rules);
 
 	rc = lnet_create_remote_nets_table();
 	if (rc)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 7a0f185..a5e90e7 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -42,11 +42,6 @@
 
 #include "../../include/linux/lnet/lib-lnet.h"
 
-/** lnet message has credit and can be submitted to lnd for send/receive */
-#define LNET_CREDIT_OK		0
-/** lnet message is waiting for credit */
-#define LNET_CREDIT_WAIT	1
-
 static int local_nid_dist_zero = 1;
 module_param(local_nid_dist_zero, int, 0444);
 MODULE_PARM_DESC(local_nid_dist_zero, "Reserved");
@@ -570,7 +565,7 @@ lnet_extract_kiov(int dst_niov, lnet_kiov_t *dst,
 }
 EXPORT_SYMBOL(lnet_extract_kiov);
 
-static void
+void
 lnet_ni_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg, int delayed,
 	     unsigned int offset, unsigned int mlen, unsigned int rlen)
 {
@@ -1431,7 +1426,7 @@ lnet_send(lnet_nid_t src_nid, lnet_msg_t *msg, lnet_nid_t rtr_nid)
 	return 0; /* rc == LNET_CREDIT_OK or LNET_CREDIT_WAIT */
 }
 
-static void
+void
 lnet_drop_message(lnet_ni_t *ni, int cpt, void *private, unsigned int nob)
 {
 	lnet_net_lock(cpt);
@@ -1705,7 +1700,7 @@ lnet_parse_ack(lnet_ni_t *ni, lnet_msg_t *msg)
  * \retval LNET_CREDIT_WAIT	If \a msg is blocked because w/o buffer
  * \retval -ve			error code
  */
-static int
+int
 lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 {
 	int rc = 0;
@@ -1729,6 +1724,33 @@ lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 	return rc;
 }
 
+int
+lnet_parse_local(lnet_ni_t *ni, lnet_msg_t *msg)
+{
+	int rc;
+
+	switch (msg->msg_type) {
+	case LNET_MSG_ACK:
+		rc = lnet_parse_ack(ni, msg);
+		break;
+	case LNET_MSG_PUT:
+		rc = lnet_parse_put(ni, msg);
+		break;
+	case LNET_MSG_GET:
+		rc = lnet_parse_get(ni, msg, msg->msg_rdma_get);
+		break;
+	case LNET_MSG_REPLY:
+		rc = lnet_parse_reply(ni, msg);
+		break;
+	default: /* prevent an unused label if !kernel */
+		LASSERT(0);
+		return -EPROTO;
+	}
+
+	LASSERT(!rc || rc == -ENOENT);
+	return rc;
+}
+
 char *
 lnet_msgtyp2str(int type)
 {
@@ -1953,6 +1975,7 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 	msg->msg_type = type;
 	msg->msg_private = private;
 	msg->msg_receiving = 1;
+	msg->msg_rdma_get = rdma_req;
 	msg->msg_wanted = payload_length;
 	msg->msg_len = payload_length;
 	msg->msg_offset = 0;
@@ -2000,6 +2023,13 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 	lnet_msg_commit(msg, cpt);
 
+	/* message delay simulation */
+	if (unlikely(!list_empty(&the_lnet.ln_delay_rules) &&
+		     lnet_delay_rule_match_locked(hdr, msg))) {
+		lnet_net_unlock(cpt);
+		return 0;
+	}
+
 	if (!for_me) {
 		rc = lnet_parse_forward_locked(ni, msg);
 		lnet_net_unlock(cpt);
@@ -2016,29 +2046,10 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 	lnet_net_unlock(cpt);
 
-	switch (type) {
-	case LNET_MSG_ACK:
-		rc = lnet_parse_ack(ni, msg);
-		break;
-	case LNET_MSG_PUT:
-		rc = lnet_parse_put(ni, msg);
-		break;
-	case LNET_MSG_GET:
-		rc = lnet_parse_get(ni, msg, rdma_req);
-		break;
-	case LNET_MSG_REPLY:
-		rc = lnet_parse_reply(ni, msg);
-		break;
-	default:
-		LASSERT(0);
-		rc = -EPROTO;
-		goto free_drop;  /* prevent an unused label if !kernel */
-	}
-
-	if (!rc)
-		return 0;
-
-	LASSERT(rc == -ENOENT);
+	rc = lnet_parse_local(ni, msg);
+	if (rc)
+		goto free_drop;
+	return 0;
 
  free_drop:
 	LASSERT(!msg->msg_md);
diff --git a/drivers/staging/lustre/lnet/lnet/lib-msg.c b/drivers/staging/lustre/lnet/lnet/lib-msg.c
index c372390..f879d7f 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-msg.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-msg.c
@@ -535,6 +535,12 @@ lnet_finalize(lnet_ni_t *ni, lnet_msg_t *msg, int status)
 			break;
 	}
 
+	if (unlikely(!list_empty(&the_lnet.ln_delay_rules))) {
+		lnet_net_unlock(cpt);
+		lnet_delay_rule_check();
+		lnet_net_lock(cpt);
+	}
+
 	container->msc_finalizers[my_slot] = NULL;
 	lnet_net_unlock(cpt);
 
diff --git a/drivers/staging/lustre/lnet/lnet/net_fault.c b/drivers/staging/lustre/lnet/lnet/net_fault.c
index 8ed05b6..91f44a7 100644
--- a/drivers/staging/lustre/lnet/lnet/net_fault.c
+++ b/drivers/staging/lustre/lnet/lnet/net_fault.c
@@ -138,6 +138,10 @@ lnet_fault_stat_inc(struct lnet_fault_stat *stat, unsigned int type)
 }
 
 /**
+ * LNet message drop simulation
+ */
+
+/**
  * Add a new drop rule to LNet
  * There is no check for duplicated drop rule, all rules will be checked for
  * incoming message.
@@ -147,8 +151,8 @@ lnet_drop_rule_add(struct lnet_fault_attr *attr)
 {
 	struct lnet_drop_rule *rule;
 
-	if (!attr->u.drop.da_rate == !attr->u.drop.da_interval) {
-		CDEBUG(D_NET, "invalid rate %d or interval %d\n",
+	if (attr->u.drop.da_rate & attr->u.drop.da_interval) {
+		CDEBUG(D_NET, "please provide either drop rate or drop interval, but not both at the same time %d/%d\n",
 		       attr->u.drop.da_rate, attr->u.drop.da_interval);
 		return -EINVAL;
 	}
@@ -276,8 +280,7 @@ lnet_drop_rule_reset(void)
 		} else {
 			rule->dr_drop_time = cfs_time_shift(cfs_rand() %
 						attr->u.drop.da_interval);
-			rule->dr_time_base = cfs_time_shift(attr->u.drop.
-								  da_interval);
+			rule->dr_time_base = cfs_time_shift(attr->u.drop.da_interval);
 		}
 		spin_unlock(&rule->dr_lock);
 	}
@@ -313,8 +316,7 @@ drop_rule_match(struct lnet_drop_rule *rule, lnet_nid_t src,
 			rule->dr_drop_time = rule->dr_time_base +
 					     cfs_time_seconds(cfs_rand() %
 						attr->u.drop.da_interval);
-			rule->dr_time_base += cfs_time_seconds(attr->u.drop.
-							       da_interval);
+			rule->dr_time_base += cfs_time_seconds(attr->u.drop.da_interval);
 
 			CDEBUG(D_NET, "Drop Rule %s->%s: next drop : %lu\n",
 			       libcfs_nid2str(attr->fa_src),
@@ -377,6 +379,559 @@ lnet_drop_rule_match(lnet_hdr_t *hdr)
 	return drop;
 }
 
+/**
+ * LNet Delay Simulation
+ */
+/** timestamp (second) to send delayed message */
+#define msg_delay_send		 msg_ev.hdr_data
+
+struct lnet_delay_rule {
+	/** link chain on the_lnet.ln_delay_rules */
+	struct list_head	dl_link;
+	/** link chain on delay_dd.dd_sched_rules */
+	struct list_head	dl_sched_link;
+	/** attributes of this rule */
+	struct lnet_fault_attr	dl_attr;
+	/** lock to protect \a below members */
+	spinlock_t		dl_lock;
+	/** refcount of delay rule */
+	atomic_t		dl_refcount;
+	/**
+	 * the message sequence to delay, which means message is delayed when
+	 * dl_stat.fs_count == dl_delay_at
+	 */
+	unsigned long		dl_delay_at;
+	/**
+	 * seconds to delay the next message, it's exclusive with dl_delay_at
+	 */
+	unsigned long		dl_delay_time;
+	/** baseline to caculate dl_delay_time */
+	unsigned long		dl_time_base;
+	/** jiffies to send the next delayed message */
+	unsigned long		dl_msg_send;
+	/** delayed message list */
+	struct list_head	dl_msg_list;
+	/** statistic of delayed messages */
+	struct lnet_fault_stat	dl_stat;
+	/** timer to wakeup delay_daemon */
+	struct timer_list	dl_timer;
+};
+
+struct delay_daemon_data {
+	/** serialise rule add/remove */
+	struct mutex		dd_mutex;
+	/** protect rules on \a dd_sched_rules */
+	spinlock_t		dd_lock;
+	/** scheduled delay rules (by timer) */
+	struct list_head	dd_sched_rules;
+	/** daemon thread sleeps at here */
+	wait_queue_head_t	dd_waitq;
+	/** controller (lctl command) wait at here */
+	wait_queue_head_t	dd_ctl_waitq;
+	/** daemon is running */
+	unsigned int		dd_running;
+	/** daemon stopped */
+	unsigned int		dd_stopped;
+};
+
+static struct delay_daemon_data	delay_dd;
+
+static unsigned long
+round_timeout(unsigned long timeout)
+{
+	return cfs_time_seconds((unsigned int)
+			cfs_duration_sec(cfs_time_sub(timeout, 0)) + 1);
+}
+
+static void
+delay_rule_decref(struct lnet_delay_rule *rule)
+{
+	if (atomic_dec_and_test(&rule->dl_refcount)) {
+		LASSERT(list_empty(&rule->dl_sched_link));
+		LASSERT(list_empty(&rule->dl_msg_list));
+		LASSERT(list_empty(&rule->dl_link));
+
+		CFS_FREE_PTR(rule);
+	}
+}
+
+/**
+ * check source/destination NID, portal, message type and delay rate,
+ * decide whether should delay this message or not
+ */
+static bool
+delay_rule_match(struct lnet_delay_rule *rule, lnet_nid_t src,
+		 lnet_nid_t dst, unsigned int type, unsigned int portal,
+		 struct lnet_msg *msg)
+{
+	struct lnet_fault_attr *attr = &rule->dl_attr;
+	bool delay;
+
+	if (!lnet_fault_attr_match(attr, src, dst, type, portal))
+		return false;
+
+	/* match this rule, check delay rate now */
+	spin_lock(&rule->dl_lock);
+	if (rule->dl_delay_time) { /* time based delay */
+		unsigned long now = cfs_time_current();
+
+		rule->dl_stat.fs_count++;
+		delay = cfs_time_aftereq(now, rule->dl_delay_time);
+		if (delay) {
+			if (cfs_time_after(now, rule->dl_time_base))
+				rule->dl_time_base = now;
+
+			rule->dl_delay_time = rule->dl_time_base +
+					     cfs_time_seconds(cfs_rand() %
+						attr->u.delay.la_interval);
+			rule->dl_time_base += cfs_time_seconds(attr->u.delay.la_interval);
+
+			CDEBUG(D_NET, "Delay Rule %s->%s: next delay : %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst),
+			       rule->dl_delay_time);
+		}
+
+	} else { /* rate based delay */
+		delay = rule->dl_stat.fs_count++ == rule->dl_delay_at;
+		/* generate the next random rate sequence */
+		if (!(rule->dl_stat.fs_count % attr->u.delay.la_rate)) {
+			rule->dl_delay_at = rule->dl_stat.fs_count +
+					    cfs_rand() % attr->u.delay.la_rate;
+			CDEBUG(D_NET, "Delay Rule %s->%s: next delay: %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst), rule->dl_delay_at);
+		}
+	}
+
+	if (!delay) {
+		spin_unlock(&rule->dl_lock);
+		return false;
+	}
+
+	/* delay this message, update counters */
+	lnet_fault_stat_inc(&rule->dl_stat, type);
+	rule->dl_stat.u.delay.ls_delayed++;
+
+	list_add_tail(&msg->msg_list, &rule->dl_msg_list);
+	msg->msg_delay_send = round_timeout(
+			cfs_time_shift(attr->u.delay.la_latency));
+	if (rule->dl_msg_send == -1) {
+		rule->dl_msg_send = msg->msg_delay_send;
+		mod_timer(&rule->dl_timer, rule->dl_msg_send);
+	}
+
+	spin_unlock(&rule->dl_lock);
+	return true;
+}
+
+/**
+ * check if \a msg can match any Delay Rule, receiving of this message
+ * will be delayed if there is a match.
+ */
+bool
+lnet_delay_rule_match_locked(lnet_hdr_t *hdr, struct lnet_msg *msg)
+{
+	struct lnet_delay_rule *rule;
+	lnet_nid_t src = le64_to_cpu(hdr->src_nid);
+	lnet_nid_t dst = le64_to_cpu(hdr->dest_nid);
+	unsigned int typ = le32_to_cpu(hdr->type);
+	unsigned int ptl = -1;
+
+	/* NB: called with hold of lnet_net_lock */
+
+	/**
+	 * NB: if Portal is specified, then only PUT and GET will be
+	 * filtered by delay rule
+	 */
+	if (typ == LNET_MSG_PUT)
+		ptl = le32_to_cpu(hdr->msg.put.ptl_index);
+	else if (typ == LNET_MSG_GET)
+		ptl = le32_to_cpu(hdr->msg.get.ptl_index);
+
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		if (delay_rule_match(rule, src, dst, typ, ptl, msg))
+			return true;
+	}
+
+	return false;
+}
+
+/** check out delayed messages for send */
+static void
+delayed_msg_check(struct lnet_delay_rule *rule, bool all,
+		  struct list_head *msg_list)
+{
+	struct lnet_msg *msg;
+	struct lnet_msg *tmp;
+	unsigned long now = cfs_time_current();
+
+	if (!all && rule->dl_msg_send > now)
+		return;
+
+	spin_lock(&rule->dl_lock);
+	list_for_each_entry_safe(msg, tmp, &rule->dl_msg_list, msg_list) {
+		if (!all && msg->msg_delay_send > now)
+			break;
+
+		msg->msg_delay_send = 0;
+		list_move_tail(&msg->msg_list, msg_list);
+	}
+
+	if (list_empty(&rule->dl_msg_list)) {
+		del_timer(&rule->dl_timer);
+		rule->dl_msg_send = -1;
+
+	} else if (!list_empty(msg_list)) {
+		/*
+		 * dequeued some timedout messages, update timer for the
+		 * next delayed message on rule
+		 */
+		msg = list_entry(rule->dl_msg_list.next,
+				 struct lnet_msg, msg_list);
+		rule->dl_msg_send = msg->msg_delay_send;
+		mod_timer(&rule->dl_timer, rule->dl_msg_send);
+	}
+	spin_unlock(&rule->dl_lock);
+}
+
+static void
+delayed_msg_process(struct list_head *msg_list, bool drop)
+{
+	struct lnet_msg	*msg;
+
+	while (!list_empty(msg_list)) {
+		struct lnet_ni *ni;
+		int cpt;
+		int rc;
+
+		msg = list_entry(msg_list->next, struct lnet_msg, msg_list);
+		LASSERT(msg->msg_rxpeer);
+
+		ni = msg->msg_rxpeer->lp_ni;
+		cpt = msg->msg_rx_cpt;
+
+		list_del_init(&msg->msg_list);
+		if (drop) {
+			rc = -ECANCELED;
+
+		} else if (!msg->msg_routing) {
+			rc = lnet_parse_local(ni, msg);
+			if (!rc)
+				continue;
+
+		} else {
+			lnet_net_lock(cpt);
+			rc = lnet_parse_forward_locked(ni, msg);
+			lnet_net_unlock(cpt);
+
+			switch (rc) {
+			case LNET_CREDIT_OK:
+				lnet_ni_recv(ni, msg->msg_private, msg, 0,
+					     0, msg->msg_len, msg->msg_len);
+			case LNET_CREDIT_WAIT:
+				continue;
+			default: /* failures */
+				break;
+			}
+		}
+
+		lnet_drop_message(ni, cpt, msg->msg_private, msg->msg_len);
+		lnet_finalize(ni, msg, rc);
+	}
+}
+
+/**
+ * Process delayed messages for scheduled rules
+ * This function can either be called by delay_rule_daemon, or by lnet_finalise
+ */
+void
+lnet_delay_rule_check(void)
+{
+	struct lnet_delay_rule *rule;
+	struct list_head msgs;
+
+	INIT_LIST_HEAD(&msgs);
+	while (1) {
+		if (list_empty(&delay_dd.dd_sched_rules))
+			break;
+
+		spin_lock_bh(&delay_dd.dd_lock);
+		if (list_empty(&delay_dd.dd_sched_rules)) {
+			spin_unlock_bh(&delay_dd.dd_lock);
+			break;
+		}
+
+		rule = list_entry(delay_dd.dd_sched_rules.next,
+				  struct lnet_delay_rule, dl_sched_link);
+		list_del_init(&rule->dl_sched_link);
+		spin_unlock_bh(&delay_dd.dd_lock);
+
+		delayed_msg_check(rule, false, &msgs);
+		delay_rule_decref(rule); /* -1 for delay_dd.dd_sched_rules */
+	}
+
+	if (!list_empty(&msgs))
+		delayed_msg_process(&msgs, false);
+}
+
+/** daemon thread to handle delayed messages */
+static int
+lnet_delay_rule_daemon(void *arg)
+{
+	delay_dd.dd_running = 1;
+	wake_up(&delay_dd.dd_ctl_waitq);
+
+	while (delay_dd.dd_running) {
+		wait_event_interruptible(delay_dd.dd_waitq,
+					 !delay_dd.dd_running ||
+					 !list_empty(&delay_dd.dd_sched_rules));
+		lnet_delay_rule_check();
+	}
+
+	/* in case more rules have been enqueued after my last check */
+	lnet_delay_rule_check();
+	delay_dd.dd_stopped = 1;
+	wake_up(&delay_dd.dd_ctl_waitq);
+
+	return 0;
+}
+
+static void
+delay_timer_cb(unsigned long arg)
+{
+	struct lnet_delay_rule *rule = (struct lnet_delay_rule *)arg;
+
+	spin_lock_bh(&delay_dd.dd_lock);
+	if (list_empty(&rule->dl_sched_link) && delay_dd.dd_running) {
+		atomic_inc(&rule->dl_refcount);
+		list_add_tail(&rule->dl_sched_link, &delay_dd.dd_sched_rules);
+		wake_up(&delay_dd.dd_waitq);
+	}
+	spin_unlock_bh(&delay_dd.dd_lock);
+}
+
+/**
+ * Add a new delay rule to LNet
+ * There is no check for duplicated delay rule, all rules will be checked for
+ * incoming message.
+ */
+int
+lnet_delay_rule_add(struct lnet_fault_attr *attr)
+{
+	struct lnet_delay_rule *rule;
+	int rc = 0;
+
+	if (attr->u.delay.la_rate & attr->u.delay.la_interval) {
+		CDEBUG(D_NET, "please provide either delay rate or delay interval, but not both at the same time %d/%d\n",
+		       attr->u.delay.la_rate, attr->u.delay.la_interval);
+		return -EINVAL;
+	}
+
+	if (!attr->u.delay.la_latency) {
+		CDEBUG(D_NET, "delay latency cannot be zero\n");
+		return -EINVAL;
+	}
+
+	if (lnet_fault_attr_validate(attr))
+		return -EINVAL;
+
+	CFS_ALLOC_PTR(rule);
+	if (!rule)
+		return -ENOMEM;
+
+	mutex_lock(&delay_dd.dd_mutex);
+	if (!delay_dd.dd_running) {
+		struct task_struct *task;
+
+		/**
+		 *  NB: although LND threads will process delayed message
+		 * in lnet_finalize, but there is no guarantee that LND
+		 * threads will be waken up if no other message needs to
+		 * be handled.
+		 * Only one daemon thread, performance is not the concern
+		 * of this simualation module.
+		 */
+		task = kthread_run(lnet_delay_rule_daemon, NULL, "lnet_dd");
+		if (IS_ERR(task)) {
+			rc = PTR_ERR(task);
+			goto failed;
+		}
+		wait_event(delay_dd.dd_ctl_waitq, delay_dd.dd_running);
+	}
+
+	init_timer(&rule->dl_timer);
+	rule->dl_timer.function = delay_timer_cb;
+	rule->dl_timer.data = (unsigned long)rule;
+
+	spin_lock_init(&rule->dl_lock);
+	INIT_LIST_HEAD(&rule->dl_msg_list);
+	INIT_LIST_HEAD(&rule->dl_sched_link);
+
+	rule->dl_attr = *attr;
+	if (attr->u.delay.la_interval) {
+		rule->dl_time_base = cfs_time_shift(attr->u.delay.la_interval);
+		rule->dl_delay_time = cfs_time_shift(cfs_rand() %
+						     attr->u.delay.la_interval);
+	} else {
+		rule->dl_delay_at = cfs_rand() % attr->u.delay.la_rate;
+	}
+
+	rule->dl_msg_send = -1;
+
+	lnet_net_lock(LNET_LOCK_EX);
+	atomic_set(&rule->dl_refcount, 1);
+	list_add(&rule->dl_link, &the_lnet.ln_delay_rules);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	CDEBUG(D_NET, "Added delay rule: src %s, dst %s, rate %d\n",
+	       libcfs_nid2str(attr->fa_src), libcfs_nid2str(attr->fa_src),
+	       attr->u.delay.la_rate);
+
+	mutex_unlock(&delay_dd.dd_mutex);
+	return 0;
+failed:
+	mutex_unlock(&delay_dd.dd_mutex);
+	CFS_FREE_PTR(rule);
+	return rc;
+}
+
+/**
+ * Remove matched Delay Rules from lnet, if \a shutdown is true or both \a src
+ * and \a dst are zero, all rules will be removed, otherwise only matched rules
+ * will be removed.
+ * If \a src is zero, then all rules have \a dst as destination will be remove
+ * If \a dst is zero, then all rules have \a src as source will be removed
+ *
+ * When a delay rule is removed, all delayed messages of this rule will be
+ * processed immediately.
+ */
+int
+lnet_delay_rule_del(lnet_nid_t src, lnet_nid_t dst, bool shutdown)
+{
+	struct lnet_delay_rule *rule;
+	struct lnet_delay_rule *tmp;
+	struct list_head rule_list;
+	struct list_head msg_list;
+	int n = 0;
+	bool cleanup;
+
+	INIT_LIST_HEAD(&rule_list);
+	INIT_LIST_HEAD(&msg_list);
+
+	if (shutdown) {
+		src = 0;
+		dst = 0;
+	}
+
+	mutex_lock(&delay_dd.dd_mutex);
+	lnet_net_lock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &the_lnet.ln_delay_rules, dl_link) {
+		if (rule->dl_attr.fa_src != src && src)
+			continue;
+
+		if (rule->dl_attr.fa_dst != dst && dst)
+			continue;
+
+		CDEBUG(D_NET, "Remove delay rule: src %s->dst: %s (1/%d, %d)\n",
+		       libcfs_nid2str(rule->dl_attr.fa_src),
+		       libcfs_nid2str(rule->dl_attr.fa_dst),
+		       rule->dl_attr.u.delay.la_rate,
+		       rule->dl_attr.u.delay.la_interval);
+		/* refcount is taken over by rule_list */
+		list_move(&rule->dl_link, &rule_list);
+	}
+
+	/* check if we need to shutdown delay_daemon */
+	cleanup = list_empty(&the_lnet.ln_delay_rules) &&
+		  !list_empty(&rule_list);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &rule_list, dl_link) {
+		list_del_init(&rule->dl_link);
+
+		del_timer_sync(&rule->dl_timer);
+		delayed_msg_check(rule, true, &msg_list);
+		delay_rule_decref(rule); /* -1 for the_lnet.ln_delay_rules */
+		n++;
+	}
+
+	if (cleanup) { /* no more delay rule, shutdown delay_daemon */
+		LASSERT(delay_dd.dd_running);
+		delay_dd.dd_running = 0;
+		wake_up(&delay_dd.dd_waitq);
+
+		while (!delay_dd.dd_stopped)
+			wait_event(delay_dd.dd_ctl_waitq, delay_dd.dd_stopped);
+	}
+	mutex_unlock(&delay_dd.dd_mutex);
+
+	if (!list_empty(&msg_list))
+		delayed_msg_process(&msg_list, shutdown);
+
+	return n;
+}
+
+/**
+ * List Delay Rule at position of \a pos
+ */
+int
+lnet_delay_rule_list(int pos, struct lnet_fault_attr *attr,
+		     struct lnet_fault_stat *stat)
+{
+	struct lnet_delay_rule *rule;
+	int cpt;
+	int i = 0;
+	int rc = -ENOENT;
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		if (i++ < pos)
+			continue;
+
+		spin_lock(&rule->dl_lock);
+		*attr = rule->dl_attr;
+		*stat = rule->dl_stat;
+		spin_unlock(&rule->dl_lock);
+		rc = 0;
+		break;
+	}
+
+	lnet_net_unlock(cpt);
+	return rc;
+}
+
+/**
+ * reset counters for all Delay Rules
+ */
+void
+lnet_delay_rule_reset(void)
+{
+	struct lnet_delay_rule *rule;
+	int cpt;
+
+	cpt = lnet_net_lock_current();
+
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		struct lnet_fault_attr *attr = &rule->dl_attr;
+
+		spin_lock(&rule->dl_lock);
+
+		memset(&rule->dl_stat, 0, sizeof(rule->dl_stat));
+		if (attr->u.delay.la_rate) {
+			rule->dl_delay_at = cfs_rand() % attr->u.delay.la_rate;
+		} else {
+			rule->dl_delay_time = cfs_time_shift(cfs_rand() %
+						attr->u.delay.la_interval);
+			rule->dl_time_base = cfs_time_shift(attr->u.delay.la_interval);
+		}
+		spin_unlock(&rule->dl_lock);
+	}
+
+	lnet_net_unlock(cpt);
+}
+
 int
 lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
 {
@@ -413,6 +968,31 @@ lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
 			return -EINVAL;
 
 		return lnet_drop_rule_list(data->ioc_count, attr, stat);
+
+	case LNET_CTL_DELAY_ADD:
+		if (!attr)
+			return -EINVAL;
+
+		return lnet_delay_rule_add(attr);
+
+	case LNET_CTL_DELAY_DEL:
+		if (!attr)
+			return -EINVAL;
+
+		data->ioc_count = lnet_delay_rule_del(attr->fa_src,
+						      attr->fa_dst, false);
+		return 0;
+
+	case LNET_CTL_DELAY_RESET:
+		lnet_delay_rule_reset();
+		return 0;
+
+	case LNET_CTL_DELAY_LIST:
+		stat = (struct lnet_fault_stat *)data->ioc_inlbuf2;
+		if (!attr || !stat)
+			return -EINVAL;
+
+		return lnet_delay_rule_list(data->ioc_count, attr, stat);
 	}
 }
 
@@ -424,6 +1004,12 @@ lnet_fault_init(void)
 	CLASSERT(LNET_GET_BIT == 1 << LNET_MSG_GET);
 	CLASSERT(LNET_REPLY_BIT == 1 << LNET_MSG_REPLY);
 
+	mutex_init(&delay_dd.dd_mutex);
+	spin_lock_init(&delay_dd.dd_lock);
+	init_waitqueue_head(&delay_dd.dd_waitq);
+	init_waitqueue_head(&delay_dd.dd_ctl_waitq);
+	INIT_LIST_HEAD(&delay_dd.dd_sched_rules);
+
 	return 0;
 }
 
@@ -431,6 +1017,9 @@ void
 lnet_fault_fini(void)
 {
 	lnet_drop_rule_del(0, 0);
+	lnet_delay_rule_del(0, 0, true);
 
 	LASSERT(list_empty(&the_lnet.ln_drop_rules));
+	LASSERT(list_empty(&the_lnet.ln_delay_rules));
+	LASSERT(list_empty(&delay_dd.dd_sched_rules));
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 02/10] staging: lustre: LNet network latency simulation
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen

From: Liang Zhen <liang.zhen@intel.com>

Incoming lnet message can be delayed for seconds if it can match
any of LNet Delay Rules.

User can add/remove/list Delay Rule by lctl commands:
- lctl net_delay_add
  Add a new Delay Rule to LNet, options
  <-s | --source SRC_NID>
  <-d | --dest DST_NID>
  <<-r | --rate RATE_NUMBER>
  <-i | --interlval SECONDS>>
  <-l | --latency DELAY_LATENCY>

- lctl net_delay_del
  Remove matched Delay Rule from LNet, options:
  <[-a | --all] |
  <-s | --source SRC_NID>
  <-d | --dest DST_NID>>

- lctl net_delay_list
  List all Delay Rules in LNet

- lctl net_delay_reset
  Reset statistic counters for all Delay Rules

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5435
Reviewed-on: http://review.whamcloud.com/11409
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   17 +
 .../staging/lustre/include/linux/lnet/lib-types.h  |    3 +
 .../staging/lustre/include/linux/lnet/lnetctl.h    |   21 +-
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    1 +
 drivers/staging/lustre/lnet/lnet/lib-move.c        |   73 ++-
 drivers/staging/lustre/lnet/lnet/lib-msg.c         |    6 +
 drivers/staging/lustre/lnet/lnet/net_fault.c       |  601 +++++++++++++++++++-
 7 files changed, 683 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 7b3f858..dfc0208 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -559,13 +559,22 @@ void lnet_portals_destroy(void);
 /* message functions */
 int lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr,
 	       lnet_nid_t fromnid, void *private, int rdma_req);
+int lnet_parse_local(lnet_ni_t *ni, lnet_msg_t *msg);
+int lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg);
+
 void lnet_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg, int delayed,
 	       unsigned int offset, unsigned int mlen, unsigned int rlen);
+void lnet_ni_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg,
+		  int delayed, unsigned int offset,
+		  unsigned int mlen, unsigned int rlen);
+
 lnet_msg_t *lnet_create_reply_msg(lnet_ni_t *ni, lnet_msg_t *get_msg);
 void lnet_set_reply_msg_len(lnet_ni_t *ni, lnet_msg_t *msg, unsigned int len);
 
 void lnet_finalize(lnet_ni_t *ni, lnet_msg_t *msg, int rc);
 
+void lnet_drop_message(lnet_ni_t *ni, int cpt, void *private,
+		       unsigned int nob);
 void lnet_drop_delayed_msg_list(struct list_head *head, char *reason);
 void lnet_recv_delayed_msg_list(struct list_head *head);
 
@@ -586,6 +595,14 @@ void lnet_fault_fini(void);
 
 bool lnet_drop_rule_match(lnet_hdr_t *hdr);
 
+int lnet_delay_rule_add(struct lnet_fault_attr *attr);
+int lnet_delay_rule_del(lnet_nid_t src, lnet_nid_t dst, bool shutdown);
+int lnet_delay_rule_list(int pos, struct lnet_fault_attr *attr,
+			 struct lnet_fault_stat *stat);
+void lnet_delay_rule_reset(void);
+void lnet_delay_rule_check(void);
+bool lnet_delay_rule_match_locked(lnet_hdr_t *hdr, struct lnet_msg *msg);
+
 /** @} lnet_fault_simulation */
 
 void lnet_counters_get(lnet_counters_t *counters);
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index cb09a8a..29c72f8 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -88,6 +88,7 @@ typedef struct lnet_msg {
 	unsigned int	msg_rtrcredit:1;	/* taken a global router credit */
 	unsigned int	msg_peerrtrcredit:1;	/* taken a peer router credit */
 	unsigned int	msg_onactivelist:1;	/* on the activelist */
+	unsigned int	msg_rdma_get:1;
 
 	struct lnet_peer	*msg_txpeer;	 /* peer I'm sending to */
 	struct lnet_peer	*msg_rxpeer;	 /* peer I received from */
@@ -574,6 +575,7 @@ typedef struct {
 	/* failure simulation */
 	struct list_head		  ln_test_peers;
 	struct list_head		  ln_drop_rules;
+	struct list_head		  ln_delay_rules;
 
 	struct list_head		  ln_nis;	/* LND instances */
 	/* NIs bond on specific CPT(s) */
@@ -610,6 +612,7 @@ typedef struct {
 
 	struct mutex			  ln_api_mutex;
 	struct mutex			  ln_lnd_mutex;
+	struct mutex			  ln_delay_mutex;
 	/* Have I called LNetNIInit myself? */
 	int				  ln_niinit_self;
 	/* LNetNIInit/LNetNIFini counter */
diff --git a/drivers/staging/lustre/include/linux/lnet/lnetctl.h b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
index ec33bf8..3957507 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetctl.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetctl.h
@@ -26,6 +26,10 @@ enum {
 	LNET_CTL_DROP_DEL,
 	LNET_CTL_DROP_RESET,
 	LNET_CTL_DROP_LIST,
+	LNET_CTL_DELAY_ADD,
+	LNET_CTL_DELAY_DEL,
+	LNET_CTL_DELAY_RESET,
+	LNET_CTL_DELAY_LIST,
 };
 
 #define LNET_ACK_BIT		BIT(0)
@@ -71,7 +75,17 @@ struct lnet_fault_attr {
 			 */
 			__u32			da_interval;
 		} drop;
-		/** TODO: add more */
+		/** message latency simulation */
+		struct {
+			__u32			la_rate;
+			/**
+			 * time interval of message delay, it is exclusive
+			 * with la_rate
+			 */
+			__u32			la_interval;
+			/** latency to delay */
+			__u32			la_latency;
+		} delay;
 		__u64			space[8];
 	} u;
 };
@@ -93,7 +107,10 @@ struct lnet_fault_stat {
 			/** total # dropped messages */
 			__u64			ds_dropped;
 		} drop;
-		/** TODO: add more */
+		struct {
+			/** total # delayed messages */
+			__u64			ls_delayed;
+		} delay;
 		__u64			space[8];
 	} u;
 };
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 4d77ca3..a666d49 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -551,6 +551,7 @@ lnet_prepare(lnet_pid_t requested_pid)
 	INIT_LIST_HEAD(&the_lnet.ln_nis_zombie);
 	INIT_LIST_HEAD(&the_lnet.ln_routers);
 	INIT_LIST_HEAD(&the_lnet.ln_drop_rules);
+	INIT_LIST_HEAD(&the_lnet.ln_delay_rules);
 
 	rc = lnet_create_remote_nets_table();
 	if (rc)
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index 7a0f185..a5e90e7 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -42,11 +42,6 @@
 
 #include "../../include/linux/lnet/lib-lnet.h"
 
-/** lnet message has credit and can be submitted to lnd for send/receive */
-#define LNET_CREDIT_OK		0
-/** lnet message is waiting for credit */
-#define LNET_CREDIT_WAIT	1
-
 static int local_nid_dist_zero = 1;
 module_param(local_nid_dist_zero, int, 0444);
 MODULE_PARM_DESC(local_nid_dist_zero, "Reserved");
@@ -570,7 +565,7 @@ lnet_extract_kiov(int dst_niov, lnet_kiov_t *dst,
 }
 EXPORT_SYMBOL(lnet_extract_kiov);
 
-static void
+void
 lnet_ni_recv(lnet_ni_t *ni, void *private, lnet_msg_t *msg, int delayed,
 	     unsigned int offset, unsigned int mlen, unsigned int rlen)
 {
@@ -1431,7 +1426,7 @@ lnet_send(lnet_nid_t src_nid, lnet_msg_t *msg, lnet_nid_t rtr_nid)
 	return 0; /* rc == LNET_CREDIT_OK or LNET_CREDIT_WAIT */
 }
 
-static void
+void
 lnet_drop_message(lnet_ni_t *ni, int cpt, void *private, unsigned int nob)
 {
 	lnet_net_lock(cpt);
@@ -1705,7 +1700,7 @@ lnet_parse_ack(lnet_ni_t *ni, lnet_msg_t *msg)
  * \retval LNET_CREDIT_WAIT	If \a msg is blocked because w/o buffer
  * \retval -ve			error code
  */
-static int
+int
 lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 {
 	int rc = 0;
@@ -1729,6 +1724,33 @@ lnet_parse_forward_locked(lnet_ni_t *ni, lnet_msg_t *msg)
 	return rc;
 }
 
+int
+lnet_parse_local(lnet_ni_t *ni, lnet_msg_t *msg)
+{
+	int rc;
+
+	switch (msg->msg_type) {
+	case LNET_MSG_ACK:
+		rc = lnet_parse_ack(ni, msg);
+		break;
+	case LNET_MSG_PUT:
+		rc = lnet_parse_put(ni, msg);
+		break;
+	case LNET_MSG_GET:
+		rc = lnet_parse_get(ni, msg, msg->msg_rdma_get);
+		break;
+	case LNET_MSG_REPLY:
+		rc = lnet_parse_reply(ni, msg);
+		break;
+	default: /* prevent an unused label if !kernel */
+		LASSERT(0);
+		return -EPROTO;
+	}
+
+	LASSERT(!rc || rc == -ENOENT);
+	return rc;
+}
+
 char *
 lnet_msgtyp2str(int type)
 {
@@ -1953,6 +1975,7 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 	msg->msg_type = type;
 	msg->msg_private = private;
 	msg->msg_receiving = 1;
+	msg->msg_rdma_get = rdma_req;
 	msg->msg_wanted = payload_length;
 	msg->msg_len = payload_length;
 	msg->msg_offset = 0;
@@ -2000,6 +2023,13 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 	lnet_msg_commit(msg, cpt);
 
+	/* message delay simulation */
+	if (unlikely(!list_empty(&the_lnet.ln_delay_rules) &&
+		     lnet_delay_rule_match_locked(hdr, msg))) {
+		lnet_net_unlock(cpt);
+		return 0;
+	}
+
 	if (!for_me) {
 		rc = lnet_parse_forward_locked(ni, msg);
 		lnet_net_unlock(cpt);
@@ -2016,29 +2046,10 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
 
 	lnet_net_unlock(cpt);
 
-	switch (type) {
-	case LNET_MSG_ACK:
-		rc = lnet_parse_ack(ni, msg);
-		break;
-	case LNET_MSG_PUT:
-		rc = lnet_parse_put(ni, msg);
-		break;
-	case LNET_MSG_GET:
-		rc = lnet_parse_get(ni, msg, rdma_req);
-		break;
-	case LNET_MSG_REPLY:
-		rc = lnet_parse_reply(ni, msg);
-		break;
-	default:
-		LASSERT(0);
-		rc = -EPROTO;
-		goto free_drop;  /* prevent an unused label if !kernel */
-	}
-
-	if (!rc)
-		return 0;
-
-	LASSERT(rc == -ENOENT);
+	rc = lnet_parse_local(ni, msg);
+	if (rc)
+		goto free_drop;
+	return 0;
 
  free_drop:
 	LASSERT(!msg->msg_md);
diff --git a/drivers/staging/lustre/lnet/lnet/lib-msg.c b/drivers/staging/lustre/lnet/lnet/lib-msg.c
index c372390..f879d7f 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-msg.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-msg.c
@@ -535,6 +535,12 @@ lnet_finalize(lnet_ni_t *ni, lnet_msg_t *msg, int status)
 			break;
 	}
 
+	if (unlikely(!list_empty(&the_lnet.ln_delay_rules))) {
+		lnet_net_unlock(cpt);
+		lnet_delay_rule_check();
+		lnet_net_lock(cpt);
+	}
+
 	container->msc_finalizers[my_slot] = NULL;
 	lnet_net_unlock(cpt);
 
diff --git a/drivers/staging/lustre/lnet/lnet/net_fault.c b/drivers/staging/lustre/lnet/lnet/net_fault.c
index 8ed05b6..91f44a7 100644
--- a/drivers/staging/lustre/lnet/lnet/net_fault.c
+++ b/drivers/staging/lustre/lnet/lnet/net_fault.c
@@ -138,6 +138,10 @@ lnet_fault_stat_inc(struct lnet_fault_stat *stat, unsigned int type)
 }
 
 /**
+ * LNet message drop simulation
+ */
+
+/**
  * Add a new drop rule to LNet
  * There is no check for duplicated drop rule, all rules will be checked for
  * incoming message.
@@ -147,8 +151,8 @@ lnet_drop_rule_add(struct lnet_fault_attr *attr)
 {
 	struct lnet_drop_rule *rule;
 
-	if (!attr->u.drop.da_rate == !attr->u.drop.da_interval) {
-		CDEBUG(D_NET, "invalid rate %d or interval %d\n",
+	if (attr->u.drop.da_rate & attr->u.drop.da_interval) {
+		CDEBUG(D_NET, "please provide either drop rate or drop interval, but not both at the same time %d/%d\n",
 		       attr->u.drop.da_rate, attr->u.drop.da_interval);
 		return -EINVAL;
 	}
@@ -276,8 +280,7 @@ lnet_drop_rule_reset(void)
 		} else {
 			rule->dr_drop_time = cfs_time_shift(cfs_rand() %
 						attr->u.drop.da_interval);
-			rule->dr_time_base = cfs_time_shift(attr->u.drop.
-								  da_interval);
+			rule->dr_time_base = cfs_time_shift(attr->u.drop.da_interval);
 		}
 		spin_unlock(&rule->dr_lock);
 	}
@@ -313,8 +316,7 @@ drop_rule_match(struct lnet_drop_rule *rule, lnet_nid_t src,
 			rule->dr_drop_time = rule->dr_time_base +
 					     cfs_time_seconds(cfs_rand() %
 						attr->u.drop.da_interval);
-			rule->dr_time_base += cfs_time_seconds(attr->u.drop.
-							       da_interval);
+			rule->dr_time_base += cfs_time_seconds(attr->u.drop.da_interval);
 
 			CDEBUG(D_NET, "Drop Rule %s->%s: next drop : %lu\n",
 			       libcfs_nid2str(attr->fa_src),
@@ -377,6 +379,559 @@ lnet_drop_rule_match(lnet_hdr_t *hdr)
 	return drop;
 }
 
+/**
+ * LNet Delay Simulation
+ */
+/** timestamp (second) to send delayed message */
+#define msg_delay_send		 msg_ev.hdr_data
+
+struct lnet_delay_rule {
+	/** link chain on the_lnet.ln_delay_rules */
+	struct list_head	dl_link;
+	/** link chain on delay_dd.dd_sched_rules */
+	struct list_head	dl_sched_link;
+	/** attributes of this rule */
+	struct lnet_fault_attr	dl_attr;
+	/** lock to protect \a below members */
+	spinlock_t		dl_lock;
+	/** refcount of delay rule */
+	atomic_t		dl_refcount;
+	/**
+	 * the message sequence to delay, which means message is delayed when
+	 * dl_stat.fs_count == dl_delay_at
+	 */
+	unsigned long		dl_delay_at;
+	/**
+	 * seconds to delay the next message, it's exclusive with dl_delay_at
+	 */
+	unsigned long		dl_delay_time;
+	/** baseline to caculate dl_delay_time */
+	unsigned long		dl_time_base;
+	/** jiffies to send the next delayed message */
+	unsigned long		dl_msg_send;
+	/** delayed message list */
+	struct list_head	dl_msg_list;
+	/** statistic of delayed messages */
+	struct lnet_fault_stat	dl_stat;
+	/** timer to wakeup delay_daemon */
+	struct timer_list	dl_timer;
+};
+
+struct delay_daemon_data {
+	/** serialise rule add/remove */
+	struct mutex		dd_mutex;
+	/** protect rules on \a dd_sched_rules */
+	spinlock_t		dd_lock;
+	/** scheduled delay rules (by timer) */
+	struct list_head	dd_sched_rules;
+	/** daemon thread sleeps at here */
+	wait_queue_head_t	dd_waitq;
+	/** controller (lctl command) wait at here */
+	wait_queue_head_t	dd_ctl_waitq;
+	/** daemon is running */
+	unsigned int		dd_running;
+	/** daemon stopped */
+	unsigned int		dd_stopped;
+};
+
+static struct delay_daemon_data	delay_dd;
+
+static unsigned long
+round_timeout(unsigned long timeout)
+{
+	return cfs_time_seconds((unsigned int)
+			cfs_duration_sec(cfs_time_sub(timeout, 0)) + 1);
+}
+
+static void
+delay_rule_decref(struct lnet_delay_rule *rule)
+{
+	if (atomic_dec_and_test(&rule->dl_refcount)) {
+		LASSERT(list_empty(&rule->dl_sched_link));
+		LASSERT(list_empty(&rule->dl_msg_list));
+		LASSERT(list_empty(&rule->dl_link));
+
+		CFS_FREE_PTR(rule);
+	}
+}
+
+/**
+ * check source/destination NID, portal, message type and delay rate,
+ * decide whether should delay this message or not
+ */
+static bool
+delay_rule_match(struct lnet_delay_rule *rule, lnet_nid_t src,
+		 lnet_nid_t dst, unsigned int type, unsigned int portal,
+		 struct lnet_msg *msg)
+{
+	struct lnet_fault_attr *attr = &rule->dl_attr;
+	bool delay;
+
+	if (!lnet_fault_attr_match(attr, src, dst, type, portal))
+		return false;
+
+	/* match this rule, check delay rate now */
+	spin_lock(&rule->dl_lock);
+	if (rule->dl_delay_time) { /* time based delay */
+		unsigned long now = cfs_time_current();
+
+		rule->dl_stat.fs_count++;
+		delay = cfs_time_aftereq(now, rule->dl_delay_time);
+		if (delay) {
+			if (cfs_time_after(now, rule->dl_time_base))
+				rule->dl_time_base = now;
+
+			rule->dl_delay_time = rule->dl_time_base +
+					     cfs_time_seconds(cfs_rand() %
+						attr->u.delay.la_interval);
+			rule->dl_time_base += cfs_time_seconds(attr->u.delay.la_interval);
+
+			CDEBUG(D_NET, "Delay Rule %s->%s: next delay : %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst),
+			       rule->dl_delay_time);
+		}
+
+	} else { /* rate based delay */
+		delay = rule->dl_stat.fs_count++ == rule->dl_delay_at;
+		/* generate the next random rate sequence */
+		if (!(rule->dl_stat.fs_count % attr->u.delay.la_rate)) {
+			rule->dl_delay_at = rule->dl_stat.fs_count +
+					    cfs_rand() % attr->u.delay.la_rate;
+			CDEBUG(D_NET, "Delay Rule %s->%s: next delay: %lu\n",
+			       libcfs_nid2str(attr->fa_src),
+			       libcfs_nid2str(attr->fa_dst), rule->dl_delay_at);
+		}
+	}
+
+	if (!delay) {
+		spin_unlock(&rule->dl_lock);
+		return false;
+	}
+
+	/* delay this message, update counters */
+	lnet_fault_stat_inc(&rule->dl_stat, type);
+	rule->dl_stat.u.delay.ls_delayed++;
+
+	list_add_tail(&msg->msg_list, &rule->dl_msg_list);
+	msg->msg_delay_send = round_timeout(
+			cfs_time_shift(attr->u.delay.la_latency));
+	if (rule->dl_msg_send == -1) {
+		rule->dl_msg_send = msg->msg_delay_send;
+		mod_timer(&rule->dl_timer, rule->dl_msg_send);
+	}
+
+	spin_unlock(&rule->dl_lock);
+	return true;
+}
+
+/**
+ * check if \a msg can match any Delay Rule, receiving of this message
+ * will be delayed if there is a match.
+ */
+bool
+lnet_delay_rule_match_locked(lnet_hdr_t *hdr, struct lnet_msg *msg)
+{
+	struct lnet_delay_rule *rule;
+	lnet_nid_t src = le64_to_cpu(hdr->src_nid);
+	lnet_nid_t dst = le64_to_cpu(hdr->dest_nid);
+	unsigned int typ = le32_to_cpu(hdr->type);
+	unsigned int ptl = -1;
+
+	/* NB: called with hold of lnet_net_lock */
+
+	/**
+	 * NB: if Portal is specified, then only PUT and GET will be
+	 * filtered by delay rule
+	 */
+	if (typ == LNET_MSG_PUT)
+		ptl = le32_to_cpu(hdr->msg.put.ptl_index);
+	else if (typ == LNET_MSG_GET)
+		ptl = le32_to_cpu(hdr->msg.get.ptl_index);
+
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		if (delay_rule_match(rule, src, dst, typ, ptl, msg))
+			return true;
+	}
+
+	return false;
+}
+
+/** check out delayed messages for send */
+static void
+delayed_msg_check(struct lnet_delay_rule *rule, bool all,
+		  struct list_head *msg_list)
+{
+	struct lnet_msg *msg;
+	struct lnet_msg *tmp;
+	unsigned long now = cfs_time_current();
+
+	if (!all && rule->dl_msg_send > now)
+		return;
+
+	spin_lock(&rule->dl_lock);
+	list_for_each_entry_safe(msg, tmp, &rule->dl_msg_list, msg_list) {
+		if (!all && msg->msg_delay_send > now)
+			break;
+
+		msg->msg_delay_send = 0;
+		list_move_tail(&msg->msg_list, msg_list);
+	}
+
+	if (list_empty(&rule->dl_msg_list)) {
+		del_timer(&rule->dl_timer);
+		rule->dl_msg_send = -1;
+
+	} else if (!list_empty(msg_list)) {
+		/*
+		 * dequeued some timedout messages, update timer for the
+		 * next delayed message on rule
+		 */
+		msg = list_entry(rule->dl_msg_list.next,
+				 struct lnet_msg, msg_list);
+		rule->dl_msg_send = msg->msg_delay_send;
+		mod_timer(&rule->dl_timer, rule->dl_msg_send);
+	}
+	spin_unlock(&rule->dl_lock);
+}
+
+static void
+delayed_msg_process(struct list_head *msg_list, bool drop)
+{
+	struct lnet_msg	*msg;
+
+	while (!list_empty(msg_list)) {
+		struct lnet_ni *ni;
+		int cpt;
+		int rc;
+
+		msg = list_entry(msg_list->next, struct lnet_msg, msg_list);
+		LASSERT(msg->msg_rxpeer);
+
+		ni = msg->msg_rxpeer->lp_ni;
+		cpt = msg->msg_rx_cpt;
+
+		list_del_init(&msg->msg_list);
+		if (drop) {
+			rc = -ECANCELED;
+
+		} else if (!msg->msg_routing) {
+			rc = lnet_parse_local(ni, msg);
+			if (!rc)
+				continue;
+
+		} else {
+			lnet_net_lock(cpt);
+			rc = lnet_parse_forward_locked(ni, msg);
+			lnet_net_unlock(cpt);
+
+			switch (rc) {
+			case LNET_CREDIT_OK:
+				lnet_ni_recv(ni, msg->msg_private, msg, 0,
+					     0, msg->msg_len, msg->msg_len);
+			case LNET_CREDIT_WAIT:
+				continue;
+			default: /* failures */
+				break;
+			}
+		}
+
+		lnet_drop_message(ni, cpt, msg->msg_private, msg->msg_len);
+		lnet_finalize(ni, msg, rc);
+	}
+}
+
+/**
+ * Process delayed messages for scheduled rules
+ * This function can either be called by delay_rule_daemon, or by lnet_finalise
+ */
+void
+lnet_delay_rule_check(void)
+{
+	struct lnet_delay_rule *rule;
+	struct list_head msgs;
+
+	INIT_LIST_HEAD(&msgs);
+	while (1) {
+		if (list_empty(&delay_dd.dd_sched_rules))
+			break;
+
+		spin_lock_bh(&delay_dd.dd_lock);
+		if (list_empty(&delay_dd.dd_sched_rules)) {
+			spin_unlock_bh(&delay_dd.dd_lock);
+			break;
+		}
+
+		rule = list_entry(delay_dd.dd_sched_rules.next,
+				  struct lnet_delay_rule, dl_sched_link);
+		list_del_init(&rule->dl_sched_link);
+		spin_unlock_bh(&delay_dd.dd_lock);
+
+		delayed_msg_check(rule, false, &msgs);
+		delay_rule_decref(rule); /* -1 for delay_dd.dd_sched_rules */
+	}
+
+	if (!list_empty(&msgs))
+		delayed_msg_process(&msgs, false);
+}
+
+/** daemon thread to handle delayed messages */
+static int
+lnet_delay_rule_daemon(void *arg)
+{
+	delay_dd.dd_running = 1;
+	wake_up(&delay_dd.dd_ctl_waitq);
+
+	while (delay_dd.dd_running) {
+		wait_event_interruptible(delay_dd.dd_waitq,
+					 !delay_dd.dd_running ||
+					 !list_empty(&delay_dd.dd_sched_rules));
+		lnet_delay_rule_check();
+	}
+
+	/* in case more rules have been enqueued after my last check */
+	lnet_delay_rule_check();
+	delay_dd.dd_stopped = 1;
+	wake_up(&delay_dd.dd_ctl_waitq);
+
+	return 0;
+}
+
+static void
+delay_timer_cb(unsigned long arg)
+{
+	struct lnet_delay_rule *rule = (struct lnet_delay_rule *)arg;
+
+	spin_lock_bh(&delay_dd.dd_lock);
+	if (list_empty(&rule->dl_sched_link) && delay_dd.dd_running) {
+		atomic_inc(&rule->dl_refcount);
+		list_add_tail(&rule->dl_sched_link, &delay_dd.dd_sched_rules);
+		wake_up(&delay_dd.dd_waitq);
+	}
+	spin_unlock_bh(&delay_dd.dd_lock);
+}
+
+/**
+ * Add a new delay rule to LNet
+ * There is no check for duplicated delay rule, all rules will be checked for
+ * incoming message.
+ */
+int
+lnet_delay_rule_add(struct lnet_fault_attr *attr)
+{
+	struct lnet_delay_rule *rule;
+	int rc = 0;
+
+	if (attr->u.delay.la_rate & attr->u.delay.la_interval) {
+		CDEBUG(D_NET, "please provide either delay rate or delay interval, but not both at the same time %d/%d\n",
+		       attr->u.delay.la_rate, attr->u.delay.la_interval);
+		return -EINVAL;
+	}
+
+	if (!attr->u.delay.la_latency) {
+		CDEBUG(D_NET, "delay latency cannot be zero\n");
+		return -EINVAL;
+	}
+
+	if (lnet_fault_attr_validate(attr))
+		return -EINVAL;
+
+	CFS_ALLOC_PTR(rule);
+	if (!rule)
+		return -ENOMEM;
+
+	mutex_lock(&delay_dd.dd_mutex);
+	if (!delay_dd.dd_running) {
+		struct task_struct *task;
+
+		/**
+		 *  NB: although LND threads will process delayed message
+		 * in lnet_finalize, but there is no guarantee that LND
+		 * threads will be waken up if no other message needs to
+		 * be handled.
+		 * Only one daemon thread, performance is not the concern
+		 * of this simualation module.
+		 */
+		task = kthread_run(lnet_delay_rule_daemon, NULL, "lnet_dd");
+		if (IS_ERR(task)) {
+			rc = PTR_ERR(task);
+			goto failed;
+		}
+		wait_event(delay_dd.dd_ctl_waitq, delay_dd.dd_running);
+	}
+
+	init_timer(&rule->dl_timer);
+	rule->dl_timer.function = delay_timer_cb;
+	rule->dl_timer.data = (unsigned long)rule;
+
+	spin_lock_init(&rule->dl_lock);
+	INIT_LIST_HEAD(&rule->dl_msg_list);
+	INIT_LIST_HEAD(&rule->dl_sched_link);
+
+	rule->dl_attr = *attr;
+	if (attr->u.delay.la_interval) {
+		rule->dl_time_base = cfs_time_shift(attr->u.delay.la_interval);
+		rule->dl_delay_time = cfs_time_shift(cfs_rand() %
+						     attr->u.delay.la_interval);
+	} else {
+		rule->dl_delay_at = cfs_rand() % attr->u.delay.la_rate;
+	}
+
+	rule->dl_msg_send = -1;
+
+	lnet_net_lock(LNET_LOCK_EX);
+	atomic_set(&rule->dl_refcount, 1);
+	list_add(&rule->dl_link, &the_lnet.ln_delay_rules);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	CDEBUG(D_NET, "Added delay rule: src %s, dst %s, rate %d\n",
+	       libcfs_nid2str(attr->fa_src), libcfs_nid2str(attr->fa_src),
+	       attr->u.delay.la_rate);
+
+	mutex_unlock(&delay_dd.dd_mutex);
+	return 0;
+failed:
+	mutex_unlock(&delay_dd.dd_mutex);
+	CFS_FREE_PTR(rule);
+	return rc;
+}
+
+/**
+ * Remove matched Delay Rules from lnet, if \a shutdown is true or both \a src
+ * and \a dst are zero, all rules will be removed, otherwise only matched rules
+ * will be removed.
+ * If \a src is zero, then all rules have \a dst as destination will be remove
+ * If \a dst is zero, then all rules have \a src as source will be removed
+ *
+ * When a delay rule is removed, all delayed messages of this rule will be
+ * processed immediately.
+ */
+int
+lnet_delay_rule_del(lnet_nid_t src, lnet_nid_t dst, bool shutdown)
+{
+	struct lnet_delay_rule *rule;
+	struct lnet_delay_rule *tmp;
+	struct list_head rule_list;
+	struct list_head msg_list;
+	int n = 0;
+	bool cleanup;
+
+	INIT_LIST_HEAD(&rule_list);
+	INIT_LIST_HEAD(&msg_list);
+
+	if (shutdown) {
+		src = 0;
+		dst = 0;
+	}
+
+	mutex_lock(&delay_dd.dd_mutex);
+	lnet_net_lock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &the_lnet.ln_delay_rules, dl_link) {
+		if (rule->dl_attr.fa_src != src && src)
+			continue;
+
+		if (rule->dl_attr.fa_dst != dst && dst)
+			continue;
+
+		CDEBUG(D_NET, "Remove delay rule: src %s->dst: %s (1/%d, %d)\n",
+		       libcfs_nid2str(rule->dl_attr.fa_src),
+		       libcfs_nid2str(rule->dl_attr.fa_dst),
+		       rule->dl_attr.u.delay.la_rate,
+		       rule->dl_attr.u.delay.la_interval);
+		/* refcount is taken over by rule_list */
+		list_move(&rule->dl_link, &rule_list);
+	}
+
+	/* check if we need to shutdown delay_daemon */
+	cleanup = list_empty(&the_lnet.ln_delay_rules) &&
+		  !list_empty(&rule_list);
+	lnet_net_unlock(LNET_LOCK_EX);
+
+	list_for_each_entry_safe(rule, tmp, &rule_list, dl_link) {
+		list_del_init(&rule->dl_link);
+
+		del_timer_sync(&rule->dl_timer);
+		delayed_msg_check(rule, true, &msg_list);
+		delay_rule_decref(rule); /* -1 for the_lnet.ln_delay_rules */
+		n++;
+	}
+
+	if (cleanup) { /* no more delay rule, shutdown delay_daemon */
+		LASSERT(delay_dd.dd_running);
+		delay_dd.dd_running = 0;
+		wake_up(&delay_dd.dd_waitq);
+
+		while (!delay_dd.dd_stopped)
+			wait_event(delay_dd.dd_ctl_waitq, delay_dd.dd_stopped);
+	}
+	mutex_unlock(&delay_dd.dd_mutex);
+
+	if (!list_empty(&msg_list))
+		delayed_msg_process(&msg_list, shutdown);
+
+	return n;
+}
+
+/**
+ * List Delay Rule at position of \a pos
+ */
+int
+lnet_delay_rule_list(int pos, struct lnet_fault_attr *attr,
+		     struct lnet_fault_stat *stat)
+{
+	struct lnet_delay_rule *rule;
+	int cpt;
+	int i = 0;
+	int rc = -ENOENT;
+
+	cpt = lnet_net_lock_current();
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		if (i++ < pos)
+			continue;
+
+		spin_lock(&rule->dl_lock);
+		*attr = rule->dl_attr;
+		*stat = rule->dl_stat;
+		spin_unlock(&rule->dl_lock);
+		rc = 0;
+		break;
+	}
+
+	lnet_net_unlock(cpt);
+	return rc;
+}
+
+/**
+ * reset counters for all Delay Rules
+ */
+void
+lnet_delay_rule_reset(void)
+{
+	struct lnet_delay_rule *rule;
+	int cpt;
+
+	cpt = lnet_net_lock_current();
+
+	list_for_each_entry(rule, &the_lnet.ln_delay_rules, dl_link) {
+		struct lnet_fault_attr *attr = &rule->dl_attr;
+
+		spin_lock(&rule->dl_lock);
+
+		memset(&rule->dl_stat, 0, sizeof(rule->dl_stat));
+		if (attr->u.delay.la_rate) {
+			rule->dl_delay_at = cfs_rand() % attr->u.delay.la_rate;
+		} else {
+			rule->dl_delay_time = cfs_time_shift(cfs_rand() %
+						attr->u.delay.la_interval);
+			rule->dl_time_base = cfs_time_shift(attr->u.delay.la_interval);
+		}
+		spin_unlock(&rule->dl_lock);
+	}
+
+	lnet_net_unlock(cpt);
+}
+
 int
 lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
 {
@@ -413,6 +968,31 @@ lnet_fault_ctl(int opc, struct libcfs_ioctl_data *data)
 			return -EINVAL;
 
 		return lnet_drop_rule_list(data->ioc_count, attr, stat);
+
+	case LNET_CTL_DELAY_ADD:
+		if (!attr)
+			return -EINVAL;
+
+		return lnet_delay_rule_add(attr);
+
+	case LNET_CTL_DELAY_DEL:
+		if (!attr)
+			return -EINVAL;
+
+		data->ioc_count = lnet_delay_rule_del(attr->fa_src,
+						      attr->fa_dst, false);
+		return 0;
+
+	case LNET_CTL_DELAY_RESET:
+		lnet_delay_rule_reset();
+		return 0;
+
+	case LNET_CTL_DELAY_LIST:
+		stat = (struct lnet_fault_stat *)data->ioc_inlbuf2;
+		if (!attr || !stat)
+			return -EINVAL;
+
+		return lnet_delay_rule_list(data->ioc_count, attr, stat);
 	}
 }
 
@@ -424,6 +1004,12 @@ lnet_fault_init(void)
 	CLASSERT(LNET_GET_BIT == 1 << LNET_MSG_GET);
 	CLASSERT(LNET_REPLY_BIT == 1 << LNET_MSG_REPLY);
 
+	mutex_init(&delay_dd.dd_mutex);
+	spin_lock_init(&delay_dd.dd_lock);
+	init_waitqueue_head(&delay_dd.dd_waitq);
+	init_waitqueue_head(&delay_dd.dd_ctl_waitq);
+	INIT_LIST_HEAD(&delay_dd.dd_sched_rules);
+
 	return 0;
 }
 
@@ -431,6 +1017,9 @@ void
 lnet_fault_fini(void)
 {
 	lnet_drop_rule_del(0, 0);
+	lnet_delay_rule_del(0, 0, true);
 
 	LASSERT(list_empty(&the_lnet.ln_drop_rules));
+	LASSERT(list_empty(&the_lnet.ln_delay_rules));
+	LASSERT(list_empty(&delay_dd.dd_sched_rules));
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/10] staging: lustre: fix 'data race condition' issue in conrpc.c
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'data race condition' defects found by Coverity version 6.5.0:
Data race condition (MISSING_LOCK)
Accessing variable without holding lock. Elsewhere,
this variable is accessed with lock held.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2744
Reviewed-on: http://review.whamcloud.com/6567
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/selftest/conrpc.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
index e6376a0..8a67f89 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.c
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
@@ -945,8 +945,12 @@ lstcon_sesnew_stat_reply(lstcon_rpc_trans_t *trans,
 		return status;
 
 	if (!trans->tas_feats_updated) {
-		trans->tas_feats_updated = 1;
-		trans->tas_features = reply->msg_ses_feats;
+		spin_lock(&console_session.ses_rpc_lock);
+		if (!trans->tas_feats_updated) { /* recheck with lock */
+			trans->tas_feats_updated = 1;
+			trans->tas_features = reply->msg_ses_feats;
+		}
+		spin_unlock(&console_session.ses_rpc_lock);
 	}
 
 	if (reply->msg_ses_feats != trans->tas_features) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 03/10] staging: lustre: fix 'data race condition' issue in conrpc.c
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'data race condition' defects found by Coverity version 6.5.0:
Data race condition (MISSING_LOCK)
Accessing variable without holding lock. Elsewhere,
this variable is accessed with lock held.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2744
Reviewed-on: http://review.whamcloud.com/6567
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/selftest/conrpc.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
index e6376a0..8a67f89 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.c
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
@@ -945,8 +945,12 @@ lstcon_sesnew_stat_reply(lstcon_rpc_trans_t *trans,
 		return status;
 
 	if (!trans->tas_feats_updated) {
-		trans->tas_feats_updated = 1;
-		trans->tas_features = reply->msg_ses_feats;
+		spin_lock(&console_session.ses_rpc_lock);
+		if (!trans->tas_feats_updated) { /* recheck with lock */
+			trans->tas_feats_updated = 1;
+			trans->tas_features = reply->msg_ses_feats;
+		}
+		spin_unlock(&console_session.ses_rpc_lock);
 	}
 
 	if (reply->msg_ses_feats != trans->tas_features) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/10] staging: lustre: fix 'NULL pointer dereference' errors
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'NULL pointer dereference' defects found by Coverity version
6.5.3:
Dereference after null check (FORWARD_NULL)
For instance, Passing null pointer to a function which dereferences
it.
Dereference before null check (REVERSE_INULL)
Null-checking variable suggests that it may be null, but it has
already been dereferenced on all paths leading to the check.
Dereference null return value (NULL_RETURNS)

The following fixes for the LNet layer are broken out of patch
http://review.whamcloud.com/4720.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2217
Reviewed-on: http://review.whamcloud.com/4720
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    2 +
 drivers/staging/lustre/lnet/selftest/conctl.c      |   49 ++++++++++----------
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    7 ++-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   10 ++++-
 .../lustre/lustre/obdclass/lprocfs_status.c        |   24 +++++----
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    2 +-
 9 files changed, 61 insertions(+), 40 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index a5e90e7..f323b8b 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -162,6 +162,7 @@ lnet_iov_nob(unsigned int niov, struct kvec *iov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(!niov || iov);
 	while (niov-- > 0)
 		nob += (iov++)->iov_len;
 
@@ -282,6 +283,7 @@ lnet_kiov_nob(unsigned int niov, lnet_kiov_t *kiov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(!niov || kiov);
 	while (niov-- > 0)
 		nob += (kiov++)->kiov_len;
 
diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index 714d14b..62cacb6 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -670,44 +670,45 @@ static int
 lst_stat_query_ioctl(lstio_stat_args_t *args)
 {
 	int rc;
-	char *name;
+	char *name = NULL;
 
 	/* TODO: not finished */
 	if (args->lstio_sta_key != console_session.ses_key)
 		return -EACCES;
 
-	if (!args->lstio_sta_resultp ||
-	    (!args->lstio_sta_namep && !args->lstio_sta_idsp) ||
-	    args->lstio_sta_nmlen <= 0 ||
-	    args->lstio_sta_nmlen > LST_NAME_SIZE)
-		return -EINVAL;
-
-	if (args->lstio_sta_idsp &&
-	    args->lstio_sta_count <= 0)
+	if (!args->lstio_sta_resultp)
 		return -EINVAL;
 
-	LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
-	if (!name)
-		return -ENOMEM;
-
-	if (copy_from_user(name, args->lstio_sta_namep,
-			   args->lstio_sta_nmlen)) {
-		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
-		return -EFAULT;
-	}
+	if (args->lstio_sta_idsp) {
+		if (args->lstio_sta_count <= 0)
+			return -EINVAL;
 
-	if (!args->lstio_sta_idsp) {
-		rc = lstcon_group_stat(name, args->lstio_sta_timeout,
-				       args->lstio_sta_resultp);
-	} else {
 		rc = lstcon_nodes_stat(args->lstio_sta_count,
 				       args->lstio_sta_idsp,
 				       args->lstio_sta_timeout,
 				       args->lstio_sta_resultp);
-	}
+	} else if (args->lstio_sta_namep) {
+		if (args->lstio_sta_nmlen <= 0 ||
+		    args->lstio_sta_nmlen > LST_NAME_SIZE)
+			return -EINVAL;
 
-	LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
+		LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
+		if (!name)
+			return -ENOMEM;
 
+		rc = copy_from_user(name, args->lstio_sta_namep,
+				    args->lstio_sta_nmlen);
+		if (!rc)
+			rc = lstcon_group_stat(name, args->lstio_sta_timeout,
+					       args->lstio_sta_resultp);
+		else
+			rc = -EFAULT;
+	} else {
+		rc = -EINVAL;
+	}
+
+	if (name)
+		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 9f026bd..276906e 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -448,6 +448,9 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
 /* For printf's only, make sure uuid is terminated */
 static inline char *obd_uuid2str(const struct obd_uuid *uuid)
 {
+	if (!uuid)
+		return NULL;
+
 	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
 		/* Obviously not safe, but for printfs, no real harm done...
 		 * we're always null-terminated, even in a race.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 2d501a6..6f0761c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -708,8 +708,13 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
 		if (policy)
 			lock->l_policy_data = *policy;
 
-		if (einfo->ei_type == LDLM_EXTENT)
+		if (einfo->ei_type == LDLM_EXTENT) {
+			/* extent lock without policy is a bug */
+			if (!policy)
+				LBUG();
+
 			lock->l_req_extent = policy->l_extent;
+		}
 		LDLM_DEBUG(lock, "client-side enqueue START, flags %llx\n",
 			   *flags);
 	}
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 5c055a0..267f001 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -238,7 +238,7 @@ static int lmv_connect(const struct lu_env *env,
 	 * and MDC stuff will be called directly, for instance while reading
 	 * ../mdc/../kbytesfree procfs file, etc.
 	 */
-	if (data->ocd_connect_flags & OBD_CONNECT_REAL)
+	if (data && data->ocd_connect_flags & OBD_CONNECT_REAL)
 		rc = lmv_check_connect(obd);
 
 	if (rc && lmv->lmv_tgts_kobj)
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 4f568f0..7178a02 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -178,7 +178,7 @@ static int lov_check_and_wait_active(struct lov_obd *lov, int ost_idx)
 				   cfs_time_seconds(1), NULL, NULL);
 
 	rc = l_wait_event(waitq, lov_check_set(lov, ost_idx), &lwi);
-	if (tgt && tgt->ltd_active)
+	if (tgt->ltd_active)
 		return 1;
 
 	return 0;
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index f5a85bb..bc49633 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -344,7 +344,15 @@ static int config_log_add(struct obd_device *obd, char *logname,
 	LASSERT(lsi->lsi_lmd);
 	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)) {
 		struct config_llog_data *recover_cld;
-		*strrchr(seclogname, '-') = 0;
+
+		ptr = strrchr(seclogname, '-');
+		if (ptr != NULL) {
+			*ptr = 0;
+		} else {
+			CERROR("sptlrpc log name not correct: %s", seclogname);
+			config_log_put(cld);
+			return -EINVAL;
+		}
 		recover_cld = config_recover_log_add(obd, seclogname, cfg, sb);
 		if (IS_ERR(recover_cld)) {
 			rc = PTR_ERR(recover_cld);
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 7c28755..1ea1578 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -1359,17 +1359,19 @@ int lprocfs_write_frac_u64_helper(const char __user *buffer,
 	}
 
 	units = 1;
-	switch (tolower(*end)) {
-	case 'p':
-		units <<= 10;
-	case 't':
-		units <<= 10;
-	case 'g':
-		units <<= 10;
-	case 'm':
-		units <<= 10;
-	case 'k':
-		units <<= 10;
+	if (end != NULL) {
+		switch (tolower(*end)) {
+		case 'p':
+			units <<= 10;
+		case 't':
+			units <<= 10;
+		case 'g':
+			units <<= 10;
+		case 'm':
+			units <<= 10;
+		case 'k':
+			units <<= 10;
+		}
 	}
 	/* Specified units override the multiplier */
 	if (units > 1)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index bdd9053..5b06901 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1798,7 +1798,7 @@ swabber_dumper_helper(struct req_capsule *pill,
 			return;
 		swabber(value);
 		ptlrpc_buf_set_swabbed(pill->rc_req, inout, offset);
-		if (dump) {
+		if (dump && field->rmf_dumper) {
 			CDEBUG(D_RPCTRACE, "Dump of swabbed field %s follows\n",
 			       field->rmf_name);
 			field->rmf_dumper(value);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 04/10] staging: lustre: fix 'NULL pointer dereference' errors
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'NULL pointer dereference' defects found by Coverity version
6.5.3:
Dereference after null check (FORWARD_NULL)
For instance, Passing null pointer to a function which dereferences
it.
Dereference before null check (REVERSE_INULL)
Null-checking variable suggests that it may be null, but it has
already been dereferenced on all paths leading to the check.
Dereference null return value (NULL_RETURNS)

The following fixes for the LNet layer are broken out of patch
http://review.whamcloud.com/4720.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2217
Reviewed-on: http://review.whamcloud.com/4720
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/lib-move.c        |    2 +
 drivers/staging/lustre/lnet/selftest/conctl.c      |   49 ++++++++++----------
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    7 ++-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 +-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   10 ++++-
 .../lustre/lustre/obdclass/lprocfs_status.c        |   24 +++++----
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    2 +-
 9 files changed, 61 insertions(+), 40 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index a5e90e7..f323b8b 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -162,6 +162,7 @@ lnet_iov_nob(unsigned int niov, struct kvec *iov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(!niov || iov);
 	while (niov-- > 0)
 		nob += (iov++)->iov_len;
 
@@ -282,6 +283,7 @@ lnet_kiov_nob(unsigned int niov, lnet_kiov_t *kiov)
 {
 	unsigned int nob = 0;
 
+	LASSERT(!niov || kiov);
 	while (niov-- > 0)
 		nob += (kiov++)->kiov_len;
 
diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c b/drivers/staging/lustre/lnet/selftest/conctl.c
index 714d14b..62cacb6 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -670,44 +670,45 @@ static int
 lst_stat_query_ioctl(lstio_stat_args_t *args)
 {
 	int rc;
-	char *name;
+	char *name = NULL;
 
 	/* TODO: not finished */
 	if (args->lstio_sta_key != console_session.ses_key)
 		return -EACCES;
 
-	if (!args->lstio_sta_resultp ||
-	    (!args->lstio_sta_namep && !args->lstio_sta_idsp) ||
-	    args->lstio_sta_nmlen <= 0 ||
-	    args->lstio_sta_nmlen > LST_NAME_SIZE)
-		return -EINVAL;
-
-	if (args->lstio_sta_idsp &&
-	    args->lstio_sta_count <= 0)
+	if (!args->lstio_sta_resultp)
 		return -EINVAL;
 
-	LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
-	if (!name)
-		return -ENOMEM;
-
-	if (copy_from_user(name, args->lstio_sta_namep,
-			   args->lstio_sta_nmlen)) {
-		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
-		return -EFAULT;
-	}
+	if (args->lstio_sta_idsp) {
+		if (args->lstio_sta_count <= 0)
+			return -EINVAL;
 
-	if (!args->lstio_sta_idsp) {
-		rc = lstcon_group_stat(name, args->lstio_sta_timeout,
-				       args->lstio_sta_resultp);
-	} else {
 		rc = lstcon_nodes_stat(args->lstio_sta_count,
 				       args->lstio_sta_idsp,
 				       args->lstio_sta_timeout,
 				       args->lstio_sta_resultp);
-	}
+	} else if (args->lstio_sta_namep) {
+		if (args->lstio_sta_nmlen <= 0 ||
+		    args->lstio_sta_nmlen > LST_NAME_SIZE)
+			return -EINVAL;
 
-	LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
+		LIBCFS_ALLOC(name, args->lstio_sta_nmlen + 1);
+		if (!name)
+			return -ENOMEM;
 
+		rc = copy_from_user(name, args->lstio_sta_namep,
+				    args->lstio_sta_nmlen);
+		if (!rc)
+			rc = lstcon_group_stat(name, args->lstio_sta_timeout,
+					       args->lstio_sta_resultp);
+		else
+			rc = -EFAULT;
+	} else {
+		rc = -EINVAL;
+	}
+
+	if (name)
+		LIBCFS_FREE(name, args->lstio_sta_nmlen + 1);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 9f026bd..276906e 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -448,6 +448,9 @@ static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
 /* For printf's only, make sure uuid is terminated */
 static inline char *obd_uuid2str(const struct obd_uuid *uuid)
 {
+	if (!uuid)
+		return NULL;
+
 	if (uuid->uuid[sizeof(*uuid) - 1] != '\0') {
 		/* Obviously not safe, but for printfs, no real harm done...
 		 * we're always null-terminated, even in a race.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 2d501a6..6f0761c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -708,8 +708,13 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
 		if (policy)
 			lock->l_policy_data = *policy;
 
-		if (einfo->ei_type == LDLM_EXTENT)
+		if (einfo->ei_type == LDLM_EXTENT) {
+			/* extent lock without policy is a bug */
+			if (!policy)
+				LBUG();
+
 			lock->l_req_extent = policy->l_extent;
+		}
 		LDLM_DEBUG(lock, "client-side enqueue START, flags %llx\n",
 			   *flags);
 	}
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 5c055a0..267f001 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -238,7 +238,7 @@ static int lmv_connect(const struct lu_env *env,
 	 * and MDC stuff will be called directly, for instance while reading
 	 * ../mdc/../kbytesfree procfs file, etc.
 	 */
-	if (data->ocd_connect_flags & OBD_CONNECT_REAL)
+	if (data && data->ocd_connect_flags & OBD_CONNECT_REAL)
 		rc = lmv_check_connect(obd);
 
 	if (rc && lmv->lmv_tgts_kobj)
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 4f568f0..7178a02 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -178,7 +178,7 @@ static int lov_check_and_wait_active(struct lov_obd *lov, int ost_idx)
 				   cfs_time_seconds(1), NULL, NULL);
 
 	rc = l_wait_event(waitq, lov_check_set(lov, ost_idx), &lwi);
-	if (tgt && tgt->ltd_active)
+	if (tgt->ltd_active)
 		return 1;
 
 	return 0;
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index f5a85bb..bc49633 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -344,7 +344,15 @@ static int config_log_add(struct obd_device *obd, char *logname,
 	LASSERT(lsi->lsi_lmd);
 	if (!(lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)) {
 		struct config_llog_data *recover_cld;
-		*strrchr(seclogname, '-') = 0;
+
+		ptr = strrchr(seclogname, '-');
+		if (ptr != NULL) {
+			*ptr = 0;
+		} else {
+			CERROR("sptlrpc log name not correct: %s", seclogname);
+			config_log_put(cld);
+			return -EINVAL;
+		}
 		recover_cld = config_recover_log_add(obd, seclogname, cfg, sb);
 		if (IS_ERR(recover_cld)) {
 			rc = PTR_ERR(recover_cld);
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 7c28755..1ea1578 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -1359,17 +1359,19 @@ int lprocfs_write_frac_u64_helper(const char __user *buffer,
 	}
 
 	units = 1;
-	switch (tolower(*end)) {
-	case 'p':
-		units <<= 10;
-	case 't':
-		units <<= 10;
-	case 'g':
-		units <<= 10;
-	case 'm':
-		units <<= 10;
-	case 'k':
-		units <<= 10;
+	if (end != NULL) {
+		switch (tolower(*end)) {
+		case 'p':
+			units <<= 10;
+		case 't':
+			units <<= 10;
+		case 'g':
+			units <<= 10;
+		case 'm':
+			units <<= 10;
+		case 'k':
+			units <<= 10;
+		}
 	}
 	/* Specified units override the multiplier */
 	if (units > 1)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index bdd9053..5b06901 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1798,7 +1798,7 @@ swabber_dumper_helper(struct req_capsule *pill,
 			return;
 		swabber(value);
 		ptlrpc_buf_set_swabbed(pill->rc_req, inout, offset);
-		if (dump) {
+		if (dump && field->rmf_dumper) {
 			CDEBUG(D_RPCTRACE, "Dump of swabbed field %s follows\n",
 			       field->rmf_name);
 			field->rmf_dumper(value);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/10] staging: lustre: fix 'data race condition' issue in framework.c
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'data race condition' defects found by Coverity version 6.5.0:
Data race condition (MISSING_LOCK)
Accessing variable without holding lock. Elsewhere,
this variable is accessed with lock held.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2744
Reviewed-on: http://review.whamcloud.com/6568
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lnet/selftest/framework.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index dbd2c61..5c7cafa 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -981,9 +981,8 @@ sfw_run_test(swi_workitem_t *wi)
 	list_add_tail(&rpc->crpc_list, &tsi->tsi_active_rpcs);
 	spin_unlock(&tsi->tsi_lock);
 
-	rpc->crpc_timeout = rpc_timeout;
-
 	spin_lock(&rpc->crpc_lock);
+	rpc->crpc_timeout = rpc_timeout;
 	srpc_post_rpc(rpc);
 	spin_unlock(&rpc->crpc_lock);
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 05/10] staging: lustre: fix 'data race condition' issue in framework.c
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Sebastien Buisson

From: Sebastien Buisson <sbuisson@ddn.com>

Fix 'data race condition' defects found by Coverity version 6.5.0:
Data race condition (MISSING_LOCK)
Accessing variable without holding lock. Elsewhere,
this variable is accessed with lock held.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2744
Reviewed-on: http://review.whamcloud.com/6568
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
---
 drivers/staging/lustre/lnet/selftest/framework.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index dbd2c61..5c7cafa 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -981,9 +981,8 @@ sfw_run_test(swi_workitem_t *wi)
 	list_add_tail(&rpc->crpc_list, &tsi->tsi_active_rpcs);
 	spin_unlock(&tsi->tsi_lock);
 
-	rpc->crpc_timeout = rpc_timeout;
-
 	spin_lock(&rpc->crpc_lock);
+	rpc->crpc_timeout = rpc_timeout;
 	srpc_post_rpc(rpc);
 	spin_unlock(&rpc->crpc_lock);
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/10] staging: lustre: Correct missing newline
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Nunez

From: James Nunez <james.a.nunez@intel.com>

Several error messages are missing newline characters
at the end of the message. Newlines are added where
necessary and other minor corrections; no punctuation
at the end of an error message, add a return code to
the end of error messages, device name at the beginning,
etc.

There are just a couple of places where newlines are
removed and this is only in LDLM_DEBUG_NOLOCK. The definition
of LDLM_DEBUG_NOLOCK already has a newline in it and
resulted in double newlines printed.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4871
Reviewed-on: http://review.whamcloud.com/10000
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
---
 drivers/staging/lustre/lnet/selftest/framework.c   |    6 +++---
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |    8 ++++----
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_lock.c       |    2 +-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c       |    4 ++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/obdclass/cl_lock.c   |    2 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |    2 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 .../staging/lustre/lustre/obdecho/echo_client.c    |    8 ++++++--
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    9 +++++----
 drivers/staging/lustre/lustre/osc/osc_lock.c       |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    6 +++---
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    4 ++--
 19 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index 5c7cafa..a2f94fa 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -453,7 +453,7 @@ sfw_make_session(srpc_mksn_reqst_t *request, srpc_mksn_reply_t *reply)
 	/* brand new or create by force */
 	LIBCFS_ALLOC(sn, sizeof(sfw_session_t));
 	if (!sn) {
-		CERROR("Dropping RPC (mksn) under memory pressure.\n");
+		CERROR("dropping RPC mksn under memory pressure\n");
 		return -ENOMEM;
 	}
 
@@ -1155,7 +1155,7 @@ sfw_add_test(struct srpc_server_rpc *rpc)
 
 	bat = sfw_bid2batch(request->tsr_bid);
 	if (!bat) {
-		CERROR("Dropping RPC (%s) from %s under memory pressure.\n",
+		CERROR("dropping RPC %s from %s under memory pressure\n",
 		       rpc->srpc_scd->scd_svc->sv_name,
 		       libcfs_id2str(rpc->srpc_peer));
 		return -ENOMEM;
@@ -1367,7 +1367,7 @@ sfw_bulk_ready(struct srpc_server_rpc *rpc, int status)
 	}
 
 	if (sfw_del_session_timer()) {
-		CERROR("Dropping RPC (%s) from %s: racing with expiry timer",
+		CERROR("dropping RPC %s from %s: racing with expiry timer\n",
 		       sv->sv_name, libcfs_id2str(rpc->srpc_peer));
 		spin_unlock(&sfw_data.fw_lock);
 		return -EAGAIN;
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 129fa02..5eb4232 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -667,7 +667,7 @@ srpc_finish_service(struct srpc_service *sv)
 		}
 
 		if (scd->scd_buf_nposted > 0) {
-			CDEBUG(D_NET, "waiting for %d posted buffers to unlink",
+			CDEBUG(D_NET, "waiting for %d posted buffers to unlink\n",
 			       scd->scd_buf_nposted);
 			spin_unlock(&scd->scd_lock);
 			return 0;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 6f0761c..c7904a9 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1041,7 +1041,7 @@ int ldlm_cli_cancel(struct lustre_handle *lockh,
 	/* concurrent cancels on the same handle can happen */
 	lock = ldlm_handle2lock_long(lockh, LDLM_FL_CANCELING);
 	if (!lock) {
-		LDLM_DEBUG_NOLOCK("lock is already being destroyed\n");
+		LDLM_DEBUG_NOLOCK("lock is already being destroyed");
 		return 0;
 	}
 
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index d3ed905..cf619af 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -952,7 +952,7 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 
 	set = ptlrpc_prep_set();
 	if (!set) {
-		CERROR("can't allocate ptlrpc set\n");
+		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
 		rc = -ENOMEM;
 	} else {
 		rc = obd_getattr_async(exp, &oinfo, set);
@@ -1180,7 +1180,7 @@ out:
 		CDEBUG(D_VFSTRACE, "Restart %s on %pD from %lld, count:%zd\n",
 		       iot == CIT_READ ? "read" : "write",
 		       file, *ppos, count);
-		LASSERTF(io->ci_nob == 0, "%zd", io->ci_nob);
+		LASSERTF(io->ci_nob == 0, "%zd\n", io->ci_nob);
 		goto restart;
 	}
 
@@ -3415,7 +3415,7 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	LASSERT(lock);
 	LASSERT(ldlm_has_layout(lock));
 
-	LDLM_DEBUG(lock, "File %p/"DFID" being reconfigured: %d.\n",
+	LDLM_DEBUG(lock, "File %p/"DFID" being reconfigured: %d",
 		   inode, PFID(&lli->lli_fid), reconf);
 
 	/* in case this is a caching lock and reinstate with new inode */
@@ -3571,7 +3571,7 @@ again:
 	it.it_op = IT_LAYOUT;
 	lockh.cookie = 0ULL;
 
-	LDLM_DEBUG_NOLOCK("%s: requeue layout lock for file %p/" DFID ".\n",
+	LDLM_DEBUG_NOLOCK("%s: requeue layout lock for file %p/" DFID "",
 			  ll_get_fsname(inode->i_sb, NULL, 0), inode,
 			PFID(&lli->lli_fid));
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 267f001..0f776cf 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -311,7 +311,7 @@ static int lmv_init_ea_size(struct obd_export *exp, int easize,
 		rc = md_init_ea_size(lmv->tgts[i]->ltd_exp, easize, def_easize,
 				     cookiesize, def_cookiesize);
 		if (rc) {
-			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d.\n",
+			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d\n",
 			       obd->obd_name, i, rc);
 			break;
 		}
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index 1169a80..ae854bc 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -264,7 +264,7 @@ static int lov_subresult(int result, int rc)
 	int rc_rank;
 
 	LASSERTF(result <= 0 || result == CLO_REPEAT || result == CLO_WAIT,
-		 "result = %d", result);
+		 "result = %d\n", result);
 	LASSERTF(rc <= 0 || rc == CLO_REPEAT || rc == CLO_WAIT,
 		 "rc = %d\n", rc);
 	CLASSERT(CLO_WAIT < CLO_REPEAT);
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index a86c1c4..5daa7fa 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -2199,7 +2199,7 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	    oqctl->qc_cmd != Q_INITQUOTA &&
 	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA &&
 	    oqctl->qc_cmd != Q_FINVALIDATE) {
-		CERROR("bad quota opc %x for lov obd", oqctl->qc_cmd);
+		CERROR("bad quota opc %x for lov obd\n", oqctl->qc_cmd);
 		return -EFAULT;
 	}
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_pool.c b/drivers/staging/lustre/lustre/lov/lov_pool.c
index be95dd7..9ae1d6f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pool.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pool.c
@@ -173,7 +173,7 @@ static void *pool_proc_next(struct seq_file *s, void *v, loff_t *pos)
 	struct pool_iterator *iter = (struct pool_iterator *)s->private;
 	int prev_idx;
 
-	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X", iter->magic);
+	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X\n", iter->magic);
 
 	/* test if end of file */
 	if (*pos >= pool_tgt_count(iter->pool))
@@ -257,7 +257,7 @@ static int pool_proc_show(struct seq_file *s, void *v)
 	struct pool_iterator *iter = (struct pool_iterator *)v;
 	struct lov_tgt_desc *tgt;
 
-	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X", iter->magic);
+	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X\n", iter->magic);
 	LASSERT(iter->pool);
 	LASSERT(iter->idx <= pool_tgt_count(iter->pool));
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index a4f3e70..55dd8ef 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1739,7 +1739,8 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	int rc;
 
 	if (!try_module_get(THIS_MODULE)) {
-		CERROR("Can't get module. Is it alive?");
+		CERROR("%s: cannot get module '%s'\n", obd->obd_name,
+		       module_name(THIS_MODULE));
 		return -EINVAL;
 	}
 	switch (cmd) {
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index bc49633..2dc5e57 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -349,7 +349,8 @@ static int config_log_add(struct obd_device *obd, char *logname,
 		if (ptr != NULL) {
 			*ptr = 0;
 		} else {
-			CERROR("sptlrpc log name not correct: %s", seclogname);
+			CERROR("%s: sptlrpc log name not correct, %s: rc = %d\n",
+			       obd->obd_name, seclogname, -EINVAL);
 			config_log_put(cld);
 			return -EINVAL;
 		}
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_lock.c b/drivers/staging/lustre/lustre/obdclass/cl_lock.c
index f40a2ec..aec644e 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_lock.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_lock.c
@@ -97,7 +97,7 @@ static int cl_lock_invariant(const struct lu_env *env,
 	result = atomic_read(&lock->cll_ref) > 0 &&
 		cl_lock_invariant_trusted(env, lock);
 	if (!result && env)
-		CL_LOCK_DEBUG(D_ERROR, env, lock, "invariant broken");
+		CL_LOCK_DEBUG(D_ERROR, env, lock, "invariant broken\n");
 	return result;
 }
 
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 1ea1578..5f52eab 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -256,7 +256,7 @@ struct dentry *ldebugfs_add_simple(struct dentry *root,
 		mode |= 0200;
 	entry = debugfs_create_file(name, mode, root, data, fops);
 	if (IS_ERR_OR_NULL(entry)) {
-		CERROR("LprocFS: No memory to create <debugfs> entry %s", name);
+		CERROR("LprocFS: No memory to create <debugfs> entry %s\n", name);
 		return entry ?: ERR_PTR(-ENOMEM);
 	}
 	return entry;
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index cefd39e..65a4746 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -459,7 +459,7 @@ int lu_cdebug_printer(const struct lu_env *env,
 		  ARRAY_SIZE(key->lck_area) - used, format, args);
 	if (complete) {
 		if (cfs_cdebug_show(msgdata->msg_mask, msgdata->msg_subsys))
-			libcfs_debug_msg(msgdata, "%s", key->lck_area);
+			libcfs_debug_msg(msgdata, "%s\n", key->lck_area);
 		key->lck_area[0] = 0;
 	}
 	va_end(args);
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 3edd7c8..64ffe24 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1293,8 +1293,12 @@ static int echo_get_object(struct echo_object **ecop, struct echo_device *ed,
 
 static void echo_put_object(struct echo_object *eco)
 {
-	if (cl_echo_object_put(eco))
-		CERROR("echo client: drop an object failed");
+	int rc;
+
+	rc = cl_echo_object_put(eco);
+	if (rc)
+		CERROR("%s: echo client drop an object failed: rc = %d\n",
+		       eco->eo_dev->ed_ec->ec_exp->exp_obd->obd_name, rc);
 }
 
 static void
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 6243aac..2e45255 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -464,7 +464,7 @@ static void osc_extent_insert(struct osc_object *obj, struct osc_extent *ext)
 		else if (ext->oe_start > tmp->oe_end)
 			n = &(*n)->rb_right;
 		else
-			EASSERTF(0, tmp, EXTSTR, EXTPARA(ext));
+			EASSERTF(0, tmp, EXTSTR"\n", EXTPARA(ext));
 	}
 	rb_link_node(&ext->oe_node, parent, n);
 	rb_insert_color(&ext->oe_node, &obj->oo_root);
@@ -674,7 +674,8 @@ static struct osc_extent *osc_extent_find(const struct lu_env *env,
 	/* grants has been allocated by caller */
 	LASSERTF(*grants >= chunksize + cli->cl_extent_tax,
 		 "%u/%u/%u.\n", *grants, chunksize, cli->cl_extent_tax);
-	LASSERTF((max_end - cur->oe_start) < max_pages, EXTSTR, EXTPARA(cur));
+	LASSERTF((max_end - cur->oe_start) < max_pages, EXTSTR"\n",
+		 EXTPARA(cur));
 
 restart:
 	osc_object_lock(obj);
@@ -692,7 +693,7 @@ restart:
 		/* if covering by different locks, no chance to match */
 		if (lock != ext->oe_osclock) {
 			EASSERTF(!overlapped(ext, cur), ext,
-				 EXTSTR, EXTPARA(cur));
+				 EXTSTR"\n", EXTPARA(cur));
 
 			ext = next_extent(ext);
 			continue;
@@ -715,7 +716,7 @@ restart:
 			 */
 			EASSERTF((ext->oe_start <= cur->oe_start &&
 				  ext->oe_end >= cur->oe_end),
-				 ext, EXTSTR, EXTPARA(cur));
+				 ext, EXTSTR"\n", EXTPARA(cur));
 
 			if (ext->oe_state > OES_CACHE || ext->oe_fsync_wait) {
 				/* for simplicity, we wait for this extent to
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index 8a3e872..013df97 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1590,7 +1590,7 @@ int osc_lock_init(const struct lu_env *env,
 		if (clk->ols_locklessable && !(enqflags & CEF_DISCARD_DATA))
 			clk->ols_flags |= LDLM_FL_DENY_ON_CONTENTION;
 
-		LDLM_DEBUG_NOLOCK("lock %p, osc lock %p, flags %llx\n",
+		LDLM_DEBUG_NOLOCK("lock %p, osc lock %p, flags %llx",
 				  lock, clk, clk->ols_flags);
 
 		result = 0;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 2238f92..74805f1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2639,7 +2639,8 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	int err = 0;
 
 	if (!try_module_get(THIS_MODULE)) {
-		CERROR("Can't get module. Is it alive?");
+		CERROR("%s: cannot get module '%s'\n", obd->obd_name,
+		       module_name(THIS_MODULE));
 		return -EINVAL;
 	}
 	switch (cmd) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 9f65a10..1b7673e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -722,9 +722,9 @@ struct ptlrpc_request *__ptlrpc_request_alloc(struct obd_import *imp,
 		request = ptlrpc_prep_req_from_pool(pool);
 
 	if (request) {
-		LASSERTF((unsigned long)imp > 0x1000, "%p", imp);
+		LASSERTF((unsigned long)imp > 0x1000, "%p\n", imp);
 		LASSERT(imp != LP_POISON);
-		LASSERTF((unsigned long)imp->imp_client > 0x1000, "%p",
+		LASSERTF((unsigned long)imp->imp_client > 0x1000, "%p\n",
 			 imp->imp_client);
 		LASSERT(imp->imp_client != LP_POISON);
 
@@ -2602,7 +2602,7 @@ int ptlrpc_queue_wait(struct ptlrpc_request *req)
 
 	set = ptlrpc_prep_set();
 	if (!set) {
-		CERROR("Unable to allocate ptlrpc set.");
+		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
 		return -ENOMEM;
 	}
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 14d0fc7..187fd1d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -1995,7 +1995,7 @@ int sptlrpc_svc_alloc_rs(struct ptlrpc_request *req, int msglen)
 		if (svcpt->scp_service->srv_max_reply_size <
 		   msglen + sizeof(struct ptlrpc_reply_state)) {
 			/* Just return failure if the size is too big */
-			CERROR("size of message is too big (%zd), %d allowed",
+			CERROR("size of message is too big (%zd), %d allowed\n",
 			       msglen + sizeof(struct ptlrpc_reply_state),
 			       svcpt->scp_service->srv_max_reply_size);
 			return -ENOMEM;
@@ -2165,7 +2165,7 @@ int sptlrpc_cli_unwrap_bulk_write(struct ptlrpc_request *req,
 	 * in case of privacy mode, nob_transferred needs to be adjusted.
 	 */
 	if (desc->bd_nob != desc->bd_nob_transferred) {
-		CERROR("nob %d doesn't match transferred nob %d",
+		CERROR("nob %d doesn't match transferred nob %d\n",
 		       desc->bd_nob, desc->bd_nob_transferred);
 		return -EPROTO;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 06/10] staging: lustre: Correct missing newline
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Nunez

From: James Nunez <james.a.nunez@intel.com>

Several error messages are missing newline characters
at the end of the message. Newlines are added where
necessary and other minor corrections; no punctuation
at the end of an error message, add a return code to
the end of error messages, device name at the beginning,
etc.

There are just a couple of places where newlines are
removed and this is only in LDLM_DEBUG_NOLOCK. The definition
of LDLM_DEBUG_NOLOCK already has a newline in it and
resulted in double newlines printed.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4871
Reviewed-on: http://review.whamcloud.com/10000
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
---
 drivers/staging/lustre/lnet/selftest/framework.c   |    6 +++---
 drivers/staging/lustre/lnet/selftest/rpc.c         |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |    8 ++++----
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_lock.c       |    2 +-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    2 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c       |    4 ++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/obdclass/cl_lock.c   |    2 +-
 .../lustre/lustre/obdclass/lprocfs_status.c        |    2 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 .../staging/lustre/lustre/obdecho/echo_client.c    |    8 ++++++--
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    9 +++++----
 drivers/staging/lustre/lustre/osc/osc_lock.c       |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    3 ++-
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    6 +++---
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    4 ++--
 19 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index 5c7cafa..a2f94fa 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -453,7 +453,7 @@ sfw_make_session(srpc_mksn_reqst_t *request, srpc_mksn_reply_t *reply)
 	/* brand new or create by force */
 	LIBCFS_ALLOC(sn, sizeof(sfw_session_t));
 	if (!sn) {
-		CERROR("Dropping RPC (mksn) under memory pressure.\n");
+		CERROR("dropping RPC mksn under memory pressure\n");
 		return -ENOMEM;
 	}
 
@@ -1155,7 +1155,7 @@ sfw_add_test(struct srpc_server_rpc *rpc)
 
 	bat = sfw_bid2batch(request->tsr_bid);
 	if (!bat) {
-		CERROR("Dropping RPC (%s) from %s under memory pressure.\n",
+		CERROR("dropping RPC %s from %s under memory pressure\n",
 		       rpc->srpc_scd->scd_svc->sv_name,
 		       libcfs_id2str(rpc->srpc_peer));
 		return -ENOMEM;
@@ -1367,7 +1367,7 @@ sfw_bulk_ready(struct srpc_server_rpc *rpc, int status)
 	}
 
 	if (sfw_del_session_timer()) {
-		CERROR("Dropping RPC (%s) from %s: racing with expiry timer",
+		CERROR("dropping RPC %s from %s: racing with expiry timer\n",
 		       sv->sv_name, libcfs_id2str(rpc->srpc_peer));
 		spin_unlock(&sfw_data.fw_lock);
 		return -EAGAIN;
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index 129fa02..5eb4232 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -667,7 +667,7 @@ srpc_finish_service(struct srpc_service *sv)
 		}
 
 		if (scd->scd_buf_nposted > 0) {
-			CDEBUG(D_NET, "waiting for %d posted buffers to unlink",
+			CDEBUG(D_NET, "waiting for %d posted buffers to unlink\n",
 			       scd->scd_buf_nposted);
 			spin_unlock(&scd->scd_lock);
 			return 0;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 6f0761c..c7904a9 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1041,7 +1041,7 @@ int ldlm_cli_cancel(struct lustre_handle *lockh,
 	/* concurrent cancels on the same handle can happen */
 	lock = ldlm_handle2lock_long(lockh, LDLM_FL_CANCELING);
 	if (!lock) {
-		LDLM_DEBUG_NOLOCK("lock is already being destroyed\n");
+		LDLM_DEBUG_NOLOCK("lock is already being destroyed");
 		return 0;
 	}
 
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index d3ed905..cf619af 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -952,7 +952,7 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 
 	set = ptlrpc_prep_set();
 	if (!set) {
-		CERROR("can't allocate ptlrpc set\n");
+		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
 		rc = -ENOMEM;
 	} else {
 		rc = obd_getattr_async(exp, &oinfo, set);
@@ -1180,7 +1180,7 @@ out:
 		CDEBUG(D_VFSTRACE, "Restart %s on %pD from %lld, count:%zd\n",
 		       iot == CIT_READ ? "read" : "write",
 		       file, *ppos, count);
-		LASSERTF(io->ci_nob == 0, "%zd", io->ci_nob);
+		LASSERTF(io->ci_nob == 0, "%zd\n", io->ci_nob);
 		goto restart;
 	}
 
@@ -3415,7 +3415,7 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	LASSERT(lock);
 	LASSERT(ldlm_has_layout(lock));
 
-	LDLM_DEBUG(lock, "File %p/"DFID" being reconfigured: %d.\n",
+	LDLM_DEBUG(lock, "File %p/"DFID" being reconfigured: %d",
 		   inode, PFID(&lli->lli_fid), reconf);
 
 	/* in case this is a caching lock and reinstate with new inode */
@@ -3571,7 +3571,7 @@ again:
 	it.it_op = IT_LAYOUT;
 	lockh.cookie = 0ULL;
 
-	LDLM_DEBUG_NOLOCK("%s: requeue layout lock for file %p/" DFID ".\n",
+	LDLM_DEBUG_NOLOCK("%s: requeue layout lock for file %p/" DFID "",
 			  ll_get_fsname(inode->i_sb, NULL, 0), inode,
 			PFID(&lli->lli_fid));
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 267f001..0f776cf 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -311,7 +311,7 @@ static int lmv_init_ea_size(struct obd_export *exp, int easize,
 		rc = md_init_ea_size(lmv->tgts[i]->ltd_exp, easize, def_easize,
 				     cookiesize, def_cookiesize);
 		if (rc) {
-			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d.\n",
+			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d\n",
 			       obd->obd_name, i, rc);
 			break;
 		}
diff --git a/drivers/staging/lustre/lustre/lov/lov_lock.c b/drivers/staging/lustre/lustre/lov/lov_lock.c
index 1169a80..ae854bc 100644
--- a/drivers/staging/lustre/lustre/lov/lov_lock.c
+++ b/drivers/staging/lustre/lustre/lov/lov_lock.c
@@ -264,7 +264,7 @@ static int lov_subresult(int result, int rc)
 	int rc_rank;
 
 	LASSERTF(result <= 0 || result == CLO_REPEAT || result == CLO_WAIT,
-		 "result = %d", result);
+		 "result = %d\n", result);
 	LASSERTF(rc <= 0 || rc == CLO_REPEAT || rc == CLO_WAIT,
 		 "rc = %d\n", rc);
 	CLASSERT(CLO_WAIT < CLO_REPEAT);
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index a86c1c4..5daa7fa 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -2199,7 +2199,7 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	    oqctl->qc_cmd != Q_INITQUOTA &&
 	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA &&
 	    oqctl->qc_cmd != Q_FINVALIDATE) {
-		CERROR("bad quota opc %x for lov obd", oqctl->qc_cmd);
+		CERROR("bad quota opc %x for lov obd\n", oqctl->qc_cmd);
 		return -EFAULT;
 	}
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_pool.c b/drivers/staging/lustre/lustre/lov/lov_pool.c
index be95dd7..9ae1d6f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pool.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pool.c
@@ -173,7 +173,7 @@ static void *pool_proc_next(struct seq_file *s, void *v, loff_t *pos)
 	struct pool_iterator *iter = (struct pool_iterator *)s->private;
 	int prev_idx;
 
-	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X", iter->magic);
+	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X\n", iter->magic);
 
 	/* test if end of file */
 	if (*pos >= pool_tgt_count(iter->pool))
@@ -257,7 +257,7 @@ static int pool_proc_show(struct seq_file *s, void *v)
 	struct pool_iterator *iter = (struct pool_iterator *)v;
 	struct lov_tgt_desc *tgt;
 
-	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X", iter->magic);
+	LASSERTF(iter->magic == POOL_IT_MAGIC, "%08X\n", iter->magic);
 	LASSERT(iter->pool);
 	LASSERT(iter->idx <= pool_tgt_count(iter->pool));
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index a4f3e70..55dd8ef 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1739,7 +1739,8 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	int rc;
 
 	if (!try_module_get(THIS_MODULE)) {
-		CERROR("Can't get module. Is it alive?");
+		CERROR("%s: cannot get module '%s'\n", obd->obd_name,
+		       module_name(THIS_MODULE));
 		return -EINVAL;
 	}
 	switch (cmd) {
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index bc49633..2dc5e57 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -349,7 +349,8 @@ static int config_log_add(struct obd_device *obd, char *logname,
 		if (ptr != NULL) {
 			*ptr = 0;
 		} else {
-			CERROR("sptlrpc log name not correct: %s", seclogname);
+			CERROR("%s: sptlrpc log name not correct, %s: rc = %d\n",
+			       obd->obd_name, seclogname, -EINVAL);
 			config_log_put(cld);
 			return -EINVAL;
 		}
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_lock.c b/drivers/staging/lustre/lustre/obdclass/cl_lock.c
index f40a2ec..aec644e 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_lock.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_lock.c
@@ -97,7 +97,7 @@ static int cl_lock_invariant(const struct lu_env *env,
 	result = atomic_read(&lock->cll_ref) > 0 &&
 		cl_lock_invariant_trusted(env, lock);
 	if (!result && env)
-		CL_LOCK_DEBUG(D_ERROR, env, lock, "invariant broken");
+		CL_LOCK_DEBUG(D_ERROR, env, lock, "invariant broken\n");
 	return result;
 }
 
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 1ea1578..5f52eab 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -256,7 +256,7 @@ struct dentry *ldebugfs_add_simple(struct dentry *root,
 		mode |= 0200;
 	entry = debugfs_create_file(name, mode, root, data, fops);
 	if (IS_ERR_OR_NULL(entry)) {
-		CERROR("LprocFS: No memory to create <debugfs> entry %s", name);
+		CERROR("LprocFS: No memory to create <debugfs> entry %s\n", name);
 		return entry ?: ERR_PTR(-ENOMEM);
 	}
 	return entry;
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index cefd39e..65a4746 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -459,7 +459,7 @@ int lu_cdebug_printer(const struct lu_env *env,
 		  ARRAY_SIZE(key->lck_area) - used, format, args);
 	if (complete) {
 		if (cfs_cdebug_show(msgdata->msg_mask, msgdata->msg_subsys))
-			libcfs_debug_msg(msgdata, "%s", key->lck_area);
+			libcfs_debug_msg(msgdata, "%s\n", key->lck_area);
 		key->lck_area[0] = 0;
 	}
 	va_end(args);
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 3edd7c8..64ffe24 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1293,8 +1293,12 @@ static int echo_get_object(struct echo_object **ecop, struct echo_device *ed,
 
 static void echo_put_object(struct echo_object *eco)
 {
-	if (cl_echo_object_put(eco))
-		CERROR("echo client: drop an object failed");
+	int rc;
+
+	rc = cl_echo_object_put(eco);
+	if (rc)
+		CERROR("%s: echo client drop an object failed: rc = %d\n",
+		       eco->eo_dev->ed_ec->ec_exp->exp_obd->obd_name, rc);
 }
 
 static void
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 6243aac..2e45255 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -464,7 +464,7 @@ static void osc_extent_insert(struct osc_object *obj, struct osc_extent *ext)
 		else if (ext->oe_start > tmp->oe_end)
 			n = &(*n)->rb_right;
 		else
-			EASSERTF(0, tmp, EXTSTR, EXTPARA(ext));
+			EASSERTF(0, tmp, EXTSTR"\n", EXTPARA(ext));
 	}
 	rb_link_node(&ext->oe_node, parent, n);
 	rb_insert_color(&ext->oe_node, &obj->oo_root);
@@ -674,7 +674,8 @@ static struct osc_extent *osc_extent_find(const struct lu_env *env,
 	/* grants has been allocated by caller */
 	LASSERTF(*grants >= chunksize + cli->cl_extent_tax,
 		 "%u/%u/%u.\n", *grants, chunksize, cli->cl_extent_tax);
-	LASSERTF((max_end - cur->oe_start) < max_pages, EXTSTR, EXTPARA(cur));
+	LASSERTF((max_end - cur->oe_start) < max_pages, EXTSTR"\n",
+		 EXTPARA(cur));
 
 restart:
 	osc_object_lock(obj);
@@ -692,7 +693,7 @@ restart:
 		/* if covering by different locks, no chance to match */
 		if (lock != ext->oe_osclock) {
 			EASSERTF(!overlapped(ext, cur), ext,
-				 EXTSTR, EXTPARA(cur));
+				 EXTSTR"\n", EXTPARA(cur));
 
 			ext = next_extent(ext);
 			continue;
@@ -715,7 +716,7 @@ restart:
 			 */
 			EASSERTF((ext->oe_start <= cur->oe_start &&
 				  ext->oe_end >= cur->oe_end),
-				 ext, EXTSTR, EXTPARA(cur));
+				 ext, EXTSTR"\n", EXTPARA(cur));
 
 			if (ext->oe_state > OES_CACHE || ext->oe_fsync_wait) {
 				/* for simplicity, we wait for this extent to
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index 8a3e872..013df97 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1590,7 +1590,7 @@ int osc_lock_init(const struct lu_env *env,
 		if (clk->ols_locklessable && !(enqflags & CEF_DISCARD_DATA))
 			clk->ols_flags |= LDLM_FL_DENY_ON_CONTENTION;
 
-		LDLM_DEBUG_NOLOCK("lock %p, osc lock %p, flags %llx\n",
+		LDLM_DEBUG_NOLOCK("lock %p, osc lock %p, flags %llx",
 				  lock, clk, clk->ols_flags);
 
 		result = 0;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 2238f92..74805f1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2639,7 +2639,8 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	int err = 0;
 
 	if (!try_module_get(THIS_MODULE)) {
-		CERROR("Can't get module. Is it alive?");
+		CERROR("%s: cannot get module '%s'\n", obd->obd_name,
+		       module_name(THIS_MODULE));
 		return -EINVAL;
 	}
 	switch (cmd) {
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 9f65a10..1b7673e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -722,9 +722,9 @@ struct ptlrpc_request *__ptlrpc_request_alloc(struct obd_import *imp,
 		request = ptlrpc_prep_req_from_pool(pool);
 
 	if (request) {
-		LASSERTF((unsigned long)imp > 0x1000, "%p", imp);
+		LASSERTF((unsigned long)imp > 0x1000, "%p\n", imp);
 		LASSERT(imp != LP_POISON);
-		LASSERTF((unsigned long)imp->imp_client > 0x1000, "%p",
+		LASSERTF((unsigned long)imp->imp_client > 0x1000, "%p\n",
 			 imp->imp_client);
 		LASSERT(imp->imp_client != LP_POISON);
 
@@ -2602,7 +2602,7 @@ int ptlrpc_queue_wait(struct ptlrpc_request *req)
 
 	set = ptlrpc_prep_set();
 	if (!set) {
-		CERROR("Unable to allocate ptlrpc set.");
+		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
 		return -ENOMEM;
 	}
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 14d0fc7..187fd1d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -1995,7 +1995,7 @@ int sptlrpc_svc_alloc_rs(struct ptlrpc_request *req, int msglen)
 		if (svcpt->scp_service->srv_max_reply_size <
 		   msglen + sizeof(struct ptlrpc_reply_state)) {
 			/* Just return failure if the size is too big */
-			CERROR("size of message is too big (%zd), %d allowed",
+			CERROR("size of message is too big (%zd), %d allowed\n",
 			       msglen + sizeof(struct ptlrpc_reply_state),
 			       svcpt->scp_service->srv_max_reply_size);
 			return -ENOMEM;
@@ -2165,7 +2165,7 @@ int sptlrpc_cli_unwrap_bulk_write(struct ptlrpc_request *req,
 	 * in case of privacy mode, nob_transferred needs to be adjusted.
 	 */
 	if (desc->bd_nob != desc->bd_nob_transferred) {
-		CERROR("nob %d doesn't match transferred nob %d",
+		CERROR("nob %d doesn't match transferred nob %d\n",
 		       desc->bd_nob, desc->bd_nob_transferred);
 		return -EPROTO;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/10] staging: lustre: add last missing sparse annotation __user
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Frank Zago

From: Frank Zago <fzago@cray.com>

One of the __user was missed in being applied to upstream
client. This is broken out of patch 11819.

Signed-off-by: Frank Zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/11819
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index a666d49..7395985 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -2041,7 +2041,7 @@ LNetCtl(unsigned int cmd, void *arg)
 		id.nid = data->ioc_nid;
 		id.pid = data->ioc_u32[0];
 		rc = lnet_ping(id, data->ioc_u32[1], /* timeout */
-			       data->ioc_pbuf1,
+			       (lnet_process_id_t __user *)data->ioc_pbuf1,
 			       data->ioc_plen1 / sizeof(lnet_process_id_t));
 		if (rc < 0)
 			return rc;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 07/10] staging: lustre: add last missing sparse annotation __user
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Frank Zago

From: Frank Zago <fzago@cray.com>

One of the __user was missed in being applied to upstream
client. This is broken out of patch 11819.

Signed-off-by: Frank Zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
Reviewed-on: http://review.whamcloud.com/11819
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index a666d49..7395985 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -2041,7 +2041,7 @@ LNetCtl(unsigned int cmd, void *arg)
 		id.nid = data->ioc_nid;
 		id.pid = data->ioc_u32[0];
 		rc = lnet_ping(id, data->ioc_u32[1], /* timeout */
-			       data->ioc_pbuf1,
+			       (lnet_process_id_t __user *)data->ioc_pbuf1,
 			       data->ioc_plen1 / sizeof(lnet_process_id_t));
 		if (rc < 0)
 			return rc;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/10] staging: lustre: change test to asser in LNetGetId
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

The ln_refcount test was changed into an assert.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 7395985..4843980 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -2090,9 +2090,7 @@ LNetGetId(unsigned int index, lnet_process_id_t *id)
 	int cpt;
 	int rc = -ENOENT;
 
-	/* LNetNI initilization failed? */
-	if (!the_lnet.ln_refcount)
-		return rc;
+	LASSERT(the_lnet.ln_refcount > 0);
 
 	cpt = lnet_net_lock_current();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 08/10] staging: lustre: change test to asser in LNetGetId
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

The ln_refcount test was changed into an assert.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/lnet/api-ni.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 7395985..4843980 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -2090,9 +2090,7 @@ LNetGetId(unsigned int index, lnet_process_id_t *id)
 	int cpt;
 	int rc = -ENOENT;
 
-	/* LNetNI initilization failed? */
-	if (!the_lnet.ln_refcount)
-		return rc;
+	LASSERT(the_lnet.ln_refcount > 0);
 
 	cpt = lnet_net_lock_current();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/10] staging: lustre: rename proc_call_handler to lprocfs_call_handler
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Using proc_call_handler as a function name is way too generic.
Rename to lprocfs_call_handler to avoid possible collisions.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/libcfs/module.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index 2a62331..a7e06ec 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -217,7 +217,7 @@ struct cfs_psdev_ops libcfs_psdev_ops = {
 	libcfs_ioctl
 };
 
-static int proc_call_handler(void *data, int write, loff_t *ppos,
+static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 			     void __user *buffer, size_t *lenp,
 			     int (*handler)(void *data, int write, loff_t pos,
 					    void __user *buffer, int len))
@@ -280,8 +280,8 @@ static int __proc_dobitmasks(void *data, int write,
 static int proc_dobitmasks(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_dobitmasks);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_dobitmasks);
 }
 
 static int __proc_dump_kernel(void *data, int write,
@@ -296,8 +296,8 @@ static int __proc_dump_kernel(void *data, int write,
 static int proc_dump_kernel(struct ctl_table *table, int write,
 			    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_dump_kernel);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_dump_kernel);
 }
 
 static int __proc_daemon_file(void *data, int write,
@@ -319,8 +319,8 @@ static int __proc_daemon_file(void *data, int write,
 static int proc_daemon_file(struct ctl_table *table, int write,
 			    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_daemon_file);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_daemon_file);
 }
 
 static int libcfs_force_lbug(struct ctl_table *table, int write,
@@ -389,8 +389,8 @@ static int __proc_cpt_table(void *data, int write,
 static int proc_cpt_table(struct ctl_table *table, int write,
 			  void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_cpt_table);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_cpt_table);
 }
 
 static struct ctl_table lnet_table[] = {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 09/10] staging: lustre: rename proc_call_handler to lprocfs_call_handler
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Using proc_call_handler as a function name is way too generic.
Rename to lprocfs_call_handler to avoid possible collisions.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/libcfs/module.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index 2a62331..a7e06ec 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -217,7 +217,7 @@ struct cfs_psdev_ops libcfs_psdev_ops = {
 	libcfs_ioctl
 };
 
-static int proc_call_handler(void *data, int write, loff_t *ppos,
+static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 			     void __user *buffer, size_t *lenp,
 			     int (*handler)(void *data, int write, loff_t pos,
 					    void __user *buffer, int len))
@@ -280,8 +280,8 @@ static int __proc_dobitmasks(void *data, int write,
 static int proc_dobitmasks(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_dobitmasks);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_dobitmasks);
 }
 
 static int __proc_dump_kernel(void *data, int write,
@@ -296,8 +296,8 @@ static int __proc_dump_kernel(void *data, int write,
 static int proc_dump_kernel(struct ctl_table *table, int write,
 			    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_dump_kernel);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_dump_kernel);
 }
 
 static int __proc_daemon_file(void *data, int write,
@@ -319,8 +319,8 @@ static int __proc_daemon_file(void *data, int write,
 static int proc_daemon_file(struct ctl_table *table, int write,
 			    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_daemon_file);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_daemon_file);
 }
 
 static int libcfs_force_lbug(struct ctl_table *table, int write,
@@ -389,8 +389,8 @@ static int __proc_cpt_table(void *data, int write,
 static int proc_cpt_table(struct ctl_table *table, int write,
 			  void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_cpt_table);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_cpt_table);
 }
 
 static struct ctl_table lnet_table[] = {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/10] staging: lustre: make LNet use lprocfs_call_handler
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05  2:09   ` James Simmons
  -1 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Sometime ago a patch was submitted to duplicate the
proc_call_handler code in the LNet layer. This was
due to the thinking libcfs was not used by the LNet
layer. This was a wrong assumption so lets make LNet
use the lprocfs_call_handler from the libcfs layer.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    4 ++
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   32 ++++----------------
 drivers/staging/lustre/lustre/libcfs/module.c      |    9 +++--
 3 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 1eab0eb..7d63620 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -154,5 +154,9 @@ struct lnet_debugfs_symlink_def {
 
 void lustre_insert_debugfs(struct ctl_table *table,
 			   const struct lnet_debugfs_symlink_def *symlinks);
+int lprocfs_call_handler(void *data, int write, loff_t *ppos,
+			 void __user *buffer, size_t *lenp,
+			 int (*handler)(void *data, int write,
+			 loff_t pos, void __user *buffer, int len));
 
 #endif /* _LIBCFS_H */
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index ce4331e..65f65a3 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -73,26 +73,6 @@
 
 #define LNET_PROC_VERSION(v)	((unsigned int)((v) & LNET_PROC_VER_MASK))
 
-static int proc_call_handler(void *data, int write, loff_t *ppos,
-			     void __user *buffer, size_t *lenp,
-			     int (*handler)(void *data, int write,
-					    loff_t pos, void __user *buffer,
-					    int len))
-{
-	int rc = handler(data, write, *ppos, buffer, *lenp);
-
-	if (rc < 0)
-		return rc;
-
-	if (write) {
-		*ppos += *lenp;
-	} else {
-		*lenp = rc;
-		*ppos += rc;
-	}
-	return 0;
-}
-
 static int __proc_lnet_stats(void *data, int write,
 			     loff_t pos, void __user *buffer, int nob)
 {
@@ -144,8 +124,8 @@ static int __proc_lnet_stats(void *data, int write,
 static int proc_lnet_stats(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_stats);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_stats);
 }
 
 static int proc_lnet_routes(struct ctl_table *table, int write,
@@ -640,8 +620,8 @@ static int __proc_lnet_buffers(void *data, int write,
 static int proc_lnet_buffers(struct ctl_table *table, int write,
 			     void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_buffers);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_buffers);
 }
 
 static int proc_lnet_nis(struct ctl_table *table, int write,
@@ -865,8 +845,8 @@ static int proc_lnet_portal_rotor(struct ctl_table *table, int write,
 				  void __user *buffer, size_t *lenp,
 				  loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_portal_rotor);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_portal_rotor);
 }
 
 static struct ctl_table lnet_table[] = {
diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index a7e06ec..cdc640b 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -217,10 +217,10 @@ struct cfs_psdev_ops libcfs_psdev_ops = {
 	libcfs_ioctl
 };
 
-static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
-			     void __user *buffer, size_t *lenp,
-			     int (*handler)(void *data, int write, loff_t pos,
-					    void __user *buffer, int len))
+int lprocfs_call_handler(void *data, int write, loff_t *ppos,
+			 void __user *buffer, size_t *lenp,
+			 int (*handler)(void *data, int write, loff_t pos,
+					void __user *buffer, int len))
 {
 	int rc = handler(data, write, *ppos, buffer, *lenp);
 
@@ -235,6 +235,7 @@ static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 	}
 	return 0;
 }
+EXPORT_SYMBOL(lprocfs_call_handler);
 
 static int __proc_dobitmasks(void *data, int write,
 			     loff_t pos, void __user *buffer, int nob)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 10/10] staging: lustre: make LNet use lprocfs_call_handler
@ 2016-03-05  2:09   ` James Simmons
  0 siblings, 0 replies; 30+ messages in thread
From: James Simmons @ 2016-03-05  2:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Sometime ago a patch was submitted to duplicate the
proc_call_handler code in the LNet layer. This was
due to the thinking libcfs was not used by the LNet
layer. This was a wrong assumption so lets make LNet
use the lprocfs_call_handler from the libcfs layer.

Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |    4 ++
 drivers/staging/lustre/lnet/lnet/router_proc.c     |   32 ++++----------------
 drivers/staging/lustre/lustre/libcfs/module.c      |    9 +++--
 3 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 1eab0eb..7d63620 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -154,5 +154,9 @@ struct lnet_debugfs_symlink_def {
 
 void lustre_insert_debugfs(struct ctl_table *table,
 			   const struct lnet_debugfs_symlink_def *symlinks);
+int lprocfs_call_handler(void *data, int write, loff_t *ppos,
+			 void __user *buffer, size_t *lenp,
+			 int (*handler)(void *data, int write,
+			 loff_t pos, void __user *buffer, int len));
 
 #endif /* _LIBCFS_H */
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index ce4331e..65f65a3 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -73,26 +73,6 @@
 
 #define LNET_PROC_VERSION(v)	((unsigned int)((v) & LNET_PROC_VER_MASK))
 
-static int proc_call_handler(void *data, int write, loff_t *ppos,
-			     void __user *buffer, size_t *lenp,
-			     int (*handler)(void *data, int write,
-					    loff_t pos, void __user *buffer,
-					    int len))
-{
-	int rc = handler(data, write, *ppos, buffer, *lenp);
-
-	if (rc < 0)
-		return rc;
-
-	if (write) {
-		*ppos += *lenp;
-	} else {
-		*lenp = rc;
-		*ppos += rc;
-	}
-	return 0;
-}
-
 static int __proc_lnet_stats(void *data, int write,
 			     loff_t pos, void __user *buffer, int nob)
 {
@@ -144,8 +124,8 @@ static int __proc_lnet_stats(void *data, int write,
 static int proc_lnet_stats(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_stats);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_stats);
 }
 
 static int proc_lnet_routes(struct ctl_table *table, int write,
@@ -640,8 +620,8 @@ static int __proc_lnet_buffers(void *data, int write,
 static int proc_lnet_buffers(struct ctl_table *table, int write,
 			     void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_buffers);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_buffers);
 }
 
 static int proc_lnet_nis(struct ctl_table *table, int write,
@@ -865,8 +845,8 @@ static int proc_lnet_portal_rotor(struct ctl_table *table, int write,
 				  void __user *buffer, size_t *lenp,
 				  loff_t *ppos)
 {
-	return proc_call_handler(table->data, write, ppos, buffer, lenp,
-				 __proc_lnet_portal_rotor);
+	return lprocfs_call_handler(table->data, write, ppos, buffer, lenp,
+				    __proc_lnet_portal_rotor);
 }
 
 static struct ctl_table lnet_table[] = {
diff --git a/drivers/staging/lustre/lustre/libcfs/module.c b/drivers/staging/lustre/lustre/libcfs/module.c
index a7e06ec..cdc640b 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -217,10 +217,10 @@ struct cfs_psdev_ops libcfs_psdev_ops = {
 	libcfs_ioctl
 };
 
-static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
-			     void __user *buffer, size_t *lenp,
-			     int (*handler)(void *data, int write, loff_t pos,
-					    void __user *buffer, int len))
+int lprocfs_call_handler(void *data, int write, loff_t *ppos,
+			 void __user *buffer, size_t *lenp,
+			 int (*handler)(void *data, int write, loff_t pos,
+					void __user *buffer, int len))
 {
 	int rc = handler(data, write, *ppos, buffer, *lenp);
 
@@ -235,6 +235,7 @@ static int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 	}
 	return 0;
 }
+EXPORT_SYMBOL(lprocfs_call_handler);
 
 static int __proc_dobitmasks(void *data, int write,
 			     loff_t pos, void __user *buffer, int nob)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/10] staging: lustre: add last missing sparse annotation __user
  2016-03-05  2:09   ` [lustre-devel] " James Simmons
@ 2016-03-05  2:55     ` Drokin, Oleg
  -1 siblings, 0 replies; 30+ messages in thread
From: Drokin, Oleg @ 2016-03-05  2:55 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, <devel@driverdev.osuosl.org>,
	Dilger, Andreas, Linux Kernel Mailing List,
	Lustre Development List, Frank Zago


On Mar 4, 2016, at 9:09 PM, James Simmons wrote:

> From: Frank Zago <fzago@cray.com>
> 
> One of the __user was missed in being applied to upstream
> client. This is broken out of patch 11819.

It was not, the bug was fixed in another way.

> Signed-off-by: Frank Zago <fzago@cray.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
> Reviewed-on: http://review.whamcloud.com/11819
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
> drivers/staging/lustre/lnet/lnet/api-ni.c |    2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index a666d49..7395985 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -2041,7 +2041,7 @@ LNetCtl(unsigned int cmd, void *arg)
> 		id.nid = data->ioc_nid;
> 		id.pid = data->ioc_u32[0];
> 		rc = lnet_ping(id, data->ioc_u32[1], /* timeout */
> -			       data->ioc_pbuf1,
> +			       (lnet_process_id_t __user *)data->ioc_pbuf1,

We do not need this one anymore, since ioc_pbuf1 is defned as user now:
drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h:     void __user *ioc_pbuf1;

> 			       data->ioc_plen1 / sizeof(lnet_process_id_t));
> 		if (rc < 0)
> 			return rc;
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 07/10] staging: lustre: add last missing sparse annotation __user
@ 2016-03-05  2:55     ` Drokin, Oleg
  0 siblings, 0 replies; 30+ messages in thread
From: Drokin, Oleg @ 2016-03-05  2:55 UTC (permalink / raw)
  To: James Simmons
  Cc: Greg Kroah-Hartman, <devel@driverdev.osuosl.org>,
	Dilger, Andreas, Linux Kernel Mailing List,
	Lustre Development List, Frank Zago


On Mar 4, 2016, at 9:09 PM, James Simmons wrote:

> From: Frank Zago <fzago@cray.com>
> 
> One of the __user was missed in being applied to upstream
> client. This is broken out of patch 11819.

It was not, the bug was fixed in another way.

> Signed-off-by: Frank Zago <fzago@cray.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5396
> Reviewed-on: http://review.whamcloud.com/11819
> Reviewed-by: James Simmons <uja.ornl@gmail.com>
> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> ---
> drivers/staging/lustre/lnet/lnet/api-ni.c |    2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index a666d49..7395985 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -2041,7 +2041,7 @@ LNetCtl(unsigned int cmd, void *arg)
> 		id.nid = data->ioc_nid;
> 		id.pid = data->ioc_u32[0];
> 		rc = lnet_ping(id, data->ioc_u32[1], /* timeout */
> -			       data->ioc_pbuf1,
> +			       (lnet_process_id_t __user *)data->ioc_pbuf1,

We do not need this one anymore, since ioc_pbuf1 is defned as user now:
drivers/staging/lustre/include/linux/libcfs/libcfs_ioctl.h:     void __user *ioc_pbuf1;

> 			       data->ioc_plen1 / sizeof(lnet_process_id_t));
> 		if (rc < 0)
> 			return rc;
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 01/10] staging: lustre: LNet drop rule implementation
  2016-03-05  2:09   ` [lustre-devel] " James Simmons
@ 2016-03-05  7:53     ` kbuild test robot
  -1 siblings, 0 replies; 30+ messages in thread
From: kbuild test robot @ 2016-03-05  7:53 UTC (permalink / raw)
  To: James Simmons
  Cc: kbuild-all, Greg Kroah-Hartman, devel, Andreas Dilger,
	Oleg Drokin, Liang Zhen, Linux Kernel Mailing List,
	Lustre Development List

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]

Hi Liang,

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on v4.5-rc6 next-20160304]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/James-Simmons/Last-batch-of-fixes-for-LNet/20160305-101431
config: m68k-allyesconfig (attached as .config)
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m68k 

All errors (new ones prefixed by >>):

>> ERROR: "__umoddi3" [drivers/staging/lustre/lnet/lnet/lnet.ko] undefined!

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 35803 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 01/10] staging: lustre: LNet drop rule implementation
@ 2016-03-05  7:53     ` kbuild test robot
  0 siblings, 0 replies; 30+ messages in thread
From: kbuild test robot @ 2016-03-05  7:53 UTC (permalink / raw)
  To: lustre-devel

Hi Liang,

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on v4.5-rc6 next-20160304]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/James-Simmons/Last-batch-of-fixes-for-LNet/20160305-101431
config: m68k-allyesconfig (attached as .config)
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m68k 

All errors (new ones prefixed by >>):

>> ERROR: "__umoddi3" [drivers/staging/lustre/lnet/lnet/lnet.ko] undefined!

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/octet-stream
Size: 35803 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20160305/5447f5c7/attachment-0001.obj>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/10] Last batch of fixes for LNet
  2016-03-05  2:09 ` [lustre-devel] " James Simmons
@ 2016-03-05 19:52   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 30+ messages in thread
From: Greg Kroah-Hartman @ 2016-03-05 19:52 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List

On Fri, Mar 04, 2016 at 09:09:40PM -0500, James Simmons wrote:
> This batch merges the remaining LNet patches from the OpenSFS
> branch for the upstream client. Once merged the LNet code
> will be up to date with the latest production code. Only style
> issues are remaining. Still future patches being developed
> for LNet will be landed to the upstream client as soon as they
> are ready after extensive testing.

Please fix up the build issue, and figure out what went wrong with the
__user patch that you sent and resend this series after reworking them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 00/10] Last batch of fixes for LNet
@ 2016-03-05 19:52   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 30+ messages in thread
From: Greg Kroah-Hartman @ 2016-03-05 19:52 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List

On Fri, Mar 04, 2016 at 09:09:40PM -0500, James Simmons wrote:
> This batch merges the remaining LNet patches from the OpenSFS
> branch for the upstream client. Once merged the LNet code
> will be up to date with the latest production code. Only style
> issues are remaining. Still future patches being developed
> for LNet will be landed to the upstream client as soon as they
> are ready after extensive testing.

Please fix up the build issue, and figure out what went wrong with the
__user patch that you sent and resend this series after reworking them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [lustre-devel] [PATCH 00/10] Last batch of fixes for LNet
  2016-03-05 19:52   ` [lustre-devel] " Greg Kroah-Hartman
@ 2016-03-07 19:34     ` Simmons, James A.
  -1 siblings, 0 replies; 30+ messages in thread
From: Simmons, James A. @ 2016-03-07 19:34 UTC (permalink / raw)
  To: 'Greg Kroah-Hartman', James Simmons
  Cc: devel, Oleg Drokin, Linux Kernel Mailing List, Lustre Development List

>On Fri, Mar 04, 2016 at 09:09:40PM -0500, James Simmons wrote:
>> This batch merges the remaining LNet patches from the OpenSFS
>> branch for the upstream client. Once merged the LNet code
>> will be up to date with the latest production code. Only style
>> issues are remaining. Still future patches being developed
>> for LNet will be landed to the upstream client as soon as they
>> are ready after extensive testing.
>
>Please fix up the build issue, and figure out what went wrong with the
>__user patch that you sent and resend this series after reworking them.

I had a discussion with Oleg about the __user patch. It  appears that is
a bug in the production branch so that patch can be dropped.  As for the
build issues this has been a know issue for a awhile but nobody has
gotten around to fixing all the 32 bit issues. I guess it is time to fix that up.
I will send out new patches later after I'm doing testing them.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH 00/10] Last batch of fixes for LNet
@ 2016-03-07 19:34     ` Simmons, James A.
  0 siblings, 0 replies; 30+ messages in thread
From: Simmons, James A. @ 2016-03-07 19:34 UTC (permalink / raw)
  To: 'Greg Kroah-Hartman', James Simmons
  Cc: devel, Oleg Drokin, Linux Kernel Mailing List, Lustre Development List

>On Fri, Mar 04, 2016 at 09:09:40PM -0500, James Simmons wrote:
>> This batch merges the remaining LNet patches from the OpenSFS
>> branch for the upstream client. Once merged the LNet code
>> will be up to date with the latest production code. Only style
>> issues are remaining. Still future patches being developed
>> for LNet will be landed to the upstream client as soon as they
>> are ready after extensive testing.
>
>Please fix up the build issue, and figure out what went wrong with the
>__user patch that you sent and resend this series after reworking them.

I had a discussion with Oleg about the __user patch. It  appears that is
a bug in the production branch so that patch can be dropped.  As for the
build issues this has been a know issue for a awhile but nobody has
gotten around to fixing all the 32 bit issues. I guess it is time to fix that up.
I will send out new patches later after I'm doing testing them.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-03-07 19:35 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-05  2:09 [PATCH 00/10] Last batch of fixes for LNet James Simmons
2016-03-05  2:09 ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 01/10] staging: lustre: LNet drop rule implementation James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  7:53   ` kbuild test robot
2016-03-05  7:53     ` [lustre-devel] " kbuild test robot
2016-03-05  2:09 ` [PATCH 02/10] staging: lustre: LNet network latency simulation James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 03/10] staging: lustre: fix 'data race condition' issue in conrpc.c James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 04/10] staging: lustre: fix 'NULL pointer dereference' errors James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 05/10] staging: lustre: fix 'data race condition' issue in framework.c James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 06/10] staging: lustre: Correct missing newline James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 07/10] staging: lustre: add last missing sparse annotation __user James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:55   ` Drokin, Oleg
2016-03-05  2:55     ` [lustre-devel] " Drokin, Oleg
2016-03-05  2:09 ` [PATCH 08/10] staging: lustre: change test to asser in LNetGetId James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 09/10] staging: lustre: rename proc_call_handler to lprocfs_call_handler James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05  2:09 ` [PATCH 10/10] staging: lustre: make LNet use lprocfs_call_handler James Simmons
2016-03-05  2:09   ` [lustre-devel] " James Simmons
2016-03-05 19:52 ` [PATCH 00/10] Last batch of fixes for LNet Greg Kroah-Hartman
2016-03-05 19:52   ` [lustre-devel] " Greg Kroah-Hartman
2016-03-07 19:34   ` Simmons, James A.
2016-03-07 19:34     ` Simmons, James A.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.