All of lore.kernel.org
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes
@ 2018-07-30  3:49 NeilBrown
  2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

There is no real pattern here, just minor tidy-up and minor bug-fix.

Thanks,
NeilBrown


---

NeilBrown (7):
      lustre: use schedule_timeout_$state().
      lustre/libcfs: fix freeing after kmalloc failure.
      lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init
      lustre: give different tcd_lock types different classes.
      lustre/libcfs: discard cfs_trace_allocate_string_buffer()
      lustre: lnet: convert ni_refs to percpu_refcount.
      lustre: change TASK_NOLOAD to TASK_IDLE.


 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   11 +---
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 -
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |   12 ++---
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    6 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |   14 ++----
 drivers/staging/lustre/lnet/libcfs/fail.c          |    3 -
 drivers/staging/lustre/lnet/libcfs/hash.c          |    2 -
 drivers/staging/lustre/lnet/libcfs/module.c        |   14 +++---
 drivers/staging/lustre/lnet/libcfs/tracefile.c     |   49 +++++++++++---------
 drivers/staging/lustre/lnet/libcfs/tracefile.h     |    1 
 drivers/staging/lustre/lnet/lnet/acceptor.c        |    3 -
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   40 +++++++---------
 drivers/staging/lustre/lnet/lnet/config.c          |   13 +++--
 drivers/staging/lustre/lnet/lnet/lib-eq.c          |    4 +-
 drivers/staging/lustre/lnet/lnet/peer.c            |    3 -
 drivers/staging/lustre/lnet/lnet/router.c          |   19 ++------
 drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 -
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    3 -
 drivers/staging/lustre/lnet/selftest/rpc.c         |    3 -
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 -
 drivers/staging/lustre/lustre/include/lustre_mdc.h |    2 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    3 -
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 +--
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 +-
 .../staging/lustre/lustre/obdecho/echo_client.c    |    3 -
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    4 --
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    3 -
 27 files changed, 101 insertions(+), 135 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state().
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
  2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
  2018-07-30  3:49 ` [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:30   ` Andreas Dilger
  2018-08-02  3:45   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes NeilBrown
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

Lustre has many calls to
   set_current_state(STATE);
   schedule_timeout(time);


These can more easily be done as
    schedule_timeout_STATE(time);

Also clean up some oddities, such as setting the state
to TASK_RUNNING after the timeout, and simplify
some time calculations.

Some schedule_timeout() calls remain as the state was set earlier,
before an 'add_wait_queue()'.  It would be incorrect to convert these.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |   12 ++++--------
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    6 ++----
 .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |   14 ++++++--------
 drivers/staging/lustre/lnet/libcfs/fail.c          |    3 +--
 drivers/staging/lustre/lnet/libcfs/tracefile.c     |    3 +--
 drivers/staging/lustre/lnet/lnet/acceptor.c        |    3 +--
 drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 ++----
 drivers/staging/lustre/lnet/lnet/peer.c            |    3 +--
 drivers/staging/lustre/lnet/lnet/router.c          |   19 +++++--------------
 drivers/staging/lustre/lnet/selftest/conrpc.c      |    3 +--
 drivers/staging/lustre/lnet/selftest/rpc.c         |    3 +--
 drivers/staging/lustre/lnet/selftest/selftest.h    |    3 +--
 drivers/staging/lustre/lustre/include/lustre_mdc.h |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    3 +--
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 +++-----
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 ++----
 .../staging/lustre/lustre/obdecho/echo_client.c    |    3 +--
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    4 +---
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    3 +--
 19 files changed, 36 insertions(+), 71 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e15ad94151bd..f496e6fcc416 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1201,8 +1201,7 @@ static struct kib_hca_dev *kiblnd_current_hdev(struct kib_dev *dev)
 		if (!(i++ % 50))
 			CDEBUG(D_NET, "%s: Wait for failover\n",
 			       dev->ibd_ifname);
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(HZ / 100);
+		schedule_timeout_interruptible(HZ / 100);
 
 		read_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
 	}
@@ -1916,8 +1915,7 @@ struct list_head *kiblnd_pool_alloc_node(struct kib_poolset *ps)
 		CDEBUG(D_NET, "Another thread is allocating new %s pool, waiting %d HZs for her to complete. trips = %d\n",
 		       ps->ps_name, interval, trips);
 
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(interval);
+		schedule_timeout_interruptible(interval);
 		if (interval < HZ)
 			interval *= 2;
 
@@ -2548,8 +2546,7 @@ static void kiblnd_base_shutdown(void)
 			CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
 			       "Waiting for %d threads to terminate\n",
 			       atomic_read(&kiblnd_data.kib_nthreads));
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(HZ);
+			schedule_timeout_uninterruptible(HZ);
 		}
 
 		/* fall through */
@@ -2599,8 +2596,7 @@ static void kiblnd_shutdown(struct lnet_ni *ni)
 			       "%s: waiting for %d peers to disconnect\n",
 			       libcfs_nid2str(ni->ni_nid),
 			       atomic_read(&net->ibn_npeers));
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(HZ);
+			schedule_timeout_uninterruptible(HZ);
 		}
 
 		kiblnd_net_fini_pools(net);
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index f0b0480686dc..4dde158451ea 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -2307,8 +2307,7 @@ ksocknal_base_shutdown(void)
 			       "waiting for %d threads to terminate\n",
 				ksocknal_data.ksnd_nthreads);
 			read_unlock(&ksocknal_data.ksnd_global_lock);
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(HZ);
+			schedule_timeout_uninterruptible(HZ);
 			read_lock(&ksocknal_data.ksnd_global_lock);
 		}
 		read_unlock(&ksocknal_data.ksnd_global_lock);
@@ -2533,8 +2532,7 @@ ksocknal_shutdown(struct lnet_ni *ni)
 		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET, /* power of 2? */
 		       "waiting for %d peers to disconnect\n",
 		       net->ksnn_npeers);
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ);
+		schedule_timeout_uninterruptible(HZ);
 
 		ksocknal_debug_peerhash(ni);
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index a5c0e8a9bc40..32b76727f400 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -188,10 +188,9 @@ ksocknal_transmit(struct ksock_conn *conn, struct ksock_tx *tx)
 	int rc;
 	int bufnob;
 
-	if (ksocknal_data.ksnd_stall_tx) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(ksocknal_data.ksnd_stall_tx * HZ);
-	}
+	if (ksocknal_data.ksnd_stall_tx)
+		schedule_timeout_uninterruptible(
+			ksocknal_data.ksnd_stall_tx * HZ);
 
 	LASSERT(tx->tx_resid);
 
@@ -293,10 +292,9 @@ ksocknal_receive(struct ksock_conn *conn)
 	 */
 	int rc;
 
-	if (ksocknal_data.ksnd_stall_rx) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(ksocknal_data.ksnd_stall_rx * HZ);
-	}
+	if (ksocknal_data.ksnd_stall_rx)
+		schedule_timeout_uninterruptible(
+			ksocknal_data.ksnd_stall_rx * HZ);
 
 	rc = ksocknal_connsock_addref(conn);
 	if (rc) {
diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
index bd86b3b5bc34..6ee4de2178ce 100644
--- a/drivers/staging/lustre/lnet/libcfs/fail.c
+++ b/drivers/staging/lustre/lnet/libcfs/fail.c
@@ -137,8 +137,7 @@ int __cfs_fail_timeout_set(u32 id, u32 value, int ms, int set)
 	if (ret && likely(ms > 0)) {
 		CERROR("cfs_fail_timeout id %x sleeping for %dms\n",
 		       id, ms);
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(ms * HZ / 1000);
+		schedule_timeout_uninterruptible(ms * HZ / 1000);
 		CERROR("cfs_fail_timeout id %x awake\n", id);
 	}
 	return ret;
diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
index a4768e930021..d4c80cf254e4 100644
--- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
+++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
@@ -1208,8 +1208,7 @@ static int tracefiled(void *arg)
 		}
 		init_waitqueue_entry(&__wait, current);
 		add_wait_queue(&tctl->tctl_waitq, &__wait);
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(HZ);
+		schedule_timeout_interruptible(HZ);
 		remove_wait_queue(&tctl->tctl_waitq, &__wait);
 	}
 	complete(&tctl->tctl_stop);
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
index 5648f17eddc0..3ae3ca1311a1 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -362,8 +362,7 @@ lnet_acceptor(void *arg)
 		if (rc) {
 			if (rc != -EAGAIN) {
 				CWARN("Accept error %d: pausing...\n", rc);
-				set_current_state(TASK_UNINTERRUPTIBLE);
-				schedule_timeout(HZ);
+				schedule_timeout_uninterruptible(HZ);
 			}
 			continue;
 		}
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 14b797802a85..cdbbe9cc8d95 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -955,8 +955,7 @@ lnet_ping_md_unlink(struct lnet_ping_info *pinfo,
 	/* NB md could be busy; this just starts the unlink */
 	while (pinfo->pi_features != LNET_PING_FEAT_INVAL) {
 		CDEBUG(D_NET, "Still waiting for ping MD to unlink\n");
-		set_current_state(TASK_NOLOAD);
-		schedule_timeout(HZ);
+		schedule_timeout_idle(HZ);
 	}
 }
 
@@ -1093,8 +1092,7 @@ lnet_clear_zombies_nis_locked(void)
 				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
 				       libcfs_nid2str(ni->ni_nid));
 			}
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(HZ);
+			schedule_timeout_uninterruptible(HZ);
 			lnet_net_lock(LNET_LOCK_EX);
 			continue;
 		}
diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
index 7c303ef6bb34..d9452c322e4d 100644
--- a/drivers/staging/lustre/lnet/lnet/peer.c
+++ b/drivers/staging/lustre/lnet/lnet/peer.c
@@ -136,8 +136,7 @@ lnet_peer_table_deathrow_wait_locked(struct lnet_peer_table *ptable,
 			       "Waiting for %d zombies on peer table\n",
 			       ptable->pt_zombies);
 		}
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ >> 1);
+		schedule_timeout_uninterruptible(HZ >> 1);
 		lnet_net_lock(cpt_locked);
 	}
 }
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 53373372b526..02241fbc9eaa 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -783,8 +783,7 @@ lnet_wait_known_routerstate(void)
 		if (all_known)
 			return;
 
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ);
+		schedule_timeout_uninterruptible(HZ);
 	}
 }
 
@@ -1159,8 +1158,7 @@ lnet_prune_rc_data(int wait_unlink)
 		i++;
 		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
 		       "Waiting for rc buffers to unlink\n");
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ / 4);
+		schedule_timeout_uninterruptible(HZ / 4);
 
 		lnet_net_lock(LNET_LOCK_EX);
 	}
@@ -1236,23 +1234,16 @@ lnet_router_checker(void *arg)
 
 		lnet_prune_rc_data(0); /* don't wait for UNLINK */
 
-		/*
-		 * Call schedule_timeout() here always adds 1 to load average
-		 * because kernel counts # active tasks as nr_running
-		 * + nr_uninterruptible.
-		 */
 		/*
 		 * if there are any routes then wakeup every second.  If
 		 * there are no routes then sleep indefinitely until woken
 		 * up by a user adding a route
 		 */
 		if (!lnet_router_checker_active())
-			wait_event_interruptible(the_lnet.ln_rc_waitq,
-						 lnet_router_checker_active());
+			wait_event_idle(the_lnet.ln_rc_waitq,
+					lnet_router_checker_active());
 		else
-			wait_event_interruptible_timeout(the_lnet.ln_rc_waitq,
-							 false,
-							 HZ);
+			schedule_timeout_idle(HZ);
 	}
 
 	lnet_prune_rc_data(1); /* wait for UNLINK */
diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
index e73b956d15e4..7809c1fc6f73 100644
--- a/drivers/staging/lustre/lnet/selftest/conrpc.c
+++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
@@ -1345,8 +1345,7 @@ lstcon_rpc_cleanup_wait(void)
 		mutex_unlock(&console_session.ses_mutex);
 
 		CWARN("Session is shutting down, waiting for termination of transactions\n");
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ);
+		schedule_timeout_uninterruptible(HZ);
 
 		mutex_lock(&console_session.ses_mutex);
 	}
diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
index e097ef8414a6..298de41444b3 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -1603,8 +1603,7 @@ srpc_startup(void)
 	spin_lock_init(&srpc_data.rpc_glock);
 
 	/* 1 second pause to avoid timestamp reuse */
-	set_current_state(TASK_UNINTERRUPTIBLE);
-	schedule_timeout(HZ);
+	schedule_timeout_uninterruptible(HZ);
 	srpc_data.rpc_matchbits = ((__u64)ktime_get_real_seconds()) << 48;
 
 	srpc_data.rpc_state = SRPC_STATE_NONE;
diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h
index ad9be095c4ea..9dbb0a51d430 100644
--- a/drivers/staging/lustre/lnet/selftest/selftest.h
+++ b/drivers/staging/lustre/lnet/selftest/selftest.h
@@ -573,8 +573,7 @@ swi_state2str(int state)
 
 #define selftest_wait_events()					\
 	do {							\
-		set_current_state(TASK_UNINTERRUPTIBLE);	\
-		schedule_timeout(HZ / 10);	\
+		schedule_timeout_uninterruptible(HZ / 10);	\
 	} while (0)
 
 #define lst_wait_until(cond, lock, fmt, ...)				\
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index a9c9992a2502..6ac7fc4fa8c6 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -124,7 +124,7 @@ static inline void mdc_get_rpc_lock(struct mdc_rpc_lock *lck,
 	 */
 	while (unlikely(lck->rpcl_it == MDC_FAKE_RPCL_IT)) {
 		mutex_unlock(&lck->rpcl_mutex);
-		schedule_timeout(HZ / 4);
+		schedule_timeout_uninterruptible(HZ / 4);
 		goto again;
 	}
 
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index 5b125fdc7321..0ee4798f1bb9 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -167,8 +167,7 @@ static void ldlm_handle_cp_callback(struct ptlrpc_request *req,
 		int to = HZ;
 
 		while (to > 0) {
-			set_current_state(TASK_INTERRUPTIBLE);
-			schedule_timeout(to);
+			schedule_timeout_interruptible(to);
 			if (lock->l_granted_mode == lock->l_req_mode ||
 			    ldlm_is_destroyed(lock))
 				break;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index f06cbd8b6d13..33d73fa8e9d5 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -749,11 +749,9 @@ static void cleanup_resource(struct ldlm_resource *res, struct list_head *q,
 			 */
 			unlock_res(res);
 			LDLM_DEBUG(lock, "setting FL_LOCAL_ONLY");
-			if (lock->l_flags & LDLM_FL_FAIL_LOC) {
-				set_current_state(TASK_UNINTERRUPTIBLE);
-				schedule_timeout(4 * HZ);
-				set_current_state(TASK_RUNNING);
-			}
+			if (lock->l_flags & LDLM_FL_FAIL_LOC)
+				schedule_timeout_uninterruptible(4 * HZ);
+
 			if (lock->l_completion_ast)
 				lock->l_completion_ast(lock, LDLM_FL_FAILED,
 						       NULL);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 5c8d0fe7217e..3dedc61d2257 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -706,10 +706,8 @@ void ll_kill_super(struct super_block *sb)
 		sbi->ll_umounting = 1;
 
 		/* wait running statahead threads to quit */
-		while (atomic_read(&sbi->ll_sa_running) > 0) {
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC >> 3));
-		}
+		while (atomic_read(&sbi->ll_sa_running) > 0)
+			schedule_timeout_uninterruptible(HZ >> 3);
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 3022706c6985..1ddb4a6dd8f3 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -751,8 +751,7 @@ static struct lu_device *echo_device_free(const struct lu_env *env,
 	while (!list_empty(&ec->ec_objects)) {
 		spin_unlock(&ec->ec_lock);
 		CERROR("echo_client still has objects at cleanup time, wait for 1 second\n");
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ);
+		schedule_timeout_uninterruptible(HZ);
 		lu_site_purge(env, ed->ed_site, -1);
 		spin_lock(&ec->ec_lock);
 	}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 7a3d83c0e50b..91dd09867260 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -761,9 +761,7 @@ int ptlrpc_request_bufs_pack(struct ptlrpc_request *request,
 			/* The RPC is infected, let the test change the
 			 * fail_loc
 			 */
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			schedule_timeout(2 * HZ);
-			set_current_state(TASK_RUNNING);
+			schedule_timeout_uninterruptible(2 * HZ);
 		}
 	}
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 9b60292370a7..9c598710b576 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -514,8 +514,7 @@ static int sptlrpc_req_replace_dead_ctx(struct ptlrpc_request *req)
 		       "ctx (%p, fl %lx) doesn't switch, relax a little bit\n",
 		       newctx, newctx->cc_flags);
 
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC));
+		schedule_timeout_interruptible(HZ);
 	} else if (unlikely(!test_bit(PTLRPC_CTX_UPTODATE_BIT, &newctx->cc_flags))) {
 		/*
 		 * new ctx not up to date yet

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure.
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
                   ` (3 preceding siblings ...)
  2018-07-30  3:49 ` [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:31   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE NeilBrown
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

The new_bkts array is *not* zeroed (any more) so when
freeing recently allocated buckets on failure, we
must no free beyond the last bucket successfully
allocated.

Fixes: 12e46c461cb9 ("staging: lustre: change some LIBCFS_ALLOC calls to k?alloc(GFP_KERNEL)")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/staging/lustre/lnet/libcfs/hash.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/hash.c b/drivers/staging/lustre/lnet/libcfs/hash.c
index 48be66f0d654..f452c4540ca1 100644
--- a/drivers/staging/lustre/lnet/libcfs/hash.c
+++ b/drivers/staging/lustre/lnet/libcfs/hash.c
@@ -904,7 +904,7 @@ cfs_hash_buckets_realloc(struct cfs_hash *hs, struct cfs_hash_bucket **old_bkts,
 		new_bkts[i] = kzalloc(cfs_hash_bkt_size(hs), GFP_KERNEL);
 		if (!new_bkts[i]) {
 			cfs_hash_buckets_free(new_bkts, cfs_hash_bkt_size(hs),
-					      old_size, new_size);
+					      old_size, i);
 			return NULL;
 		}
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
  2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:32   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state() NeilBrown
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

large memory allocations should be avoided at module-init,
but registering services is appropriate.
So move the registration of debugfs files
back into libcfs_init().
Without this, /sys/kernel/debug/lnet etc are not visible
immediately that libcfs is loaded.
No debugfs file access needs anything allocated by libcfs_setup().

Fixes: 64bf0b1a079d ("staging: lustre: refactor libcfs initialization.")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/staging/lustre/lnet/libcfs/module.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
index bfadfcfa3c44..5d2be941777e 100644
--- a/drivers/staging/lustre/lnet/libcfs/module.c
+++ b/drivers/staging/lustre/lnet/libcfs/module.c
@@ -719,10 +719,6 @@ int libcfs_setup(void)
 		goto err;
 	}
 
-	lnet_insert_debugfs(lnet_table);
-	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
-		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
-
 	CDEBUG(D_OTHER, "portals setup OK\n");
 out:
 	libcfs_active = 1;
@@ -743,6 +739,10 @@ static int libcfs_init(void)
 {
 	int rc;
 
+	lnet_insert_debugfs(lnet_table);
+	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
+		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
+
 	rc = misc_register(&libcfs_dev);
 	if (rc)
 		CERROR("misc_register: error %d\n", rc);

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes.
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
                   ` (2 preceding siblings ...)
  2018-07-30  3:49 ` [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state() NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:32   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure NeilBrown
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

There are three different trace contexts:
 process, softirq, irq.
Each has its own lock (tcd_lock) which is locked
as appropriate for that context.
lockdep currently doesn't see that they are different
and so deduces that the different uses might lead to
deadlocks.
So use separate calls to spin_lock_init() so that they
each get a separate lock class, and lockdep sees no
problem.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/staging/lustre/lnet/libcfs/tracefile.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
index d4c80cf254e4..40048165fc16 100644
--- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
+++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
@@ -1285,7 +1285,23 @@ int cfs_tracefile_init(int max_pages)
 	cfs_tcd_for_each(tcd, i, j) {
 		int factor = pages_factor[i];
 
-		spin_lock_init(&tcd->tcd_lock);
+		/* Note that we have three separate calls so
+		 * they the locks get three separate classes
+		 * and lockdep never thinks they are related.
+		 * As they are used in different interrupt
+		 * contexts, lockdep think the usage would conflict.
+		 */
+		switch(i) {
+		case CFS_TCD_TYPE_PROC:
+			spin_lock_init(&tcd->tcd_lock);
+			break;
+		case CFS_TCD_TYPE_SOFTIRQ:
+			spin_lock_init(&tcd->tcd_lock);
+			break;
+		case CFS_TCD_TYPE_IRQ:
+			spin_lock_init(&tcd->tcd_lock);
+			break;
+		}
 		tcd->tcd_pages_factor = factor;
 		tcd->tcd_type = i;
 		tcd->tcd_cpu = j;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer()
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:37   ` Andreas Dilger
  2018-08-02  3:47   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init NeilBrown
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

cfs_trace_allocate_string_buffer() is a simple wrapper
around kzalloc() that adds little value.  The code is
clearer if we perform the test and the allocation
directly where needed.

Also change the test from '>' to '>=' to ensure we
never try to allocate more than 2 pages, as that seems
to be the intent.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/staging/lustre/lnet/libcfs/module.c    |    6 +++--
 drivers/staging/lustre/lnet/libcfs/tracefile.c |   28 +++++++++---------------
 drivers/staging/lustre/lnet/libcfs/tracefile.h |    1 -
 3 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
index 5d2be941777e..1de83b1997c6 100644
--- a/drivers/staging/lustre/lnet/libcfs/module.c
+++ b/drivers/staging/lustre/lnet/libcfs/module.c
@@ -305,9 +305,9 @@ static int proc_dobitmasks(struct ctl_table *table, int write,
 	int is_subsys = (mask == &libcfs_subsystem_debug) ? 1 : 0;
 	int is_printk = (mask == &libcfs_printk) ? 1 : 0;
 
-	rc = cfs_trace_allocate_string_buffer(&tmpstr, tmpstrlen);
-	if (rc < 0)
-		return rc;
+	tmpstr = kzalloc(tmpstrlen, GFP_KERNEL);
+	if (!tmpstr)
+		return -ENOMEM;
 
 	if (!write) {
 		libcfs_debug_mask2str(tmpstr, tmpstrlen, *mask, is_subsys);
diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
index 40048165fc16..b273107b3815 100644
--- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
+++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
@@ -963,26 +963,16 @@ int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
 }
 EXPORT_SYMBOL(cfs_trace_copyout_string);
 
-int cfs_trace_allocate_string_buffer(char **str, int nob)
-{
-	if (nob > 2 * PAGE_SIZE)	    /* string must be "sensible" */
-		return -EINVAL;
-
-	*str = kmalloc(nob, GFP_KERNEL | __GFP_ZERO);
-	if (!*str)
-		return -ENOMEM;
-
-	return 0;
-}
-
 int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob)
 {
 	char *str;
 	int rc;
 
-	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
-	if (rc)
-		return rc;
+	if (usr_str_nob >= 2 * PAGE_SIZE)
+		return -EINVAL;
+	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
+	if (!str)
+		return -ENOMEM;
 
 	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
 				     usr_str, usr_str_nob);
@@ -1044,9 +1034,11 @@ int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob)
 	char *str;
 	int rc;
 
-	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
-	if (rc)
-		return rc;
+	if (usr_str_nob >= 2 * PAGE_SIZE)
+		return -EINVAL;
+	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
+	if (!str)
+		return -ENOMEM;
 
 	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
 				     usr_str, usr_str_nob);
diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.h b/drivers/staging/lustre/lnet/libcfs/tracefile.h
index 82f090fd8dfa..2134549bb3d7 100644
--- a/drivers/staging/lustre/lnet/libcfs/tracefile.h
+++ b/drivers/staging/lustre/lnet/libcfs/tracefile.h
@@ -63,7 +63,6 @@ int cfs_trace_copyin_string(char *knl_buffer, int knl_buffer_nob,
 			    const char __user *usr_buffer, int usr_buffer_nob);
 int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
 			     const char *knl_str, char *append);
-int cfs_trace_allocate_string_buffer(char **str, int nob);
 int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob);
 int cfs_trace_daemon_command(char *str);
 int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob);

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount.
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
                   ` (5 preceding siblings ...)
  2018-07-30  3:49 ` [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-08-02  3:47   ` James Simmons
  2018-07-30 21:45 ` [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes Andreas Dilger
  7 siblings, 1 reply; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

ni_refs is a per-cpt refcount.
Linux already has a per-cpu refcount implementation
which doesn't require anylocking.

So convert ni_refs to percpu_refcount.
As a bonus, we can get a wake-up when the refcount
reaches zero, rather than having to wait a full second.
The waiting in lnet_clear_zombies_nis_locked() is
modified so that instead of waiting one second each
time, and printing a warning on power-of-two seconds,
we wait an increasing power-of-two seconds and print
a warning if the wait ever timed out.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 .../staging/lustre/include/linux/lnet/lib-lnet.h   |   11 +-----
 .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +
 drivers/staging/lustre/lnet/lnet/api-ni.c          |   34 +++++++++-----------
 drivers/staging/lustre/lnet/lnet/config.c          |   13 +++++---
 drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 +
 5 files changed, 28 insertions(+), 34 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 0fecf0d32c58..371002825a7d 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -338,34 +338,27 @@ static inline void
 lnet_ni_addref_locked(struct lnet_ni *ni, int cpt)
 {
 	LASSERT(cpt >= 0 && cpt < LNET_CPT_NUMBER);
-	LASSERT(*ni->ni_refs[cpt] >= 0);
-
-	(*ni->ni_refs[cpt])++;
+	percpu_ref_get(&ni->ni_refs);
 }
 
 static inline void
 lnet_ni_addref(struct lnet_ni *ni)
 {
-	lnet_net_lock(0);
 	lnet_ni_addref_locked(ni, 0);
-	lnet_net_unlock(0);
 }
 
 static inline void
 lnet_ni_decref_locked(struct lnet_ni *ni, int cpt)
 {
 	LASSERT(cpt >= 0 && cpt < LNET_CPT_NUMBER);
-	LASSERT(*ni->ni_refs[cpt] > 0);
 
-	(*ni->ni_refs[cpt])--;
+	percpu_ref_put(&ni->ni_refs);
 }
 
 static inline void
 lnet_ni_decref(struct lnet_ni *ni)
 {
-	lnet_net_lock(0);
 	lnet_ni_decref_locked(ni, 0);
-	lnet_net_unlock(0);
 }
 
 void lnet_ni_free(struct lnet_ni *ni);
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index 6d4106fd9039..7527fef90cac 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -269,7 +269,7 @@ struct lnet_ni {
 	void			 *ni_data;	/* instance-specific data */
 	struct lnet_lnd		 *ni_lnd;	/* procedural interface */
 	struct lnet_tx_queue	**ni_tx_queues;	/* percpt TX queues */
-	int			**ni_refs;	/* percpt reference count */
+	struct percpu_ref	  ni_refs;
 	time64_t		  ni_last_alive;/* when I was last alive */
 	struct lnet_ni_status	 *ni_status;	/* my health status */
 	/* per NI LND tunables */
diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index cdbbe9cc8d95..fea03737439a 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -1055,7 +1055,7 @@ lnet_ni_unlink_locked(struct lnet_ni *ni)
 	/* move it to zombie list and nobody can find it anymore */
 	LASSERT(!list_empty(&ni->ni_list));
 	list_move(&ni->ni_list, &the_lnet.ln_nis_zombie);
-	lnet_ni_decref_locked(ni, 0);	/* drop ln_nis' ref */
+	percpu_ref_kill_and_confirm(&ni->ni_refs, NULL);	/* drop ln_nis' ref */
 }
 
 static void
@@ -1069,34 +1069,32 @@ lnet_clear_zombies_nis_locked(void)
 	 * Now wait for the NI's I just nuked to show up on ln_zombie_nis
 	 * and shut them down in guaranteed thread context
 	 */
-	i = 2;
+	i = 1;
 	while (!list_empty(&the_lnet.ln_nis_zombie)) {
-		int *ref;
-		int j;
 
 		ni = list_entry(the_lnet.ln_nis_zombie.next,
 				struct lnet_ni, ni_list);
-		list_del_init(&ni->ni_list);
-		cfs_percpt_for_each(ref, j, ni->ni_refs) {
-			if (!*ref)
-				continue;
-			/* still busy, add it back to zombie list */
-			list_add(&ni->ni_list, &the_lnet.ln_nis_zombie);
-			break;
-		}
 
-		if (!list_empty(&ni->ni_list)) {
+		if (!percpu_ref_is_zero(&ni->ni_refs)) {
+			/* still busy, wait a while */
+
 			lnet_net_unlock(LNET_LOCK_EX);
 			++i;
-			if ((i & (-i)) == i) {
+
+			if (wait_var_event_timeout(
+				    &ni->ni_refs,
+				    percpu_ref_is_zero(&ni->ni_refs),
+				    HZ << i) == 0)
 				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
 				       libcfs_nid2str(ni->ni_nid));
-			}
+
 			schedule_timeout_uninterruptible(HZ);
 			lnet_net_lock(LNET_LOCK_EX);
 			continue;
 		}
 
+		list_del_init(&ni->ni_list);
+
 		ni->ni_lnd->lnd_refcount--;
 		lnet_net_unlock(LNET_LOCK_EX);
 
@@ -1114,7 +1112,7 @@ lnet_clear_zombies_nis_locked(void)
 			       libcfs_nid2str(ni->ni_nid));
 
 		lnet_ni_free(ni);
-		i = 2;
+		i = 1;
 
 		lnet_net_lock(LNET_LOCK_EX);
 	}
@@ -1305,8 +1303,8 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf)
 	LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query);
 
 	lnet_net_lock(LNET_LOCK_EX);
-	/* refcount for ln_nis */
-	lnet_ni_addref_locked(ni, 0);
+	/* Initialise refcount for ln_nis to 1 */
+	percpu_ref_reinit(&ni->ni_refs);
 	list_add_tail(&ni->ni_list, &the_lnet.ln_nis);
 	if (ni->ni_cpts) {
 		lnet_ni_addref_locked(ni, 0);
diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
index 091c4f714e84..4145c7431576 100644
--- a/drivers/staging/lustre/lnet/lnet/config.c
+++ b/drivers/staging/lustre/lnet/lnet/config.c
@@ -96,8 +96,7 @@ lnet_ni_free(struct lnet_ni *ni)
 {
 	int i;
 
-	if (ni->ni_refs)
-		cfs_percpt_free(ni->ni_refs);
+	percpu_ref_exit(&ni->ni_refs);
 
 	if (ni->ni_tx_queues)
 		cfs_percpt_free(ni->ni_tx_queues);
@@ -117,6 +116,11 @@ lnet_ni_free(struct lnet_ni *ni)
 	kfree(ni);
 }
 
+static void ref_release(struct percpu_ref *ref)
+{
+	wake_up_var(ref);
+}
+
 struct lnet_ni *
 lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist)
 {
@@ -140,9 +144,8 @@ lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist)
 
 	spin_lock_init(&ni->ni_lock);
 	INIT_LIST_HEAD(&ni->ni_cptlist);
-	ni->ni_refs = cfs_percpt_alloc(lnet_cpt_table(),
-				       sizeof(*ni->ni_refs[0]));
-	if (!ni->ni_refs)
+	if (percpu_ref_init(&ni->ni_refs, ref_release,
+			    PERCPU_REF_INIT_DEAD, GFP_KERNEL) < 0)
 		goto failed;
 
 	ni->ni_tx_queues = cfs_percpt_alloc(lnet_cpt_table(),
diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
index d779445fefb5..8856798d263f 100644
--- a/drivers/staging/lustre/lnet/lnet/router_proc.c
+++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
@@ -703,7 +703,7 @@ static int proc_lnet_nis(struct ctl_table *table, int write,
 				s += snprintf(s, tmpstr + tmpsiz - s,
 					      "%-24s %6s %5lld %4d %4d %4d %5d %5d %5d\n",
 					      libcfs_nid2str(ni->ni_nid), stat,
-					      last_alive, *ni->ni_refs[i],
+					      last_alive, 0/* No per-cpt refcount */,
 					      ni->ni_peertxcredits,
 					      ni->ni_peerrtrcredits,
 					      tq->tq_credits_max,

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
                   ` (4 preceding siblings ...)
  2018-07-30  3:49 ` [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure NeilBrown
@ 2018-07-30  3:49 ` NeilBrown
  2018-07-30 21:43   ` Andreas Dilger
  2018-08-02  3:48   ` James Simmons
  2018-07-30  3:49 ` [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount NeilBrown
  2018-07-30 21:45 ` [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes Andreas Dilger
  7 siblings, 2 replies; 24+ messages in thread
From: NeilBrown @ 2018-07-30  3:49 UTC (permalink / raw)
  To: lustre-devel

TASK_NOLOAD is not a task state to be use by
itself, it should only be used together with
TASK_UNINTERRUPTIBLE, which easily done
by using TASK_IDLE.

So convert to TASK_IDLE.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/staging/lustre/lnet/lnet/lib-eq.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
index 8347cc44e47d..f085388895ea 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
@@ -349,7 +349,7 @@ __must_hold(&the_lnet.ln_eq_wait_lock)
  * \param timeout Time in jiffies to wait for an event to occur on
  * one of the EQs. The constant MAX_SCHEDULE_TIMEOUT can be used to indicate an
  * infinite timeout.
- * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_NOLOAD
+ * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_IDLE
  * \param event,which On successful return (1 or -EOVERFLOW), \a event will
  * hold the next event in the EQs, and \a which will contain the index of the
  * EQ from which the event was taken.
@@ -406,7 +406,7 @@ LNetEQPoll(struct lnet_handle_eq *eventqs, int neq, signed long timeout,
 		 */
 		wait = lnet_eq_wait_locked(&timeout,
 					   interruptible ? TASK_INTERRUPTIBLE
-					   : TASK_NOLOAD);
+					   : TASK_IDLE);
 		if (wait < 0) /* no new event */
 			break;
 	}

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state().
  2018-07-30  3:49 ` [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state() NeilBrown
@ 2018-07-30 21:30   ` Andreas Dilger
  2018-08-02  3:45   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:30 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> Lustre has many calls to
>   set_current_state(STATE);
>   schedule_timeout(time);
> 
> 
> These can more easily be done as
>    schedule_timeout_STATE(time);
> 
> Also clean up some oddities, such as setting the state
> to TASK_RUNNING after the timeout, and simplify
> some time calculations.
> 
> Some schedule_timeout() calls remain as the state was set earlier,
> before an 'add_wait_queue()'.  It would be incorrect to convert these.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

> ---
> .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |   12 ++++--------
> .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    6 ++----
> .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |   14 ++++++--------
> drivers/staging/lustre/lnet/libcfs/fail.c          |    3 +--
> drivers/staging/lustre/lnet/libcfs/tracefile.c     |    3 +--
> drivers/staging/lustre/lnet/lnet/acceptor.c        |    3 +--
> drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 ++----
> drivers/staging/lustre/lnet/lnet/peer.c            |    3 +--
> drivers/staging/lustre/lnet/lnet/router.c          |   19 +++++--------------
> drivers/staging/lustre/lnet/selftest/conrpc.c      |    3 +--
> drivers/staging/lustre/lnet/selftest/rpc.c         |    3 +--
> drivers/staging/lustre/lnet/selftest/selftest.h    |    3 +--
> drivers/staging/lustre/lustre/include/lustre_mdc.h |    2 +-
> drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    3 +--
> drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 +++-----
> drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 ++----
> .../staging/lustre/lustre/obdecho/echo_client.c    |    3 +--
> drivers/staging/lustre/lustre/ptlrpc/client.c      |    4 +---
> drivers/staging/lustre/lustre/ptlrpc/sec.c         |    3 +--
> 19 files changed, 36 insertions(+), 71 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> index e15ad94151bd..f496e6fcc416 100644
> --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> @@ -1201,8 +1201,7 @@ static struct kib_hca_dev *kiblnd_current_hdev(struct kib_dev *dev)
> 		if (!(i++ % 50))
> 			CDEBUG(D_NET, "%s: Wait for failover\n",
> 			       dev->ibd_ifname);
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(HZ / 100);
> +		schedule_timeout_interruptible(HZ / 100);
> 
> 		read_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
> 	}
> @@ -1916,8 +1915,7 @@ struct list_head *kiblnd_pool_alloc_node(struct kib_poolset *ps)
> 		CDEBUG(D_NET, "Another thread is allocating new %s pool, waiting %d HZs for her to complete. trips = %d\n",
> 		       ps->ps_name, interval, trips);
> 
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(interval);
> +		schedule_timeout_interruptible(interval);
> 		if (interval < HZ)
> 			interval *= 2;
> 
> @@ -2548,8 +2546,7 @@ static void kiblnd_base_shutdown(void)
> 			CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
> 			       "Waiting for %d threads to terminate\n",
> 			       atomic_read(&kiblnd_data.kib_nthreads));
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
> 		}
> 
> 		/* fall through */
> @@ -2599,8 +2596,7 @@ static void kiblnd_shutdown(struct lnet_ni *ni)
> 			       "%s: waiting for %d peers to disconnect\n",
> 			       libcfs_nid2str(ni->ni_nid),
> 			       atomic_read(&net->ibn_npeers));
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
> 		}
> 
> 		kiblnd_net_fini_pools(net);
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> index f0b0480686dc..4dde158451ea 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> @@ -2307,8 +2307,7 @@ ksocknal_base_shutdown(void)
> 			       "waiting for %d threads to terminate\n",
> 				ksocknal_data.ksnd_nthreads);
> 			read_unlock(&ksocknal_data.ksnd_global_lock);
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
> 			read_lock(&ksocknal_data.ksnd_global_lock);
> 		}
> 		read_unlock(&ksocknal_data.ksnd_global_lock);
> @@ -2533,8 +2532,7 @@ ksocknal_shutdown(struct lnet_ni *ni)
> 		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET, /* power of 2? */
> 		       "waiting for %d peers to disconnect\n",
> 		       net->ksnn_npeers);
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
> 
> 		ksocknal_debug_peerhash(ni);
> 
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> index a5c0e8a9bc40..32b76727f400 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> @@ -188,10 +188,9 @@ ksocknal_transmit(struct ksock_conn *conn, struct ksock_tx *tx)
> 	int rc;
> 	int bufnob;
> 
> -	if (ksocknal_data.ksnd_stall_tx) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ksocknal_data.ksnd_stall_tx * HZ);
> -	}
> +	if (ksocknal_data.ksnd_stall_tx)
> +		schedule_timeout_uninterruptible(
> +			ksocknal_data.ksnd_stall_tx * HZ);
> 
> 	LASSERT(tx->tx_resid);
> 
> @@ -293,10 +292,9 @@ ksocknal_receive(struct ksock_conn *conn)
> 	 */
> 	int rc;
> 
> -	if (ksocknal_data.ksnd_stall_rx) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ksocknal_data.ksnd_stall_rx * HZ);
> -	}
> +	if (ksocknal_data.ksnd_stall_rx)
> +		schedule_timeout_uninterruptible(
> +			ksocknal_data.ksnd_stall_rx * HZ);
> 
> 	rc = ksocknal_connsock_addref(conn);
> 	if (rc) {
> diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
> index bd86b3b5bc34..6ee4de2178ce 100644
> --- a/drivers/staging/lustre/lnet/libcfs/fail.c
> +++ b/drivers/staging/lustre/lnet/libcfs/fail.c
> @@ -137,8 +137,7 @@ int __cfs_fail_timeout_set(u32 id, u32 value, int ms, int set)
> 	if (ret && likely(ms > 0)) {
> 		CERROR("cfs_fail_timeout id %x sleeping for %dms\n",
> 		       id, ms);
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ms * HZ / 1000);
> +		schedule_timeout_uninterruptible(ms * HZ / 1000);
> 		CERROR("cfs_fail_timeout id %x awake\n", id);
> 	}
> 	return ret;
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index a4768e930021..d4c80cf254e4 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -1208,8 +1208,7 @@ static int tracefiled(void *arg)
> 		}
> 		init_waitqueue_entry(&__wait, current);
> 		add_wait_queue(&tctl->tctl_waitq, &__wait);
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_interruptible(HZ);
> 		remove_wait_queue(&tctl->tctl_waitq, &__wait);
> 	}
> 	complete(&tctl->tctl_stop);
> diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
> index 5648f17eddc0..3ae3ca1311a1 100644
> --- a/drivers/staging/lustre/lnet/lnet/acceptor.c
> +++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
> @@ -362,8 +362,7 @@ lnet_acceptor(void *arg)
> 		if (rc) {
> 			if (rc != -EAGAIN) {
> 				CWARN("Accept error %d: pausing...\n", rc);
> -				set_current_state(TASK_UNINTERRUPTIBLE);
> -				schedule_timeout(HZ);
> +				schedule_timeout_uninterruptible(HZ);
> 			}
> 			continue;
> 		}
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index 14b797802a85..cdbbe9cc8d95 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -955,8 +955,7 @@ lnet_ping_md_unlink(struct lnet_ping_info *pinfo,
> 	/* NB md could be busy; this just starts the unlink */
> 	while (pinfo->pi_features != LNET_PING_FEAT_INVAL) {
> 		CDEBUG(D_NET, "Still waiting for ping MD to unlink\n");
> -		set_current_state(TASK_NOLOAD);
> -		schedule_timeout(HZ);
> +		schedule_timeout_idle(HZ);
> 	}
> }
> 
> @@ -1093,8 +1092,7 @@ lnet_clear_zombies_nis_locked(void)
> 				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
> 				       libcfs_nid2str(ni->ni_nid));
> 			}
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
> 			lnet_net_lock(LNET_LOCK_EX);
> 			continue;
> 		}
> diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
> index 7c303ef6bb34..d9452c322e4d 100644
> --- a/drivers/staging/lustre/lnet/lnet/peer.c
> +++ b/drivers/staging/lustre/lnet/lnet/peer.c
> @@ -136,8 +136,7 @@ lnet_peer_table_deathrow_wait_locked(struct lnet_peer_table *ptable,
> 			       "Waiting for %d zombies on peer table\n",
> 			       ptable->pt_zombies);
> 		}
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ >> 1);
> +		schedule_timeout_uninterruptible(HZ >> 1);
> 		lnet_net_lock(cpt_locked);
> 	}
> }
> diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
> index 53373372b526..02241fbc9eaa 100644
> --- a/drivers/staging/lustre/lnet/lnet/router.c
> +++ b/drivers/staging/lustre/lnet/lnet/router.c
> @@ -783,8 +783,7 @@ lnet_wait_known_routerstate(void)
> 		if (all_known)
> 			return;
> 
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
> 	}
> }
> 
> @@ -1159,8 +1158,7 @@ lnet_prune_rc_data(int wait_unlink)
> 		i++;
> 		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
> 		       "Waiting for rc buffers to unlink\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ / 4);
> +		schedule_timeout_uninterruptible(HZ / 4);
> 
> 		lnet_net_lock(LNET_LOCK_EX);
> 	}
> @@ -1236,23 +1234,16 @@ lnet_router_checker(void *arg)
> 
> 		lnet_prune_rc_data(0); /* don't wait for UNLINK */
> 
> -		/*
> -		 * Call schedule_timeout() here always adds 1 to load average
> -		 * because kernel counts # active tasks as nr_running
> -		 * + nr_uninterruptible.
> -		 */
> 		/*
> 		 * if there are any routes then wakeup every second.  If
> 		 * there are no routes then sleep indefinitely until woken
> 		 * up by a user adding a route
> 		 */
> 		if (!lnet_router_checker_active())
> -			wait_event_interruptible(the_lnet.ln_rc_waitq,
> -						 lnet_router_checker_active());
> +			wait_event_idle(the_lnet.ln_rc_waitq,
> +					lnet_router_checker_active());
> 		else
> -			wait_event_interruptible_timeout(the_lnet.ln_rc_waitq,
> -							 false,
> -							 HZ);
> +			schedule_timeout_idle(HZ);
> 	}
> 
> 	lnet_prune_rc_data(1); /* wait for UNLINK */
> diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
> index e73b956d15e4..7809c1fc6f73 100644
> --- a/drivers/staging/lustre/lnet/selftest/conrpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
> @@ -1345,8 +1345,7 @@ lstcon_rpc_cleanup_wait(void)
> 		mutex_unlock(&console_session.ses_mutex);
> 
> 		CWARN("Session is shutting down, waiting for termination of transactions\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
> 
> 		mutex_lock(&console_session.ses_mutex);
> 	}
> diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
> index e097ef8414a6..298de41444b3 100644
> --- a/drivers/staging/lustre/lnet/selftest/rpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/rpc.c
> @@ -1603,8 +1603,7 @@ srpc_startup(void)
> 	spin_lock_init(&srpc_data.rpc_glock);
> 
> 	/* 1 second pause to avoid timestamp reuse */
> -	set_current_state(TASK_UNINTERRUPTIBLE);
> -	schedule_timeout(HZ);
> +	schedule_timeout_uninterruptible(HZ);
> 	srpc_data.rpc_matchbits = ((__u64)ktime_get_real_seconds()) << 48;
> 
> 	srpc_data.rpc_state = SRPC_STATE_NONE;
> diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h
> index ad9be095c4ea..9dbb0a51d430 100644
> --- a/drivers/staging/lustre/lnet/selftest/selftest.h
> +++ b/drivers/staging/lustre/lnet/selftest/selftest.h
> @@ -573,8 +573,7 @@ swi_state2str(int state)
> 
> #define selftest_wait_events()					\
> 	do {							\
> -		set_current_state(TASK_UNINTERRUPTIBLE);	\
> -		schedule_timeout(HZ / 10);	\
> +		schedule_timeout_uninterruptible(HZ / 10);	\
> 	} while (0)
> 
> #define lst_wait_until(cond, lock, fmt, ...)				\
> diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
> index a9c9992a2502..6ac7fc4fa8c6 100644
> --- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
> +++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
> @@ -124,7 +124,7 @@ static inline void mdc_get_rpc_lock(struct mdc_rpc_lock *lck,
> 	 */
> 	while (unlikely(lck->rpcl_it == MDC_FAKE_RPCL_IT)) {
> 		mutex_unlock(&lck->rpcl_mutex);
> -		schedule_timeout(HZ / 4);
> +		schedule_timeout_uninterruptible(HZ / 4);
> 		goto again;
> 	}
> 
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> index 5b125fdc7321..0ee4798f1bb9 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> @@ -167,8 +167,7 @@ static void ldlm_handle_cp_callback(struct ptlrpc_request *req,
> 		int to = HZ;
> 
> 		while (to > 0) {
> -			set_current_state(TASK_INTERRUPTIBLE);
> -			schedule_timeout(to);
> +			schedule_timeout_interruptible(to);
> 			if (lock->l_granted_mode == lock->l_req_mode ||
> 			    ldlm_is_destroyed(lock))
> 				break;
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> index f06cbd8b6d13..33d73fa8e9d5 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> @@ -749,11 +749,9 @@ static void cleanup_resource(struct ldlm_resource *res, struct list_head *q,
> 			 */
> 			unlock_res(res);
> 			LDLM_DEBUG(lock, "setting FL_LOCAL_ONLY");
> -			if (lock->l_flags & LDLM_FL_FAIL_LOC) {
> -				set_current_state(TASK_UNINTERRUPTIBLE);
> -				schedule_timeout(4 * HZ);
> -				set_current_state(TASK_RUNNING);
> -			}
> +			if (lock->l_flags & LDLM_FL_FAIL_LOC)
> +				schedule_timeout_uninterruptible(4 * HZ);
> +
> 			if (lock->l_completion_ast)
> 				lock->l_completion_ast(lock, LDLM_FL_FAILED,
> 						       NULL);
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
> index 5c8d0fe7217e..3dedc61d2257 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
> @@ -706,10 +706,8 @@ void ll_kill_super(struct super_block *sb)
> 		sbi->ll_umounting = 1;
> 
> 		/* wait running statahead threads to quit */
> -		while (atomic_read(&sbi->ll_sa_running) > 0) {
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC >> 3));
> -		}
> +		while (atomic_read(&sbi->ll_sa_running) > 0)
> +			schedule_timeout_uninterruptible(HZ >> 3);
> 	}
> }
> 
> diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
> index 3022706c6985..1ddb4a6dd8f3 100644
> --- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
> +++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
> @@ -751,8 +751,7 @@ static struct lu_device *echo_device_free(const struct lu_env *env,
> 	while (!list_empty(&ec->ec_objects)) {
> 		spin_unlock(&ec->ec_lock);
> 		CERROR("echo_client still has objects at cleanup time, wait for 1 second\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
> 		lu_site_purge(env, ed->ed_site, -1);
> 		spin_lock(&ec->ec_lock);
> 	}
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
> index 7a3d83c0e50b..91dd09867260 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/client.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
> @@ -761,9 +761,7 @@ int ptlrpc_request_bufs_pack(struct ptlrpc_request *request,
> 			/* The RPC is infected, let the test change the
> 			 * fail_loc
> 			 */
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(2 * HZ);
> -			set_current_state(TASK_RUNNING);
> +			schedule_timeout_uninterruptible(2 * HZ);
> 		}
> 	}
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
> index 9b60292370a7..9c598710b576 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
> @@ -514,8 +514,7 @@ static int sptlrpc_req_replace_dead_ctx(struct ptlrpc_request *req)
> 		       "ctx (%p, fl %lx) doesn't switch, relax a little bit\n",
> 		       newctx, newctx->cc_flags);
> 
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC));
> +		schedule_timeout_interruptible(HZ);
> 	} else if (unlikely(!test_bit(PTLRPC_CTX_UPTODATE_BIT, &newctx->cc_flags))) {
> 		/*
> 		 * new ctx not up to date yet
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/af0d9a23/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure NeilBrown
@ 2018-07-30 21:31   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:31 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> The new_bkts array is *not* zeroed (any more) so when
> freeing recently allocated buckets on failure, we
> must no free beyond the last bucket successfully
> allocated.
> 
> Fixes: 12e46c461cb9 ("staging: lustre: change some LIBCFS_ALLOC calls to k?alloc(GFP_KERNEL)")
> Signed-off-by: NeilBrown <neilb@suse.com>

Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

> ---
> drivers/staging/lustre/lnet/libcfs/hash.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/hash.c b/drivers/staging/lustre/lnet/libcfs/hash.c
> index 48be66f0d654..f452c4540ca1 100644
> --- a/drivers/staging/lustre/lnet/libcfs/hash.c
> +++ b/drivers/staging/lustre/lnet/libcfs/hash.c
> @@ -904,7 +904,7 @@ cfs_hash_buckets_realloc(struct cfs_hash *hs, struct cfs_hash_bucket **old_bkts,
> 		new_bkts[i] = kzalloc(cfs_hash_bkt_size(hs), GFP_KERNEL);
> 		if (!new_bkts[i]) {
> 			cfs_hash_buckets_free(new_bkts, cfs_hash_bkt_size(hs),
> -					      old_size, new_size);
> +					      old_size, i);
> 			return NULL;
> 		}
> 
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/3f7477a3/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init
  2018-07-30  3:49 ` [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init NeilBrown
@ 2018-07-30 21:32   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:32 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> large memory allocations should be avoided at module-init,
> but registering services is appropriate.
> So move the registration of debugfs files
> back into libcfs_init().
> Without this, /sys/kernel/debug/lnet etc are not visible
> immediately that libcfs is loaded.
> No debugfs file access needs anything allocated by libcfs_setup().
> 
> Fixes: 64bf0b1a079d ("staging: lustre: refactor libcfs initialization.")
> Signed-off-by: NeilBrown <neilb@suse.com>

Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

> ---
> drivers/staging/lustre/lnet/libcfs/module.c |    8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
> index bfadfcfa3c44..5d2be941777e 100644
> --- a/drivers/staging/lustre/lnet/libcfs/module.c
> +++ b/drivers/staging/lustre/lnet/libcfs/module.c
> @@ -719,10 +719,6 @@ int libcfs_setup(void)
> 		goto err;
> 	}
> 
> -	lnet_insert_debugfs(lnet_table);
> -	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
> -		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
> -
> 	CDEBUG(D_OTHER, "portals setup OK\n");
> out:
> 	libcfs_active = 1;
> @@ -743,6 +739,10 @@ static int libcfs_init(void)
> {
> 	int rc;
> 
> +	lnet_insert_debugfs(lnet_table);
> +	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
> +		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
> +
> 	rc = misc_register(&libcfs_dev);
> 	if (rc)
> 		CERROR("misc_register: error %d\n", rc);
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/55c1b423/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes NeilBrown
@ 2018-07-30 21:32   ` Andreas Dilger
  2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:32 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> There are three different trace contexts:
> process, softirq, irq.
> Each has its own lock (tcd_lock) which is locked
> as appropriate for that context.
> lockdep currently doesn't see that they are different
> and so deduces that the different uses might lead to
> deadlocks.
> So use separate calls to spin_lock_init() so that they
> each get a separate lock class, and lockdep sees no
> problem.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

> ---
> drivers/staging/lustre/lnet/libcfs/tracefile.c |   18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index d4c80cf254e4..40048165fc16 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -1285,7 +1285,23 @@ int cfs_tracefile_init(int max_pages)
> 	cfs_tcd_for_each(tcd, i, j) {
> 		int factor = pages_factor[i];
> 
> -		spin_lock_init(&tcd->tcd_lock);
> +		/* Note that we have three separate calls so
> +		 * they the locks get three separate classes
> +		 * and lockdep never thinks they are related.
> +		 * As they are used in different interrupt
> +		 * contexts, lockdep think the usage would conflict.
> +		 */
> +		switch(i) {
> +		case CFS_TCD_TYPE_PROC:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		case CFS_TCD_TYPE_SOFTIRQ:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		case CFS_TCD_TYPE_IRQ:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		}
> 		tcd->tcd_pages_factor = factor;
> 		tcd->tcd_type = i;
> 		tcd->tcd_cpu = j;
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/6120dad5/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer()
  2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
@ 2018-07-30 21:37   ` Andreas Dilger
  2018-08-02  3:47   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:37 UTC (permalink / raw)
  To: lustre-devel



> On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> cfs_trace_allocate_string_buffer() is a simple wrapper
> around kzalloc() that adds little value.  The code is
> clearer if we perform the test and the allocation
> directly where needed.
> 
> Also change the test from '>' to '>=' to ensure we
> never try to allocate more than 2 pages, as that seems
> to be the intent.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I was going to say that we probably don't have even a single debug message that is larger than PAGE_SIZE, but I suspect in some cases, printing a message with a PATH_MAX-sized filename may result in a buffer that is larger than PAGE_SIZE, though smaller than 2x PAGE_SIZE.

It isn't clear that returning an error for = 2 * PAGE_SIZE makes sense, but I also don't think this corner case is critical (and it leaves a byte for a NUL
terminator in the buffer).

Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

> ---
> drivers/staging/lustre/lnet/libcfs/module.c    |    6 +++--
> drivers/staging/lustre/lnet/libcfs/tracefile.c |   28 +++++++++---------------
> drivers/staging/lustre/lnet/libcfs/tracefile.h |    1 -
> 3 files changed, 13 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
> index 5d2be941777e..1de83b1997c6 100644
> --- a/drivers/staging/lustre/lnet/libcfs/module.c
> +++ b/drivers/staging/lustre/lnet/libcfs/module.c
> @@ -305,9 +305,9 @@ static int proc_dobitmasks(struct ctl_table *table, int write,
> 	int is_subsys = (mask == &libcfs_subsystem_debug) ? 1 : 0;
> 	int is_printk = (mask == &libcfs_printk) ? 1 : 0;
> 
> -	rc = cfs_trace_allocate_string_buffer(&tmpstr, tmpstrlen);
> -	if (rc < 0)
> -		return rc;
> +	tmpstr = kzalloc(tmpstrlen, GFP_KERNEL);
> +	if (!tmpstr)
> +		return -ENOMEM;
> 
> 	if (!write) {
> 		libcfs_debug_mask2str(tmpstr, tmpstrlen, *mask, is_subsys);
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index 40048165fc16..b273107b3815 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -963,26 +963,16 @@ int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
> }
> EXPORT_SYMBOL(cfs_trace_copyout_string);
> 
> -int cfs_trace_allocate_string_buffer(char **str, int nob)
> -{
> -	if (nob > 2 * PAGE_SIZE)	    /* string must be "sensible" */
> -		return -EINVAL;
> -
> -	*str = kmalloc(nob, GFP_KERNEL | __GFP_ZERO);
> -	if (!*str)
> -		return -ENOMEM;
> -
> -	return 0;
> -}
> -
> int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob)
> {
> 	char *str;
> 	int rc;
> 
> -	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
> -	if (rc)
> -		return rc;
> +	if (usr_str_nob >= 2 * PAGE_SIZE)
> +		return -EINVAL;
> +	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
> +	if (!str)
> +		return -ENOMEM;
> 
> 	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
> 				     usr_str, usr_str_nob);
> @@ -1044,9 +1034,11 @@ int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob)
> 	char *str;
> 	int rc;
> 
> -	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
> -	if (rc)
> -		return rc;
> +	if (usr_str_nob >= 2 * PAGE_SIZE)
> +		return -EINVAL;
> +	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
> +	if (!str)
> +		return -ENOMEM;
> 
> 	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
> 				     usr_str, usr_str_nob);
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.h b/drivers/staging/lustre/lnet/libcfs/tracefile.h
> index 82f090fd8dfa..2134549bb3d7 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.h
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.h
> @@ -63,7 +63,6 @@ int cfs_trace_copyin_string(char *knl_buffer, int knl_buffer_nob,
> 			    const char __user *usr_buffer, int usr_buffer_nob);
> int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
> 			     const char *knl_str, char *append);
> -int cfs_trace_allocate_string_buffer(char **str, int nob);
> int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob);
> int cfs_trace_daemon_command(char *str);
> int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob);
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/7ec4de1e/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE NeilBrown
@ 2018-07-30 21:43   ` Andreas Dilger
  2018-08-02  3:48   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:43 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> TASK_NOLOAD is not a task state to be use by
> itself, it should only be used together with
> TASK_UNINTERRUPTIBLE, which easily done
> by using TASK_IDLE.
> 
> So convert to TASK_IDLE.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

Nice to see this infrastructure is available.  I'm no fan of l_wait_event()
complexity, but without TASK_IDLE (only added in 4.13) we had to find a way
for servers to have lots of service threads without a load average of 100
or whatever...

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> drivers/staging/lustre/lnet/lnet/lib-eq.c |    4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
> index 8347cc44e47d..f085388895ea 100644
> --- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
> @@ -349,7 +349,7 @@ __must_hold(&the_lnet.ln_eq_wait_lock)
>  * \param timeout Time in jiffies to wait for an event to occur on
>  * one of the EQs. The constant MAX_SCHEDULE_TIMEOUT can be used to indicate an
>  * infinite timeout.
> - * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_NOLOAD
> + * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_IDLE
>  * \param event,which On successful return (1 or -EOVERFLOW), \a event will
>  * hold the next event in the EQs, and \a which will contain the index of the
>  * EQ from which the event was taken.
> @@ -406,7 +406,7 @@ LNetEQPoll(struct lnet_handle_eq *eventqs, int neq, signed long timeout,
> 		 */
> 		wait = lnet_eq_wait_locked(&timeout,
> 					   interruptible ? TASK_INTERRUPTIBLE
> -					   : TASK_NOLOAD);
> +					   : TASK_IDLE);
> 		if (wait < 0) /* no new event */
> 			break;
> 	}
> 
> 

Cheers, Andreas
---
Andreas Dilger
Principal Lustre Architect
Whamcloud







-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/b79ac136/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes
  2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
                   ` (6 preceding siblings ...)
  2018-07-30  3:49 ` [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount NeilBrown
@ 2018-07-30 21:45 ` Andreas Dilger
  7 siblings, 0 replies; 24+ messages in thread
From: Andreas Dilger @ 2018-07-30 21:45 UTC (permalink / raw)
  To: lustre-devel

On Jul 29, 2018, at 21:49, NeilBrown <neilb@suse.com> wrote:
> 
> There is no real pattern here, just minor tidy-up and minor bug-fix.

I didn't review the 6/7 patch, nor a couple of the list_head changes
since they are more involved in LNet and better reviewed by someone
with more experience in that code.

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180730/36f78979/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state().
  2018-07-30  3:49 ` [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state() NeilBrown
  2018-07-30 21:30   ` Andreas Dilger
@ 2018-08-02  3:45   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:45 UTC (permalink / raw)
  To: lustre-devel


> Lustre has many calls to
>    set_current_state(STATE);
>    schedule_timeout(time);
> 
> 
> These can more easily be done as
>     schedule_timeout_STATE(time);
> 
> Also clean up some oddities, such as setting the state
> to TASK_RUNNING after the timeout, and simplify
> some time calculations.
> 
> Some schedule_timeout() calls remain as the state was set earlier,
> before an 'add_wait_queue()'.  It would be incorrect to convert these.

From a time long ago.

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c    |   12 ++++--------
>  .../staging/lustre/lnet/klnds/socklnd/socklnd.c    |    6 ++----
>  .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |   14 ++++++--------
>  drivers/staging/lustre/lnet/libcfs/fail.c          |    3 +--
>  drivers/staging/lustre/lnet/libcfs/tracefile.c     |    3 +--
>  drivers/staging/lustre/lnet/lnet/acceptor.c        |    3 +--
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |    6 ++----
>  drivers/staging/lustre/lnet/lnet/peer.c            |    3 +--
>  drivers/staging/lustre/lnet/lnet/router.c          |   19 +++++--------------
>  drivers/staging/lustre/lnet/selftest/conrpc.c      |    3 +--
>  drivers/staging/lustre/lnet/selftest/rpc.c         |    3 +--
>  drivers/staging/lustre/lnet/selftest/selftest.h    |    3 +--
>  drivers/staging/lustre/lustre/include/lustre_mdc.h |    2 +-
>  drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    3 +--
>  drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    8 +++-----
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 ++----
>  .../staging/lustre/lustre/obdecho/echo_client.c    |    3 +--
>  drivers/staging/lustre/lustre/ptlrpc/client.c      |    4 +---
>  drivers/staging/lustre/lustre/ptlrpc/sec.c         |    3 +--
>  19 files changed, 36 insertions(+), 71 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> index e15ad94151bd..f496e6fcc416 100644
> --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
> @@ -1201,8 +1201,7 @@ static struct kib_hca_dev *kiblnd_current_hdev(struct kib_dev *dev)
>  		if (!(i++ % 50))
>  			CDEBUG(D_NET, "%s: Wait for failover\n",
>  			       dev->ibd_ifname);
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(HZ / 100);
> +		schedule_timeout_interruptible(HZ / 100);
>  
>  		read_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
>  	}
> @@ -1916,8 +1915,7 @@ struct list_head *kiblnd_pool_alloc_node(struct kib_poolset *ps)
>  		CDEBUG(D_NET, "Another thread is allocating new %s pool, waiting %d HZs for her to complete. trips = %d\n",
>  		       ps->ps_name, interval, trips);
>  
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(interval);
> +		schedule_timeout_interruptible(interval);
>  		if (interval < HZ)
>  			interval *= 2;
>  
> @@ -2548,8 +2546,7 @@ static void kiblnd_base_shutdown(void)
>  			CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
>  			       "Waiting for %d threads to terminate\n",
>  			       atomic_read(&kiblnd_data.kib_nthreads));
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
>  		}
>  
>  		/* fall through */
> @@ -2599,8 +2596,7 @@ static void kiblnd_shutdown(struct lnet_ni *ni)
>  			       "%s: waiting for %d peers to disconnect\n",
>  			       libcfs_nid2str(ni->ni_nid),
>  			       atomic_read(&net->ibn_npeers));
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
>  		}
>  
>  		kiblnd_net_fini_pools(net);
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> index f0b0480686dc..4dde158451ea 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> @@ -2307,8 +2307,7 @@ ksocknal_base_shutdown(void)
>  			       "waiting for %d threads to terminate\n",
>  				ksocknal_data.ksnd_nthreads);
>  			read_unlock(&ksocknal_data.ksnd_global_lock);
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
>  			read_lock(&ksocknal_data.ksnd_global_lock);
>  		}
>  		read_unlock(&ksocknal_data.ksnd_global_lock);
> @@ -2533,8 +2532,7 @@ ksocknal_shutdown(struct lnet_ni *ni)
>  		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET, /* power of 2? */
>  		       "waiting for %d peers to disconnect\n",
>  		       net->ksnn_npeers);
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
>  
>  		ksocknal_debug_peerhash(ni);
>  
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> index a5c0e8a9bc40..32b76727f400 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
> @@ -188,10 +188,9 @@ ksocknal_transmit(struct ksock_conn *conn, struct ksock_tx *tx)
>  	int rc;
>  	int bufnob;
>  
> -	if (ksocknal_data.ksnd_stall_tx) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ksocknal_data.ksnd_stall_tx * HZ);
> -	}
> +	if (ksocknal_data.ksnd_stall_tx)
> +		schedule_timeout_uninterruptible(
> +			ksocknal_data.ksnd_stall_tx * HZ);
>  
>  	LASSERT(tx->tx_resid);
>  
> @@ -293,10 +292,9 @@ ksocknal_receive(struct ksock_conn *conn)
>  	 */
>  	int rc;
>  
> -	if (ksocknal_data.ksnd_stall_rx) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ksocknal_data.ksnd_stall_rx * HZ);
> -	}
> +	if (ksocknal_data.ksnd_stall_rx)
> +		schedule_timeout_uninterruptible(
> +			ksocknal_data.ksnd_stall_rx * HZ);
>  
>  	rc = ksocknal_connsock_addref(conn);
>  	if (rc) {
> diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
> index bd86b3b5bc34..6ee4de2178ce 100644
> --- a/drivers/staging/lustre/lnet/libcfs/fail.c
> +++ b/drivers/staging/lustre/lnet/libcfs/fail.c
> @@ -137,8 +137,7 @@ int __cfs_fail_timeout_set(u32 id, u32 value, int ms, int set)
>  	if (ret && likely(ms > 0)) {
>  		CERROR("cfs_fail_timeout id %x sleeping for %dms\n",
>  		       id, ms);
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(ms * HZ / 1000);
> +		schedule_timeout_uninterruptible(ms * HZ / 1000);
>  		CERROR("cfs_fail_timeout id %x awake\n", id);
>  	}
>  	return ret;
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index a4768e930021..d4c80cf254e4 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -1208,8 +1208,7 @@ static int tracefiled(void *arg)
>  		}
>  		init_waitqueue_entry(&__wait, current);
>  		add_wait_queue(&tctl->tctl_waitq, &__wait);
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_interruptible(HZ);
>  		remove_wait_queue(&tctl->tctl_waitq, &__wait);
>  	}
>  	complete(&tctl->tctl_stop);
> diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c
> index 5648f17eddc0..3ae3ca1311a1 100644
> --- a/drivers/staging/lustre/lnet/lnet/acceptor.c
> +++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
> @@ -362,8 +362,7 @@ lnet_acceptor(void *arg)
>  		if (rc) {
>  			if (rc != -EAGAIN) {
>  				CWARN("Accept error %d: pausing...\n", rc);
> -				set_current_state(TASK_UNINTERRUPTIBLE);
> -				schedule_timeout(HZ);
> +				schedule_timeout_uninterruptible(HZ);
>  			}
>  			continue;
>  		}
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index 14b797802a85..cdbbe9cc8d95 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -955,8 +955,7 @@ lnet_ping_md_unlink(struct lnet_ping_info *pinfo,
>  	/* NB md could be busy; this just starts the unlink */
>  	while (pinfo->pi_features != LNET_PING_FEAT_INVAL) {
>  		CDEBUG(D_NET, "Still waiting for ping MD to unlink\n");
> -		set_current_state(TASK_NOLOAD);
> -		schedule_timeout(HZ);
> +		schedule_timeout_idle(HZ);
>  	}
>  }
>  
> @@ -1093,8 +1092,7 @@ lnet_clear_zombies_nis_locked(void)
>  				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
>  				       libcfs_nid2str(ni->ni_nid));
>  			}
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(HZ);
> +			schedule_timeout_uninterruptible(HZ);
>  			lnet_net_lock(LNET_LOCK_EX);
>  			continue;
>  		}
> diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c
> index 7c303ef6bb34..d9452c322e4d 100644
> --- a/drivers/staging/lustre/lnet/lnet/peer.c
> +++ b/drivers/staging/lustre/lnet/lnet/peer.c
> @@ -136,8 +136,7 @@ lnet_peer_table_deathrow_wait_locked(struct lnet_peer_table *ptable,
>  			       "Waiting for %d zombies on peer table\n",
>  			       ptable->pt_zombies);
>  		}
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ >> 1);
> +		schedule_timeout_uninterruptible(HZ >> 1);
>  		lnet_net_lock(cpt_locked);
>  	}
>  }
> diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
> index 53373372b526..02241fbc9eaa 100644
> --- a/drivers/staging/lustre/lnet/lnet/router.c
> +++ b/drivers/staging/lustre/lnet/lnet/router.c
> @@ -783,8 +783,7 @@ lnet_wait_known_routerstate(void)
>  		if (all_known)
>  			return;
>  
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
>  	}
>  }
>  
> @@ -1159,8 +1158,7 @@ lnet_prune_rc_data(int wait_unlink)
>  		i++;
>  		CDEBUG(((i & (-i)) == i) ? D_WARNING : D_NET,
>  		       "Waiting for rc buffers to unlink\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ / 4);
> +		schedule_timeout_uninterruptible(HZ / 4);
>  
>  		lnet_net_lock(LNET_LOCK_EX);
>  	}
> @@ -1236,23 +1234,16 @@ lnet_router_checker(void *arg)
>  
>  		lnet_prune_rc_data(0); /* don't wait for UNLINK */
>  
> -		/*
> -		 * Call schedule_timeout() here always adds 1 to load average
> -		 * because kernel counts # active tasks as nr_running
> -		 * + nr_uninterruptible.
> -		 */
>  		/*
>  		 * if there are any routes then wakeup every second.  If
>  		 * there are no routes then sleep indefinitely until woken
>  		 * up by a user adding a route
>  		 */
>  		if (!lnet_router_checker_active())
> -			wait_event_interruptible(the_lnet.ln_rc_waitq,
> -						 lnet_router_checker_active());
> +			wait_event_idle(the_lnet.ln_rc_waitq,
> +					lnet_router_checker_active());
>  		else
> -			wait_event_interruptible_timeout(the_lnet.ln_rc_waitq,
> -							 false,
> -							 HZ);
> +			schedule_timeout_idle(HZ);
>  	}
>  
>  	lnet_prune_rc_data(1); /* wait for UNLINK */
> diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c b/drivers/staging/lustre/lnet/selftest/conrpc.c
> index e73b956d15e4..7809c1fc6f73 100644
> --- a/drivers/staging/lustre/lnet/selftest/conrpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
> @@ -1345,8 +1345,7 @@ lstcon_rpc_cleanup_wait(void)
>  		mutex_unlock(&console_session.ses_mutex);
>  
>  		CWARN("Session is shutting down, waiting for termination of transactions\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
>  
>  		mutex_lock(&console_session.ses_mutex);
>  	}
> diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c b/drivers/staging/lustre/lnet/selftest/rpc.c
> index e097ef8414a6..298de41444b3 100644
> --- a/drivers/staging/lustre/lnet/selftest/rpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/rpc.c
> @@ -1603,8 +1603,7 @@ srpc_startup(void)
>  	spin_lock_init(&srpc_data.rpc_glock);
>  
>  	/* 1 second pause to avoid timestamp reuse */
> -	set_current_state(TASK_UNINTERRUPTIBLE);
> -	schedule_timeout(HZ);
> +	schedule_timeout_uninterruptible(HZ);
>  	srpc_data.rpc_matchbits = ((__u64)ktime_get_real_seconds()) << 48;
>  
>  	srpc_data.rpc_state = SRPC_STATE_NONE;
> diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h
> index ad9be095c4ea..9dbb0a51d430 100644
> --- a/drivers/staging/lustre/lnet/selftest/selftest.h
> +++ b/drivers/staging/lustre/lnet/selftest/selftest.h
> @@ -573,8 +573,7 @@ swi_state2str(int state)
>  
>  #define selftest_wait_events()					\
>  	do {							\
> -		set_current_state(TASK_UNINTERRUPTIBLE);	\
> -		schedule_timeout(HZ / 10);	\
> +		schedule_timeout_uninterruptible(HZ / 10);	\
>  	} while (0)
>  
>  #define lst_wait_until(cond, lock, fmt, ...)				\
> diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
> index a9c9992a2502..6ac7fc4fa8c6 100644
> --- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
> +++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
> @@ -124,7 +124,7 @@ static inline void mdc_get_rpc_lock(struct mdc_rpc_lock *lck,
>  	 */
>  	while (unlikely(lck->rpcl_it == MDC_FAKE_RPCL_IT)) {
>  		mutex_unlock(&lck->rpcl_mutex);
> -		schedule_timeout(HZ / 4);
> +		schedule_timeout_uninterruptible(HZ / 4);
>  		goto again;
>  	}
>  
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> index 5b125fdc7321..0ee4798f1bb9 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
> @@ -167,8 +167,7 @@ static void ldlm_handle_cp_callback(struct ptlrpc_request *req,
>  		int to = HZ;
>  
>  		while (to > 0) {
> -			set_current_state(TASK_INTERRUPTIBLE);
> -			schedule_timeout(to);
> +			schedule_timeout_interruptible(to);
>  			if (lock->l_granted_mode == lock->l_req_mode ||
>  			    ldlm_is_destroyed(lock))
>  				break;
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> index f06cbd8b6d13..33d73fa8e9d5 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> @@ -749,11 +749,9 @@ static void cleanup_resource(struct ldlm_resource *res, struct list_head *q,
>  			 */
>  			unlock_res(res);
>  			LDLM_DEBUG(lock, "setting FL_LOCAL_ONLY");
> -			if (lock->l_flags & LDLM_FL_FAIL_LOC) {
> -				set_current_state(TASK_UNINTERRUPTIBLE);
> -				schedule_timeout(4 * HZ);
> -				set_current_state(TASK_RUNNING);
> -			}
> +			if (lock->l_flags & LDLM_FL_FAIL_LOC)
> +				schedule_timeout_uninterruptible(4 * HZ);
> +
>  			if (lock->l_completion_ast)
>  				lock->l_completion_ast(lock, LDLM_FL_FAILED,
>  						       NULL);
> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
> index 5c8d0fe7217e..3dedc61d2257 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
> @@ -706,10 +706,8 @@ void ll_kill_super(struct super_block *sb)
>  		sbi->ll_umounting = 1;
>  
>  		/* wait running statahead threads to quit */
> -		while (atomic_read(&sbi->ll_sa_running) > 0) {
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC >> 3));
> -		}
> +		while (atomic_read(&sbi->ll_sa_running) > 0)
> +			schedule_timeout_uninterruptible(HZ >> 3);
>  	}
>  }
>  
> diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
> index 3022706c6985..1ddb4a6dd8f3 100644
> --- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
> +++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
> @@ -751,8 +751,7 @@ static struct lu_device *echo_device_free(const struct lu_env *env,
>  	while (!list_empty(&ec->ec_objects)) {
>  		spin_unlock(&ec->ec_lock);
>  		CERROR("echo_client still has objects at cleanup time, wait for 1 second\n");
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		schedule_timeout(HZ);
> +		schedule_timeout_uninterruptible(HZ);
>  		lu_site_purge(env, ed->ed_site, -1);
>  		spin_lock(&ec->ec_lock);
>  	}
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
> index 7a3d83c0e50b..91dd09867260 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/client.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
> @@ -761,9 +761,7 @@ int ptlrpc_request_bufs_pack(struct ptlrpc_request *request,
>  			/* The RPC is infected, let the test change the
>  			 * fail_loc
>  			 */
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> -			schedule_timeout(2 * HZ);
> -			set_current_state(TASK_RUNNING);
> +			schedule_timeout_uninterruptible(2 * HZ);
>  		}
>  	}
>  
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
> index 9b60292370a7..9c598710b576 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
> @@ -514,8 +514,7 @@ static int sptlrpc_req_replace_dead_ctx(struct ptlrpc_request *req)
>  		       "ctx (%p, fl %lx) doesn't switch, relax a little bit\n",
>  		       newctx, newctx->cc_flags);
>  
> -		set_current_state(TASK_INTERRUPTIBLE);
> -		schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC));
> +		schedule_timeout_interruptible(HZ);
>  	} else if (unlikely(!test_bit(PTLRPC_CTX_UPTODATE_BIT, &newctx->cc_flags))) {
>  		/*
>  		 * new ctx not up to date yet
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure NeilBrown
  2018-07-30 21:31   ` Andreas Dilger
@ 2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:46 UTC (permalink / raw)
  To: lustre-devel


> The new_bkts array is *not* zeroed (any more) so when
> freeing recently allocated buckets on failure, we
> must no free beyond the last bucket successfully
> allocated.

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Fixes: 12e46c461cb9 ("staging: lustre: change some LIBCFS_ALLOC calls to k?alloc(GFP_KERNEL)")
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/staging/lustre/lnet/libcfs/hash.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/hash.c b/drivers/staging/lustre/lnet/libcfs/hash.c
> index 48be66f0d654..f452c4540ca1 100644
> --- a/drivers/staging/lustre/lnet/libcfs/hash.c
> +++ b/drivers/staging/lustre/lnet/libcfs/hash.c
> @@ -904,7 +904,7 @@ cfs_hash_buckets_realloc(struct cfs_hash *hs, struct cfs_hash_bucket **old_bkts,
>  		new_bkts[i] = kzalloc(cfs_hash_bkt_size(hs), GFP_KERNEL);
>  		if (!new_bkts[i]) {
>  			cfs_hash_buckets_free(new_bkts, cfs_hash_bkt_size(hs),
> -					      old_size, new_size);
> +					      old_size, i);
>  			return NULL;
>  		}
>  
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init
  2018-07-30  3:49 ` [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init NeilBrown
  2018-07-30 21:32   ` Andreas Dilger
@ 2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:46 UTC (permalink / raw)
  To: lustre-devel


> large memory allocations should be avoided at module-init,
> but registering services is appropriate.
> So move the registration of debugfs files
> back into libcfs_init().
> Without this, /sys/kernel/debug/lnet etc are not visible
> immediately that libcfs is loaded.
> No debugfs file access needs anything allocated by libcfs_setup().

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Fixes: 64bf0b1a079d ("staging: lustre: refactor libcfs initialization.")
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/staging/lustre/lnet/libcfs/module.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
> index bfadfcfa3c44..5d2be941777e 100644
> --- a/drivers/staging/lustre/lnet/libcfs/module.c
> +++ b/drivers/staging/lustre/lnet/libcfs/module.c
> @@ -719,10 +719,6 @@ int libcfs_setup(void)
>  		goto err;
>  	}
>  
> -	lnet_insert_debugfs(lnet_table);
> -	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
> -		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
> -
>  	CDEBUG(D_OTHER, "portals setup OK\n");
>  out:
>  	libcfs_active = 1;
> @@ -743,6 +739,10 @@ static int libcfs_init(void)
>  {
>  	int rc;
>  
> +	lnet_insert_debugfs(lnet_table);
> +	if (!IS_ERR_OR_NULL(lnet_debugfs_root))
> +		lnet_insert_debugfs_links(lnet_debugfs_symlinks);
> +
>  	rc = misc_register(&libcfs_dev);
>  	if (rc)
>  		CERROR("misc_register: error %d\n", rc);
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes NeilBrown
  2018-07-30 21:32   ` Andreas Dilger
@ 2018-08-02  3:46   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:46 UTC (permalink / raw)
  To: lustre-devel


> There are three different trace contexts:
>  process, softirq, irq.
> Each has its own lock (tcd_lock) which is locked
> as appropriate for that context.
> lockdep currently doesn't see that they are different
> and so deduces that the different uses might lead to
> deadlocks.
> So use separate calls to spin_lock_init() so that they
> each get a separate lock class, and lockdep sees no
> problem.

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/staging/lustre/lnet/libcfs/tracefile.c |   18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index d4c80cf254e4..40048165fc16 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -1285,7 +1285,23 @@ int cfs_tracefile_init(int max_pages)
>  	cfs_tcd_for_each(tcd, i, j) {
>  		int factor = pages_factor[i];
>  
> -		spin_lock_init(&tcd->tcd_lock);
> +		/* Note that we have three separate calls so
> +		 * they the locks get three separate classes
> +		 * and lockdep never thinks they are related.
> +		 * As they are used in different interrupt
> +		 * contexts, lockdep think the usage would conflict.
> +		 */
> +		switch(i) {
> +		case CFS_TCD_TYPE_PROC:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		case CFS_TCD_TYPE_SOFTIRQ:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		case CFS_TCD_TYPE_IRQ:
> +			spin_lock_init(&tcd->tcd_lock);
> +			break;
> +		}
>  		tcd->tcd_pages_factor = factor;
>  		tcd->tcd_type = i;
>  		tcd->tcd_cpu = j;
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer()
  2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
  2018-07-30 21:37   ` Andreas Dilger
@ 2018-08-02  3:47   ` James Simmons
  1 sibling, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:47 UTC (permalink / raw)
  To: lustre-devel


> cfs_trace_allocate_string_buffer() is a simple wrapper
> around kzalloc() that adds little value.  The code is
> clearer if we perform the test and the allocation
> directly where needed.
> 
> Also change the test from '>' to '>=' to ensure we
> never try to allocate more than 2 pages, as that seems
> to be the intent.

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/staging/lustre/lnet/libcfs/module.c    |    6 +++--
>  drivers/staging/lustre/lnet/libcfs/tracefile.c |   28 +++++++++---------------
>  drivers/staging/lustre/lnet/libcfs/tracefile.h |    1 -
>  3 files changed, 13 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/module.c b/drivers/staging/lustre/lnet/libcfs/module.c
> index 5d2be941777e..1de83b1997c6 100644
> --- a/drivers/staging/lustre/lnet/libcfs/module.c
> +++ b/drivers/staging/lustre/lnet/libcfs/module.c
> @@ -305,9 +305,9 @@ static int proc_dobitmasks(struct ctl_table *table, int write,
>  	int is_subsys = (mask == &libcfs_subsystem_debug) ? 1 : 0;
>  	int is_printk = (mask == &libcfs_printk) ? 1 : 0;
>  
> -	rc = cfs_trace_allocate_string_buffer(&tmpstr, tmpstrlen);
> -	if (rc < 0)
> -		return rc;
> +	tmpstr = kzalloc(tmpstrlen, GFP_KERNEL);
> +	if (!tmpstr)
> +		return -ENOMEM;
>  
>  	if (!write) {
>  		libcfs_debug_mask2str(tmpstr, tmpstrlen, *mask, is_subsys);
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.c b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> index 40048165fc16..b273107b3815 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.c
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.c
> @@ -963,26 +963,16 @@ int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
>  }
>  EXPORT_SYMBOL(cfs_trace_copyout_string);
>  
> -int cfs_trace_allocate_string_buffer(char **str, int nob)
> -{
> -	if (nob > 2 * PAGE_SIZE)	    /* string must be "sensible" */
> -		return -EINVAL;
> -
> -	*str = kmalloc(nob, GFP_KERNEL | __GFP_ZERO);
> -	if (!*str)
> -		return -ENOMEM;
> -
> -	return 0;
> -}
> -
>  int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob)
>  {
>  	char *str;
>  	int rc;
>  
> -	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
> -	if (rc)
> -		return rc;
> +	if (usr_str_nob >= 2 * PAGE_SIZE)
> +		return -EINVAL;
> +	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
> +	if (!str)
> +		return -ENOMEM;
>  
>  	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
>  				     usr_str, usr_str_nob);
> @@ -1044,9 +1034,11 @@ int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob)
>  	char *str;
>  	int rc;
>  
> -	rc = cfs_trace_allocate_string_buffer(&str, usr_str_nob + 1);
> -	if (rc)
> -		return rc;
> +	if (usr_str_nob >= 2 * PAGE_SIZE)
> +		return -EINVAL;
> +	str = kzalloc(usr_str_nob + 1, GFP_KERNEL);
> +	if (!str)
> +		return -ENOMEM;
>  
>  	rc = cfs_trace_copyin_string(str, usr_str_nob + 1,
>  				     usr_str, usr_str_nob);
> diff --git a/drivers/staging/lustre/lnet/libcfs/tracefile.h b/drivers/staging/lustre/lnet/libcfs/tracefile.h
> index 82f090fd8dfa..2134549bb3d7 100644
> --- a/drivers/staging/lustre/lnet/libcfs/tracefile.h
> +++ b/drivers/staging/lustre/lnet/libcfs/tracefile.h
> @@ -63,7 +63,6 @@ int cfs_trace_copyin_string(char *knl_buffer, int knl_buffer_nob,
>  			    const char __user *usr_buffer, int usr_buffer_nob);
>  int cfs_trace_copyout_string(char __user *usr_buffer, int usr_buffer_nob,
>  			     const char *knl_str, char *append);
> -int cfs_trace_allocate_string_buffer(char **str, int nob);
>  int cfs_trace_dump_debug_buffer_usrstr(void __user *usr_str, int usr_str_nob);
>  int cfs_trace_daemon_command(char *str);
>  int cfs_trace_daemon_command_usrstr(void __user *usr_str, int usr_str_nob);
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount NeilBrown
@ 2018-08-02  3:47   ` James Simmons
  0 siblings, 0 replies; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:47 UTC (permalink / raw)
  To: lustre-devel


> ni_refs is a per-cpt refcount.
> Linux already has a per-cpu refcount implementation
> which doesn't require anylocking.
> 
> So convert ni_refs to percpu_refcount.
> As a bonus, we can get a wake-up when the refcount
> reaches zero, rather than having to wait a full second.
> The waiting in lnet_clear_zombies_nis_locked() is
> modified so that instead of waiting one second each
> time, and printing a warning on power-of-two seconds,
> we wait an increasing power-of-two seconds and print
> a warning if the wait ever timed out.

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  .../staging/lustre/include/linux/lnet/lib-lnet.h   |   11 +-----
>  .../staging/lustre/include/linux/lnet/lib-types.h  |    2 +
>  drivers/staging/lustre/lnet/lnet/api-ni.c          |   34 +++++++++-----------
>  drivers/staging/lustre/lnet/lnet/config.c          |   13 +++++---
>  drivers/staging/lustre/lnet/lnet/router_proc.c     |    2 +
>  5 files changed, 28 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> index 0fecf0d32c58..371002825a7d 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
> @@ -338,34 +338,27 @@ static inline void
>  lnet_ni_addref_locked(struct lnet_ni *ni, int cpt)
>  {
>  	LASSERT(cpt >= 0 && cpt < LNET_CPT_NUMBER);
> -	LASSERT(*ni->ni_refs[cpt] >= 0);
> -
> -	(*ni->ni_refs[cpt])++;
> +	percpu_ref_get(&ni->ni_refs);
>  }
>  
>  static inline void
>  lnet_ni_addref(struct lnet_ni *ni)
>  {
> -	lnet_net_lock(0);
>  	lnet_ni_addref_locked(ni, 0);
> -	lnet_net_unlock(0);
>  }
>  
>  static inline void
>  lnet_ni_decref_locked(struct lnet_ni *ni, int cpt)
>  {
>  	LASSERT(cpt >= 0 && cpt < LNET_CPT_NUMBER);
> -	LASSERT(*ni->ni_refs[cpt] > 0);
>  
> -	(*ni->ni_refs[cpt])--;
> +	percpu_ref_put(&ni->ni_refs);
>  }
>  
>  static inline void
>  lnet_ni_decref(struct lnet_ni *ni)
>  {
> -	lnet_net_lock(0);
>  	lnet_ni_decref_locked(ni, 0);
> -	lnet_net_unlock(0);
>  }
>  
>  void lnet_ni_free(struct lnet_ni *ni);
> diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
> index 6d4106fd9039..7527fef90cac 100644
> --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
> +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
> @@ -269,7 +269,7 @@ struct lnet_ni {
>  	void			 *ni_data;	/* instance-specific data */
>  	struct lnet_lnd		 *ni_lnd;	/* procedural interface */
>  	struct lnet_tx_queue	**ni_tx_queues;	/* percpt TX queues */
> -	int			**ni_refs;	/* percpt reference count */
> +	struct percpu_ref	  ni_refs;
>  	time64_t		  ni_last_alive;/* when I was last alive */
>  	struct lnet_ni_status	 *ni_status;	/* my health status */
>  	/* per NI LND tunables */
> diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
> index cdbbe9cc8d95..fea03737439a 100644
> --- a/drivers/staging/lustre/lnet/lnet/api-ni.c
> +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
> @@ -1055,7 +1055,7 @@ lnet_ni_unlink_locked(struct lnet_ni *ni)
>  	/* move it to zombie list and nobody can find it anymore */
>  	LASSERT(!list_empty(&ni->ni_list));
>  	list_move(&ni->ni_list, &the_lnet.ln_nis_zombie);
> -	lnet_ni_decref_locked(ni, 0);	/* drop ln_nis' ref */
> +	percpu_ref_kill_and_confirm(&ni->ni_refs, NULL);	/* drop ln_nis' ref */
>  }
>  
>  static void
> @@ -1069,34 +1069,32 @@ lnet_clear_zombies_nis_locked(void)
>  	 * Now wait for the NI's I just nuked to show up on ln_zombie_nis
>  	 * and shut them down in guaranteed thread context
>  	 */
> -	i = 2;
> +	i = 1;
>  	while (!list_empty(&the_lnet.ln_nis_zombie)) {
> -		int *ref;
> -		int j;
>  
>  		ni = list_entry(the_lnet.ln_nis_zombie.next,
>  				struct lnet_ni, ni_list);
> -		list_del_init(&ni->ni_list);
> -		cfs_percpt_for_each(ref, j, ni->ni_refs) {
> -			if (!*ref)
> -				continue;
> -			/* still busy, add it back to zombie list */
> -			list_add(&ni->ni_list, &the_lnet.ln_nis_zombie);
> -			break;
> -		}
>  
> -		if (!list_empty(&ni->ni_list)) {
> +		if (!percpu_ref_is_zero(&ni->ni_refs)) {
> +			/* still busy, wait a while */
> +
>  			lnet_net_unlock(LNET_LOCK_EX);
>  			++i;
> -			if ((i & (-i)) == i) {
> +
> +			if (wait_var_event_timeout(
> +				    &ni->ni_refs,
> +				    percpu_ref_is_zero(&ni->ni_refs),
> +				    HZ << i) == 0)
>  				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
>  				       libcfs_nid2str(ni->ni_nid));
> -			}
> +
>  			schedule_timeout_uninterruptible(HZ);
>  			lnet_net_lock(LNET_LOCK_EX);
>  			continue;
>  		}
>  
> +		list_del_init(&ni->ni_list);
> +
>  		ni->ni_lnd->lnd_refcount--;
>  		lnet_net_unlock(LNET_LOCK_EX);
>  
> @@ -1114,7 +1112,7 @@ lnet_clear_zombies_nis_locked(void)
>  			       libcfs_nid2str(ni->ni_nid));
>  
>  		lnet_ni_free(ni);
> -		i = 2;
> +		i = 1;
>  
>  		lnet_net_lock(LNET_LOCK_EX);
>  	}
> @@ -1305,8 +1303,8 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf)
>  	LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query);
>  
>  	lnet_net_lock(LNET_LOCK_EX);
> -	/* refcount for ln_nis */
> -	lnet_ni_addref_locked(ni, 0);
> +	/* Initialise refcount for ln_nis to 1 */
> +	percpu_ref_reinit(&ni->ni_refs);
>  	list_add_tail(&ni->ni_list, &the_lnet.ln_nis);
>  	if (ni->ni_cpts) {
>  		lnet_ni_addref_locked(ni, 0);
> diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c
> index 091c4f714e84..4145c7431576 100644
> --- a/drivers/staging/lustre/lnet/lnet/config.c
> +++ b/drivers/staging/lustre/lnet/lnet/config.c
> @@ -96,8 +96,7 @@ lnet_ni_free(struct lnet_ni *ni)
>  {
>  	int i;
>  
> -	if (ni->ni_refs)
> -		cfs_percpt_free(ni->ni_refs);
> +	percpu_ref_exit(&ni->ni_refs);
>  
>  	if (ni->ni_tx_queues)
>  		cfs_percpt_free(ni->ni_tx_queues);
> @@ -117,6 +116,11 @@ lnet_ni_free(struct lnet_ni *ni)
>  	kfree(ni);
>  }
>  
> +static void ref_release(struct percpu_ref *ref)
> +{
> +	wake_up_var(ref);
> +}
> +
>  struct lnet_ni *
>  lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist)
>  {
> @@ -140,9 +144,8 @@ lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist)
>  
>  	spin_lock_init(&ni->ni_lock);
>  	INIT_LIST_HEAD(&ni->ni_cptlist);
> -	ni->ni_refs = cfs_percpt_alloc(lnet_cpt_table(),
> -				       sizeof(*ni->ni_refs[0]));
> -	if (!ni->ni_refs)
> +	if (percpu_ref_init(&ni->ni_refs, ref_release,
> +			    PERCPU_REF_INIT_DEAD, GFP_KERNEL) < 0)
>  		goto failed;
>  
>  	ni->ni_tx_queues = cfs_percpt_alloc(lnet_cpt_table(),
> diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c
> index d779445fefb5..8856798d263f 100644
> --- a/drivers/staging/lustre/lnet/lnet/router_proc.c
> +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c
> @@ -703,7 +703,7 @@ static int proc_lnet_nis(struct ctl_table *table, int write,
>  				s += snprintf(s, tmpstr + tmpsiz - s,
>  					      "%-24s %6s %5lld %4d %4d %4d %5d %5d %5d\n",
>  					      libcfs_nid2str(ni->ni_nid), stat,
> -					      last_alive, *ni->ni_refs[i],
> +					      last_alive, 0/* No per-cpt refcount */,
>  					      ni->ni_peertxcredits,
>  					      ni->ni_peerrtrcredits,
>  					      tq->tq_credits_max,
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.
  2018-07-30  3:49 ` [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE NeilBrown
  2018-07-30 21:43   ` Andreas Dilger
@ 2018-08-02  3:48   ` James Simmons
  2018-08-02  4:18     ` NeilBrown
  1 sibling, 1 reply; 24+ messages in thread
From: James Simmons @ 2018-08-02  3:48 UTC (permalink / raw)
  To: lustre-devel


> TASK_NOLOAD is not a task state to be use by
> itself, it should only be used together with
> TASK_UNINTERRUPTIBLE, which easily done
> by using TASK_IDLE.
> 
> So convert to TASK_IDLE.

Sad only the latest kernel support this :-(

Reviewed-by: James Simmons <jsimmons@infradead.org>
 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/staging/lustre/lnet/lnet/lib-eq.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
> index 8347cc44e47d..f085388895ea 100644
> --- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
> @@ -349,7 +349,7 @@ __must_hold(&the_lnet.ln_eq_wait_lock)
>   * \param timeout Time in jiffies to wait for an event to occur on
>   * one of the EQs. The constant MAX_SCHEDULE_TIMEOUT can be used to indicate an
>   * infinite timeout.
> - * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_NOLOAD
> + * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_IDLE
>   * \param event,which On successful return (1 or -EOVERFLOW), \a event will
>   * hold the next event in the EQs, and \a which will contain the index of the
>   * EQ from which the event was taken.
> @@ -406,7 +406,7 @@ LNetEQPoll(struct lnet_handle_eq *eventqs, int neq, signed long timeout,
>  		 */
>  		wait = lnet_eq_wait_locked(&timeout,
>  					   interruptible ? TASK_INTERRUPTIBLE
> -					   : TASK_NOLOAD);
> +					   : TASK_IDLE);
>  		if (wait < 0) /* no new event */
>  			break;
>  	}
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.
  2018-08-02  3:48   ` James Simmons
@ 2018-08-02  4:18     ` NeilBrown
  2018-08-02 12:48       ` Patrick Farrell
  0 siblings, 1 reply; 24+ messages in thread
From: NeilBrown @ 2018-08-02  4:18 UTC (permalink / raw)
  To: lustre-devel

On Thu, Aug 02 2018, James Simmons wrote:

>> TASK_NOLOAD is not a task state to be use by
>> itself, it should only be used together with
>> TASK_UNINTERRUPTIBLE, which easily done
>> by using TASK_IDLE.
>> 
>> So convert to TASK_IDLE.
>
> Sad only the latest kernel support this :-(

So?  The patch to add support is trivial.

commit 80ed87c8a9ca0cad7ca66cf3bbdfb17559a66dcf
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri May 8 14:23:45 2015 +0200

    sched/wait: Introduce TASK_NOLOAD and TASK_IDLE

....
 include/linux/sched.h        | 10 +++++++---
 include/trace/events/sched.h |  3 ++-
 2 files changed, 9 insertions(+), 4 deletions(-)

You would then need to add some macors like wait_event_idle(), but they
could go in the lustre code.

If you need it in any vendor kernel I'd be quite surprised if they
wouldn't accept it, at least in a service-pack.
I just checked SLES and the only release that isn't in long-term-support
(no new features) that doesn't already have this patch (linux 4.2 and
later) is SLE11-SP4.
Do you still provide new features for older kernels, or just bug-fix
updates?

NeilBrown

>
> Reviewed-by: James Simmons <jsimmons@infradead.org>
>  
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>  drivers/staging/lustre/lnet/lnet/lib-eq.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> index 8347cc44e47d..f085388895ea 100644
>> --- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> +++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> @@ -349,7 +349,7 @@ __must_hold(&the_lnet.ln_eq_wait_lock)
>>   * \param timeout Time in jiffies to wait for an event to occur on
>>   * one of the EQs. The constant MAX_SCHEDULE_TIMEOUT can be used to indicate an
>>   * infinite timeout.
>> - * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_NOLOAD
>> + * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_IDLE
>>   * \param event,which On successful return (1 or -EOVERFLOW), \a event will
>>   * hold the next event in the EQs, and \a which will contain the index of the
>>   * EQ from which the event was taken.
>> @@ -406,7 +406,7 @@ LNetEQPoll(struct lnet_handle_eq *eventqs, int neq, signed long timeout,
>>  		 */
>>  		wait = lnet_eq_wait_locked(&timeout,
>>  					   interruptible ? TASK_INTERRUPTIBLE
>> -					   : TASK_NOLOAD);
>> +					   : TASK_IDLE);
>>  		if (wait < 0) /* no new event */
>>  			break;
>>  	}
>> 
>> 
>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180802/d6a9462e/attachment.sig>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.
  2018-08-02  4:18     ` NeilBrown
@ 2018-08-02 12:48       ` Patrick Farrell
  0 siblings, 0 replies; 24+ messages in thread
From: Patrick Farrell @ 2018-08-02 12:48 UTC (permalink / raw)
  To: lustre-devel

We generally provide full updates for older kernels, until we don?t.  There?s a gradually moving window of kernel support for new versions of Lustre, with a heavy eye on the major enterprise distro versions.

But when a feature depends on something new, we wrap it in config stuff and either disable it when not found, or we copy the required bits in to Lustre if they?re small or hard to do without.

So, kind of what you?re suggesting.  No problem at all.
________________________________
From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
Sent: Wednesday, August 1, 2018 11:18:57 PM
To: James Simmons
Cc: Lustre Development List
Subject: Re: [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE.

On Thu, Aug 02 2018, James Simmons wrote:

>> TASK_NOLOAD is not a task state to be use by
>> itself, it should only be used together with
>> TASK_UNINTERRUPTIBLE, which easily done
>> by using TASK_IDLE.
>>
>> So convert to TASK_IDLE.
>
> Sad only the latest kernel support this :-(

So?  The patch to add support is trivial.

commit 80ed87c8a9ca0cad7ca66cf3bbdfb17559a66dcf
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri May 8 14:23:45 2015 +0200

    sched/wait: Introduce TASK_NOLOAD and TASK_IDLE

....
 include/linux/sched.h        | 10 +++++++---
 include/trace/events/sched.h |  3 ++-
 2 files changed, 9 insertions(+), 4 deletions(-)

You would then need to add some macors like wait_event_idle(), but they
could go in the lustre code.

If you need it in any vendor kernel I'd be quite surprised if they
wouldn't accept it, at least in a service-pack.
I just checked SLES and the only release that isn't in long-term-support
(no new features) that doesn't already have this patch (linux 4.2 and
later) is SLE11-SP4.
Do you still provide new features for older kernels, or just bug-fix
updates?

NeilBrown

>
> Reviewed-by: James Simmons <jsimmons@infradead.org>
>
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>  drivers/staging/lustre/lnet/lnet/lib-eq.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c b/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> index 8347cc44e47d..f085388895ea 100644
>> --- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> +++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
>> @@ -349,7 +349,7 @@ __must_hold(&the_lnet.ln_eq_wait_lock)
>>   * \param timeout Time in jiffies to wait for an event to occur on
>>   * one of the EQs. The constant MAX_SCHEDULE_TIMEOUT can be used to indicate an
>>   * infinite timeout.
>> - * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_NOLOAD
>> + * \param interruptible, if true, use TASK_INTERRUPTIBLE, else TASK_IDLE
>>   * \param event,which On successful return (1 or -EOVERFLOW), \a event will
>>   * hold the next event in the EQs, and \a which will contain the index of the
>>   * EQ from which the event was taken.
>> @@ -406,7 +406,7 @@ LNetEQPoll(struct lnet_handle_eq *eventqs, int neq, signed long timeout,
>>                */
>>               wait = lnet_eq_wait_locked(&timeout,
>>                                          interruptible ? TASK_INTERRUPTIBLE
>> -                                       : TASK_NOLOAD);
>> +                                       : TASK_IDLE);
>>               if (wait < 0) /* no new event */
>>                       break;
>>       }
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180802/6942a578/attachment-0001.html>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-08-02 12:48 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-30  3:49 [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes NeilBrown
2018-07-30  3:49 ` [lustre-devel] [PATCH 5/7] lustre/libcfs: discard cfs_trace_allocate_string_buffer() NeilBrown
2018-07-30 21:37   ` Andreas Dilger
2018-08-02  3:47   ` James Simmons
2018-07-30  3:49 ` [lustre-devel] [PATCH 3/7] lustre/libfs: move debugfs registration from libcfs_setup back to libcfs_init NeilBrown
2018-07-30 21:32   ` Andreas Dilger
2018-08-02  3:46   ` James Simmons
2018-07-30  3:49 ` [lustre-devel] [PATCH 1/7] lustre: use schedule_timeout_$state() NeilBrown
2018-07-30 21:30   ` Andreas Dilger
2018-08-02  3:45   ` James Simmons
2018-07-30  3:49 ` [lustre-devel] [PATCH 4/7] lustre: give different tcd_lock types different classes NeilBrown
2018-07-30 21:32   ` Andreas Dilger
2018-08-02  3:46   ` James Simmons
2018-07-30  3:49 ` [lustre-devel] [PATCH 2/7] lustre/libcfs: fix freeing after kmalloc failure NeilBrown
2018-07-30 21:31   ` Andreas Dilger
2018-08-02  3:46   ` James Simmons
2018-07-30  3:49 ` [lustre-devel] [PATCH 7/7] lustre: change TASK_NOLOAD to TASK_IDLE NeilBrown
2018-07-30 21:43   ` Andreas Dilger
2018-08-02  3:48   ` James Simmons
2018-08-02  4:18     ` NeilBrown
2018-08-02 12:48       ` Patrick Farrell
2018-07-30  3:49 ` [lustre-devel] [PATCH 6/7] lustre: lnet: convert ni_refs to percpu_refcount NeilBrown
2018-08-02  3:47   ` James Simmons
2018-07-30 21:45 ` [lustre-devel] [PATCH 0/7] lustre: ad-hoc fixes Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.