lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Chris Horn <chris.horn@hpe.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 12/13] lnet: Correct the router ping interval calculation
Date: Sat, 15 May 2021 09:06:09 -0400	[thread overview]
Message-ID: <1621083970-32463-13-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org>

From: Chris Horn <chris.horn@hpe.com>

The router ping interval is being divided by the number of local nets
which results in sending pings more frequently than defined by the
alive_router_check_interval. In addition, the current code is structured
such that we may not find a peer net in need of a ping until after
inspecting the router list multiple times. Re-work the code so that the
loop that inspects a router's peer nets will look at all of them until
it either loops back around the list or it finds one that actually
needs to be pinged.

We also move the check of LNET_PEER_RTR_DISCOVERY so that we avoid the
work of inspecting the router's peer nets if the router is already being
discovered.

HPE-bug-id: LUS-9237
WC-bug-id: https://jira.whamcloud.com/browse/LU-13912
Lustre-commit: 0131d39a622f1efc ("LU-13912 lnet: Correct the router ping interval calculation")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/39694
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h |  4 +--
 net/lnet/lnet/router.c         | 57 ++++++++++++++++++++++++++----------------
 2 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index f199b15..d898066 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -798,8 +798,8 @@ struct lnet_peer_net {
 	/* peer net health */
 	int			lpn_healthv;
 
-	/* time of last router net check attempt */
-	time64_t		lpn_rtrcheck_timestamp;
+	/* time of next router ping on this net */
+	time64_t		lpn_next_ping;
 
 	/* selection sequence number */
 	u32			lpn_seq;
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index e179997..9003d47 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -603,6 +603,7 @@ static void lnet_shuffle_seed(void)
 	unsigned int offset = 0;
 	unsigned int len = 0;
 	struct list_head *e;
+	time64_t now;
 
 	lnet_shuffle_seed();
 
@@ -623,9 +624,10 @@ static void lnet_shuffle_seed(void)
 	/* force a router check on the gateway to make sure the route is
 	 * alive
 	 */
+	now = ktime_get_real_seconds();
 	list_for_each_entry(lpn, &route->lr_gateway->lp_peer_nets,
 			    lpn_peer_nets) {
-		lpn->lpn_rtrcheck_timestamp = 0;
+		lpn->lpn_next_ping = now;
 	}
 
 	the_lnet.ln_remote_nets_version++;
@@ -1105,11 +1107,12 @@ bool lnet_router_checker_active(void)
 void
 lnet_check_routers(void)
 {
-	struct lnet_peer_net *first_lpn = NULL;
+	struct lnet_peer_net *first_lpn;
 	struct lnet_peer_net *lpn;
 	struct lnet_peer_ni *lpni;
 	struct lnet_peer *rtr;
 	bool push = false;
+	bool needs_ping;
 	bool found_lpn;
 	u64 version;
 	u32 net_id;
@@ -1122,14 +1125,18 @@ bool lnet_router_checker_active(void)
 	version = the_lnet.ln_routers_version;
 
 	list_for_each_entry(rtr, &the_lnet.ln_routers, lp_rtr_list) {
+		/* If we're currently discovering the peer then don't
+		 * issue another discovery
+		 */
+		if (rtr->lp_state & LNET_PEER_RTR_DISCOVERY)
+			continue;
+
 		now = ktime_get_real_seconds();
 
-		/* only discover the router if we've passed
-		 * alive_router_check_interval seconds. Some of the router
-		 * interfaces could be down and in that case they would be
-		 * undergoing recovery separately from this discovery.
-		 */
-		/* find next peer net which is also local */
+		/* find the next local peer net which needs to be ping'd */
+		needs_ping = false;
+		first_lpn = NULL;
+		found_lpn = false;
 		net_id = rtr->lp_disc_net_id;
 		do {
 			lpn = lnet_get_next_peer_net_locked(rtr, net_id);
@@ -1138,13 +1145,27 @@ bool lnet_router_checker_active(void)
 				       libcfs_nid2str(rtr->lp_primary_nid));
 				break;
 			}
+
+			/* We looped back to the first peer net */
 			if (first_lpn == lpn)
 				break;
 			if (!first_lpn)
 				first_lpn = lpn;
-			found_lpn = lnet_islocalnet_locked(lpn->lpn_net_id);
+
 			net_id = lpn->lpn_net_id;
-		} while (!found_lpn);
+			if (!lnet_islocalnet_locked(net_id))
+				continue;
+
+			found_lpn = true;
+
+			CDEBUG(D_NET, "rtr %s(%p) %s(%p) next ping %lld\n",
+			       libcfs_nid2str(rtr->lp_primary_nid), rtr,
+			       libcfs_net2str(net_id), lpn,
+			       lpn->lpn_next_ping);
+
+			needs_ping = now >= lpn->lpn_next_ping;
+
+		} while (!needs_ping);
 
 		if (!found_lpn || !lpn) {
 			CERROR("no local network found for gateway %s\n",
@@ -1152,18 +1173,10 @@ bool lnet_router_checker_active(void)
 			continue;
 		}
 
-		if (now - lpn->lpn_rtrcheck_timestamp <
-		    alive_router_check_interval / lnet_current_net_count)
+		if (!needs_ping)
 			continue;
 
-		/* If we're currently discovering the peer then don't
-		 * issue another discovery
-		 */
 		spin_lock(&rtr->lp_lock);
-		if (rtr->lp_state & LNET_PEER_RTR_DISCOVERY) {
-			spin_unlock(&rtr->lp_lock);
-			continue;
-		}
 		/* make sure we fully discover the router */
 		rtr->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
 		rtr->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH |
@@ -1188,16 +1201,16 @@ bool lnet_router_checker_active(void)
 		       libcfs_nid2str(lpni->lpni_nid), cpt);
 		rc = lnet_discover_peer_locked(lpni, cpt, false);
 
-		/* decrement ref count acquired by find_peer_ni_locked() */
+		/* drop ref taken above */
 		lnet_peer_ni_decref_locked(lpni);
 
 		if (!rc)
-			lpn->lpn_rtrcheck_timestamp = now;
+			lpn->lpn_next_ping = now + alive_router_check_interval;
 		else
 			CERROR("Failed to discover router %s\n",
 			       libcfs_nid2str(rtr->lp_primary_nid));
 
-		/* NB dropped lock */
+		/* NB cpt lock was dropped in lnet_discover_peer_locked() */
 		if (version != the_lnet.ln_routers_version) {
 			/* the routers list has changed */
 			goto rescan;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2021-05-15 13:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-15 13:05 [lustre-devel] [PATCH 00/13] lustre: sync to OpenSFS tree as of May 14, 2021 James Simmons
2021-05-15 13:05 ` [lustre-devel] [PATCH 01/13] lnet: Allow delayed sends James Simmons
2021-05-15 13:05 ` [lustre-devel] [PATCH 02/13] lustre: lov: correctly handling sub-lock init failure James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 03/13] lnet: Local NI must be on same net as next-hop James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 04/13] lnet: socklnd: add conns_per_peer parameter James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 05/13] lustre: readahead: export pages directly without RA James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 06/13] lustre: readahead: fix reserving for unaliged read James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 07/13] lustre: sec: rework includes for client encryption James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 08/13] lustre: ptlrpc: remove might_sleep() in sptlrpc_gc_del_sec() James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 09/13] lustre; obdclass: server qos penalty miscaculated James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 10/13] lustre: lmv: add default LMV inherit depth James Simmons
2021-05-15 13:06 ` [lustre-devel] [PATCH 11/13] lustre: lmv: qos stay on current MDT if less full James Simmons
2021-05-15 13:06 ` James Simmons [this message]
2021-05-15 13:06 ` [lustre-devel] [PATCH 13/13] lustre: llite: Introduce inode open heat counter James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1621083970-32463-13-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=chris.horn@hpe.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).