lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH 13/22] lnet: handle discovery off properly
Date: Tue,  2 Jun 2020 20:59:52 -0400	[thread overview]
Message-ID: <1591146001-27171-14-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1591146001-27171-1-git-send-email-jsimmons@infradead.org>

From: Amir Shehata <ashehata@whamcloud.com>

Peers need to only be updated when discovery is toggled from
on to off. This way the peers don't attempt to send to a
non-primary NID of the node. However, when discovery is
toggled from off to on, the peer will attempt rediscovery
and the peer information will eventually consolidate.

In order to properly delete the peer only when it makes sense
we have to differentiate between the case when we get the
initial message and when we get a push for an already discovered
peer. We only want to delete our local representation if the peer
is one we have already had in our records.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13478
Lustre-commit: adae4295b62b1 ("LU-13478 lnet: handle discovery off properly")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38321
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/api-ni.c | 26 +++++++-------------------
 net/lnet/lnet/peer.c   | 12 ++++++++----
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 6f19e63..a966e64 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -270,7 +270,7 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
 discovery_set(const char *val, const struct kernel_param *kp)
 {
 	int rc;
-	unsigned int *discovery = (unsigned int *)kp->arg;
+	unsigned int *discovery_off = (unsigned int *)kp->arg;
 	unsigned long value;
 	struct lnet_ping_buffer *pbuf;
 
@@ -288,7 +288,7 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
 	 */
 	mutex_lock(&the_lnet.ln_api_mutex);
 
-	if (value == *discovery) {
+	if (value == *discovery_off) {
 		mutex_unlock(&the_lnet.ln_api_mutex);
 		return 0;
 	}
@@ -300,7 +300,7 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
 	 * updating the peers
 	 */
 	if (the_lnet.ln_state == LNET_STATE_SHUTDOWN) {
-		*discovery = value;
+		*discovery_off = value;
 		mutex_unlock(&the_lnet.ln_api_mutex);
 		return 0;
 	}
@@ -314,22 +314,10 @@ static int lnet_discover(struct lnet_process_id id, u32 force,
 		pbuf->pb_info.pi_features |= LNET_PING_FEAT_DISCOVERY;
 	lnet_net_unlock(LNET_LOCK_EX);
 
-	/* Always update the peers. This will result in a push to the
-	 * peers with the updated capabilities feature mask. The peer can
-	 * then take appropriate action to update its representation of
-	 * the node.
-	 *
-	 * If discovery is already off, turn it on first before pushing
-	 * the update. The discovery flag must be on before pushing.
-	 * otherwise if the flag is on and we're turning it off then push
-	 * first before turning the flag off. In the former case the flag
-	 * is being set twice, but I find it's better to do that rather
-	 * than have duplicate code in an if/else statement.
-	 */
-	if (*discovery > 0 && value == 0)
-		*discovery = value;
-	lnet_push_update_to_peers(1);
-	*discovery = value;
+	/* only send a push when we're turning off discovery */
+	if (*discovery_off <= 0 && value > 0)
+		lnet_push_update_to_peers(1);
+	*discovery_off = value;
 
 	mutex_unlock(&the_lnet.ln_api_mutex);
 
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index b2065bd..60749da 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2018,13 +2018,17 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY)) {
 		CDEBUG(D_NET, "Peer %s has discovery disabled\n",
 		       libcfs_nid2str(lp->lp_primary_nid));
-		/* If the peer is going from discovery enabled to
-		 * discovery disabled, we need to reflect that in our
-		 * representation of the peer.
+		/* Mark the peer for deletion if we already know about it
+		 * and it's going from discovery set to no discovery set
 		 */
 		if (!(lp->lp_state & (LNET_PEER_NO_DISCOVERY |
-				      LNET_PEER_DISCOVERING)))
+				      LNET_PEER_DISCOVERING)) &&
+		    lp->lp_state & LNET_PEER_DISCOVERED) {
+			CDEBUG(D_NET, "Marking %s:0x%x for deletion\n",
+			       libcfs_nid2str(lp->lp_primary_nid),
+			       lp->lp_state);
 			lp->lp_state |= LNET_PEER_MARK_DELETION;
+		}
 		lp->lp_state |= LNET_PEER_NO_DISCOVERY;
 	} else if (lp->lp_state & LNET_PEER_NO_DISCOVERY) {
 		CDEBUG(D_NET, "Peer %s has discovery enabled\n",
-- 
1.8.3.1

  parent reply	other threads:[~2020-06-03  0:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-03  0:59 [lustre-devel] [PATCH 00/22] lustre: OpenSFS backport patches for May 29 2020 James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 01/22] lnet: libcfs: fix CPT handling for UP systems James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 02/22] lustre: use BIT() macro where appropriate in include James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 03/22] lustre: use BIT() macro where appropriate James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 04/22] lustre: ptlrpc: change LONG_UNLINK to PTLRPC_REQ_LONG_UNLINK James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 05/22] lustre: llite: use %pd to report dentry names James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 06/22] lnet: tidy lnet_discover and fix mem accounting bug James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 07/22] lustre: llite: prevent MAX_DIO_SIZE 32-bit truncation James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 08/22] lustre: llite: integrate statx() API with Lustre James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 09/22] lustre: ldlm: no current source if lu_ref_del not in same tsk James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 10/22] lnet: always pass struct lnet_md by reference James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 11/22] lustre: llite: fix read if readahead window smaller than rpc size James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 12/22] lustre: obdclass: bind zombie export cleanup workqueue James Simmons
2020-06-03  0:59 ` James Simmons [this message]
2020-06-03  0:59 ` [lustre-devel] [PATCH 14/22] lnet: Force full discovery cycle James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 15/22] lnet: set route aliveness properly James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 16/22] lnet: Correct the default LND timeout James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 17/22] lnet: Add lnet_lnd_timeout to sysfs James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 18/22] lnet: lnd: Allow independent ko2iblnd timeout James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 19/22] lnet: lnd: Allow independent socklnd timeout James Simmons
2020-06-03  0:59 ` [lustre-devel] [PATCH 20/22] lnet: lnd: gracefully handle unexpected events James Simmons
2020-06-03  1:00 ` [lustre-devel] [PATCH 21/22] lustre: update version to 2.13.54 James Simmons
2020-06-03  1:00 ` [lustre-devel] [PATCH 22/22] lnet: procs: print new line based on distro James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1591146001-27171-14-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).