lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Amir Shehata <ashehata@whamcloud.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 29/40] lnet: don't delete peer created by Lustre
Date: Sun,  9 Apr 2023 08:13:09 -0400	[thread overview]
Message-ID: <1681042400-15491-30-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1681042400-15491-1-git-send-email-jsimmons@infradead.org>

From: Amir Shehata <ashehata@whamcloud.com>

Peers created by Lustre have their primary NIDs locked.
If that peer is deleted, it'll confuse lustre. So when manually
deleting a peer using:
   lnetctl peer del --prim_nid ...
We must continue to preserve the primary NID. Therefore we delete
all the constituent NIDs, but keep the primary NID. We then
flag the peer for rediscovery.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14668
Lustre-commit: 7cc5b4329fc2eecbf ("LU-14668 lnet: don't delete peer created by Lustre")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43565
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/peer.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index fa2ca54..0a5e73a 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -1983,6 +1983,40 @@ int lnet_user_add_peer_ni(struct lnet_nid *prim_nid, struct lnet_nid *nid, bool
 	return lnet_add_peer_ni(prim_nid, nid, mr, LNET_PEER_CONFIGURED);
 }
 
+static int
+lnet_reset_peer(struct lnet_peer *lp)
+{
+	struct lnet_peer_net *lpn, *lpntmp;
+	struct lnet_peer_ni *lpni, *lpnitmp;
+	unsigned int flags;
+	int rc;
+
+	lnet_peer_cancel_discovery(lp);
+
+	flags = LNET_PEER_CONFIGURED;
+	if (lp->lp_state & LNET_PEER_MULTI_RAIL)
+		flags |= LNET_PEER_MULTI_RAIL;
+
+	list_for_each_entry_safe(lpn, lpntmp, &lp->lp_peer_nets, lpn_peer_nets) {
+		list_for_each_entry_safe(lpni, lpnitmp, &lpn->lpn_peer_nis,
+					 lpni_peer_nis) {
+			if (nid_same(&lpni->lpni_nid, &lp->lp_primary_nid))
+				continue;
+
+			rc = lnet_peer_del_nid(lp, &lpni->lpni_nid, flags);
+			if (rc) {
+				CERROR("Failed to delete %s from peer %s\n",
+				       libcfs_nidstr(&lpni->lpni_nid),
+				       libcfs_nidstr(&lp->lp_primary_nid));
+			}
+		}
+	}
+
+	/* mark it for discovery the next time we use it */
+	lp->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
+	return 0;
+}
+
 /*
  * Implementation of IOC_LIBCFS_DEL_PEER_NI.
  *
@@ -2026,8 +2060,15 @@ int lnet_user_add_peer_ni(struct lnet_nid *prim_nid, struct lnet_nid *nid, bool
 	}
 	lnet_net_unlock(LNET_LOCK_EX);
 
-	if (LNET_NID_IS_ANY(nid) || nid_same(nid, &lp->lp_primary_nid))
-		return lnet_peer_del(lp);
+	if (LNET_NID_IS_ANY(nid) || nid_same(nid, &lp->lp_primary_nid)) {
+		if (lp->lp_state & LNET_PEER_LOCK_PRIMARY) {
+			CERROR("peer %s created by Lustre. Must preserve primary NID, but will remove other NIDs\n",
+			       libcfs_nidstr(&lp->lp_primary_nid));
+			return lnet_reset_peer(lp);
+		} else {
+			return lnet_peer_del(lp);
+		}
+	}
 
 	flags = LNET_PEER_CONFIGURED;
 	if (lp->lp_state & LNET_PEER_MULTI_RAIL)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2023-04-09 12:39 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-09 12:12 [lustre-devel] [PATCH 00/40] lustre: backport OpenSFS changes from March XX, 2023 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 01/40] lustre: protocol: basic batching processing framework James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 02/40] lustre: lov: fiemap improperly handles fm_extent_count=0 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 03/40] lustre: llite: SIGBUS is possible on a race with page reclaim James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 04/40] lustre: osc: page fault in osc_release_bounce_pages() James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 05/40] lustre: readahead: add stats for read-ahead page count James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 06/40] lustre: quota: enforce project quota for root James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 07/40] lustre: ldlm: send the cancel RPC asap James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 08/40] lustre: enc: align Base64 encoding with RFC 4648 base64url James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 09/40] lustre: quota: fix insane grant quota James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 10/40] lustre: llite: check truncated page in ->readpage() James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 11/40] lnet: o2iblnd: Fix key mismatch issue James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 12/40] lustre: sec: fid2path for encrypted files James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 13/40] lustre: sec: Lustre/HSM on enc file with enc key James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 14/40] lustre: llite: check read page past requested James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 15/40] lustre: llite: fix relatime support James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 16/40] lustre: ptlrpc: clarify AT error message James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 17/40] lustre: update version to 2.15.54 James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 18/40] lustre: tgt: skip free inodes in OST weights James Simmons
2023-04-09 12:12 ` [lustre-devel] [PATCH 19/40] lustre: fileset: check fileset for operations by fid James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 20/40] lustre: clio: Remove cl_page_size() James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 21/40] lustre: fid: clean up OBIF_MAX_OID and IDIF_MAX_OID James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 22/40] lustre: llog: fix processing of a wrapped catalog James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 23/40] lustre: llite: replace lld_nfs_dentry flag with opencache handling James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 24/40] lustre: llite: match lock in corresponding namespace James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 25/40] lnet: libcfs: remove unused hash code James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 26/40] lustre: client: -o network needs add_conn processing James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 27/40] lnet: Lock primary NID logic James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 28/40] lnet: Peers added via kernel API should be permanent James Simmons
2023-04-09 12:13 ` James Simmons [this message]
2023-04-09 12:13 ` [lustre-devel] [PATCH 30/40] lnet: memory leak in copy_ioc_udsp_descr James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 31/40] lnet: remove crash with UDSP James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 32/40] lustre: ptlrpc: fix clang build errors James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 33/40] lustre: ldlm: remove client_import_find_conn() James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 34/40] lnet: add 'force' option to lnetctl peer del James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 35/40] lustre: ldlm: BL_AST lock cancel still can be batched James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 36/40] lnet: lnet_parse_route uses wrong loop var James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 37/40] lustre: tgt: add qos debug James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 38/40] lustre: enc: file names encryption when using secure boot James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 39/40] lustre: uapi: add DMV_IMP_INHERIT connect flag James Simmons
2023-04-09 12:13 ` [lustre-devel] [PATCH 40/40] lustre: llite: dir layout inheritance fixes James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1681042400-15491-30-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=ashehata@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).