All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 19/22] lnet: find correct primary for peer
Date: Sun, 20 Nov 2022 09:17:05 -0500	[thread overview]
Message-ID: <1668953828-10909-20-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1668953828-10909-1-git-send-email-jsimmons@infradead.org>

From: Mr NeilBrown <neilb@suse.de>

If the peer has a large-address for the primary, it can now be found.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 022b46d887603f703 ("LU-10391 lnet: find correct primary for peer")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44632
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/peer.c | 41 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 34 insertions(+), 7 deletions(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index b33d6ac..a1305b6 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2585,11 +2585,40 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	       libcfs_nidstr(&lp->lp_primary_nid), ev->status);
 }
 
+static bool find_primary(struct lnet_nid *nid,
+			 struct lnet_ping_buffer *pbuf)
+{
+	struct lnet_ping_info *pi = &pbuf->pb_info;
+	struct lnet_ping_iter piter;
+	u32 *stp;
+
+	if (pi->pi_features & LNET_PING_FEAT_PRIMARY_LARGE) {
+		/* First large nid is primary */
+		for (stp = ping_iter_first(&piter, pbuf, nid);
+		     stp;
+		     stp = ping_iter_next(&piter, nid)) {
+			if (nid_is_nid4(nid))
+				continue;
+			/* nid has already been copied in */
+			return true;
+		}
+		/* no large nids ... weird ... ignore the flag
+		 * and use first nid.
+		 */
+	}
+	/* pi_nids[1] is primary */
+	if (pi->pi_nnis < 2)
+		return false;
+	lnet_nid4_to_nid(pbuf->pb_info.pi_ni[1].ns_nid, nid);
+	return true;
+}
+
 /* Handle a Reply message. This is the reply to a Ping message. */
 static void
 lnet_discovery_event_reply(struct lnet_peer *lp, struct lnet_event *ev)
 {
 	struct lnet_ping_buffer *pbuf;
+	struct lnet_nid primary;
 	int infobytes;
 	int rc;
 	bool ping_feat_disc;
@@ -2731,9 +2760,8 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	 * available if the reply came from a Multi-Rail peer.
 	 */
 	if (pbuf->pb_info.pi_features & LNET_PING_FEAT_MULTI_RAIL &&
-	    pbuf->pb_info.pi_nnis > 1 &&
-	    lnet_nid_to_nid4(&lp->lp_primary_nid) ==
-	    pbuf->pb_info.pi_ni[1].ns_nid) {
+	    find_primary(&primary, pbuf) &&
+	    nid_same(&lp->lp_primary_nid, &primary)) {
 		if (LNET_PING_BUFFER_SEQNO(pbuf) < lp->lp_peer_seqno)
 			CDEBUG(D_NET,
 			       "peer %s: seq# got %u have %u. peer rebooted?\n",
@@ -3081,11 +3109,11 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 	 * peer's lp_peer_nets list, and the peer NI for the primary NID should
 	 * be the first entry in its peer net's lpn_peer_nis list.
 	 */
-	lnet_nid4_to_nid(pbuf->pb_info.pi_ni[1].ns_nid, &nid);
+	find_primary(&nid, pbuf);
 	lpni = lnet_peer_ni_find_locked(&nid);
 	if (!lpni) {
 		CERROR("Internal error: Failed to lookup peer NI for primary NID: %s\n",
-		       libcfs_nid2str(pbuf->pb_info.pi_ni[1].ns_nid));
+		       libcfs_nidstr(&nid));
 		goto out;
 	}
 
@@ -3341,11 +3369,10 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 	 * primary NID to the correct value here. Moreover, this peer
 	 * can show up with only the loopback NID in the ping buffer.
 	 */
-	if (pbuf->pb_info.pi_nnis <= 1) {
+	if (!find_primary(&nid, pbuf)) {
 		lnet_ping_buffer_decref(pbuf);
 		goto out;
 	}
-	lnet_nid4_to_nid(pbuf->pb_info.pi_ni[1].ns_nid, &nid);
 	if (nid_is_lo0(&lp->lp_primary_nid)) {
 		rc = lnet_peer_set_primary_nid(lp, &nid, flags);
 		if (rc)
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2022-11-20 14:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-20 14:16 [lustre-devel] [PATCH 00/22] lustre: backport OpenSFS work as of Nov 20, 2022 James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 01/22] lustre: llite: clear stale page's uptodate bit James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 02/22] lustre: osc: Remove oap lock James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 03/22] lnet: Don't modify uptodate peer with temp NI James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 04/22] lustre: llite: Explicitly support .splice_write James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 05/22] lnet: o2iblnd: add verbose debug prints for rx/tx events James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 06/22] lnet: use Netlink to support old and new NI APIs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 07/22] lustre: obdclass: improve precision of wakeups for mod_rpcs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 08/22] lnet: allow ping packet to contain large nids James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 09/22] lustre: llog: skip bad records in llog James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 10/22] lnet: fix build issue when IPv6 is disabled James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 11/22] lustre: obdclass: fill jobid in a safe way James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 12/22] lustre: llite: remove linefeed from LDLM_DEBUG James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 13/22] lnet: selftest: migrate LNet selftest session handling to Netlink James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 14/22] lustre: clio: append to non-existent component James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 15/22] lnet: fix debug message in lnet_discovery_event_reply James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 16/22] lustre: ldlm: group lock unlock fix James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 17/22] lnet: Signal completion on ping send failure James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 18/22] lnet: extend lnet_is_nid_in_ping_info() James Simmons
2022-11-20 14:17 ` James Simmons [this message]
2022-11-20 14:17 ` [lustre-devel] [PATCH 20/22] lnet: change lnet_notify() to take struct lnet_nid James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 21/22] lnet: discard lnet_nid2ni_*() James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 22/22] lnet: change lnet_debug_peer() to struct lnet_nid James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1668953828-10909-20-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.