lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Chris Horn <chris.horn@hpe.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 17/22] lnet: Signal completion on ping send failure
Date: Sun, 20 Nov 2022 09:17:03 -0500	[thread overview]
Message-ID: <1668953828-10909-18-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1668953828-10909-1-git-send-email-jsimmons@infradead.org>

From: Chris Horn <chris.horn@hpe.com>

Call complete() on the ping_data::completion if we get
LNET_EVENT_SEND with non-zero status. Otherwise the thread which
issued the ping is stuck waiting for the full ping timeout.

A pd_unlinked member is added to struct ping_data to indicate whether
the associated MD has been unlinked. This is checked by lnet_ping() to
determine whether it needs to explicitly called LNetMDUnlink().

Lastly, in cases where we do not receive a reply, we now return the
value of pd.rc, if it is non-zero, rather than -EIO. This can provide
more information about the underlying ping failure.

HPE-bug-id: LUS-11317
WC-bug-id: https://jira.whamcloud.com/browse/LU-16290
Lustre-commit: 48c34c71de65e8a25 ("LU-16290 lnet: Signal completion on ping send failure")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49020
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/api-ni.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 935c848..8b53adf 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -5333,6 +5333,7 @@ void LNetDebugPeer(struct lnet_processid *id)
 struct ping_data {
 	int rc;
 	int replied;
+	int pd_unlinked;
 	struct lnet_handle_md mdh;
 	struct completion completion;
 };
@@ -5353,7 +5354,12 @@ struct ping_data {
 		pd->replied = 1;
 		pd->rc = event->mlength;
 	}
+
 	if (event->unlinked)
+		pd->pd_unlinked = 1;
+
+	if (event->unlinked ||
+	    (event->type == LNET_EVENT_SEND && event->status))
 		complete(&pd->completion);
 }
 
@@ -5424,13 +5430,14 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 		/* NB must wait for the UNLINK event below... */
 	}
 
-	if (wait_for_completion_timeout(&pd.completion, timeout) == 0) {
-		/* Ensure completion in finite time... */
+	/* Ensure completion in finite time... */
+	wait_for_completion_timeout(&pd.completion, timeout);
+	if (!pd.pd_unlinked) {
 		LNetMDUnlink(pd.mdh);
 		wait_for_completion(&pd.completion);
 	}
 	if (!pd.replied) {
-		rc = -EIO;
+		rc = pd.rc ?: -EIO;
 		goto fail_ping_buffer_decref;
 	}
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2022-11-20 14:39 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-20 14:16 [lustre-devel] [PATCH 00/22] lustre: backport OpenSFS work as of Nov 20, 2022 James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 01/22] lustre: llite: clear stale page's uptodate bit James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 02/22] lustre: osc: Remove oap lock James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 03/22] lnet: Don't modify uptodate peer with temp NI James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 04/22] lustre: llite: Explicitly support .splice_write James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 05/22] lnet: o2iblnd: add verbose debug prints for rx/tx events James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 06/22] lnet: use Netlink to support old and new NI APIs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 07/22] lustre: obdclass: improve precision of wakeups for mod_rpcs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 08/22] lnet: allow ping packet to contain large nids James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 09/22] lustre: llog: skip bad records in llog James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 10/22] lnet: fix build issue when IPv6 is disabled James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 11/22] lustre: obdclass: fill jobid in a safe way James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 12/22] lustre: llite: remove linefeed from LDLM_DEBUG James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 13/22] lnet: selftest: migrate LNet selftest session handling to Netlink James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 14/22] lustre: clio: append to non-existent component James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 15/22] lnet: fix debug message in lnet_discovery_event_reply James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 16/22] lustre: ldlm: group lock unlock fix James Simmons
2022-11-20 14:17 ` James Simmons [this message]
2022-11-20 14:17 ` [lustre-devel] [PATCH 18/22] lnet: extend lnet_is_nid_in_ping_info() James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 19/22] lnet: find correct primary for peer James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 20/22] lnet: change lnet_notify() to take struct lnet_nid James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 21/22] lnet: discard lnet_nid2ni_*() James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 22/22] lnet: change lnet_debug_peer() to struct lnet_nid James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1668953828-10909-18-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=chris.horn@hpe.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).