lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Serguei Smirnov <ssmirnov@whamcloud.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 08/18] lnet: use ni fatal error when calculating net health
Date: Mon, 19 Jul 2021 08:32:03 -0400	[thread overview]
Message-ID: <1626697933-6971-9-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1626697933-6971-1-git-send-email-jsimmons@infradead.org>

From: Serguei Smirnov <ssmirnov@whamcloud.com>

When ni is flagged with "fatal_error" by LND, its health score
remains unaffected. This allows for the net containing such ni
to be selected for tx even if it is the only ni in this net.
Take "fatal_error" status of the ni into account when calculating
the net health score.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14750
Lustre-commit: 86a69f9eb5cab3f9 ("LU-14750 lnet: use ni fatal error when calculating net health")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43962
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/api-ni.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 687df3b..dc9020d 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -3103,11 +3103,12 @@ int lnet_get_net_healthv_locked(struct lnet_net *net)
 {
 	struct lnet_ni *ni;
 	int best_healthv = 0;
-	int healthv;
+	int healthv, ni_fatal;
 
 	list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
 		healthv = atomic_read(&ni->ni_healthv);
-		if (healthv > best_healthv)
+		ni_fatal = atomic_read(&ni->ni_fatal_error_on);
+		if (!ni_fatal && healthv > best_healthv)
 			best_healthv = healthv;
 	}
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2021-07-19 12:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-19 12:31 [lustre-devel] [PATCH 00/18] lustre: sync to OpenSFS as of July 18, 2021 James Simmons
2021-07-19 12:31 ` [lustre-devel] [PATCH 01/18] lustre: statahead: update task management code James Simmons
2021-07-19 12:31 ` [lustre-devel] [PATCH 02/18] lustre: llite: simplify callback handling for async getattr James Simmons
2021-07-19 12:31 ` [lustre-devel] [PATCH 03/18] lustre: uapi: per-user changelog names and mask James Simmons
2021-07-19 12:31 ` [lustre-devel] [PATCH 04/18] lnet: Correct peer NI recovery age out calculation James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 05/18] lustre: lmv: compare space to mkdir on parent MDT James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 06/18] lnet: annotate LNET_WIRE_HANDLE_COOKIE_NONE as u64 James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 07/18] lnet: libcfs: Add checksum speed under /sys/fs James Simmons
2021-07-19 12:32 ` James Simmons [this message]
2021-07-19 12:32 ` [lustre-devel] [PATCH 09/18] lustre: quota: add get/set project support for non-dir/file James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 10/18] lustre: readahead: fix to reserve min pages James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 11/18] lnet: RMDA infrastructure updates James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 12/18] lnet: o2iblnd: Move racy NULL assignment James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 13/18] lnet: o2iblnd: Avoid double posting invalidate James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 14/18] lustre: quota: nodemap squashed root cannot bypass quota James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 15/18] lustre: llite: reset pfid after dir migration James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 16/18] lustre: llite: failed ASSERTION(ldlm_has_layout(lock)) James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 17/18] lustre: pcc: introducing OBD_CONNECT2_PCCRO flag James Simmons
2021-07-19 12:32 ` [lustre-devel] [PATCH 18/18] lustre: sec: migrate/extend/split on encrypted file James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1626697933-6971-9-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    --cc=ssmirnov@whamcloud.com \
    --subject='Re: [lustre-devel] [PATCH 08/18] lnet: use ni fatal error when calculating net health' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).