lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022
@ 2022-10-14 21:37 James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 01/20] lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() James Simmons
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

This covers the work done for the lastest Lustre.
Most of it is IPv6 work which is needed for merger
upstream.

Bobi Jam (1):
  lustre: osc: take ldlm lock when queue sync pages

Chris Horn (1):
  lnet: Router test interop check and aarch fix

Emoly Liu (1):
  lustre: obdclass: free inst_name correctly

Etienne AUJAMES (1):
  lustre: ptlrpc: add assert for ptlrpc_service_purge_all

James Simmons (1):
  lustre: obdclass: user netlink to collect devices information

Lei Feng (1):
  lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs()

Li Dongyang (1):
  lustre: obdclass: set OBD_MD_FLGROUP for ladvise RPC

Mikhail Pershin (1):
  lustre: llog: correct llog FID and path output

Mr NeilBrown (6):
  lnet: track pinginfo size in bytes, not nis.
  lnet: add iface index to struct lnet_inetdev
  lnet: ksocklnd: support IPv6 in ksocknal_ip2index()
  lnet: only use PUBLIC IP6 addresses for connections
  lnet: use %pISc for formatting IP addresses
  lnet: socklnd: remove remnants of tcp bonding

Patrick Farrell (1):
  lustre: osc: Remove oap_magic

Serguei Smirnov (4):
  lnet: o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE
  lnet: o2iblnd: fix deadline for tx on peer queue
  lnet: o2iblnd: detect link state to set fatal error on ni
  lnet: socklnd: limit retries on conns_per_peer mismatch

Yang Sheng (1):
  lustre: ptlrpc: lower the message level in no resend case

 fs/lustre/include/lustre_kernelcomm.h         |  37 ++-
 fs/lustre/include/lustre_net.h                |   9 +-
 fs/lustre/include/lustre_osc.h                |   4 +-
 fs/lustre/mdc/mdc_dev.c                       |   3 +
 fs/lustre/obdclass/class_obd.c                |  14 +-
 fs/lustre/obdclass/kernelcomm.c               | 257 +++++++++++++++++-
 fs/lustre/obdclass/llog.c                     |  11 +-
 fs/lustre/obdclass/llog_cat.c                 |  37 +--
 fs/lustre/obdclass/llog_swab.c                |   2 +-
 fs/lustre/obdclass/obd_config.c               |   5 +-
 fs/lustre/obdclass/obdo.c                     |   3 -
 fs/lustre/osc/osc_cache.c                     |  13 +-
 fs/lustre/osc/osc_io.c                        |   3 +-
 fs/lustre/osc/osc_lock.c                      |  19 ++
 fs/lustre/osc/osc_page.c                      |   7 +-
 fs/lustre/ptlrpc/client.c                     |   3 +-
 fs/lustre/ptlrpc/service.c                    |   2 +
 include/linux/lnet/lib-lnet.h                 |   7 +-
 include/linux/lnet/lib-types.h                |  13 +-
 include/uapi/linux/lnet/lnet-idl.h            |   8 +-
 include/uapi/linux/lustre/lustre_kernelcomm.h |  18 ++
 include/uapi/linux/lustre/lustre_user.h       |   1 +
 net/lnet/klnds/o2iblnd/o2iblnd.c              | 219 ++++++++++++---
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c           |  34 ++-
 net/lnet/klnds/socklnd/socklnd.c              | 134 ++++-----
 net/lnet/klnds/socklnd/socklnd.h              |   8 +-
 net/lnet/klnds/socklnd/socklnd_cb.c           |  69 +++--
 net/lnet/klnds/socklnd/socklnd_proto.c        |  30 +-
 net/lnet/lnet/acceptor.c                      |  34 +--
 net/lnet/lnet/api-ni.c                        | 180 ++++++------
 net/lnet/lnet/config.c                        |   1 +
 net/lnet/lnet/lib-move.c                      |  10 +-
 net/lnet/lnet/lib-msg.c                       |  14 +-
 net/lnet/lnet/lib-socket.c                    |  13 +-
 net/lnet/lnet/peer.c                          |  58 ++--
 net/lnet/lnet/router.c                        |  24 +-
 36 files changed, 909 insertions(+), 395 deletions(-)

-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 01/20] lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs()
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 02/20] lustre: obdclass: set OBD_MD_FLGROUP for ladvise RPC James Simmons
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lei Feng, Lustre Development List

From: Lei Feng <flei@whamcloud.com>

There is a race condition that: on server side, one thread sent
reply message and is deleting the reply message, another is
searching for existing request and print some debug information
in _debug_req() if there is a duplicated request. They both operate on
req->rq_repmsg but it is not protected in ptlrpc_req_drop_rs().
So we protected it with req->rq_early_free_lock.

WC-bug-id: https://jira.whamcloud.com/browse/LU-15986
Lustre-commit: aaef545cff2dd9584 ("LU-15986 ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs()")
Signed-off-by: Lei Feng <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47839
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_net.h | 9 ++++++++-
 fs/lustre/ptlrpc/service.c     | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h
index f8d28619a6d7..1605fcc64fc4 100644
--- a/fs/lustre/include/lustre_net.h
+++ b/fs/lustre/include/lustre_net.h
@@ -2234,9 +2234,16 @@ static inline void ptlrpc_req_drop_rs(struct ptlrpc_request *req)
 {
 	if (!req->rq_reply_state)
 		return; /* shouldn't occur */
+
+	/* req_repmsg equals rq_reply_state->rs_msg,
+	 * so set it to NULL before rq_reply_state is possibly freed
+	 */
+	spin_lock(&req->rq_early_free_lock);
+	req->rq_repmsg = NULL;
+	spin_unlock(&req->rq_early_free_lock);
+
 	ptlrpc_rs_decref(req->rq_reply_state);
 	req->rq_reply_state = NULL;
-	req->rq_repmsg = NULL;
 }
 
 static inline u32 lustre_request_magic(struct ptlrpc_request *req)
diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c
index 277fbdbc590a..59fe1f4aa18f 100644
--- a/fs/lustre/ptlrpc/service.c
+++ b/fs/lustre/ptlrpc/service.c
@@ -1136,6 +1136,7 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req)
 	}
 
 	*reqcopy = *req;
+	spin_lock_init(&reqcopy->rq_early_free_lock);
 	reqcopy->rq_reply_state = NULL;
 	reqcopy->rq_rep_swab_mask = 0;
 	reqcopy->rq_pack_bulk = 0;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 02/20] lustre: obdclass: set OBD_MD_FLGROUP for ladvise RPC
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 01/20] lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 03/20] lustre: obdclass: free inst_name correctly James Simmons
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Li Dongyang, Lustre Development List

From: Li Dongyang <dongyangli@ddn.com>

ladvise RPC doesn't have OBD_MD_FLGROUP set, when RPC
reaches server, tgt_validate_obdo() will corrupt the FID
if it's seq is in FID_SEQ_NORMAL range.

Do not mess with seq in obdo_to_ioobj() and tgt_validate_obdo(),
since 2.0 all RPCs should have OBD_MD_FLGROUP set.

Add OBD_MD_FLGROUP for ladvise RPC to fix new client talking
to old servers.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16057
Lustre-commit: bee803c6e440ba6b5 ("LU-16057 obdclass: set OBD_MD_FLGROUP for ladvise RPC")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/48080
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/obdclass/obdo.c | 3 ---
 fs/lustre/osc/osc_io.c    | 2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/lustre/obdclass/obdo.c b/fs/lustre/obdclass/obdo.c
index 7df4ff399890..9cafda16a95f 100644
--- a/fs/lustre/obdclass/obdo.c
+++ b/fs/lustre/obdclass/obdo.c
@@ -123,9 +123,6 @@ EXPORT_SYMBOL(obdo_from_inode);
 void obdo_to_ioobj(const struct obdo *oa, struct obd_ioobj *ioobj)
 {
 	ioobj->ioo_oid = oa->o_oi;
-	if (unlikely(!(oa->o_valid & OBD_MD_FLGROUP)))
-		ostid_set_seq_mdt0(&ioobj->ioo_oid);
-
 	/* Since 2.4 this does not contain o_mode in the low 16 bits.
 	 * Instead, it holds (bd_md_max_brw - 1) for multi-bulk BRW RPCs
 	 */
diff --git a/fs/lustre/osc/osc_io.c b/fs/lustre/osc/osc_io.c
index 655c7c68ab3a..4c9b3d2bb481 100644
--- a/fs/lustre/osc/osc_io.c
+++ b/fs/lustre/osc/osc_io.c
@@ -1036,7 +1036,7 @@ static int osc_io_ladvise_start(const struct lu_env *env,
 
 	memset(oa, 0, sizeof(*oa));
 	oa->o_oi = loi->loi_oi;
-	oa->o_valid = OBD_MD_FLID;
+	oa->o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
 	obdo_set_parent_fid(oa, lio->li_fid);
 
 	ladvise = ladvise_hdr->lah_advise;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 03/20] lustre: obdclass: free inst_name correctly
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 01/20] lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 02/20] lustre: obdclass: set OBD_MD_FLGROUP for ladvise RPC James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 04/20] lustre: osc: take ldlm lock when queue sync pages James Simmons
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Emoly Liu <emoly@whamcloud.com>

In functon class_config_llog_handler(), inst_name should be freed
correctly before break.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16154
Lustre-commit: e7f17c5e0c95dba3b ("LU-16154 obdclass: free inst_name correctly")
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48542
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/obdclass/obd_config.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c
index 7d001ffaf121..2b24276880a6 100644
--- a/fs/lustre/obdclass/obd_config.c
+++ b/fs/lustre/obdclass/obd_config.c
@@ -1230,7 +1230,7 @@ int class_config_llog_handler(const struct lu_env *env,
 			       clli->cfg_flags);
 			rc = 0;
 			/* No processing! */
-			break;
+			goto out_inst;
 		}
 
 		/*
@@ -1352,7 +1352,7 @@ int class_config_llog_handler(const struct lu_env *env,
 		lcfg_new = kzalloc(lcfg_len, GFP_NOFS);
 		if (!lcfg_new) {
 			rc = -ENOMEM;
-			goto out;
+			goto out_inst;
 		}
 
 		lustre_cfg_init(lcfg_new, lcfg->lcfg_command, &bufs);
@@ -1379,6 +1379,7 @@ int class_config_llog_handler(const struct lu_env *env,
 
 		rc = class_process_config(lcfg_new);
 		kfree(lcfg_new);
+out_inst:
 		kfree(inst_name);
 		break;
 	}
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 04/20] lustre: osc: take ldlm lock when queue sync pages
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (2 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 03/20] lustre: obdclass: free inst_name correctly James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 05/20] lnet: track pinginfo size in bytes, not nis James Simmons
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Bobi Jam <bobijam@whamcloud.com>

osc_queue_sync_pages() add osc_extent to osc_object's IO extent
list without taking ldlm locks, and then it calls
osc_io_unplug_async() to queue the IO work for the client.

This patch make sync page queuing take ldlm lock in the
osc_extent.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16160
Lustre-commit: 67aca1fcc6bed2079 ("LU-16160 osc: take ldlm lock when queue sync pages")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/48557
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_osc.h |  3 +++
 fs/lustre/mdc/mdc_dev.c        |  3 +++
 fs/lustre/osc/osc_cache.c      |  7 +++++++
 fs/lustre/osc/osc_io.c         |  1 +
 fs/lustre/osc/osc_lock.c       | 19 +++++++++++++++++++
 5 files changed, 33 insertions(+)

diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h
index 89f02c51dbb3..323eeba3da42 100644
--- a/fs/lustre/include/lustre_osc.h
+++ b/fs/lustre/include/lustre_osc.h
@@ -157,6 +157,7 @@ struct osc_io {
 
 	/* write osc_lock for this IO, used by osc_extent_find(). */
 	struct osc_lock		*oi_write_osclock;
+	struct osc_lock		*oi_read_osclock;
 	struct obdo		oi_oa;
 	struct osc_async_cbargs {
 		bool			opc_rpc_sent;
@@ -724,6 +725,8 @@ int osc_lock_enqueue_wait(const struct lu_env *env, struct osc_object *obj,
 			  struct osc_lock *oscl);
 void osc_lock_set_writer(const struct lu_env *env, const struct cl_io *io,
 			 struct cl_object *obj, struct osc_lock *oscl);
+void osc_lock_set_reader(const struct lu_env *env, const struct cl_io *io,
+			 struct cl_object *obj, struct osc_lock *oscl);
 int osc_lock_print(const struct lu_env *env, void *cookie,
 		   lu_printer_t p, const struct cl_lock_slice *slice);
 void osc_lock_cancel(const struct lu_env *env,
diff --git a/fs/lustre/mdc/mdc_dev.c b/fs/lustre/mdc/mdc_dev.c
index fd0e36225c13..2fd137d2fa4b 100644
--- a/fs/lustre/mdc/mdc_dev.c
+++ b/fs/lustre/mdc/mdc_dev.c
@@ -966,6 +966,9 @@ int mdc_lock_init(const struct lu_env *env, struct cl_object *obj,
 
 	if (io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io))
 		osc_lock_set_writer(env, io, obj, ols);
+	else if (io->ci_type == CIT_READ ||
+		 (io->ci_type == CIT_FAULT && !io->u.ci_fault.ft_mkwrite))
+		osc_lock_set_reader(env, io, obj, ols);
 
 	LDLM_DEBUG_NOLOCK("lock %p, mdc lock %p, flags %llx\n",
 			  lock, ols, ols->ols_flags);
diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c
index b6f0cdb92bdc..36fec837d93e 100644
--- a/fs/lustre/osc/osc_cache.c
+++ b/fs/lustre/osc/osc_cache.c
@@ -2655,6 +2655,7 @@ int osc_queue_sync_pages(const struct lu_env *env, struct cl_io *io,
 			 struct osc_object *obj, struct list_head *list,
 			 int brw_flags)
 {
+	struct osc_io *oio = osc_env_io(env);
 	struct client_obd *cli = osc_cli(obj);
 	struct osc_extent *ext;
 	struct osc_async_page *oap;
@@ -2663,6 +2664,7 @@ int osc_queue_sync_pages(const struct lu_env *env, struct cl_io *io,
 	bool can_merge = true;
 	pgoff_t start = CL_PAGE_EOF;
 	pgoff_t end = 0;
+	struct osc_lock *oscl;
 
 	list_for_each_entry(oap, list, oap_pending_item) {
 		struct osc_page *opg = oap2osc_page(oap);
@@ -2703,6 +2705,11 @@ int osc_queue_sync_pages(const struct lu_env *env, struct cl_io *io,
 	ext->oe_srvlock = !!(brw_flags & OBD_BRW_SRVLOCK);
 	ext->oe_ndelay = !!(brw_flags & OBD_BRW_NDELAY);
 	ext->oe_dio = !!(brw_flags & OBD_BRW_NOCACHE);
+	oscl = oio->oi_write_osclock ? : oio->oi_read_osclock;
+	if (oscl && oscl->ols_dlmlock != NULL) {
+		ext->oe_dlmlock = LDLM_LOCK_GET(oscl->ols_dlmlock);
+		lu_ref_add(&ext->oe_dlmlock->l_reference, "osc_extent", ext);
+	}
 	if (ext->oe_dio && !ext->oe_rw) { /* direct io write */
 		int grants;
 		int ppc;
diff --git a/fs/lustre/osc/osc_io.c b/fs/lustre/osc/osc_io.c
index 4c9b3d2bb481..aa8f61d710b9 100644
--- a/fs/lustre/osc/osc_io.c
+++ b/fs/lustre/osc/osc_io.c
@@ -461,6 +461,7 @@ void osc_io_rw_iter_fini(const struct lu_env *env,
 		oio->oi_lru_reserved = 0;
 	}
 	oio->oi_write_osclock = NULL;
+	oio->oi_read_osclock = NULL;
 
 	osc_io_iter_fini(env, ios);
 }
diff --git a/fs/lustre/osc/osc_lock.c b/fs/lustre/osc/osc_lock.c
index c8f85020c8fe..dd109496d260 100644
--- a/fs/lustre/osc/osc_lock.c
+++ b/fs/lustre/osc/osc_lock.c
@@ -1178,6 +1178,22 @@ void osc_lock_set_writer(const struct lu_env *env, const struct cl_io *io,
 }
 EXPORT_SYMBOL(osc_lock_set_writer);
 
+void osc_lock_set_reader(const struct lu_env *env, const struct cl_io *io,
+			 struct cl_object *obj, struct osc_lock *oscl)
+{
+	struct osc_io *oio = osc_env_io(env);
+
+	if (!cl_object_same(io->ci_obj, obj))
+		return;
+
+	if (oscl->ols_glimpse || osc_lock_is_lockless(oscl))
+		return;
+
+	if (oio->oi_read_osclock == NULL)
+		oio->oi_read_osclock = oscl;
+}
+EXPORT_SYMBOL(osc_lock_set_reader);
+
 int osc_lock_init(const struct lu_env *env,
 		  struct cl_object *obj, struct cl_lock *lock,
 		  const struct cl_io *io)
@@ -1224,6 +1240,9 @@ int osc_lock_init(const struct lu_env *env,
 
 	if (io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io))
 		osc_lock_set_writer(env, io, obj, oscl);
+	else if (io->ci_type == CIT_READ ||
+		 (io->ci_type == CIT_FAULT && !io->u.ci_fault.ft_mkwrite))
+		osc_lock_set_reader(env, io, obj, oscl);
 
 
 	LDLM_DEBUG_NOLOCK("lock %p, osc lock %p, flags %llx",
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 05/20] lnet: track pinginfo size in bytes, not nis.
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (3 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 04/20] lustre: osc: take ldlm lock when queue sync pages James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 06/20] lnet: add iface index to struct lnet_inetdev James Simmons
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

When we extend the pinginfo to be able to store large-address nids,
there could be nids of different sizes in it.  So using the number of
nis to track the size won't work.  So change to using the number of
bytes.  i.e.  the total size of the 'struct lnet_ping_info'.

This affects pb_nnis in the ping_buffer, and the global
ln_push_target_nnis.

LNET_PING_INFO_SIZE is removed as size won't depend on number of nids
any more.

When determining the number of bytes expected in a received ping_info,
use a new macro lnet_ping_info_size() which can extract information
as required from the ping_info.

Note that lnet_ping_target_create() now initializes pi_nis to 0.
Setting the initial size doesn't seem to be useful.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 941218e09e1d6bb9b ("LU-10391 lnet: track pinginfo size in bytes, not nis.")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44627
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h      |   6 +-
 include/linux/lnet/lib-types.h     |  13 ++-
 include/uapi/linux/lnet/lnet-idl.h |   8 +-
 net/lnet/lnet/api-ni.c             | 180 ++++++++++++++++-------------
 net/lnet/lnet/lib-move.c           |  10 +-
 net/lnet/lnet/lib-msg.c            |  14 +--
 net/lnet/lnet/peer.c               |  58 +++++-----
 7 files changed, 157 insertions(+), 132 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index fc086dab080e..a95919e69802 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -646,7 +646,7 @@ void lnet_prep_send(struct lnet_msg *msg, int type,
 int lnet_send(struct lnet_nid *nid, struct lnet_msg *msg,
 	      struct lnet_nid *rtr_nid);
 int lnet_send_ping(struct lnet_nid *dest_nid, struct lnet_handle_md *mdh,
-		   int nnis, void *user_ptr, lnet_handler_t handler,
+		   int bytes, void *user_ptr, lnet_handler_t handler,
 		   bool recovery);
 void lnet_return_tx_credits_locked(struct lnet_msg *msg);
 void lnet_return_rx_credits_locked(struct lnet_msg *msg);
@@ -860,7 +860,7 @@ void lnet_wait_router_start(void);
 void lnet_swap_pinginfo(struct lnet_ping_buffer *pbuf);
 
 int lnet_ping_info_validate(struct lnet_ping_info *pinfo);
-struct lnet_ping_buffer *lnet_ping_buffer_alloc(int nnis, gfp_t gfp);
+struct lnet_ping_buffer *lnet_ping_buffer_alloc(int bytes, gfp_t gfp);
 void lnet_ping_buffer_free(struct lnet_ping_buffer *pbuf);
 
 static inline void lnet_ping_buffer_addref(struct lnet_ping_buffer *pbuf)
@@ -878,7 +878,7 @@ static inline void lnet_ping_buffer_decref(struct lnet_ping_buffer *pbuf)
 
 static inline int lnet_push_target_resize_needed(void)
 {
-	return the_lnet.ln_push_target->pb_nnis < the_lnet.ln_push_target_nnis;
+	return the_lnet.ln_push_target->pb_nbytes < the_lnet.ln_push_target_nbytes;
 }
 
 int lnet_push_target_resize(void);
diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 2266d1be16a6..499385bb981b 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -567,14 +567,14 @@ struct lnet_ni {
  * area that may be overwritten by network data.
  */
 struct lnet_ping_buffer {
-	int			pb_nnis;
+	int			pb_nbytes;	/* sizeof pb_info */
 	atomic_t		pb_refcnt;
 	bool			pb_needs_post;
 	struct lnet_ping_info	pb_info;
 };
 
-#define LNET_PING_BUFFER_SIZE(NNIDS) \
-	offsetof(struct lnet_ping_buffer, pb_info.pi_ni[NNIDS])
+#define LNET_PING_BUFFER_SIZE(bytes) \
+	(offsetof(struct lnet_ping_buffer, pb_info) + bytes)
 #define LNET_PING_BUFFER_LONI(PBUF)	((PBUF)->pb_info.pi_ni[0].ns_nid)
 #define LNET_PING_BUFFER_SEQNO(PBUF)	((PBUF)->pb_info.pi_ni[0].ns_status)
 
@@ -733,8 +733,8 @@ struct lnet_peer {
 	/* MD handle for push in progress */
 	struct lnet_handle_md	lp_push_mdh;
 
-	/* number of NIDs for sizing push data */
-	int			lp_data_nnis;
+	/* number of bytes for sizing pb_info in push data */
+	int			lp_data_bytes;
 
 	/* NI config sequence number of peer */
 	u32			lp_peer_seqno;
@@ -1255,7 +1255,8 @@ struct lnet {
 	lnet_handler_t			ln_push_target_handler;
 	struct lnet_handle_md		ln_push_target_md;
 	struct lnet_ping_buffer	       *ln_push_target;
-	int				ln_push_target_nnis;
+	/* bytes needed for pb_info to receive push */
+	int				ln_push_target_nbytes;
 
 	/* discovery event queue handle */
 	lnet_handler_t			ln_dc_handler;
diff --git a/include/uapi/linux/lnet/lnet-idl.h b/include/uapi/linux/lnet/lnet-idl.h
index 74036e7ef406..41bbb404af6c 100644
--- a/include/uapi/linux/lnet/lnet-idl.h
+++ b/include/uapi/linux/lnet/lnet-idl.h
@@ -291,9 +291,13 @@ struct lnet_ping_info {
 	struct lnet_ni_status	pi_ni[0];
 } __attribute__((packed));
 
-#define LNET_PING_INFO_SIZE(NNIDS) \
-	offsetof(struct lnet_ping_info, pi_ni[NNIDS])
+#define LNET_PING_INFO_HDR_SIZE \
+	offsetof(struct lnet_ping_info, pi_ni[0])
+#define LNET_PING_INFO_MIN_SIZE \
+	offsetof(struct lnet_ping_info, pi_ni[LNET_INTERFACES_MIN])
 #define LNET_PING_INFO_LONI(PINFO)      ((PINFO)->pi_ni[0].ns_nid)
 #define LNET_PING_INFO_SEQNO(PINFO)     ((PINFO)->pi_ni[0].ns_status)
+#define lnet_ping_info_size(pinfo)	\
+	offsetof(struct lnet_ping_info, pi_ni[(pinfo)->pi_nnis])
 
 #endif
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 89c7b99e45be..9459fc0f103f 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -1720,13 +1720,13 @@ lnet_count_acceptor_nets(void)
 }
 
 struct lnet_ping_buffer *
-lnet_ping_buffer_alloc(int nnis, gfp_t gfp)
+lnet_ping_buffer_alloc(int nbytes, gfp_t gfp)
 {
 	struct lnet_ping_buffer *pbuf;
 
-	pbuf = kmalloc(LNET_PING_BUFFER_SIZE(nnis), gfp);
+	pbuf = kmalloc(LNET_PING_BUFFER_SIZE(nbytes), gfp);
 	if (pbuf) {
-		pbuf->pb_nnis = nnis;
+		pbuf->pb_nbytes = nbytes;	/* sizeof of pb_info */
 		pbuf->pb_needs_post = false;
 		atomic_set(&pbuf->pb_refcnt, 1);
 	}
@@ -1742,17 +1742,17 @@ lnet_ping_buffer_free(struct lnet_ping_buffer *pbuf)
 }
 
 static struct lnet_ping_buffer *
-lnet_ping_target_create(int nnis)
+lnet_ping_target_create(int nbytes)
 {
 	struct lnet_ping_buffer *pbuf;
 
-	pbuf = lnet_ping_buffer_alloc(nnis, GFP_KERNEL);
+	pbuf = lnet_ping_buffer_alloc(nbytes, GFP_KERNEL);
 	if (!pbuf) {
-		CERROR("Can't allocate ping source [%d]\n", nnis);
+		CERROR("Can't allocate ping source [%d]\n", nbytes);
 		return NULL;
 	}
 
-	pbuf->pb_info.pi_nnis = nnis;
+	pbuf->pb_info.pi_nnis = 0;
 	pbuf->pb_info.pi_pid = the_lnet.ln_pid;
 	pbuf->pb_info.pi_magic = LNET_PROTO_PING_MAGIC;
 	pbuf->pb_info.pi_features =
@@ -1762,52 +1762,56 @@ lnet_ping_target_create(int nnis)
 }
 
 static inline int
-lnet_get_net_ni_count_locked(struct lnet_net *net)
+lnet_get_net_ni_bytes_locked(struct lnet_net *net)
 {
 	struct lnet_ni *ni;
-	int count = 0;
+	int bytes = 0;
 
 	list_for_each_entry(ni, &net->net_ni_list, ni_netlist)
-		count++;
+		if (nid_is_nid4(&ni->ni_nid))
+			bytes += sizeof(struct lnet_ni_status);
 
-	return count;
+	return bytes;
 }
 
 static inline int
-lnet_get_net_ni_count_pre(struct lnet_net *net)
+lnet_get_net_ni_bytes_pre(struct lnet_net *net)
 {
 	struct lnet_ni *ni;
-	int count = 0;
+	int bytes = 0;
 
 	list_for_each_entry(ni, &net->net_ni_added, ni_netlist)
-		count++;
+		if (nid_is_nid4(&ni->ni_nid))
+			bytes += sizeof(struct lnet_ni_status);
 
-	return count;
+	return bytes;
 }
 
 static inline int
-lnet_get_ni_count(void)
+lnet_get_ni_bytes(void)
 {
 	struct lnet_ni *ni;
 	struct lnet_net *net;
-	int count = 0;
+	int bytes = 0;
 
 	lnet_net_lock(0);
 
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist)
-			count++;
+			if (nid_is_nid4(&ni->ni_nid))
+				bytes += sizeof(struct lnet_ni_status);
+
 	}
 
 	lnet_net_unlock(0);
 
-	return count;
+	return bytes;
 }
 
 void
 lnet_swap_pinginfo(struct lnet_ping_buffer *pbuf)
 {
-	struct lnet_ni_status *stat;
+	struct lnet_ni_status *stat, *end;
 	int nnis;
 	int i;
 
@@ -1816,10 +1820,9 @@ lnet_swap_pinginfo(struct lnet_ping_buffer *pbuf)
 	__swab32s(&pbuf->pb_info.pi_pid);
 	__swab32s(&pbuf->pb_info.pi_nnis);
 	nnis = pbuf->pb_info.pi_nnis;
-	if (nnis > pbuf->pb_nnis)
-		nnis = pbuf->pb_nnis;
-	for (i = 0; i < nnis; i++) {
-		stat = &pbuf->pb_info.pi_ni[i];
+	stat = &pbuf->pb_info.pi_ni[0];
+	end = (void *)&pbuf->pb_info + pbuf->pb_nbytes;
+	for (i = 0; i < nnis && stat + 1 <= end; i++, stat++) {
 		__swab64s(&stat->ns_nid);
 		__swab32s(&stat->ns_status);
 	}
@@ -1876,7 +1879,7 @@ lnet_ping_target_event_handler(struct lnet_event *event)
 static int
 lnet_ping_target_setup(struct lnet_ping_buffer **ppbuf,
 		       struct lnet_handle_md *ping_mdh,
-		       int ni_count, bool set_eq)
+		       int ni_bytes, bool set_eq)
 {
 	struct lnet_processid id = {
 		.nid = LNET_ANY_NID,
@@ -1890,7 +1893,7 @@ lnet_ping_target_setup(struct lnet_ping_buffer **ppbuf,
 		the_lnet.ln_ping_target_handler =
 			lnet_ping_target_event_handler;
 
-	*ppbuf = lnet_ping_target_create(ni_count);
+	*ppbuf = lnet_ping_target_create(ni_bytes);
 	if (!*ppbuf) {
 		rc = -ENOMEM;
 		goto fail_free_eq;
@@ -1908,7 +1911,7 @@ lnet_ping_target_setup(struct lnet_ping_buffer **ppbuf,
 
 	/* initialize md content */
 	md.start = &(*ppbuf)->pb_info;
-	md.length = LNET_PING_INFO_SIZE((*ppbuf)->pb_nnis);
+	md.length = (*ppbuf)->pb_nbytes;
 	md.threshold = LNET_MD_THRESH_INF;
 	md.max_size = 0;
 	md.options = LNET_MD_OP_GET | LNET_MD_TRUNCATE |
@@ -1949,20 +1952,19 @@ lnet_ping_md_unlink(struct lnet_ping_buffer *pbuf,
 static void
 lnet_ping_target_install_locked(struct lnet_ping_buffer *pbuf)
 {
-	struct lnet_ni_status *ns;
+	struct lnet_ni_status *ns, *end;
 	struct lnet_ni *ni;
 	struct lnet_net *net;
-	int i = 0;
 	int rc;
 
+	pbuf->pb_info.pi_nnis = 0;
+	ns = &pbuf->pb_info.pi_ni[0];
+	end = (void *)&pbuf->pb_info + pbuf->pb_nbytes;
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			LASSERT(i < pbuf->pb_nnis);
-
-			ns = &pbuf->pb_info.pi_ni[i];
-
 			if (!nid_is_nid4(&ni->ni_nid))
 				continue;
+			LASSERT(ns + 1 <= end);
 			ns->ns_nid = lnet_nid_to_nid4(&ni->ni_nid);
 
 			lnet_ni_lock(ni);
@@ -1970,11 +1972,12 @@ lnet_ping_target_install_locked(struct lnet_ping_buffer *pbuf)
 			ni->ni_status = &ns->ns_status;
 			lnet_ni_unlock(ni);
 
-			i++;
+			pbuf->pb_info.pi_nnis++;
+			ns++;
 		}
 	}
-	/*
-	 * We (ab)use the ns_status of the loopback interface to
+
+	/* We (ab)use the ns_status of the loopback interface to
 	 * transmit the sequence number. The first interface listed
 	 * must be the loopback interface.
 	 */
@@ -2043,13 +2046,13 @@ int lnet_push_target_resize(void)
 	struct lnet_handle_md old_mdh;
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_ping_buffer *old_pbuf;
-	int nnis;
+	int nbytes;
 	int rc;
 
 again:
-	nnis = the_lnet.ln_push_target_nnis;
-	if (nnis <= 0) {
-		CDEBUG(D_NET, "Invalid nnis %d\n", nnis);
+	nbytes = the_lnet.ln_push_target_nbytes;
+	if (nbytes <= 0) {
+		CDEBUG(D_NET, "Invalid nbytes %d\n", nbytes);
 		return -EINVAL;
 	}
 
@@ -2057,9 +2060,9 @@ int lnet_push_target_resize(void)
 	 * dropped when we need to resize again (see "old_pbuf" below) or when
 	 * LNet is shutdown (see lnet_push_target_fini())
 	 */
-	pbuf = lnet_ping_buffer_alloc(nnis, GFP_NOFS);
+	pbuf = lnet_ping_buffer_alloc(nbytes, GFP_NOFS);
 	if (!pbuf) {
-		CDEBUG(D_NET, "Can't allocate pbuf for nnis %d\n", nnis);
+		CDEBUG(D_NET, "Can't allocate pbuf for nbytes %d\n", nbytes);
 		return  -ENOMEM;
 	}
 
@@ -2084,10 +2087,10 @@ int lnet_push_target_resize(void)
 	}
 
 	/* Received another push or reply that requires a larger buffer */
-	if (nnis < the_lnet.ln_push_target_nnis)
+	if (nbytes < the_lnet.ln_push_target_nbytes)
 		goto again;
 
-	CDEBUG(D_NET, "nnis %d success\n", nnis);
+	CDEBUG(D_NET, "nbytes %d success\n", nbytes);
 	return 0;
 }
 
@@ -2118,7 +2121,7 @@ int lnet_push_target_post(struct lnet_ping_buffer *pbuf,
 
 	/* initialize md content */
 	md.start = &pbuf->pb_info;
-	md.length = LNET_PING_INFO_SIZE(pbuf->pb_nnis);
+	md.length = pbuf->pb_nbytes;
 	md.threshold = 1;
 	md.max_size = 0;
 	md.options = LNET_MD_OP_PUT | LNET_MD_TRUNCATE;
@@ -2175,7 +2178,7 @@ static int lnet_push_target_init(void)
 	LASSERT(rc == 0);
 
 	/* Start at the required minimum, we'll enlarge if required. */
-	the_lnet.ln_push_target_nnis = LNET_INTERFACES_MIN;
+	the_lnet.ln_push_target_nbytes = LNET_PING_INFO_MIN_SIZE;
 
 	rc = lnet_push_target_resize();
 	if (rc) {
@@ -2204,7 +2207,7 @@ static void lnet_push_target_fini(void)
 	/* Drop ref set by lnet_ping_buffer_alloc() */
 	lnet_ping_buffer_decref(the_lnet.ln_push_target);
 	the_lnet.ln_push_target = NULL;
-	the_lnet.ln_push_target_nnis = 0;
+	the_lnet.ln_push_target_nbytes = 0;
 
 	LNetClearLazyPortal(LNET_RESERVED_PORTAL);
 	lnet_assert_handler_unused(the_lnet.ln_push_target_handler);
@@ -2865,7 +2868,7 @@ LNetNIInit(lnet_pid_t requested_pid)
 {
 	int im_a_router = 0;
 	int rc;
-	int ni_count;
+	int ni_bytes;
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_handle_md ping_mdh;
 	LIST_HEAD(net_head);
@@ -2921,11 +2924,9 @@ LNetNIInit(lnet_pid_t requested_pid)
 			goto err_empty_list;
 	}
 
-	ni_count = lnet_startup_lndnets(&net_head);
-	if (ni_count < 0) {
-		rc = ni_count;
+	rc = lnet_startup_lndnets(&net_head);
+	if (rc < 0)
 		goto err_empty_list;
-	}
 
 	if (!the_lnet.ln_nis_from_mod_params) {
 		rc = lnet_parse_routes(lnet_get_routes(), &im_a_router);
@@ -2944,7 +2945,11 @@ LNetNIInit(lnet_pid_t requested_pid)
 	the_lnet.ln_refcount = 1;
 	/* Now I may use my own API functions... */
 
-	rc = lnet_ping_target_setup(&pbuf, &ping_mdh, ni_count, true);
+	ni_bytes = LNET_PING_INFO_HDR_SIZE;
+	list_for_each_entry(net, &the_lnet.ln_nets, net_list)
+		ni_bytes += lnet_get_net_ni_bytes_locked(net);
+
+	rc = lnet_ping_target_setup(&pbuf, &ping_mdh, ni_bytes, true);
 	if (rc)
 		goto err_acceptor_stop;
 
@@ -3363,7 +3368,7 @@ static int lnet_add_net_common(struct lnet_net *net,
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_remotenet *rnet;
 	struct lnet_ni *ni;
-	int net_ni_count;
+	int net_ni_bytes;
 	u32 net_id;
 	int rc;
 
@@ -3388,12 +3393,13 @@ static int lnet_add_net_common(struct lnet_net *net,
 	 * which will be added.
 	 *
 	 * since ni hasn't been configured yet, use
-	 * lnet_get_net_ni_count_pre() which checks the net_ni_added list
+	 * lnet_get_net_ni_bytes_pre() which checks the net_ni_added list
 	 */
-	net_ni_count = lnet_get_net_ni_count_pre(net);
+	net_ni_bytes = lnet_get_net_ni_bytes_pre(net);
 
 	rc = lnet_ping_target_setup(&pbuf, &ping_mdh,
-				    net_ni_count + lnet_get_ni_count(),
+				    LNET_PING_INFO_HDR_SIZE +
+				    net_ni_bytes + lnet_get_ni_bytes(),
 				    false);
 	if (rc < 0) {
 		lnet_net_free(net);
@@ -3589,8 +3595,8 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
 	u32 net_id = LNET_NIDNET(conf->lic_nid);
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_handle_md ping_mdh;
-	int rc;
-	int net_count;
+	int net_bytes, rc;
+	bool net_empty;
 	u32 addr;
 
 	/* don't allow userspace to shutdown the LOLND */
@@ -3616,13 +3622,13 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
 	addr = LNET_NIDADDR(conf->lic_nid);
 	if (addr == 0) {
 		/* remove the entire net */
-		net_count = lnet_get_net_ni_count_locked(net);
+		net_bytes = lnet_get_net_ni_bytes_locked(net);
 
 		lnet_net_unlock(0);
 
 		/* create and link a new ping info, before removing the old one */
 		rc = lnet_ping_target_setup(&pbuf, &ping_mdh,
-					    lnet_get_ni_count() - net_count,
+					    lnet_get_ni_bytes() - net_bytes,
 					    false);
 		if (rc != 0)
 			goto unlock_api_mutex;
@@ -3644,13 +3650,17 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
 		goto unlock_net;
 	}
 
-	net_count = lnet_get_net_ni_count_locked(net);
+	net_bytes = lnet_get_net_ni_bytes_locked(net);
+	net_empty = list_is_singular(&net->net_ni_list);
 
 	lnet_net_unlock(0);
 
 	/* create and link a new ping info, before removing the old one */
 	rc = lnet_ping_target_setup(&pbuf, &ping_mdh,
-				    lnet_get_ni_count() - 1, false);
+				    (LNET_PING_INFO_HDR_SIZE +
+				     lnet_get_ni_bytes() -
+				     sizeof(pbuf->pb_info.pi_ni[0])),
+				    false);
 	if (rc != 0)
 		goto unlock_api_mutex;
 
@@ -3661,7 +3671,7 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
 	lnet_ping_target_update(pbuf, ping_mdh);
 
 	/* check if the net is empty and remove it if it is */
-	if (net_count == 1)
+	if (net_empty)
 		lnet_shutdown_lndnet(net);
 
 	goto unlock_api_mutex;
@@ -3744,8 +3754,7 @@ lnet_dyn_del_net(u32 net_id)
 	struct lnet_net *net;
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_handle_md ping_mdh;
-	int rc;
-	int net_ni_count;
+	int net_ni_bytes, rc;
 
 	/* don't allow userspace to shutdown the LOLND */
 	if (LNET_NETTYP(net_id) == LOLND)
@@ -3766,13 +3775,15 @@ lnet_dyn_del_net(u32 net_id)
 		goto out;
 	}
 
-	net_ni_count = lnet_get_net_ni_count_locked(net);
+	net_ni_bytes = lnet_get_net_ni_bytes_locked(net);
 
 	lnet_net_unlock(0);
 
 	/* create and link a new ping info, before removing the old one */
 	rc = lnet_ping_target_setup(&pbuf, &ping_mdh,
-				    lnet_get_ni_count() - net_ni_count, false);
+				     LNET_PING_INFO_HDR_SIZE +
+				     lnet_get_ni_bytes() - net_ni_bytes,
+				     false);
 	if (rc)
 		goto out;
 
@@ -4626,6 +4637,12 @@ lnet_ping_event_handler(struct lnet_event *event)
 		complete(&pd->completion);
 }
 
+/* lnet_ping() only works with nid4 nids, so we can calculate
+ * size from number of nids
+ */
+#define LNET_PING_INFO_SIZE(NNIDS) \
+	offsetof(struct lnet_ping_info, pi_ni[NNIDS])
+
 static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 		     signed long timeout, struct lnet_process_id __user *ids,
 		     int n_ids)
@@ -4635,6 +4652,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_process_id tmpid;
 	struct lnet_processid id;
+	int id_bytes;
 	int i;
 	int nob;
 	int rc;
@@ -4653,13 +4671,14 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 	if (id4.pid == LNET_PID_ANY)
 		id4.pid = LNET_PID_LUSTRE;
 
-	pbuf = lnet_ping_buffer_alloc(n_ids, GFP_NOFS);
+	id_bytes = LNET_PING_INFO_SIZE(n_ids);
+	pbuf = lnet_ping_buffer_alloc(id_bytes, GFP_NOFS);
 	if (!pbuf)
 		return -ENOMEM;
 
 	/* initialize md content */
 	md.start = &pbuf->pb_info;
-	md.length = LNET_PING_INFO_SIZE(n_ids);
+	md.length = id_bytes;
 	md.threshold = 2; /* GET/REPLY */
 	md.max_size = 0;
 	md.options = LNET_MD_TRUNCATE;
@@ -4696,7 +4715,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 	}
 
 	nob = pd.rc;
-	LASSERT(nob >= 0 && nob <= LNET_PING_INFO_SIZE(n_ids));
+	LASSERT(nob >= 0 && nob <= id_bytes);
 
 	rc = -EPROTO;		/* if I can't parse... */
 
@@ -4720,20 +4739,21 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 		goto fail_ping_buffer_decref;
 	}
 
-	if (nob < LNET_PING_INFO_SIZE(0)) {
-		CERROR("%s: Short reply %d(%d min)\n",
-		       libcfs_idstr(&id),
-		       nob, (int)LNET_PING_INFO_SIZE(0));
+	/* Test if smaller than lnet_pinginfo with no pi_ni status info */
+	if (nob < LNET_PING_INFO_HDR_SIZE) {
+		CERROR("%s: Short reply %d(%lu min)\n",
+		       libcfs_idstr(&id), nob, LNET_PING_INFO_HDR_SIZE);
 		goto fail_ping_buffer_decref;
 	}
 
-	if (pbuf->pb_info.pi_nnis < n_ids)
+	if (pbuf->pb_info.pi_nnis < n_ids) {
 		n_ids = pbuf->pb_info.pi_nnis;
+		id_bytes = lnet_ping_info_size(&pbuf->pb_info);
+	}
 
-	if (nob < LNET_PING_INFO_SIZE(n_ids)) {
+	if (nob < id_bytes) {
 		CERROR("%s: Short reply %d(%d expected)\n",
-		       libcfs_idstr(&id),
-		       nob, (int)LNET_PING_INFO_SIZE(n_ids));
+		       libcfs_idstr(&id), nob, id_bytes);
 		goto fail_ping_buffer_decref;
 	}
 
@@ -4753,6 +4773,8 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid,
 	return rc;
 }
 
+#undef LNET_PING_INFO_SIZE
+
 static int
 lnet_discover(struct lnet_process_id id4, u32 force,
 	      struct lnet_process_id __user *ids,
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index a8a5ddbab84a..d46578929d08 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -3334,7 +3334,7 @@ lnet_recover_local_nis(void)
 
 			ev_info->mt_type = MT_TYPE_LOCAL_NI;
 			ev_info->mt_nid = nid;
-			rc = lnet_send_ping(&nid, &mdh, LNET_INTERFACES_MIN,
+			rc = lnet_send_ping(&nid, &mdh, LNET_PING_INFO_MIN_SIZE,
 					    ev_info, the_lnet.ln_mt_handler,
 					    true);
 			/* lookup the nid again */
@@ -3563,7 +3563,7 @@ lnet_recover_peer_nis(void)
 
 			ev_info->mt_type = MT_TYPE_PEER_NI;
 			ev_info->mt_nid = nid;
-			rc = lnet_send_ping(&nid, &mdh, LNET_INTERFACES_MIN,
+			rc = lnet_send_ping(&nid, &mdh, LNET_PING_INFO_MIN_SIZE,
 					    ev_info, the_lnet.ln_mt_handler,
 					    true);
 			lnet_net_lock(0);
@@ -3672,7 +3672,7 @@ lnet_monitor_thread(void *arg)
  */
 int
 lnet_send_ping(struct lnet_nid *dest_nid,
-	       struct lnet_handle_md *mdh, int nnis,
+	       struct lnet_handle_md *mdh, int bytes,
 	       void *user_data, lnet_handler_t handler, bool recovery)
 {
 	struct lnet_md md = { NULL };
@@ -3685,7 +3685,7 @@ lnet_send_ping(struct lnet_nid *dest_nid,
 		goto fail_error;
 	}
 
-	pbuf = lnet_ping_buffer_alloc(nnis, GFP_NOFS);
+	pbuf = lnet_ping_buffer_alloc(bytes, GFP_NOFS);
 	if (!pbuf) {
 		rc = ENOMEM;
 		goto fail_error;
@@ -3693,7 +3693,7 @@ lnet_send_ping(struct lnet_nid *dest_nid,
 
 	/* initialize md content */
 	md.start = &pbuf->pb_info;
-	md.length = LNET_PING_INFO_SIZE(nnis);
+	md.length = bytes;
 	md.threshold = 2; /* GET/REPLY */
 	md.max_size = 0;
 	md.options = LNET_MD_TRUNCATE | LNET_MD_TRACK_RESPONSE;
diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index 3b1f6a36bfe0..9fb001e5815e 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -814,6 +814,8 @@ lnet_health_check(struct lnet_msg *msg)
 	 * messages with a health status != OK.
 	 */
 	if (hstatus != LNET_MSG_STATUS_OK) {
+		struct lnet_ping_info *pi;
+
 		/* Don't further decrement the health value if a recovery
 		 * message failed.
 		 */
@@ -826,11 +828,10 @@ lnet_health_check(struct lnet_msg *msg)
 		}
 
 		/* For local failures, health/recovery/resends are not needed if
-		 * I only have a single (non-lolnd) interface. NB: pb_nnis
-		 * includes the lolnd interface, so a single-rail node would
-		 * have pb_nnis == 2.
+		 * I only have a single (non-lolnd) interface.
 		 */
-		if (the_lnet.ln_ping_target->pb_nnis <= 2) {
+		pi = &the_lnet.ln_ping_target->pb_info;
+		if (pi->pi_nnis <= 2) {
 			handle_local_health = false;
 			attempt_local_resend = false;
 		}
@@ -840,9 +841,8 @@ lnet_health_check(struct lnet_msg *msg)
 		/* For remote failures, health/recovery/resends are not needed
 		 * if the peer only has a single interface. Special case for
 		 * routers where we rely on health feature to manage route
-		 * aliveness. NB: unlike pb_nnis above, lp_nnis does _not_
-		 * include the lolnd, so a single-rail node would have
-		 * lp_nnis == 1.
+		 * aliveness. NB: lp_nnis does _not_ include the lolnd, so a
+		 * single-rail node would have lp_nnis == 1.
 		 */
 		if (lpni && lpni->lpni_peer_net &&
 		    lpni->lpni_peer_net->lpn_peer &&
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index e7c3c835b528..9b2066028509 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -2254,6 +2254,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 {
 	struct lnet_ping_buffer *pbuf;
 	struct lnet_peer *lp;
+	int infobytes;
 
 	pbuf = LNET_PING_INFO_TO_BUFFER(ev->md_start + ev->offset);
 
@@ -2298,12 +2299,12 @@ void lnet_peer_push_event(struct lnet_event *ev)
 		goto out;
 	}
 
-	/*
-	 * Make sure we'll allocate the correct size ping buffer when
+	/* Make sure we'll allocate the correct size ping buffer when
 	 * pinging the peer.
 	 */
-	if (lp->lp_data_nnis < pbuf->pb_info.pi_nnis)
-		lp->lp_data_nnis = pbuf->pb_info.pi_nnis;
+	infobytes = lnet_ping_info_size(&pbuf->pb_info);
+	if (lp->lp_data_bytes < infobytes)
+		lp->lp_data_bytes = infobytes;
 
 	/*
 	 * A non-Multi-Rail peer is not supposed to be capable of
@@ -2369,13 +2370,12 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	 * and tell discovery to allocate a bigger buffer.
 	 */
 	if (ev->mlength < ev->rlength) {
-		if (the_lnet.ln_push_target_nnis < pbuf->pb_info.pi_nnis)
-			the_lnet.ln_push_target_nnis = pbuf->pb_info.pi_nnis;
+		if (the_lnet.ln_push_target_nbytes < infobytes)
+			the_lnet.ln_push_target_nbytes = infobytes;
 		lp->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
 		lp->lp_state |= LNET_PEER_FORCE_PING;
-		CDEBUG(D_NET, "Truncated Push from %s (%d nids)\n",
-		       libcfs_nidstr(&lp->lp_primary_nid),
-		       pbuf->pb_info.pi_nnis);
+		CDEBUG(D_NET, "Truncated Push from %s (%d bytes)\n",
+		       libcfs_nidstr(&lp->lp_primary_nid), infobytes);
 		goto out;
 	}
 
@@ -2383,8 +2383,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	lp->lp_peer_seqno = LNET_PING_BUFFER_SEQNO(pbuf);
 	lp->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
 
-	/*
-	 * If there is data present that hasn't been processed yet,
+	/* If there is data present that hasn't been processed yet,
 	 * we'll replace it if the Put contained newer data and it
 	 * fits. We're racing with a Ping or earlier Push in this
 	 * case.
@@ -2392,9 +2391,9 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	if (lp->lp_state & LNET_PEER_DATA_PRESENT) {
 		if (LNET_PING_BUFFER_SEQNO(pbuf) >
 			LNET_PING_BUFFER_SEQNO(lp->lp_data) &&
-		    pbuf->pb_info.pi_nnis <= lp->lp_data->pb_nnis) {
+		    infobytes <= lp->lp_data->pb_nbytes) {
 			memcpy(&lp->lp_data->pb_info, &pbuf->pb_info,
-			       LNET_PING_INFO_SIZE(pbuf->pb_info.pi_nnis));
+			       infobytes);
 			CDEBUG(D_NET, "Ping/Push race from %s: %u vs %u\n",
 			       libcfs_nidstr(&lp->lp_primary_nid),
 			       LNET_PING_BUFFER_SEQNO(pbuf),
@@ -2408,7 +2407,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	 * the Push and set FORCE_PING to force the discovery
 	 * thread to fix the problem by pinging the peer.
 	 */
-	lp->lp_data = lnet_ping_buffer_alloc(lp->lp_data_nnis, GFP_ATOMIC);
+	lp->lp_data = lnet_ping_buffer_alloc(lp->lp_data_bytes, GFP_ATOMIC);
 	if (!lp->lp_data) {
 		lp->lp_state |= LNET_PEER_FORCE_PING;
 		CDEBUG(D_NET, "Cannot allocate Push buffer for %s %u\n",
@@ -2418,8 +2417,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	}
 
 	/* Success */
-	memcpy(&lp->lp_data->pb_info, &pbuf->pb_info,
-	       LNET_PING_INFO_SIZE(pbuf->pb_info.pi_nnis));
+	memcpy(&lp->lp_data->pb_info, &pbuf->pb_info, infobytes);
 	lp->lp_state |= LNET_PEER_DATA_PRESENT;
 	CDEBUG(D_NET, "Received Push %s %u\n",
 	       libcfs_nidstr(&lp->lp_primary_nid),
@@ -2580,6 +2578,7 @@ static void
 lnet_discovery_event_reply(struct lnet_peer *lp, struct lnet_event *ev)
 {
 	struct lnet_ping_buffer *pbuf;
+	int infobytes;
 	int rc;
 
 	spin_lock(&lp->lp_lock);
@@ -2692,25 +2691,24 @@ lnet_discovery_event_reply(struct lnet_peer *lp, struct lnet_event *ev)
 		}
 	}
 
+	infobytes = lnet_ping_info_size(&pbuf->pb_info);
 	/*
 	 * Make sure we'll allocate the correct size ping buffer when
 	 * pinging the peer.
 	 */
-	if (lp->lp_data_nnis < pbuf->pb_info.pi_nnis)
-		lp->lp_data_nnis = pbuf->pb_info.pi_nnis;
+	if (lp->lp_data_bytes < infobytes)
+		lp->lp_data_bytes = infobytes;
 
-	/*
-	 * Check for truncation of the Reply. Clear PING_SENT and set
+	/* Check for truncation of the Reply. Clear PING_SENT and set
 	 * PING_FAILED to trigger a retry.
 	 */
-	if (pbuf->pb_nnis < pbuf->pb_info.pi_nnis) {
-		if (the_lnet.ln_push_target_nnis < pbuf->pb_info.pi_nnis)
-			the_lnet.ln_push_target_nnis = pbuf->pb_info.pi_nnis;
+	if (pbuf->pb_nbytes < infobytes) {
+		if (the_lnet.ln_push_target_nbytes < infobytes)
+			the_lnet.ln_push_target_nbytes = infobytes;
 		lp->lp_state |= LNET_PEER_PING_FAILED;
 		lp->lp_ping_error = 0;
-		CDEBUG(D_NET, "Truncated Reply from %s (%d nids)\n",
-		       libcfs_nidstr(&lp->lp_primary_nid),
-		       pbuf->pb_info.pi_nnis);
+		CDEBUG(D_NET, "Truncated Reply from %s (%d bytes)\n",
+		       libcfs_nidstr(&lp->lp_primary_nid), infobytes);
 		goto out;
 	}
 
@@ -3391,7 +3389,7 @@ __must_hold(&lp->lp_lock)
 static int lnet_peer_send_ping(struct lnet_peer *lp)
 __must_hold(&lp->lp_lock)
 {
-	int nnis;
+	int bytes;
 	int rc;
 	int cpt;
 
@@ -3404,9 +3402,9 @@ __must_hold(&lp->lp_lock)
 	lnet_peer_addref_locked(lp);
 	lnet_net_unlock(cpt);
 
-	nnis = max_t(int, lp->lp_data_nnis, LNET_INTERFACES_MIN);
+	bytes = max_t(int, lp->lp_data_bytes, LNET_PING_INFO_MIN_SIZE);
 
-	rc = lnet_send_ping(&lp->lp_primary_nid, &lp->lp_ping_mdh, nnis, lp,
+	rc = lnet_send_ping(&lp->lp_primary_nid, &lp->lp_ping_mdh, bytes, lp,
 			    the_lnet.ln_dc_handler, false);
 	/* if LNetMDBind in lnet_send_ping fails we need to decrement the
 	 * refcount on the peer, otherwise LNetMDUnlink will be called
@@ -3514,7 +3512,7 @@ __must_hold(&lp->lp_lock)
 
 	/* Push source MD */
 	md.start = &pbuf->pb_info;
-	md.length = LNET_PING_INFO_SIZE(pbuf->pb_nnis);
+	md.length = pbuf->pb_nbytes;
 	md.threshold = 2; /* Put/Ack */
 	md.max_size = 0;
 	md.options = LNET_MD_TRACK_RESPONSE;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 06/20] lnet: add iface index to struct lnet_inetdev
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (4 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 05/20] lnet: track pinginfo size in bytes, not nis James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 07/20] lnet: ksocklnd: support IPv6 in ksocknal_ip2index() James Simmons
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

When getting list of interfaces, get the index as well, as this can be
useful and avoid search the list of interfaces again to find it.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 860182ee6e84d391a ("LU-10391 lnet: add iface index to struct lnet_inetdev")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48569
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h    | 1 +
 net/lnet/klnds/socklnd/socklnd.c | 2 +-
 net/lnet/lnet/config.c           | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index a95919e69802..eb48d2900172 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -826,6 +826,7 @@ struct lnet_inetdev {
 	u32	li_flags;
 	u32	li_ipaddr;
 	u32	li_netmask;
+	u32	li_index;
 	char	li_name[IFNAMSIZ];
 };
 
diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 2b08501133dc..69cb738796e7 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -2522,11 +2522,11 @@ ksocknal_startup(struct lnet_ni *ni)
 	}
 
 	ni->ni_dev_cpt = ifaces[i].li_cpt;
+	ksi->ksni_index = ifaces[i].li_index;
 	sa = (void *)&ksi->ksni_addr;
 	memset(sa, 0, sizeof(*sa));
 	sa->sin_family = AF_INET;
 	sa->sin_addr.s_addr = htonl(ifaces[i].li_ipaddr);
-	ksi->ksni_index = ksocknal_ip2index((struct sockaddr *)sa, ni);
 	ksi->ksni_netmask = ifaces[i].li_netmask;
 	strlcpy(ksi->ksni_name, ifaces[i].li_name, sizeof(ksi->ksni_name));
 
diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c
index da3d20e5bebb..083a9a29697f 100644
--- a/net/lnet/lnet/config.c
+++ b/net/lnet/lnet/config.c
@@ -1538,6 +1538,7 @@ int lnet_inet_enumerate(struct lnet_inetdev **dev_list, struct net *ns)
 
 			ifaces[nip].li_cpt = cpt;
 			ifaces[nip].li_flags = flags;
+			ifaces[nip].li_index = dev->ifindex;
 			ifaces[nip].li_ipaddr = ntohl(ifa->ifa_local);
 			ifaces[nip].li_netmask = ntohl(ifa->ifa_mask);
 			strlcpy(ifaces[nip].li_name, ifa->ifa_label,
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 07/20] lnet: ksocklnd: support IPv6 in ksocknal_ip2index()
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (5 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 06/20] lnet: add iface index to struct lnet_inetdev James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:37 ` [lustre-devel] [PATCH 08/20] lnet: only use PUBLIC IP6 addresses for connections James Simmons
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

ksocknal_ip2index() can now find the interface index for an IPv6
address.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: c0fdf9efbf927db40 ("LU-10391 socklnd: support IPv6 in ksocknal_ip2index()")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48570
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 40 +++++++++++++++++++++++++-------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 69cb738796e7..89696977ac63 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -40,6 +40,7 @@
 #include <linux/ethtool.h>
 #include <linux/inetdevice.h>
 #include <linux/sunrpc/addr.h>
+#include <net/addrconf.h>
 #include "socklnd.h"
 
 static struct lnet_lnd the_ksocklnd;
@@ -79,8 +80,7 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 	int ret = -1;
 	const struct in_ifaddr *ifa;
 
-	if (addr->sa_family != AF_INET)
-		/* No IPv6 support yet */
+	if (addr->sa_family != AF_INET && addr->sa_family != AF_INET6)
 		return ret;
 
 	rcu_read_lock();
@@ -94,14 +94,36 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 		if (!(flags & IFF_UP))
 			continue;
 
-		in_dev = __in_dev_get_rcu(dev);
-		if (!in_dev)
-			continue;
+		switch (addr->sa_family) {
+		case AF_INET:
+			in_dev = __in_dev_get_rcu(dev);
+			if (!in_dev)
+				continue;
 
-		in_dev_for_each_ifa_rcu(ifa, in_dev) {
-			if (ifa->ifa_local ==
-			    ((struct sockaddr_in *)addr)->sin_addr.s_addr)
-				ret = dev->ifindex;
+			in_dev_for_each_ifa_rcu(ifa, in_dev) {
+				if (ifa->ifa_local ==
+				    ((struct sockaddr_in *)addr)->sin_addr.s_addr)
+					ret = dev->ifindex;
+			}
+			break;
+#if IS_ENABLED(CONFIG_IPV6)
+		case AF_INET6: {
+			struct inet6_dev *in6_dev;
+			const struct inet6_ifaddr *ifa6;
+			struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *)addr;
+
+			in6_dev = __in6_dev_get(dev);
+			if (!in6_dev)
+				continue;
+
+			list_for_each_entry_rcu(ifa6, &in6_dev->addr_list, if_list) {
+				if (ipv6_addr_cmp(&ifa6->addr,
+						  &addr6->sin6_addr) == 0)
+					ret = dev->ifindex;
+			}
+			break;
+			}
+#endif /* IS_ENABLED(CONFIG_IPV6) */
 		}
 		if (ret >= 0)
 			break;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 08/20] lnet: only use PUBLIC IP6 addresses for connections
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (6 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 07/20] lnet: ksocklnd: support IPv6 in ksocknal_ip2index() James Simmons
@ 2022-10-14 21:37 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 09/20] lustre: osc: Remove oap_magic James Simmons
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:37 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

IPv6 can have temporary address.  These can be used for short-lives
outgoing connections to increase privacy.  They are not suitable for
long-term connections.

So request that only PUBLIC IPv6 addresses are used when making a
connection.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: cd3b89be221b4c5b6 ("LU-10391 lnet: only use PUBLIC IP6 addresses for connections")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48571
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/lib-socket.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net/lnet/lnet/lib-socket.c b/net/lnet/lnet/lib-socket.c
index 3a99cb69f66f..01f375ed96a3 100644
--- a/net/lnet/lnet/lib-socket.c
+++ b/net/lnet/lnet/lib-socket.c
@@ -379,6 +379,17 @@ lnet_sock_connect(int interface, int local_port,
 	if (IS_ERR(sock))
 		return sock;
 
+	/* Avoid temporary address, they are bad for long-lived
+	 * connections such as lustre mounts.
+	 * RFC4941, section 3.6 suggests that:
+	 *    Individual applications, which have specific
+	 *    knowledge about the normal duration of connections,
+	 *    MAY override this as appropriate.
+	 */
+	if (peeraddr->sa_family == PF_INET6)
+		ip6_sock_set_addr_preferences(sock->sk,
+					      IPV6_PREFER_SRC_PUBLIC);
+
 	rc = kernel_connect(sock, peeraddr, sizeof(struct sockaddr_in6), 0);
 	if (!rc)
 		return sock;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 09/20] lustre: osc: Remove oap_magic
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (7 preceding siblings ...)
  2022-10-14 21:37 ` [lustre-devel] [PATCH 08/20] lnet: only use PUBLIC IP6 addresses for connections James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 10/20] lustre: ptlrpc: add assert for ptlrpc_service_purge_all James Simmons
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Patrick Farrell <pfarrell@whamcloud.com>

oap_magic exists only to debug init and allocation
failures, but is allocated for every page of memory, which
wastes a lot of memory for something we don't need
dedicated debug for.

Remove it.

WC-bug-id: https://jira.whamcloud.com/browse/LU-15619
Lustre-commit: 721df28648c4b3faa ("LU-15619 osc: Remove oap_magic")
Author: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46713
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_osc.h | 1 -
 fs/lustre/osc/osc_cache.c      | 6 ------
 fs/lustre/osc/osc_page.c       | 7 ++-----
 3 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h
index 323eeba3da42..884eafee8a83 100644
--- a/fs/lustre/include/lustre_osc.h
+++ b/fs/lustre/include/lustre_osc.h
@@ -75,7 +75,6 @@ enum async_flags {
 };
 
 struct osc_async_page {
-	int			oap_magic;
 	unsigned short		oap_cmd;
 
 	struct list_head        oap_pending_item;
diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c
index 36fec837d93e..12d9ab519e48 100644
--- a/fs/lustre/osc/osc_cache.c
+++ b/fs/lustre/osc/osc_cache.c
@@ -2313,7 +2313,6 @@ int osc_prep_async_page(struct osc_object *osc, struct osc_page *ops,
 	if (!page)
 		return -EIO;
 
-	oap->oap_magic = OAP_MAGIC;
 	oap->oap_obj = osc;
 
 	oap->oap_page = vmpage;
@@ -2354,9 +2353,6 @@ int osc_queue_async_io(const struct lu_env *env, struct cl_io *io,
 	bool need_release = false;
 	int rc = 0;
 
-	if (oap->oap_magic != OAP_MAGIC)
-		return -EINVAL;
-
 	if (!cli->cl_import || cli->cl_import->imp_invalid)
 		return -EIO;
 
@@ -2537,8 +2533,6 @@ int osc_teardown_async_page(const struct lu_env *env,
 	struct osc_async_page *oap = &ops->ops_oap;
 	int rc = 0;
 
-	LASSERT(oap->oap_magic == OAP_MAGIC);
-
 	CDEBUG(D_INFO, "teardown oap %p page %p at index %lu.\n",
 	       oap, ops, osc_index(oap2osc(oap)));
 
diff --git a/fs/lustre/osc/osc_page.c b/fs/lustre/osc/osc_page.c
index 12ba10827a23..ba10ba320079 100644
--- a/fs/lustre/osc/osc_page.c
+++ b/fs/lustre/osc/osc_page.c
@@ -125,10 +125,10 @@ static int osc_page_print(const struct lu_env *env,
 	struct client_obd *cli = &osc_export(obj)->exp_obd->u.cli;
 
 	return (*printer)(env, cookie, LUSTRE_OSC_NAME
-			  "-page@%p %lu: 1< %#x %d %c %c > 2< %llu %u %u %#x %#x | %p %p %p > 3< %d %d > 4< %d %d %d %lu %c | %c %c %c %c > 5< %c %c %c %c | %d %c | %d %c %c>\n",
+			  "-page@%p %lu: 1< %d %c %c > 2< %llu %u %u %#x %#x | %p %p %p > 3< %d %d > 4< %d %d %d %lu %c | %c %c %c %c > 5< %c %c %c %c | %d %c | %d %c %c>\n",
 			  opg, osc_index(opg),
 			  /* 1 */
-			  oap->oap_magic, oap->oap_cmd,
+			  oap->oap_cmd,
 			  list_empty_marker(&oap->oap_pending_item),
 			  list_empty_marker(&oap->oap_rpc_item),
 			  /* 2 */
@@ -293,9 +293,6 @@ void osc_page_submit(const struct lu_env *env, struct osc_page *opg,
 	struct osc_io *oio = osc_env_io(env);
 	struct osc_async_page *oap = &opg->ops_oap;
 
-	LASSERTF(oap->oap_magic == OAP_MAGIC,
-		 "Bad oap magic: oap %p, magic 0x%x\n",
-		 oap, oap->oap_magic);
 	LASSERT(oap->oap_async_flags & ASYNC_READY);
 	LASSERT(oap->oap_async_flags & ASYNC_COUNT_STABLE);
 
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 10/20] lustre: ptlrpc: add assert for ptlrpc_service_purge_all
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (8 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 09/20] lustre: osc: Remove oap_magic James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 11/20] lustre: ptlrpc: lower the message level in no resend case James Simmons
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Etienne AUJAMES, Lustre Development List

From: Etienne AUJAMES <etienne.aujames@cea.fr>

ptlrpc_service_purge_all() calls ptlrpc_server_request_get() with
"force=true" to purge all active requests before stopping an NRS
policy (when unregistering a service).

"force" mode should always return a request if a pending request is
present in the NRS policy.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000114
IP: [<ffffffffc0d9e965>] ptlrpc_nrs_req_stop_nolock+0x5/0x150
.....
? ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc]
ptlrpc_service_purge_all+0x137/0x920 [ptlrpc]
ptlrpc_unregister_service+0xe7/0x6f0 [ptlrpc]
ost_cleanup+0x52/0x1b0 [ost]
class_free_dev+0x21d/0x720 [obdclass]
class_export_put+0x1f0/0x2c0 [obdclass]
class_unlink_export+0x135/0x170 [obdclass]
class_decref+0x80/0x160 [obdclass]
class_detach+0x1b3/0x2e0 [obdclass]
class_process_config+0x1a38/0x2830 [obdclass]
? complete+0x4a/0x60
? list_del+0xd/0x30
? wait_for_completion+0x4e/0x140
class_manual_cleanup+0x1e0/0x710 [obdclass]
server_stop_servers+0xd5/0x160 [obdclass]
server_put_super+0x12d/0xd00 [obdclass]
generic_shutdown_super+0x6d/0x100

WC-bug-id: https://jira.whamcloud.com/browse/LU-16144
Lustre-commit: 1bba7dd425d3fc9ef3 ("LU-16144 nrs: implement force mode for nrs_tbf_req_get()")
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48494
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/ptlrpc/service.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c
index 59fe1f4aa18f..aaf7529e25f3 100644
--- a/fs/lustre/ptlrpc/service.c
+++ b/fs/lustre/ptlrpc/service.c
@@ -2939,6 +2939,7 @@ ptlrpc_service_purge_all(struct ptlrpc_service *svc)
 
 		while (ptlrpc_server_request_pending(svcpt, true)) {
 			req = ptlrpc_server_request_get(svcpt, true);
+			LASSERT(req);
 			ptlrpc_server_finish_active_request(svcpt, req);
 		}
 
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 11/20] lustre: ptlrpc: lower the message level in no resend case
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (9 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 10/20] lustre: ptlrpc: add assert for ptlrpc_service_purge_all James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 12/20] lustre: obdclass: user netlink to collect devices information James Simmons
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Yang Sheng, Lustre Development List

From: Yang Sheng <ys@whamcloud.com>

Don't report the wrong generation as a error message in
rq_no_resend case.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16166
Lustre-commit: d13cca56a5ae2ad44 ("LU-16166 ptlrpc: lower the message level in no resend case")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48585
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/ptlrpc/client.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c
index 069ffdf6f93f..5f0ff476db6a 100644
--- a/fs/lustre/ptlrpc/client.c
+++ b/fs/lustre/ptlrpc/client.c
@@ -1227,7 +1227,8 @@ static int ptlrpc_import_delay_req(struct obd_import *imp,
 			DEBUG_REQ(D_NET, req, "IMP_INVALID");
 		*status = -ESHUTDOWN; /* bz 12940 */
 	} else if (req->rq_import_generation != imp->imp_generation) {
-		DEBUG_REQ(D_ERROR, req, "req wrong generation:");
+		DEBUG_REQ(req->rq_no_resend ? D_INFO : D_ERROR,
+			  req, "req wrong generation:");
 		*status = -EIO;
 	} else if (req->rq_send_state != imp->imp_state) {
 		/* invalidate in progress - any requests should be drop */
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 12/20] lustre: obdclass: user netlink to collect devices information
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (10 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 11/20] lustre: ptlrpc: lower the message level in no resend case James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 13/20] lnet: use %pISc for formatting IP addresses James Simmons
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

Our utilities can report to users a device list with various bits
of data using the debugfs file 'devices'. This debugfs file is
only by default available to root which prevents regular users
from collecting information. Enable non-root users to collect
the same information for lctl dl using netlink. The advantage of
using netlink is that it also removes the 8K ioctl limit. Add the
ability to present this data in YAML format as well.

WC-bug-id: https://jira.whamcloud.com/browse/LU-9680
Lustre-commit: 86ba46c24430f67bb ("LU-9680 obdclass: user netlink to collect devices information")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/31618
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/include/lustre_kernelcomm.h         |  37 ++-
 fs/lustre/obdclass/class_obd.c                |  14 +-
 fs/lustre/obdclass/kernelcomm.c               | 257 +++++++++++++++++-
 include/uapi/linux/lustre/lustre_kernelcomm.h |  18 ++
 4 files changed, 315 insertions(+), 11 deletions(-)

diff --git a/fs/lustre/include/lustre_kernelcomm.h b/fs/lustre/include/lustre_kernelcomm.h
index bd5376b2a672..db725ab8a0f9 100644
--- a/fs/lustre/include/lustre_kernelcomm.h
+++ b/fs/lustre/include/lustre_kernelcomm.h
@@ -41,11 +41,46 @@
 /* For declarations shared with userspace */
 #include <uapi/linux/lustre/lustre_kernelcomm.h>
 
+/**
+ * enum lustre_device_attrs	      - Lustre general top-level netlink
+ *					attributes that describe lustre
+ *					'devices'. These values are used
+ *					to piece togther messages for
+ *					sending and receiving.
+ *
+ * @LUSTRE_DEVICE_ATTR_UNSPEC:		unspecified attribute to catch errors
+ *
+ * @LUSTRE_DEVICE_ATTR_HDR:		Netlink group this data is for
+ *					(NLA_NUL_STRING)
+ * @LUSTRE_DEVICE_ATTR_INDEX:		device number used as an index (NLA_U16)
+ * @LUSTRE_DEVICE_ATTR_STATUS:		status of the device (NLA_STRING)
+ * @LUSTRE_DEVICE_ATTR_CLASS:		class the device belongs to (NLA_STRING)
+ * @LUSTRE_DEVICE_ATTR_NAME:		name of the device (NLA_STRING)
+ * @LUSTRE_DEVICE_ATTR_UUID:		UUID of the device (NLA_STRING)
+ * @LUSTRE_DEVICE_ATTR_REFCOUNT:	refcount of the device (NLA_U32)
+ */
+enum lustre_device_attrs {
+	LUSTRE_DEVICE_ATTR_UNSPEC = 0,
+
+	LUSTRE_DEVICE_ATTR_HDR,
+	LUSTRE_DEVICE_ATTR_INDEX,
+	LUSTRE_DEVICE_ATTR_STATUS,
+	LUSTRE_DEVICE_ATTR_CLASS,
+	LUSTRE_DEVICE_ATTR_NAME,
+	LUSTRE_DEVICE_ATTR_UUID,
+	LUSTRE_DEVICE_ATTR_REFCOUNT,
+
+	__LUSTRE_DEVICE_ATTR_MAX_PLUS_ONE
+};
+
+#define LUSTRE_DEVICE_ATTR_MAX (__LUSTRE_DEVICE_ATTR_MAX_PLUS_ONE - 1)
+
 /* prototype for callback function on kuc groups */
 typedef int (*libcfs_kkuc_cb_t)(void *data, void *cb_arg);
 
 /* Kernel methods */
-void libcfs_kkuc_init(void);
+int libcfs_kkuc_init(void);
+void libcfs_kkuc_fini(void);
 int libcfs_kkuc_group_put(const struct obd_uuid *uuid, int group, void *data);
 int libcfs_kkuc_group_add(struct file *fp, const struct obd_uuid *uuid, int uid,
 			  int group, void *data, size_t data_len);
diff --git a/fs/lustre/obdclass/class_obd.c b/fs/lustre/obdclass/class_obd.c
index f455ed752c15..67a94222a664 100644
--- a/fs/lustre/obdclass/class_obd.c
+++ b/fs/lustre/obdclass/class_obd.c
@@ -671,15 +671,17 @@ static int __init obdclass_init(void)
 	if (err)
 		return err;
 
-	libcfs_kkuc_init();
+	err = obd_init_checks();
+	if (err)
+		return err;
 
-	err = obd_zombie_impexp_init();
+	err = libcfs_kkuc_init();
 	if (err)
 		return err;
 
-	err = obd_init_checks();
+	err = obd_zombie_impexp_init();
 	if (err)
-		goto cleanup_zombie_impexp;
+		goto cleanup_kkuc;
 
 	err = class_handle_init();
 	if (err)
@@ -754,6 +756,9 @@ static int __init obdclass_init(void)
 cleanup_zombie_impexp:
 	obd_zombie_impexp_stop();
 
+cleanup_kkuc:
+	libcfs_kkuc_fini();
+
 	return err;
 }
 
@@ -771,6 +776,7 @@ static void obdclass_exit(void)
 	class_handle_cleanup();
 	class_del_uuid(NULL); /* Delete all UUIDs. */
 	obd_zombie_impexp_stop();
+	libcfs_kkuc_fini();
 }
 
 void obd_heat_clear(struct obd_heat_instance *instance, int count)
diff --git a/fs/lustre/obdclass/kernelcomm.c b/fs/lustre/obdclass/kernelcomm.c
index e59b6aadf097..5682d4e1ab53 100644
--- a/fs/lustre/obdclass/kernelcomm.c
+++ b/fs/lustre/obdclass/kernelcomm.c
@@ -38,16 +38,254 @@
 #define DEBUG_SUBSYSTEM S_CLASS
 
 #include <linux/file.h>
-#include <linux/libcfs/libcfs.h>
+#include <linux/glob.h>
+#include <net/genetlink.h>
+#include <net/sock.h>
+
+#include <obd_class.h>
 #include <obd_support.h>
 #include <lustre_kernelcomm.h>
 
+static struct genl_family lustre_family;
+
+static struct ln_key_list device_list = {
+	.lkl_maxattr			= LUSTRE_DEVICE_ATTR_MAX,
+	.lkl_list			= {
+		[LUSTRE_DEVICE_ATTR_HDR]	= {
+			.lkp_value		= "devices",
+			.lkp_key_format		= LNKF_SEQUENCE | LNKF_MAPPING,
+			.lkp_data_type		= NLA_NUL_STRING,
+		},
+		[LUSTRE_DEVICE_ATTR_INDEX]	= {
+			.lkp_value		= "index",
+			.lkp_data_type		= NLA_U16
+		},
+		[LUSTRE_DEVICE_ATTR_STATUS]	= {
+			.lkp_value		= "status",
+			.lkp_data_type		= NLA_STRING
+		},
+		[LUSTRE_DEVICE_ATTR_CLASS]	= {
+			.lkp_value		= "type",
+			.lkp_data_type		= NLA_STRING
+		},
+		[LUSTRE_DEVICE_ATTR_NAME]	= {
+			.lkp_value		= "name",
+			.lkp_data_type		= NLA_STRING
+		},
+		[LUSTRE_DEVICE_ATTR_UUID]	= {
+			.lkp_value		= "uuid",
+			.lkp_data_type		= NLA_STRING
+		},
+		[LUSTRE_DEVICE_ATTR_REFCOUNT]	= {
+			.lkp_value		= "refcount",
+			.lkp_data_type		= NLA_U32
+		},
+	},
+};
+
+struct genl_dev_list {
+	struct obd_device	*gdl_target;
+	unsigned int		gdl_start;
+};
+
+static inline struct genl_dev_list *
+device_dump_ctx(struct netlink_callback *cb)
+{
+	return (struct genl_dev_list *)cb->args[0];
+}
+
+/* generic ->start() handler for GET requests */
+static int lustre_device_list_start(struct netlink_callback *cb)
+{
+	struct genlmsghdr *gnlh = nlmsg_data(cb->nlh);
+	struct netlink_ext_ack *extack = cb->extack;
+	struct genl_dev_list *glist;
+	int msg_len, rc = 0;
+
+	glist = kmalloc(sizeof(*glist), GFP_KERNEL);
+	if (!glist)
+		return -ENOMEM;
+
+	cb->args[0] = (long)glist;
+	glist->gdl_target = NULL;
+	glist->gdl_start = 0;
+
+	msg_len = genlmsg_len(gnlh);
+	if (msg_len > 0) {
+		struct nlattr *params = genlmsg_data(gnlh);
+		struct nlattr *dev;
+		int rem;
+
+		nla_for_each_attr(dev, params, msg_len, rem) {
+			struct nlattr *prop;
+			int rem2;
+
+			nla_for_each_nested(prop, dev, rem2) {
+				char name[MAX_OBD_NAME];
+				struct obd_device *obd;
+
+				if (nla_type(prop) != LN_SCALAR_ATTR_VALUE ||
+				    nla_strcmp(prop, "name") != 0)
+					continue;
+
+				prop = nla_next(prop, &rem2);
+				if (nla_type(prop) != LN_SCALAR_ATTR_VALUE) {
+					rc = -EINVAL;
+					goto report_err;
+				}
+
+				rc = nla_strlcpy(name, prop, sizeof(name));
+				if (rc < 0)
+					goto report_err;
+				rc = 0;
+
+				obd = class_name2obd(name);
+				if (obd)
+					glist->gdl_target = obd;
+			}
+		}
+		if (!glist->gdl_target) {
+			NL_SET_ERR_MSG(extack, "No devices found");
+			rc = -ENOENT;
+		}
+	}
+report_err:
+	if (rc < 0) {
+		kfree(glist);
+		cb->args[0] = 0;
+	}
+	return rc;
+}
+
+static int lustre_device_list_dump(struct sk_buff *msg,
+				   struct netlink_callback *cb)
+{
+	struct genl_dev_list *glist = device_dump_ctx(cb);
+	struct obd_device *filter = glist->gdl_target;
+	struct netlink_ext_ack *extack = cb->extack;
+	int portid = NETLINK_CB(cb->skb).portid;
+	int seq = cb->nlh->nlmsg_seq;
+	int idx, rc = 0;
+
+	if (glist->gdl_start == 0) {
+		const struct ln_key_list *all[] = {
+			&device_list, NULL
+		};
+
+		rc = lnet_genl_send_scalar_list(msg, portid, seq,
+						&lustre_family,
+						NLM_F_CREATE | NLM_F_MULTI,
+						LUSTRE_CMD_DEVICES, all);
+		if (rc < 0) {
+			NL_SET_ERR_MSG(extack, "failed to send key table");
+			return rc;
+		}
+	}
+
+	for (idx = glist->gdl_start; idx < class_devno_max(); idx++) {
+		struct obd_device *obd;
+		const char *status;
+		void *hdr;
+
+		obd = class_num2obd(idx);
+		if (!obd)
+			continue;
+
+		if (filter && filter != obd)
+			continue;
+
+		hdr = genlmsg_put(msg, portid, seq, &lustre_family,
+				  NLM_F_MULTI, LUSTRE_CMD_DEVICES);
+		if (!hdr) {
+			NL_SET_ERR_MSG(extack, "failed to send values");
+			genlmsg_cancel(msg, hdr);
+			rc = -EMSGSIZE;
+			break;
+		}
+
+		if (idx == 0)
+			nla_put_string(msg, LUSTRE_DEVICE_ATTR_HDR, "");
+
+		nla_put_u16(msg, LUSTRE_DEVICE_ATTR_INDEX, obd->obd_minor);
+
+		/* Collect only the index value for a single obd */
+		if (filter) {
+			genlmsg_end(msg, hdr);
+			idx++;
+			break;
+		}
+
+		if (obd->obd_stopping)
+			status = "ST";
+		else if (obd->obd_inactive)
+			status = "IN";
+		else if (obd->obd_set_up)
+			status = "UP";
+		else if (obd->obd_attached)
+			status = "AT";
+		else
+			status = "--";
+
+		nla_put_string(msg, LUSTRE_DEVICE_ATTR_STATUS, status);
+
+		nla_put_string(msg, LUSTRE_DEVICE_ATTR_CLASS,
+			       obd->obd_type->typ_name);
+
+		nla_put_string(msg, LUSTRE_DEVICE_ATTR_NAME,
+			       obd->obd_name);
+
+		nla_put_string(msg, LUSTRE_DEVICE_ATTR_UUID,
+			       obd->obd_uuid.uuid);
+
+		nla_put_u32(msg, LUSTRE_DEVICE_ATTR_REFCOUNT,
+			    atomic_read(&obd->obd_refcount));
+
+		genlmsg_end(msg, hdr);
+	}
+
+	glist->gdl_start = idx;
+	return rc < 0 ? rc : msg->len;
+}
+
+int lustre_device_done(struct netlink_callback *cb)
+{
+	struct genl_dev_list *glist;
+
+	glist = device_dump_ctx(cb);
+	kfree(glist);
+	cb->args[0] = 0;
+
+	return 0;
+}
+
+static const struct genl_multicast_group lustre_mcast_grps[] = {
+	{ .name		= "devices",		},
+};
+
+static const struct genl_ops lustre_genl_ops[] = {
+	{
+		.cmd		= LUSTRE_CMD_DEVICES,
+		.start		= lustre_device_list_start,
+		.dumpit		= lustre_device_list_dump,
+		.done		= lustre_device_done,
+	},
+};
+
+static struct genl_family lustre_family = {
+	.name		= LUSTRE_GENL_NAME,
+	.version	= LUSTRE_GENL_VERSION,
+	.module		= THIS_MODULE,
+	.ops		= lustre_genl_ops,
+	.n_ops		= ARRAY_SIZE(lustre_genl_ops),
+	.mcgrps		= lustre_mcast_grps,
+	.n_mcgrps	= ARRAY_SIZE(lustre_mcast_grps),
+};
+
 /**
  * libcfs_kkuc_msg_put - send an message from kernel to userspace
- *
- * @fp:		to send the message to
- * @payload:	Payload data. First field of payload is always
- *		struct kuc_hdr
+ * @param fp	to send the message to
+ * @param payload Payload data. First field of payload is always
+ *  struct kuc_hdr
  */
 static int libcfs_kkuc_msg_put(struct file *filp, void *payload)
 {
@@ -104,12 +342,19 @@ static inline bool libcfs_kkuc_group_is_valid(unsigned int group)
 	return group < ARRAY_SIZE(kkuc_groups);
 }
 
-void libcfs_kkuc_init(void)
+int libcfs_kkuc_init(void)
 {
 	int group;
 
 	for (group = 0; group < ARRAY_SIZE(kkuc_groups); group++)
 		INIT_LIST_HEAD(&kkuc_groups[group]);
+
+	return genl_register_family(&lustre_family);
+}
+
+void libcfs_kkuc_fini(void)
+{
+	genl_unregister_family(&lustre_family);
 }
 
 /** Add a receiver to a broadcast group
diff --git a/include/uapi/linux/lustre/lustre_kernelcomm.h b/include/uapi/linux/lustre/lustre_kernelcomm.h
index 744eeb674f72..91bb686d33e9 100644
--- a/include/uapi/linux/lustre/lustre_kernelcomm.h
+++ b/include/uapi/linux/lustre/lustre_kernelcomm.h
@@ -39,6 +39,24 @@
 
 #include <linux/types.h>
 
+#define LUSTRE_GENL_NAME		"lustre"
+#define LUSTRE_GENL_VERSION		0x1
+
+/*
+ * enum lustre_commands		      - Supported Lustre Netlink commands
+ *
+ * @LUSTRE_CMD_UNSPEC:			unspecified command to catch errors
+ * @LUSTRE_CMD_DEVICES:			command to manage the Lustre devices
+ */
+enum lustre_commands {
+	LUSTRE_CMD_UNSPEC	= 0,
+	LUSTRE_CMD_DEVICES	= 1,
+
+	__LUSTRE_CMD_MAX_PLUS_ONE
+};
+
+#define LUSTRE_CMD_MAX	(__LUSTRE_CMD_MAX_PLUS_ONE - 1)
+
 /* KUC message header.
  * All current and future KUC messages should use this header.
  * To avoid having to include Lustre headers from libcfs, define this here.
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 13/20] lnet: use %pISc for formatting IP addresses
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (11 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 12/20] lustre: obdclass: user netlink to collect devices information James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 14/20] lustre: llog: correct llog FID and path output James Simmons
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

The Linux kernel's printf functionality understands %pIS to means that
a the address in a 'struct sockaddr' should be formated, either as
IPv4 or IPv6.  For IPv6, the verbose format showing all 16 bytes
whether zero or not is used.

To get the more familiar "compressed" format where strings of :0000:
are replaced with ::, we need to add the 'c' flag.  This is ignored
for IPv4.

When requesting the port as well ("%pISp), the 'c' and 'p' can appear
in either order.

So this patch changes all %pIS to %pISc as we always want the
compressed format.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: ed6b125bab5f7e383 ("LU-10391 lnet: use %pISc for formatting IP addresses")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48685
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c       | 18 +++++------
 net/lnet/klnds/socklnd/socklnd_cb.c    | 44 +++++++++++++-------------
 net/lnet/klnds/socklnd/socklnd_proto.c | 30 +++++++++---------
 net/lnet/lnet/acceptor.c               | 34 ++++++++++----------
 net/lnet/lnet/lib-socket.c             |  2 +-
 5 files changed, 64 insertions(+), 64 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 89696977ac63..8999580b67b4 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -527,13 +527,13 @@ ksocknal_associate_cb_conn_locked(struct ksock_conn_cb *conn_cb,
 			/* conn_cb wasn't bound locally yet (the initial
 			 * conn_cb)
 			 */
-			CDEBUG(D_NET, "Binding %s %pIS to interface %d\n",
+			CDEBUG(D_NET, "Binding %s %pISc to interface %d\n",
 			       libcfs_idstr(&peer_ni->ksnp_id),
 			       &conn_cb->ksnr_addr,
 			       conn_iface);
 		} else {
 			CDEBUG(D_NET,
-			       "Rebinding %s %pIS from interface %d to %d\n",
+			       "Rebinding %s %pISc from interface %d to %d\n",
 			       libcfs_idstr(&peer_ni->ksnp_id),
 			       &conn_cb->ksnr_addr,
 			       conn_cb->ksnr_myiface,
@@ -835,7 +835,7 @@ ksocknal_accept(struct lnet_ni *ni, struct socket *sock)
 	cr = kzalloc(sizeof(*cr), GFP_NOFS);
 	if (!cr) {
 		LCONSOLE_ERROR_MSG(0x12f,
-				   "Dropping connection request from %pIS: memory exhausted\n",
+				   "Dropping connection request from %pISc: memory exhausted\n",
 				   &peer);
 		return -ENOMEM;
 	}
@@ -1113,7 +1113,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_conn_cb *conn_cb,
 	if (active &&
 	    !rpc_cmp_addr((struct sockaddr *)&conn_cb->ksnr_addr,
 			  (struct sockaddr *)&conn->ksnc_peeraddr)) {
-		CERROR("Route %s %pIS connected to %pIS\n",
+		CERROR("Route %s %pISc connected to %pISc\n",
 		       libcfs_idstr(&peer_ni->ksnp_id),
 		       &conn_cb->ksnr_addr,
 		       &conn->ksnc_peeraddr);
@@ -1183,7 +1183,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_conn_cb *conn_cb,
 	 *	socket callbacks.
 	 */
 	CDEBUG(D_NET,
-	       "New conn %s p %d.x %pIS -> %pISp incarnation:%lld sched[%d]\n",
+	       "New conn %s p %d.x %pISc -> %pIScp incarnation:%lld sched[%d]\n",
 	       libcfs_idstr(&peerid), conn->ksnc_proto->pro_version,
 	       &conn->ksnc_myaddr, &conn->ksnc_peeraddr,
 	       incarnation, cpt);
@@ -1552,7 +1552,7 @@ ksocknal_destroy_conn(struct ksock_conn *conn)
 	case SOCKNAL_RX_LNET_PAYLOAD:
 		last_rcv = conn->ksnc_rx_deadline -
 			   ksocknal_timeout();
-		CERROR("Completing partial receive from %s[%d], ip %pISp, with error, wanted: %zd, left: %d, last alive is %lld secs ago\n",
+		CERROR("Completing partial receive from %s[%d], ip %pIScp, with error, wanted: %zd, left: %d, last alive is %lld secs ago\n",
 		       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 		       conn->ksnc_type,
 		       &conn->ksnc_peeraddr,
@@ -1565,21 +1565,21 @@ ksocknal_destroy_conn(struct ksock_conn *conn)
 		break;
 	case SOCKNAL_RX_LNET_HEADER:
 		if (conn->ksnc_rx_started)
-			CERROR("Incomplete receive of lnet header from %s, ip %pISp, with error, protocol: %d.x.\n",
+			CERROR("Incomplete receive of lnet header from %s, ip %pIScp, with error, protocol: %d.x.\n",
 			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr,
 			       conn->ksnc_proto->pro_version);
 		break;
 	case SOCKNAL_RX_KSM_HEADER:
 		if (conn->ksnc_rx_started)
-			CERROR("Incomplete receive of ksock message from %s, ip %pISp, with error, protocol: %d.x.\n",
+			CERROR("Incomplete receive of ksock message from %s, ip %pIScp, with error, protocol: %d.x.\n",
 			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr,
 			       conn->ksnc_proto->pro_version);
 		break;
 	case SOCKNAL_RX_SLOP:
 		if (conn->ksnc_rx_started)
-			CERROR("Incomplete receive of slops from %s, ip %pISp, with error\n",
+			CERROR("Incomplete receive of slops from %s, ip %pIScp, with error\n",
 			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr);
 	       break;
diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c
index 308d8b0d6366..b2da535fbfbe 100644
--- a/net/lnet/klnds/socklnd/socklnd_cb.c
+++ b/net/lnet/klnds/socklnd/socklnd_cb.c
@@ -533,15 +533,15 @@ ksocknal_process_transmit(struct ksock_conn *conn, struct ksock_tx *tx)
 	if (!conn->ksnc_closing) {
 		switch (rc) {
 		case -ECONNRESET:
-			LCONSOLE_WARN("Host %pIS reset our connection while we were sending data; it may have rebooted.\n",
+			LCONSOLE_WARN("Host %pISc reset our connection while we were sending data; it may have rebooted.\n",
 				      &conn->ksnc_peeraddr);
 			break;
 		default:
-			LCONSOLE_WARN("There was an unexpected network error while writing to %pIS: %d.\n",
+			LCONSOLE_WARN("There was an unexpected network error while writing to %pISc: %d.\n",
 				      &conn->ksnc_peeraddr, rc);
 			break;
 		}
-		CDEBUG(D_NET, "[%p] Error %d on write to %s ip %pISp\n",
+		CDEBUG(D_NET, "[%p] Error %d on write to %s ip %pIScp\n",
 		       conn, rc, libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 		       &conn->ksnc_peeraddr);
 	}
@@ -676,7 +676,7 @@ ksocknal_queue_tx_locked(struct ksock_tx *tx, struct ksock_conn *conn)
 	 */
 	LASSERT(!conn->ksnc_closing);
 
-	CDEBUG(D_NET, "Sending to %s ip %pISp\n",
+	CDEBUG(D_NET, "Sending to %s ip %pIScp\n",
 	       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 	       &conn->ksnc_peeraddr);
 
@@ -777,7 +777,7 @@ ksocknal_find_connectable_conn_cb_locked(struct ksock_peer_ni *peer_ni)
 	if (!(conn_cb->ksnr_retry_interval == 0 || /* first attempt */
 	      now >= conn_cb->ksnr_timeout)) {
 		CDEBUG(D_NET,
-		       "Too soon to retry route %pIS (cnted %d, interval %lld, %lld secs later)\n",
+		       "Too soon to retry route %pISc (cnted %d, interval %lld, %lld secs later)\n",
 		       &conn_cb->ksnr_addr,
 		       conn_cb->ksnr_connected,
 		       conn_cb->ksnr_retry_interval,
@@ -1095,11 +1095,11 @@ ksocknal_process_receive(struct ksock_conn *conn)
 			LASSERT(rc != -EAGAIN);
 
 			if (!rc)
-				CDEBUG(D_NET, "[%p] EOF from %s ip %pISp\n",
+				CDEBUG(D_NET, "[%p] EOF from %s ip %pIScp\n",
 				       conn, libcfs_idstr(ksnp_id),
 				       &conn->ksnc_peeraddr);
 			else if (!conn->ksnc_closing)
-				CERROR("[%p] Error %d on read from %s ip %pISp\n",
+				CERROR("[%p] Error %d on read from %s ip %pIScp\n",
 				       conn, rc, libcfs_idstr(ksnp_id),
 				       &conn->ksnc_peeraddr);
 
@@ -1653,7 +1653,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	rc = lnet_sock_read(sock, &hello->kshm_magic,
 			    sizeof(hello->kshm_magic), timeout);
 	if (rc) {
-		CERROR("Error %d reading HELLO from %pIS\n",
+		CERROR("Error %d reading HELLO from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0);
 		return rc;
@@ -1663,7 +1663,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	    hello->kshm_magic != __swab32(LNET_PROTO_MAGIC) &&
 	    hello->kshm_magic != le32_to_cpu(LNET_PROTO_TCP_MAGIC)) {
 		/* Unexpected magic! */
-		CERROR("Bad magic(1) %#08x (%#08x expected) from %pIS\n",
+		CERROR("Bad magic(1) %#08x (%#08x expected) from %pISc\n",
 		       __cpu_to_le32(hello->kshm_magic),
 		       LNET_PROTO_TCP_MAGIC, &conn->ksnc_peeraddr);
 		return -EPROTO;
@@ -1672,7 +1672,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	rc = lnet_sock_read(sock, &hello->kshm_version,
 			    sizeof(hello->kshm_version), timeout);
 	if (rc) {
-		CERROR("Error %d reading HELLO from %pIS\n",
+		CERROR("Error %d reading HELLO from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0);
 		return rc;
@@ -1696,7 +1696,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 					    hello);
 		}
 
-		CERROR("Unknown protocol version (%d.x expected) from %pIS\n",
+		CERROR("Unknown protocol version (%d.x expected) from %pISc\n",
 		       conn->ksnc_proto->pro_version,
 		       &conn->ksnc_peeraddr);
 
@@ -1709,7 +1709,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	/* receive the rest of hello message anyway */
 	rc = conn->ksnc_proto->pro_recv_hello(conn, hello, timeout);
 	if (rc) {
-		CERROR("Error %d reading or checking hello from from %pIS\n",
+		CERROR("Error %d reading or checking hello from from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0);
 		return rc;
@@ -1718,7 +1718,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	*incarnation = hello->kshm_src_incarnation;
 
 	if (LNET_NID_IS_ANY(&hello->kshm_src_nid)) {
-		CERROR("Expecting a HELLO hdr with a NID, but got LNET_NID_ANY from %pIS\n",
+		CERROR("Expecting a HELLO hdr with a NID, but got LNET_NID_ANY from %pISc\n",
 		       &conn->ksnc_peeraddr);
 		return -EPROTO;
 	}
@@ -1746,7 +1746,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 		/* peer_ni determines type */
 		conn->ksnc_type = ksocknal_invert_type(hello->kshm_ctype);
 		if (conn->ksnc_type == SOCKLND_CONN_NONE) {
-			CERROR("Unexpected type %d from %s ip %pIS\n",
+			CERROR("Unexpected type %d from %s ip %pISc\n",
 			       hello->kshm_ctype, libcfs_idstr(peerid),
 			       &conn->ksnc_peeraddr);
 			return -EPROTO;
@@ -1758,7 +1758,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	if (peerid->pid != recv_id.pid ||
 	    !nid_same(&peerid->nid,  &recv_id.nid)) {
 		LCONSOLE_ERROR_MSG(0x130,
-				   "Connected successfully to %s on host %pIS, but they claimed they were %s; please check your Lustre configuration.\n",
+				   "Connected successfully to %s on host %pISc, but they claimed they were %s; please check your Lustre configuration.\n",
 				   libcfs_idstr(peerid),
 				   &conn->ksnc_peeraddr,
 				   libcfs_idstr(&recv_id));
@@ -1771,7 +1771,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn,
 	}
 
 	if (ksocknal_invert_type(hello->kshm_ctype) != conn->ksnc_type) {
-		CERROR("Mismatched types: me %d, %s ip %pIS %d\n",
+		CERROR("Mismatched types: me %d, %s ip %pISc %d\n",
 		       conn->ksnc_type, libcfs_idstr(peerid),
 		       &conn->ksnc_peeraddr, hello->kshm_ctype);
 		return -EPROTO;
@@ -2149,7 +2149,7 @@ ksocknal_connd(void *arg)
 			if (ksocknal_connect(conn_cb)) {
 				/* consecutive retry */
 				if (cons_retry++ > SOCKNAL_INSANITY_RECONN) {
-					CWARN("massive consecutive re-connecting to %pIS\n",
+					CWARN("massive consecutive re-connecting to %pISc\n",
 					      &conn_cb->ksnr_addr);
 					cons_retry = 0;
 				}
@@ -2211,17 +2211,17 @@ ksocknal_find_timed_out_conn(struct ksock_peer_ni *peer_ni)
 
 			switch (error) {
 			case ECONNRESET:
-				CNETERR("A connection with %s (%pISp) was reset; it may have rebooted.\n",
+				CNETERR("A connection with %s (%pIScp) was reset; it may have rebooted.\n",
 					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
 				break;
 			case ETIMEDOUT:
-				CNETERR("A connection with %s (%pISp) timed out; the network or node may be down.\n",
+				CNETERR("A connection with %s (%pIScp) timed out; the network or node may be down.\n",
 					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
 				break;
 			default:
-				CNETERR("An unexpected network error %d occurred with %s (%pISp\n",
+				CNETERR("An unexpected network error %d occurred with %s (%pIScp\n",
 					error,
 					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
@@ -2235,7 +2235,7 @@ ksocknal_find_timed_out_conn(struct ksock_peer_ni *peer_ni)
 		    ktime_get_seconds() >= conn->ksnc_rx_deadline) {
 			/* Timed out incomplete incoming message */
 			ksocknal_conn_addref(conn);
-			CNETERR("Timeout receiving from %s (%pISp), state %d wanted %zd left %d\n",
+			CNETERR("Timeout receiving from %s (%pIScp), state %d wanted %zd left %d\n",
 				libcfs_idstr(&peer_ni->ksnp_id),
 				&conn->ksnc_peeraddr,
 				conn->ksnc_rx_state,
@@ -2257,7 +2257,7 @@ ksocknal_find_timed_out_conn(struct ksock_peer_ni *peer_ni)
 					    tx_list)
 				tx->tx_hstatus =
 					LNET_MSG_STATUS_LOCAL_TIMEOUT;
-			CNETERR("Timeout sending data to %s (%pISp) the network or that node may be down.\n",
+			CNETERR("Timeout sending data to %s (%pIScp) the network or that node may be down.\n",
 				libcfs_idstr(&peer_ni->ksnp_id),
 				&conn->ksnc_peeraddr);
 			spin_unlock_bh(&sched->kss_lock);
diff --git a/net/lnet/klnds/socklnd/socklnd_proto.c b/net/lnet/klnds/socklnd/socklnd_proto.c
index 0a93d572ea68..4d92a1c93aad 100644
--- a/net/lnet/klnds/socklnd/socklnd_proto.c
+++ b/net/lnet/klnds/socklnd/socklnd_proto.c
@@ -548,7 +548,7 @@ ksocknal_send_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello)
 
 	rc = lnet_sock_write(sock, hdr, sizeof(*hdr), lnet_acceptor_timeout());
 	if (rc) {
-		CNETERR("Error %d sending HELLO hdr to %pISp\n",
+		CNETERR("Error %d sending HELLO hdr to %pIScp\n",
 			rc, &conn->ksnc_peeraddr);
 		goto out;
 	}
@@ -563,7 +563,7 @@ ksocknal_send_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello)
 			     hello->kshm_nips * sizeof(u32),
 			     lnet_acceptor_timeout());
 	if (rc) {
-		CNETERR("Error %d sending HELLO payload (%d) to %pISp\n",
+		CNETERR("Error %d sending HELLO payload (%d) to %pIScp\n",
 			rc, hello->kshm_nips,
 			&conn->ksnc_peeraddr);
 	}
@@ -621,7 +621,7 @@ ksocknal_send_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello)
 			     lnet_acceptor_timeout());
 	kfree(hello4);
 	if (rc) {
-		CNETERR("Error %d sending HELLO hdr to %pISp\n",
+		CNETERR("Error %d sending HELLO hdr to %pIScp\n",
 			rc, &conn->ksnc_peeraddr);
 		return rc;
 	}
@@ -633,7 +633,7 @@ ksocknal_send_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello)
 			     hello->kshm_nips * sizeof(u32),
 			     lnet_acceptor_timeout());
 	if (rc) {
-		CNETERR("Error %d sending HELLO payload (%d) to %pISp\n",
+		CNETERR("Error %d sending HELLO payload (%d) to %pIScp\n",
 			rc, hello->kshm_nips,
 			&conn->ksnc_peeraddr);
 	}
@@ -654,7 +654,7 @@ ksocknal_send_hello_v4(struct ksock_conn *conn, struct ksock_hello_msg *hello)
 			     lnet_acceptor_timeout());
 
 	if (rc != 0)
-		CNETERR("Error %d sending HELLO hdr to %pISp\n",
+		CNETERR("Error %d sending HELLO hdr to %pIScp\n",
 			rc, &conn->ksnc_peeraddr);
 	return rc;
 }
@@ -679,7 +679,7 @@ ksocknal_recv_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 						    src_nid),
 			    timeout);
 	if (rc) {
-		CERROR("Error %d reading rest of HELLO hdr from %pIS\n",
+		CERROR("Error %d reading rest of HELLO hdr from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0 && rc != -EALREADY);
 		goto out;
@@ -687,7 +687,7 @@ ksocknal_recv_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 
 	/* ...and check we got what we expected */
 	if (hdr->type != cpu_to_le32(LNET_MSG_HELLO)) {
-		CERROR("Expecting a HELLO hdr, but got type %d from %pIS\n",
+		CERROR("Expecting a HELLO hdr, but got type %d from %pISc\n",
 		       le32_to_cpu(hdr->type),
 		       &conn->ksnc_peeraddr);
 		rc = -EPROTO;
@@ -701,7 +701,7 @@ ksocknal_recv_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 	hello->kshm_nips = le32_to_cpu(hdr->payload_length) / sizeof(u32);
 
 	if (hello->kshm_nips > LNET_INTERFACES_NUM) {
-		CERROR("Bad nips %d from ip %pIS\n",
+		CERROR("Bad nips %d from ip %pISc\n",
 		       hello->kshm_nips, &conn->ksnc_peeraddr);
 		rc = -EPROTO;
 		goto out;
@@ -713,7 +713,7 @@ ksocknal_recv_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 	rc = lnet_sock_read(sock, hello->kshm_ips,
 			    hello->kshm_nips * sizeof(u32), timeout);
 	if (rc) {
-		CERROR("Error %d reading IPs from ip %pIS\n",
+		CERROR("Error %d reading IPs from ip %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0 && rc != -EALREADY);
 		goto out;
@@ -723,7 +723,7 @@ ksocknal_recv_hello_v1(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 		hello->kshm_ips[i] = __le32_to_cpu(hello->kshm_ips[i]);
 
 		if (!hello->kshm_ips[i]) {
-			CERROR("Zero IP[%d] from ip %pIS\n",
+			CERROR("Zero IP[%d] from ip %pISc\n",
 			       i, &conn->ksnc_peeraddr);
 			rc = -EPROTO;
 			break;
@@ -754,7 +754,7 @@ ksocknal_recv_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 			    offsetof(struct ksock_hello_msg_nid4, kshm_src_nid),
 			    timeout);
 	if (rc) {
-		CERROR("Error %d reading HELLO from %pIS\n",
+		CERROR("Error %d reading HELLO from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0 && rc != -EALREADY);
 		return rc;
@@ -783,7 +783,7 @@ ksocknal_recv_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 	}
 
 	if (hello->kshm_nips > LNET_INTERFACES_NUM) {
-		CERROR("Bad nips %d from ip %pIS\n",
+		CERROR("Bad nips %d from ip %pISc\n",
 		       hello->kshm_nips, &conn->ksnc_peeraddr);
 		return -EPROTO;
 	}
@@ -794,7 +794,7 @@ ksocknal_recv_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 	rc = lnet_sock_read(sock, hello->kshm_ips,
 			    hello->kshm_nips * sizeof(u32), timeout);
 	if (rc) {
-		CERROR("Error %d reading IPs from ip %pIS\n",
+		CERROR("Error %d reading IPs from ip %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0 && rc != -EALREADY);
 		return rc;
@@ -805,7 +805,7 @@ ksocknal_recv_hello_v2(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 			__swab32s(&hello->kshm_ips[i]);
 
 		if (!hello->kshm_ips[i]) {
-			CERROR("Zero IP[%d] from ip %pIS\n",
+			CERROR("Zero IP[%d] from ip %pISc\n",
 			       i, &conn->ksnc_peeraddr);
 			return -EPROTO;
 		}
@@ -831,7 +831,7 @@ ksocknal_recv_hello_v4(struct ksock_conn *conn, struct ksock_hello_msg *hello,
 			    offsetof(struct ksock_hello_msg, kshm_src_nid),
 			    timeout);
 	if (rc) {
-		CERROR("Error %d reading HELLO from %pIS\n",
+		CERROR("Error %d reading HELLO from %pISc\n",
 		       rc, &conn->ksnc_peeraddr);
 		LASSERT(rc < 0 && rc != -EALREADY);
 		return rc;
diff --git a/net/lnet/lnet/acceptor.c b/net/lnet/lnet/acceptor.c
index 2306760210ae..6e219085cb53 100644
--- a/net/lnet/lnet/acceptor.c
+++ b/net/lnet/lnet/acceptor.c
@@ -91,37 +91,37 @@ lnet_connect_console_error(int rc, struct lnet_nid *peer_nid,
 	switch (rc) {
 	/* "normal" errors */
 	case -ECONNREFUSED:
-		CNETERR("Connection to %s at host %pISp was refused: check that Lustre is running on that node.\n",
+		CNETERR("Connection to %s at host %pIScp was refused: check that Lustre is running on that node.\n",
 			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -EHOSTUNREACH:
 	case -ENETUNREACH:
-		CNETERR("Connection to %s at host %pIS was unreachable: the network or that node may be down, or Lustre may be misconfigured.\n",
+		CNETERR("Connection to %s at host %pISc was unreachable: the network or that node may be down, or Lustre may be misconfigured.\n",
 			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -ETIMEDOUT:
-		CNETERR("Connection to %s at host %pISp took too long: that node may be hung or experiencing high load.\n",
+		CNETERR("Connection to %s at host %pIScp took too long: that node may be hung or experiencing high load.\n",
 			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -ECONNRESET:
 		LCONSOLE_ERROR_MSG(0x11b,
-				   "Connection to %s at host %pISp was reset: is it running a compatible version of Lustre and is %s one of its NIDs?\n",
+				   "Connection to %s at host %pIScp was reset: is it running a compatible version of Lustre and is %s one of its NIDs?\n",
 				   libcfs_nidstr(peer_nid), sa,
 				   libcfs_nidstr(peer_nid));
 		break;
 	case -EPROTO:
 		LCONSOLE_ERROR_MSG(0x11c,
-				   "Protocol error connecting to %s at host %pISp: is it running a compatible version of Lustre?\n",
+				   "Protocol error connecting to %s at host %pIScp: is it running a compatible version of Lustre?\n",
 				   libcfs_nidstr(peer_nid), sa);
 		break;
 	case -EADDRINUSE:
 		LCONSOLE_ERROR_MSG(0x11d,
-				   "No privileged ports available to connect to %s at host %pISp\n",
+				   "No privileged ports available to connect to %s at host %pIScp\n",
 				   libcfs_nidstr(peer_nid), sa);
 		break;
 	default:
 		LCONSOLE_ERROR_MSG(0x11e,
-				   "Unexpected error %d connecting to %s at host %pISp\n",
+				   "Unexpected error %d connecting to %s at host %pIScp\n",
 				   rc, libcfs_nidstr(peer_nid), sa);
 		break;
 	}
@@ -237,7 +237,7 @@ lnet_accept(struct socket *sock, u32 magic)
 					     accept_timeout);
 
 			if (rc)
-				CERROR("Error sending magic+version in response to LNET magic from %pIS: %d\n",
+				CERROR("Error sending magic+version in response to LNET magic from %pISc: %d\n",
 				       &peer, rc);
 			return -EPROTO;
 		}
@@ -248,7 +248,7 @@ lnet_accept(struct socket *sock, u32 magic)
 			str = "unrecognised";
 
 		LCONSOLE_ERROR_MSG(0x11f,
-				   "Refusing connection from %pIS magic %08x: %s acceptor protocol\n",
+				   "Refusing connection from %pISc magic %08x: %s acceptor protocol\n",
 				   &peer, magic, str);
 		return -EPROTO;
 	}
@@ -258,7 +258,7 @@ lnet_accept(struct socket *sock, u32 magic)
 	rc = lnet_sock_read(sock, &cr.acr_version, sizeof(cr.acr_version),
 			    accept_timeout);
 	if (rc) {
-		CERROR("Error %d reading connection request version from %pIS\n",
+		CERROR("Error %d reading connection request version from %pISc\n",
 		       rc, &peer);
 		return -EIO;
 	}
@@ -281,7 +281,7 @@ lnet_accept(struct socket *sock, u32 magic)
 
 		rc = lnet_sock_write(sock, &cr, sizeof(cr), accept_timeout);
 		if (rc)
-			CERROR("Error sending magic+version in response to version %d from %pIS: %d\n",
+			CERROR("Error sending magic+version in response to version %d from %pISc: %d\n",
 			       peer_version, &peer, rc);
 		return -EPROTO;
 
@@ -311,7 +311,7 @@ lnet_accept(struct socket *sock, u32 magic)
 		break;
 	}
 	if (rc) {
-		CERROR("Error %d reading connection request from %pIS\n",
+		CERROR("Error %d reading connection request from %pISc\n",
 		       rc, &peer);
 		return -EIO;
 	}
@@ -323,7 +323,7 @@ lnet_accept(struct socket *sock, u32 magic)
 		if (ni)
 			lnet_ni_decref(ni);
 		LCONSOLE_ERROR_MSG(0x120,
-				   "Refusing connection from %pIS for %s: No matching NI\n",
+				   "Refusing connection from %pISc for %s: No matching NI\n",
 				   &peer, libcfs_nidstr(&nid));
 		return -EPERM;
 	}
@@ -332,12 +332,12 @@ lnet_accept(struct socket *sock, u32 magic)
 		/* This catches a request for the loopback LND */
 		lnet_ni_decref(ni);
 		LCONSOLE_ERROR_MSG(0x121,
-				   "Refusing connection from %pIS for %s: NI doesn not accept IP connections\n",
+				   "Refusing connection from %pISc for %s: NI doesn not accept IP connections\n",
 				   &peer, libcfs_nidstr(&nid));
 		return -EPERM;
 	}
 
-	CDEBUG(D_NET, "Accept %s from %pIS\n", libcfs_nidstr(&nid), &peer);
+	CDEBUG(D_NET, "Accept %s from %pISc\n", libcfs_nidstr(&nid), &peer);
 
 	rc = ni->ni_net->net_lnd->lnd_accept(ni, sock);
 
@@ -430,7 +430,7 @@ lnet_acceptor(void *arg)
 		if (secure &&
 		    rpc_get_port((struct sockaddr *)&peer) >
 		    LNET_ACCEPTOR_MAX_RESERVED_PORT) {
-			CERROR("Refusing connection from %pISp: insecure port\n",
+			CERROR("Refusing connection from %pIScp: insecure port\n",
 			       &peer);
 			goto failed;
 		}
@@ -438,7 +438,7 @@ lnet_acceptor(void *arg)
 		rc = lnet_sock_read(newsock, &magic, sizeof(magic),
 				    accept_timeout);
 		if (rc) {
-			CERROR("Error %d reading connection request from %pIS\n",
+			CERROR("Error %d reading connection request from %pISc\n",
 			       rc, &peer);
 			goto failed;
 		}
diff --git a/net/lnet/lnet/lib-socket.c b/net/lnet/lnet/lib-socket.c
index 01f375ed96a3..e6236c98261d 100644
--- a/net/lnet/lnet/lib-socket.c
+++ b/net/lnet/lnet/lib-socket.c
@@ -401,7 +401,7 @@ lnet_sock_connect(int interface, int local_port,
 	 * port...
 	 */
 	CDEBUG_LIMIT(rc == -EADDRNOTAVAIL ? D_NET : D_NETERROR,
-		     "Error %d connecting %d -> %pISp\n", rc,
+		     "Error %d connecting %d -> %pIScp\n", rc,
 		     local_port, peeraddr);
 
 	sock_release(sock);
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 14/20] lustre: llog: correct llog FID and path output
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (12 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 13/20] lnet: use %pISc for formatting IP addresses James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 15/20] lnet: o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE James Simmons
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Mikhail Pershin, Lustre Development List

From: Mikhail Pershin <mpershin@whamcloud.com>

- fix wrong LLOG_ID-to-FID convestion to output llog FID by
  introducing PLOGID macro to expand llog ID for DFID format
- stop printing lgl_ogen along with llog FID as it always zero
  since 2.3.51 and is not used anymore

Fixes: e813619f0a3b ("lustre: llog: update llog print format to use FIDs")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15646
Lustre-commit: e28f3ee185b2ef7ba ("LU-15646 llog: correct llog FID and path output")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48430
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/obdclass/llog.c               | 11 ++++----
 fs/lustre/obdclass/llog_cat.c           | 37 ++++++++-----------------
 fs/lustre/obdclass/llog_swab.c          |  2 +-
 include/uapi/linux/lustre/lustre_user.h |  1 +
 4 files changed, 19 insertions(+), 32 deletions(-)

diff --git a/fs/lustre/obdclass/llog.c b/fs/lustre/obdclass/llog.c
index 38904b6b5da5..eb8f7e5a8c48 100644
--- a/fs/lustre/obdclass/llog.c
+++ b/fs/lustre/obdclass/llog.c
@@ -422,7 +422,7 @@ static int llog_process_thread(void *arg)
 			if (rc) {
 				CERROR("%s: invalid record in llog "DFID" record for index %d/%d: rc = %d\n",
 				       loghandle2name(loghandle),
-                                       PFID(&loghandle->lgh_id.lgl_oi.oi_fid),
+				       PLOGID(&loghandle->lgh_id),
 				       rec->lrh_len, index, rc);
 				/*
 				 * the block seem to be corrupted, let's try
@@ -448,7 +448,7 @@ static int llog_process_thread(void *arg)
 				 */
 				CERROR("%s: "DFID" index %u, expected %u\n",
 				       loghandle2name(loghandle),
-				       PFID(&loghandle->lgh_id.lgl_oi.oi_fid),
+				       PLOGID(&loghandle->lgh_id),
 				       rec->lrh_index, index);
 				index = rec->lrh_index;
 			}
@@ -481,10 +481,9 @@ static int llog_process_thread(void *arg)
 	}
 
 out:
-	CDEBUG(D_HA, "stop processing %s " DOSTID ":%x index %d count %d\n",
+	CDEBUG(D_HA, "stop processing %s "DFID" index %d count %d\n",
 	       ((llh->llh_flags & LLOG_F_IS_CAT) ? "catalog" : "plain"),
-	       POSTID(&loghandle->lgh_id.lgl_oi), loghandle->lgh_id.lgl_ogen,
-	       index, llh->llh_count);
+	       PLOGID(&loghandle->lgh_id), index, llh->llh_count);
 
 	if (cd)
 		cd->lpcd_last_idx = last_called_index;
@@ -534,7 +533,7 @@ int llog_process_or_fork(const struct lu_env *env,
 
 	CDEBUG(D_OTHER,
 	       "Processing " DFID " flags 0x%03x startcat %d startidx %d first_idx %d last_idx %d read_mode %d\n",
-	       PFID(&loghandle->lgh_id.lgl_oi.oi_fid), flags,
+	       PLOGID(&loghandle->lgh_id), flags,
 	       (flags & LLOG_F_IS_CAT) && d ? d->lpd_startcat : -1,
 	       (flags & LLOG_F_IS_CAT) && d ? d->lpd_startidx : -1,
 	       cd ? cd->lpcd_first_idx : -1, cd ? cd->lpcd_last_idx : -1,
diff --git a/fs/lustre/obdclass/llog_cat.c b/fs/lustre/obdclass/llog_cat.c
index 753422be3185..95bfa65d25b1 100644
--- a/fs/lustre/obdclass/llog_cat.c
+++ b/fs/lustre/obdclass/llog_cat.c
@@ -77,18 +77,11 @@ static int llog_cat_id2handle(const struct lu_env *env,
 
 		if (ostid_id(&cgl->lgl_oi) == ostid_id(&logid->lgl_oi) &&
 		    ostid_seq(&cgl->lgl_oi) == ostid_seq(&logid->lgl_oi)) {
-			if (cgl->lgl_ogen != logid->lgl_ogen) {
-				CWARN("%s: log " DFID " generation %x != %x\n",
-				      loghandle2name(loghandle),
-				      PFID(&logid->lgl_oi.oi_fid),
-				      cgl->lgl_ogen, logid->lgl_ogen);
-				continue;
-			}
 			*res = llog_handle_get(loghandle);
 			if (!*res) {
 				CERROR("%s: log "DFID" refcount is zero!\n",
 				       loghandle2name(loghandle),
-				       PFID(&logid->lgl_oi.oi_fid));
+				       PLOGID(logid));
 				continue;
 			}
 			loghandle->u.phd.phd_cat_handle = cathandle;
@@ -101,9 +94,8 @@ static int llog_cat_id2handle(const struct lu_env *env,
 	rc = llog_open(env, cathandle->lgh_ctxt, &loghandle, logid, NULL,
 		       LLOG_OPEN_EXISTS);
 	if (rc < 0) {
-		CERROR("%s: error opening log id " DFID ":%x: rc = %d\n",
-		       loghandle2name(cathandle), PFID(&logid->lgl_oi.oi_fid),
-		       logid->lgl_ogen, rc);
+		CERROR("%s: error opening log id "DFID": rc = %d\n",
+		       loghandle2name(cathandle), PLOGID(logid), rc);
 		return rc;
 	}
 
@@ -153,25 +145,20 @@ static int llog_cat_process_common(const struct lu_env *env,
 
 	if (rec->lrh_type != le32_to_cpu(LLOG_LOGID_MAGIC)) {
 		rc = -EINVAL;
-		CWARN("%s: invalid record in catalog " DFID ":%x: rc = %d\n",
-		      loghandle2name(cat_llh),
-		      PFID(&cat_llh->lgh_id.lgl_oi.oi_fid),
-		      cat_llh->lgh_id.lgl_ogen, rc);
-
+		CWARN("%s: invalid record in catalog "DFID": rc = %d\n",
+		      loghandle2name(cat_llh), PLOGID(&cat_llh->lgh_id), rc);
 		return rc;
 	}
 	CDEBUG(D_HA,
-	       "processing log " DFID ":%x at index %u of catalog " DFID "\n",
-	       PFID(&lir->lid_id.lgl_oi.oi_fid), lir->lid_id.lgl_ogen,
-	       le32_to_cpu(rec->lrh_index),
-	       PFID(&cat_llh->lgh_id.lgl_oi.oi_fid));
+	       "processing log "DFID" at index %u of catalog "DFID"\n",
+	       PLOGID(&lir->lid_id), le32_to_cpu(rec->lrh_index),
+	       PLOGID(&cat_llh->lgh_id));
 
 	rc = llog_cat_id2handle(env, cat_llh, llhp, &lir->lid_id);
 	if (rc) {
-		CWARN("%s: can't find llog handle " DFID ":%x: rc = %d\n",
-		      loghandle2name(cat_llh),
-		      PFID(&lir->lid_id.lgl_oi.oi_fid),
-		      lir->lid_id.lgl_ogen, rc);
+		CWARN("%s: can't find llog handle "DFID": rc = %d\n",
+		      loghandle2name(cat_llh), PLOGID(&lir->lid_id),
+		      rc);
 
 		return rc;
 	}
@@ -238,7 +225,7 @@ static int llog_cat_process_or_fork(const struct lu_env *env,
 
 		CWARN("%s: catlog " DFID " crosses index zero\n",
 		      loghandle2name(cat_llh),
-		      PFID(&cat_llh->lgh_id.lgl_oi.oi_fid));
+		      PLOGID(&cat_llh->lgh_id));
 		/*startcat = 0 is default value for general processing */
 		if ((startcat != LLOG_CAT_FIRST &&
 		    startcat >= llh->llh_cat_idx) || !startcat) {
diff --git a/fs/lustre/obdclass/llog_swab.c b/fs/lustre/obdclass/llog_swab.c
index fcc2a48b50b0..5d2936429b9f 100644
--- a/fs/lustre/obdclass/llog_swab.c
+++ b/fs/lustre/obdclass/llog_swab.c
@@ -45,7 +45,7 @@ static void print_llogd_body(struct llogd_body *d)
 {
 	CDEBUG(D_OTHER, "llogd body: %p\n", d);
 	CDEBUG(D_OTHER, "\tlgd_logid.lgl_oi: " DFID "\n",
-	       PFID(&d->lgd_logid.lgl_oi.oi_fid));
+	       PLOGID(&d->lgd_logid));
 	CDEBUG(D_OTHER, "\tlgd_logid.lgl_ogen: %#x\n", d->lgd_logid.lgl_ogen);
 	CDEBUG(D_OTHER, "\tlgd_ctxt_idx: %#x\n", d->lgd_ctxt_idx);
 	CDEBUG(D_OTHER, "\tlgd_llh_flags: %#x\n", d->lgd_llh_flags);
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index 6577202517af..db18cd5f1fa8 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -987,6 +987,7 @@ static inline void obd_uuid2fsname(char *buf, char *uuid, int buflen)
  */
 #define SFID "0x%llx:0x%x:0x%x"
 #define RFID(fid) &((fid)->f_seq), &((fid)->f_oid), &((fid)->f_ver)
+#define PLOGID(logid) ((unsigned long long)(logid)->lgl_oi.oi.oi_seq, (__u32)(logid)->lgl_oi.oi.oi_id, 0)
 
 /********* Quotas **********/
 
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 15/20] lnet: o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (13 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 14/20] lustre: llog: correct llog FID and path output James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 16/20] lnet: socklnd: remove remnants of tcp bonding James Simmons
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Serguei Smirnov, Lustre Development List

From: Serguei Smirnov <ssmirnov@whamcloud.com>

RDMA_CM_EVENT_UNREACHABLE may be received not only when connection
is being connected, but also when it is being closed. Fix handing
of this event accordingly.

WC-bug-id: https://jira.whamcloud.com/browse/LU-15885
Lustre-commit: 3925b1669d519e6c0 ("LU-15885 o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48492
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 30e77c0382f1..919b83d5c6e2 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3171,13 +3171,17 @@ kiblnd_cm_callback(struct rdma_cm_id *cmid, struct rdma_cm_event *event)
 		return event->status;	/* rc destroys cmid */
 
 	case RDMA_CM_EVENT_UNREACHABLE:
-		conn = (struct kib_conn *)cmid->context;
-		LASSERT(conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT ||
-			conn->ibc_state == IBLND_CONN_PASSIVE_WAIT);
-		CNETERR("%s: UNREACHABLE %d\n",
-			libcfs_nid2str(conn->ibc_peer->ibp_nid), event->status);
-		kiblnd_connreq_done(conn, -ENETDOWN);
-		kiblnd_conn_decref(conn);
+		CNETERR("%s: UNREACHABLE %d, ibc_state: %d\n",
+			libcfs_nid2str(conn->ibc_peer->ibp_nid),
+			event->status,
+			conn->ibc_state);
+		LASSERT(conn->ibc_state != IBLND_CONN_ESTABLISHED &&
+			conn->ibc_state != IBLND_CONN_INIT);
+		if (conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT ||
+		    conn->ibc_state == IBLND_CONN_PASSIVE_WAIT) {
+			kiblnd_connreq_done(conn, -ENETDOWN);
+			kiblnd_conn_decref(conn);
+		}
 		return 0;
 
 	case RDMA_CM_EVENT_CONNECT_ERROR:
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 16/20] lnet: socklnd: remove remnants of tcp bonding
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (14 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 15/20] lnet: o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 17/20] lnet: Router test interop check and aarch fix James Simmons
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

->ksnp_n_passive_ips is now always zero, so remove it and all uses of
it.  ->ksnp_passive_ips is gone too, as is ksocknal_ip2iface().

WC-bug-id: https://jira.whamcloud.com/browse/LU-13641
Lustre-commit: 3630e1eaf9db562a1 ("LU-13641 socklnd: remove remnants of tcp bonding")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48568
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 73 +++-----------------------------
 net/lnet/klnds/socklnd/socklnd.h |  4 --
 2 files changed, 6 insertions(+), 71 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 8999580b67b4..9c8b75f0b2a2 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -46,20 +46,6 @@
 static struct lnet_lnd the_ksocklnd;
 struct ksock_nal_data ksocknal_data;
 
-static struct ksock_interface *
-ksocknal_ip2iface(struct lnet_ni *ni, struct sockaddr *addr)
-{
-	struct ksock_net *net = ni->ni_data;
-	struct ksock_interface *iface;
-
-	iface = &net->ksnn_interface;
-
-	if (rpc_cmp_addr((struct sockaddr *)&iface->ksni_addr, addr))
-		return iface;
-
-	return NULL;
-}
-
 static struct ksock_interface *
 ksocknal_index2iface(struct lnet_ni *ni, int index)
 {
@@ -281,28 +267,6 @@ ksocknal_find_peer(struct lnet_ni *ni, struct lnet_processid *id)
 static void
 ksocknal_unlink_peer_locked(struct ksock_peer_ni *peer_ni)
 {
-	int i;
-	struct ksock_interface *iface;
-
-	for (i = 0; i < peer_ni->ksnp_n_passive_ips; i++) {
-		struct sockaddr_in sa = { .sin_family = AF_INET };
-
-		LASSERT(i < LNET_INTERFACES_NUM);
-		sa.sin_addr.s_addr = htonl(peer_ni->ksnp_passive_ips[i]);
-
-		iface = ksocknal_ip2iface(peer_ni->ksnp_ni,
-					  (struct sockaddr *)&sa);
-		/*
-		 * All IPs in peer_ni->ksnp_passive_ips[] come from the
-		 * interface list, therefore the call must succeed.
-		 */
-		LASSERT(iface);
-
-		CDEBUG(D_NET, "peer_ni=%p iface=%p ksni_nroutes=%d\n",
-		       peer_ni, iface, iface->ksni_nroutes);
-		iface->ksni_npeers--;
-	}
-
 	LASSERT(list_empty(&peer_ni->ksnp_conns));
 	LASSERT(!peer_ni->ksnp_conn_cb);
 	LASSERT(!peer_ni->ksnp_closing);
@@ -320,7 +284,6 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index,
 	struct ksock_peer_ni *peer_ni;
 	struct ksock_conn_cb *conn_cb;
 	int i;
-	int j;
 	int rc = -ENOENT;
 
 	read_lock(&ksocknal_data.ksnd_global_lock);
@@ -328,12 +291,11 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index,
 	hash_for_each(ksocknal_data.ksnd_peers, i, peer_ni, ksnp_list) {
 		if (peer_ni->ksnp_ni != ni)
 			continue;
+		if (index-- > 0)
+			continue;
 
-		if (!peer_ni->ksnp_n_passive_ips &&
-		    !peer_ni->ksnp_conn_cb) {
-			if (index-- > 0)
-				continue;
-
+		conn_cb = peer_ni->ksnp_conn_cb;
+		if (!conn_cb) {
 			*id = peer_ni->ksnp_id;
 			*myip = 0;
 			*peer_ip = 0;
@@ -341,29 +303,7 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index,
 			*conn_count = 0;
 			*share_count = 0;
 			rc = 0;
-			goto out;
-		}
-
-		for (j = 0; j < peer_ni->ksnp_n_passive_ips; j++) {
-			if (index-- > 0)
-				continue;
-
-			*id = peer_ni->ksnp_id;
-			*myip = peer_ni->ksnp_passive_ips[j];
-			*peer_ip = 0;
-			*port = 0;
-			*conn_count = 0;
-			*share_count = 0;
-			rc = 0;
-			goto out;
-		}
-
-		if (peer_ni->ksnp_conn_cb) {
-			if (index-- > 0)
-				continue;
-
-			conn_cb = peer_ni->ksnp_conn_cb;
-
+		} else {
 			*id = peer_ni->ksnp_id;
 			if (conn_cb->ksnr_addr.ss_family == AF_INET) {
 				struct sockaddr_in *sa;
@@ -383,10 +323,9 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index,
 			}
 			*conn_count = conn_cb->ksnr_conn_count;
 			*share_count = 1;
-			goto out;
 		}
+		break;
 	}
-out:
 	read_unlock(&ksocknal_data.ksnd_global_lock);
 	return rc;
 }
diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h
index 93368bd4139f..dcb4b2952f8e 100644
--- a/net/lnet/klnds/socklnd/socklnd.h
+++ b/net/lnet/klnds/socklnd/socklnd.h
@@ -438,10 +438,6 @@ struct ksock_peer_ni {
 							 */
 	time64_t		ksnp_send_keepalive;	/* time to send keepalive */
 	struct lnet_ni	       *ksnp_ni;		/* which network */
-	int			ksnp_n_passive_ips;	/* # of... */
-
-	/* preferred local interfaces */
-	u32			ksnp_passive_ips[LNET_INTERFACES_NUM];
 };
 
 struct ksock_connreq {
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 17/20] lnet: Router test interop check and aarch fix
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (15 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 16/20] lnet: socklnd: remove remnants of tcp bonding James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 18/20] lnet: o2iblnd: fix deadline for tx on peer queue James Simmons
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Chris Horn, Lustre Development List

From: Chris Horn <chris.horn@hpe.com>

Enabling routing may fail on nodes with small amount of memory (like
aarch config). Define small number of router buffers to work around
this issue. Modify the functions which calculate the number of buffers
to allow small sizes to be specified via parameters.

WC-bug-id: https://jira.whamcloud.com/browse/LU-15595
Lustre-commit: 1aba6b0d9b661d369 ("LU-15595 tests: Router test interop check and aarch fix")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48578
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/router.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index 5d1e5a05a9fb..ee4f1d84dc9f 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -1405,11 +1405,15 @@ lnet_nrb_tiny_calculate(void)
 		return -EINVAL;
 	}
 
-	if (tiny_router_buffers > 0)
+	if (tiny_router_buffers > 0) {
+		if (tiny_router_buffers < LNET_NRB_TINY_MIN)
+			CWARN("tiny_router_buffers=%d less than recommended minimum %d\n",
+			      tiny_router_buffers, LNET_NRB_TINY_MIN);
 		nrbs = tiny_router_buffers;
+	}
 
 	nrbs /= LNET_CPT_NUMBER;
-	return max(nrbs, LNET_NRB_TINY_MIN);
+	return max(nrbs, 1);
 }
 
 static int
@@ -1424,11 +1428,15 @@ lnet_nrb_small_calculate(void)
 		return -EINVAL;
 	}
 
-	if (small_router_buffers > 0)
+	if (small_router_buffers > 0) {
+		if (small_router_buffers < LNET_NRB_SMALL_MIN)
+			CWARN("small_router_buffers=%d less than recommended minimum %d\n",
+			      small_router_buffers, LNET_NRB_SMALL_MIN);
 		nrbs = small_router_buffers;
+	}
 
 	nrbs /= LNET_CPT_NUMBER;
-	return max(nrbs, LNET_NRB_SMALL_MIN);
+	return max(nrbs, 1);
 }
 
 static int
@@ -1443,11 +1451,15 @@ lnet_nrb_large_calculate(void)
 		return -EINVAL;
 	}
 
-	if (large_router_buffers > 0)
+	if (large_router_buffers > 0) {
+		if (large_router_buffers < LNET_NRB_LARGE_MIN)
+			CWARN("large_router_buffers=%d less than recommended minimum %d\n",
+			      large_router_buffers, LNET_NRB_LARGE_MIN);
 		nrbs = large_router_buffers;
+	}
 
 	nrbs /= LNET_CPT_NUMBER;
-	return max(nrbs, LNET_NRB_LARGE_MIN);
+	return max(nrbs, 1);
 }
 
 int
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 18/20] lnet: o2iblnd: fix deadline for tx on peer queue
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (16 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 17/20] lnet: Router test interop check and aarch fix James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 19/20] lnet: o2iblnd: detect link state to set fatal error on ni James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 20/20] lnet: socklnd: limit retries on conns_per_peer mismatch James Simmons
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Serguei Smirnov, Lustre Development List

From: Serguei Smirnov <ssmirnov@whamcloud.com>

In o2iblnd, deadline is checked for txs on peer queue,
but not set prior to adding the tx to the queue. This
may cause the tx to be dropped unnecessarily with
"Timed out tx for ..." warning.

Fix it by setting the tx_deadline when adding tx to peer queue.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16184
Lustre-commit: 4c89ee7d7b098c7f1 ("LU-16184 o2iblnd: fix deadline for tx on peer queue")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48640
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 919b83d5c6e2..6f040964121c 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -1422,6 +1422,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid)
 	int rc;
 	int i;
 	struct lnet_ioctl_config_o2iblnd_tunables *tunables;
+	s64 timeout_ns;
 
 	/*
 	 * If I get here, I've committed to send, so I complete the tx with
@@ -1450,6 +1451,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid)
 		return;
 	}
 
+	timeout_ns = kiblnd_timeout() * NSEC_PER_SEC;
 	read_unlock(g_lock);
 	/* Re-try with a write lock */
 	write_lock(g_lock);
@@ -1459,9 +1461,12 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid)
 		if (list_empty(&peer_ni->ibp_conns)) {
 			/* found a peer_ni, but it's still connecting... */
 			LASSERT(kiblnd_peer_connecting(peer_ni));
-			if (tx)
+			if (tx) {
+				tx->tx_deadline = ktime_add_ns(ktime_get(),
+							       timeout_ns);
 				list_add_tail(&tx->tx_list,
 					      &peer_ni->ibp_tx_queue);
+			}
 			write_unlock_irqrestore(g_lock, flags);
 		} else {
 			conn = kiblnd_get_conn_locked(peer_ni);
@@ -1498,9 +1503,12 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid)
 		if (list_empty(&peer2->ibp_conns)) {
 			/* found a peer_ni, but it's still connecting... */
 			LASSERT(kiblnd_peer_connecting(peer2));
-			if (tx)
+			if (tx) {
+				tx->tx_deadline = ktime_add_ns(ktime_get(),
+							       timeout_ns);
 				list_add_tail(&tx->tx_list,
 					      &peer2->ibp_tx_queue);
+			}
 			write_unlock_irqrestore(g_lock, flags);
 		} else {
 			conn = kiblnd_get_conn_locked(peer2);
@@ -1525,8 +1533,10 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid)
 	/* always called with a ref on ni, which prevents ni being shutdown */
 	LASSERT(!((struct kib_net *)ni->ni_data)->ibn_shutdown);
 
-	if (tx)
+	if (tx) {
+		tx->tx_deadline = ktime_add_ns(ktime_get(), timeout_ns);
 		list_add_tail(&tx->tx_list, &peer_ni->ibp_tx_queue);
+	}
 
 	kiblnd_peer_addref(peer_ni);
 	hash_add(kiblnd_data.kib_peers, &peer_ni->ibp_list, nid);
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 19/20] lnet: o2iblnd: detect link state to set fatal error on ni
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (17 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 18/20] lnet: o2iblnd: fix deadline for tx on peer queue James Simmons
@ 2022-10-14 21:38 ` James Simmons
  2022-10-14 21:38 ` [lustre-devel] [PATCH 20/20] lnet: socklnd: limit retries on conns_per_peer mismatch James Simmons
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Serguei Smirnov, Lustre Development List

From: Serguei Smirnov <ssmirnov@whamcloud.com>

To avoid selecting lnet ni which corresponds to a downed link
for sending, add a mechanism for detecting ip-layer link events
in o2iblnd. On ip link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag. This complements the
existing mechanism for ib-layer link event handling.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16051
Lustre-commit: 30d73908087d5b2f0 ("LU-16051 o2iblnd: detect link state to set fatal error on ni")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48644
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/o2iblnd/o2iblnd.c | 219 ++++++++++++++++++++++++++-----
 1 file changed, 186 insertions(+), 33 deletions(-)

diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c
index c713528a7e7c..d5ca1a3dd25c 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd.c
@@ -2527,6 +2527,184 @@ void kiblnd_destroy_dev(struct kib_dev *dev)
 	kfree(dev);
 }
 
+static struct kib_dev *
+kiblnd_dev_search(char *ifname)
+{
+	struct kib_dev *alias = NULL;
+	struct kib_dev *dev;
+	char *colon;
+	char *colon2;
+
+	colon = strchr(ifname, ':');
+	list_for_each_entry(dev, &kiblnd_data.kib_devs, ibd_list) {
+		if (strcmp(&dev->ibd_ifname[0], ifname) == 0)
+			return dev;
+
+		if (alias)
+			continue;
+
+		colon2 = strchr(dev->ibd_ifname, ':');
+		if (colon)
+			*colon = 0;
+		if (colon2)
+			*colon2 = 0;
+
+		if (strcmp(&dev->ibd_ifname[0], ifname) == 0)
+			alias = dev;
+
+		if (colon)
+			*colon = ':';
+		if (colon2)
+			*colon2 = ':';
+	}
+	return alias;
+}
+
+static int
+kiblnd_handle_link_state_change(struct net_device *dev,
+				unsigned char operstate)
+{
+	struct lnet_ni *ni = NULL;
+	struct kib_dev *event_kibdev;
+	struct kib_net *net;
+	struct kib_net *cnxt;
+	bool link_down = !(operstate == IF_OPER_UP);
+	struct in_device *in_dev;
+	bool found_ip = false;
+	const struct in_ifaddr *ifa;
+
+	event_kibdev = kiblnd_dev_search(dev->name);
+
+	if (!event_kibdev)
+		goto out;
+
+	list_for_each_entry_safe(net, cnxt, &event_kibdev->ibd_nets, ibn_list) {
+		found_ip = false;
+
+		ni = net->ibn_ni;
+
+		in_dev = __in_dev_get_rtnl(dev);
+		if (!in_dev) {
+			CDEBUG(D_NET, "Interface %s has no IPv4 status.\n",
+			       dev->name);
+			CDEBUG(D_NET, "%s: set link fatal state to 1\n",
+			       libcfs_nidstr(&net->ibn_ni->ni_nid));
+			atomic_set(&ni->ni_fatal_error_on, 1);
+			continue;
+		}
+		in_dev_for_each_ifa_rtnl(ifa, in_dev) {
+			if (htonl(event_kibdev->ibd_ifip) == ifa->ifa_local)
+				found_ip = true;
+		}
+
+		if (!found_ip) {
+			CDEBUG(D_NET, "Interface %s has no matching ip\n",
+			       dev->name);
+			CDEBUG(D_NET, "%s: set link fatal state to 1\n",
+			       libcfs_nidstr(&net->ibn_ni->ni_nid));
+			atomic_set(&ni->ni_fatal_error_on, 1);
+			continue;
+		}
+
+		if (link_down) {
+			CDEBUG(D_NET, "%s: set link fatal state to 1\n",
+			       libcfs_nidstr(&net->ibn_ni->ni_nid));
+			atomic_set(&ni->ni_fatal_error_on, link_down);
+		} else {
+			CDEBUG(D_NET, "%s: set link fatal state to %u\n",
+			       libcfs_nidstr(&net->ibn_ni->ni_nid),
+			       (kiblnd_get_link_status(dev) == 0));
+			atomic_set(&ni->ni_fatal_error_on,
+				   (kiblnd_get_link_status(dev) == 0));
+		}
+	}
+out:
+	return 0;
+}
+
+static int
+kiblnd_handle_inetaddr_change(struct in_ifaddr *ifa, unsigned long event)
+{
+	struct kib_dev *event_kibdev;
+	struct kib_net *net;
+	struct kib_net *cnxt;
+	struct net_device *event_netdev = ifa->ifa_dev->dev;
+
+	event_kibdev = kiblnd_dev_search(event_netdev->name);
+
+	if (!event_kibdev)
+		goto out;
+
+	if (htonl(event_kibdev->ibd_ifip) != ifa->ifa_local)
+		goto out;
+
+	list_for_each_entry_safe(net, cnxt, &event_kibdev->ibd_nets,
+				 ibn_list) {
+		CDEBUG(D_NET, "%s: set link fatal state to %u\n",
+		       libcfs_nidstr(&net->ibn_ni->ni_nid),
+		       (event == NETDEV_DOWN));
+		atomic_set(&net->ibn_ni->ni_fatal_error_on,
+			   (event == NETDEV_DOWN));
+	}
+out:
+	return 0;
+}
+
+/************************************
+ * Net device notifier event handler
+ ************************************/
+static int kiblnd_device_event(struct notifier_block *unused,
+			       unsigned long event, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	unsigned char operstate;
+
+	operstate = dev->operstate;
+
+	CDEBUG(D_NET, "devevent: status=%ld, iface=%s ifindex %d state %u\n",
+	       event, dev->name, dev->ifindex, operstate);
+
+	switch (event) {
+	case NETDEV_UP:
+	case NETDEV_DOWN:
+	case NETDEV_CHANGE:
+		kiblnd_handle_link_state_change(dev, operstate);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+/************************************
+ * Inetaddr notifier event handler
+ ************************************/
+static int kiblnd_inetaddr_event(struct notifier_block *unused,
+				 unsigned long event, void *ptr)
+{
+	struct in_ifaddr *ifa = ptr;
+
+	CDEBUG(D_NET, "addrevent: status %ld ip addr %pI4, netmask %pI4.\n",
+	       event, &ifa->ifa_address, &ifa->ifa_mask);
+
+	switch (event) {
+	case NETDEV_UP:
+	case NETDEV_DOWN:
+	case NETDEV_CHANGE:
+		kiblnd_handle_inetaddr_change(ifa, event);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block kiblnd_dev_notifier_block = {
+	.notifier_call = kiblnd_device_event,
+};
+
+static struct notifier_block kiblnd_inetaddr_notifier_block = {
+	.notifier_call = kiblnd_inetaddr_event,
+};
+
 static void kiblnd_base_shutdown(void)
 {
 	struct kib_sched_info *sched;
@@ -2535,6 +2713,11 @@ static void kiblnd_base_shutdown(void)
 
 	LASSERT(list_empty(&kiblnd_data.kib_devs));
 
+	if (kiblnd_data.kib_init == IBLND_INIT_ALL) {
+		unregister_netdevice_notifier(&kiblnd_dev_notifier_block);
+		unregister_inetaddr_notifier(&kiblnd_inetaddr_notifier_block);
+	}
+
 	switch (kiblnd_data.kib_init) {
 	default:
 		LBUG();
@@ -2723,6 +2906,9 @@ static int kiblnd_base_startup(struct net *ns)
 		goto failed;
 	}
 
+	register_netdevice_notifier(&kiblnd_dev_notifier_block);
+	register_inetaddr_notifier(&kiblnd_inetaddr_notifier_block);
+
 	/* flag everything initialised */
 	kiblnd_data.kib_init = IBLND_INIT_ALL;
 	/*****************************************************/
@@ -2799,39 +2985,6 @@ static int kiblnd_dev_start_threads(struct kib_dev *dev, bool newdev, u32 *cpts,
 	return 0;
 }
 
-static struct kib_dev *
-kiblnd_dev_search(char *ifname)
-{
-	struct kib_dev *alias = NULL;
-	struct kib_dev *dev;
-	char            *colon;
-	char            *colon2;
-
-	colon = strchr(ifname, ':');
-	list_for_each_entry(dev, &kiblnd_data.kib_devs, ibd_list) {
-		if (strcmp(&dev->ibd_ifname[0], ifname) == 0)
-			return dev;
-
-		if (alias)
-			continue;
-
-		colon2 = strchr(dev->ibd_ifname, ':');
-		if (colon)
-			*colon = 0;
-		if (colon2)
-			*colon2 = 0;
-
-		if (strcmp(&dev->ibd_ifname[0], ifname) == 0)
-			alias = dev;
-
-		if (colon)
-			*colon = ':';
-		if (colon2)
-			*colon2 = ':';
-	}
-	return alias;
-}
-
 static int kiblnd_startup(struct lnet_ni *ni)
 {
 	char *ifname = NULL;
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [lustre-devel] [PATCH 20/20] lnet: socklnd: limit retries on conns_per_peer mismatch
  2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
                   ` (18 preceding siblings ...)
  2022-10-14 21:38 ` [lustre-devel] [PATCH 19/20] lnet: o2iblnd: detect link state to set fatal error on ni James Simmons
@ 2022-10-14 21:38 ` James Simmons
  19 siblings, 0 replies; 21+ messages in thread
From: James Simmons @ 2022-10-14 21:38 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Serguei Smirnov, Lustre Development List

From: Serguei Smirnov <ssmirnov@whamcloud.com>

If connection initiator has a higher conns-per-peer setting than
its peer, don't try to create extra connections forever as the
peer will keep rejecting them. A few retries should suffice to
resolve a valid race.

Fixes: 511ace4a ("lnet: socklnd: add conns_per_peer parameter")
WC-bug-id: https://jira.whamcloud.com/browse/LU-16191
Lustre-commit: da893c6c9707ca3b2 ("LU-16191 socklnd: limit retries on conns_per_peer mismatch")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48664
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c    |  1 +
 net/lnet/klnds/socklnd/socklnd.h    |  4 ++++
 net/lnet/klnds/socklnd/socklnd_cb.c | 25 +++++++++++++++++++------
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 9c8b75f0b2a2..00e33c88dfaa 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -144,6 +144,7 @@ ksocknal_create_conn_cb(struct sockaddr *addr)
 	conn_cb->ksnr_blki_conn_count = 0;
 	conn_cb->ksnr_blko_conn_count = 0;
 	conn_cb->ksnr_max_conns = 0;
+	conn_cb->ksnr_busy_retry_count = 0;
 
 	return conn_cb;
 }
diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h
index dcb4b2952f8e..bb68a3df596a 100644
--- a/net/lnet/klnds/socklnd/socklnd.h
+++ b/net/lnet/klnds/socklnd/socklnd.h
@@ -379,6 +379,7 @@ struct ksock_conn {
 };
 
 #define SOCKNAL_CONN_COUNT_MAX_BITS	8	/* max conn count bits */
+#define SOCKNAL_MAX_BUSY_RETRIES	3
 
 struct ksock_conn_cb {
 	struct list_head	ksnr_connd_list;	/* chain on ksnr_connd_routes */
@@ -407,6 +408,9 @@ struct ksock_conn_cb {
 	unsigned int		ksnr_max_conns;		/* conns_per_peer at
 							 * peer creation
 							 */
+	unsigned int		ksnr_busy_retry_count;	/* counts retry attempts
+							 * due to EALREADY rc
+							 */
 };
 
 #define SOCKNAL_KEEPALIVE_PING	1	/* cookie for keepalive ping */
diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c
index b2da535fbfbe..f358875a2afe 100644
--- a/net/lnet/klnds/socklnd/socklnd_cb.c
+++ b/net/lnet/klnds/socklnd/socklnd_cb.c
@@ -1785,7 +1785,7 @@ ksocknal_connect(struct ksock_conn_cb *conn_cb)
 {
 	LIST_HEAD(zombies);
 	struct ksock_peer_ni *peer_ni = conn_cb->ksnr_peer;
-	int type;
+	int type = SOCKLND_CONN_NONE;
 	int wanted;
 	struct socket *sock;
 	time64_t deadline;
@@ -1863,14 +1863,18 @@ ksocknal_connect(struct ksock_conn_cb *conn_cb)
 			goto failed;
 		}
 
-		/*
-		 * A +ve RC means I have to retry because I lost the connection
+		if (rc == EALREADY && conn_cb->ksnr_conn_count > 0)
+			conn_cb->ksnr_busy_retry_count += 1;
+		else
+			conn_cb->ksnr_busy_retry_count = 0;
+
+		/* A +ve RC means I have to retry because I lost the connection
 		 * race or I have to renegotiate protocol version
 		 */
-		retry_later = (rc);
+		retry_later = (rc != 0);
 		if (retry_later)
-			CDEBUG(D_NET, "peer_ni %s: conn race, retry later.\n",
-			       libcfs_nidstr(&peer_ni->ksnp_id.nid));
+			CDEBUG(D_NET, "peer_ni %s: conn race, retry later. rc %d\n",
+			       libcfs_nidstr(&peer_ni->ksnp_id.nid), rc);
 
 		write_lock_bh(&ksocknal_data.ksnd_global_lock);
 	}
@@ -1878,6 +1882,15 @@ ksocknal_connect(struct ksock_conn_cb *conn_cb)
 	conn_cb->ksnr_scheduled = 0;
 	conn_cb->ksnr_connecting = 0;
 
+	if (conn_cb->ksnr_busy_retry_count >= SOCKNAL_MAX_BUSY_RETRIES &&
+	    type > SOCKLND_CONN_NONE) {
+		/* After so many retries due to EALREADY assume that
+		 * the peer doesn't support as many connections as we want
+		 */
+		conn_cb->ksnr_connected |= BIT(type);
+		retry_later = false;
+	}
+
 	if (retry_later) {
 		/*
 		 * re-queue for attention; this frees me up to handle
-- 
2.27.0

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-10-14 21:39 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-14 21:37 [lustre-devel] [PATCH 00/20] lustre: backport OpenSFS work as of Oct 14, 2022 James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 01/20] lustre: ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 02/20] lustre: obdclass: set OBD_MD_FLGROUP for ladvise RPC James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 03/20] lustre: obdclass: free inst_name correctly James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 04/20] lustre: osc: take ldlm lock when queue sync pages James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 05/20] lnet: track pinginfo size in bytes, not nis James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 06/20] lnet: add iface index to struct lnet_inetdev James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 07/20] lnet: ksocklnd: support IPv6 in ksocknal_ip2index() James Simmons
2022-10-14 21:37 ` [lustre-devel] [PATCH 08/20] lnet: only use PUBLIC IP6 addresses for connections James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 09/20] lustre: osc: Remove oap_magic James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 10/20] lustre: ptlrpc: add assert for ptlrpc_service_purge_all James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 11/20] lustre: ptlrpc: lower the message level in no resend case James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 12/20] lustre: obdclass: user netlink to collect devices information James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 13/20] lnet: use %pISc for formatting IP addresses James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 14/20] lustre: llog: correct llog FID and path output James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 15/20] lnet: o2iblnd: fix handling of RDMA_CM_EVENT_UNREACHABLE James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 16/20] lnet: socklnd: remove remnants of tcp bonding James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 17/20] lnet: Router test interop check and aarch fix James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 18/20] lnet: o2iblnd: fix deadline for tx on peer queue James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 19/20] lnet: o2iblnd: detect link state to set fatal error on ni James Simmons
2022-10-14 21:38 ` [lustre-devel] [PATCH 20/20] lnet: socklnd: limit retries on conns_per_peer mismatch James Simmons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).