lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021
@ 2021-09-22  2:19 James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 01/24] lnet: Lock primary NID logic James Simmons
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

Merge latest new OpenSFS work to native Linux client. Biggest
change is the native support of the Lustre utilites with the
Linux client!!!! I do see new test failures due to the debugfs
root only issues which will be addressed later.

Alex Zhuravlev (1):
  lustre: obdclass: EAGAIN after rhashtable_walk_next()

Alexander Boyko (1):
  lustre: llite: don't touch vma after filemap_fault

Amir Shehata (1):
  lnet: Lock primary NID logic

Chris Horn (1):
  lnet: Check for -ESHUTDOWN in lnet_parse

Hongchao Zhang (2):
  lustre: quota: enforce block quota for chgrp
  lustre: llite: check read only mount for setquota

James Simmons (2):
  lustre: uapi: fixup UAPI headers for native Linux client.
  lustre: ptlrpc: separate out server code for wiretest

Mr NeilBrown (10):
  lnet: introduce struct lnet_nid
  lnet: add string formating/parsing for IPv6 nids
  lnet: change lpni_nid in lnet_peer_ni to lnet_nid
  lnet: change lp_primary_nid to struct lnet_nid
  lnet: change lp_disc_*_nid to struct lnet_nid
  lnet: socklnd: factor out key calculation for ksnd_peers
  lnet: introduce lnet_processid for ksock_peer_ni
  lnet: enhance connect/accept to support large addr
  lnet: change lr_nid to struct lnet_nid
  lnet: extend rspt_next_hop_nid in lnet_rsp_tracker

Oleg Drokin (1):
  lustre: llite: Remove inode locking in ll_fsync

Patrick Farrell (1):
  lustre: llite: Always do lookup on ENOENT in open

Qian Yingjin (1):
  lustre: pcc: VM_WRITE should not trigger layout write

Sebastien Buisson (1):
  lustre: sec: filename encryption

Serguei Smirnov (1):
  lnet: socklnd: fix link state detection

Vitaly Fertman (1):
  lustre: ptlrpc: two replay lock threads

 fs/lustre/include/cl_object.h           |   5 -
 fs/lustre/include/lustre_swab.h         |   1 -
 fs/lustre/include/obd.h                 |   4 +
 fs/lustre/include/obd_support.h         |   1 +
 fs/lustre/ldlm/ldlm_request.c           |  10 +-
 fs/lustre/llite/crypto.c                | 144 +++++++++
 fs/lustre/llite/dcache.c                |   8 +
 fs/lustre/llite/dir.c                   |  52 +++-
 fs/lustre/llite/file.c                  |  56 ++--
 fs/lustre/llite/llite_internal.h        |  29 +-
 fs/lustre/llite/llite_lib.c             |  63 +++-
 fs/lustre/llite/llite_mmap.c            |  31 +-
 fs/lustre/llite/namei.c                 |  47 ++-
 fs/lustre/llite/statahead.c             |  48 +++
 fs/lustre/llite/vvp_dev.c               |   6 +
 fs/lustre/llite/vvp_io.c                |   3 +-
 fs/lustre/lov/lov_io.c                  |   6 +-
 fs/lustre/mdc/mdc_lib.c                 |   6 +-
 fs/lustre/obdclass/jobid.c              |   5 +
 fs/lustre/obdclass/llog_swab.c          |  33 ---
 fs/lustre/obdclass/obd_config.c         |   4 +-
 fs/lustre/ptlrpc/layout.c               |   5 +-
 fs/lustre/ptlrpc/pack_generic.c         |  13 +-
 fs/lustre/ptlrpc/wiretest.c             | 210 ++++++-------
 include/linux/lnet/lib-lnet.h           |  39 ++-
 include/linux/lnet/lib-types.h          |  18 +-
 include/uapi/linux/lnet/lnet-idl.h      |  39 ++-
 include/uapi/linux/lnet/lnet-types.h    | 106 ++++++-
 include/uapi/linux/lnet/nidstr.h        |  12 +-
 include/uapi/linux/lustre/lustre_idl.h  |  78 +----
 include/uapi/linux/lustre/lustre_user.h |  20 ++
 net/lnet/klnds/o2iblnd/o2iblnd.c        |  13 +-
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c     |  17 +-
 net/lnet/klnds/socklnd/socklnd.c        | 353 +++++++++++++++-------
 net/lnet/klnds/socklnd/socklnd.h        |  14 +-
 net/lnet/klnds/socklnd/socklnd_cb.c     | 119 ++++----
 net/lnet/klnds/socklnd/socklnd_proto.c  |  14 +-
 net/lnet/lnet/acceptor.c                | 112 ++++---
 net/lnet/lnet/api-ni.c                  | 178 ++++++++---
 net/lnet/lnet/config.c                  |  20 +-
 net/lnet/lnet/lib-move.c                | 155 +++++-----
 net/lnet/lnet/lib-msg.c                 |  13 +-
 net/lnet/lnet/lib-socket.c              |  32 +-
 net/lnet/lnet/lo.c                      |   3 +-
 net/lnet/lnet/net_fault.c               |   4 +-
 net/lnet/lnet/nidstrings.c              | 163 ++++++++++-
 net/lnet/lnet/peer.c                    | 503 +++++++++++++++++++-------------
 net/lnet/lnet/router.c                  |  88 +++---
 net/lnet/lnet/router_proc.c             |  12 +-
 net/lnet/lnet/udsp.c                    |  39 +--
 50 files changed, 1997 insertions(+), 957 deletions(-)

-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 01/24] lnet: Lock primary NID logic
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 02/24] lustre: quota: enforce block quota for chgrp James Simmons
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Amir Shehata, Lustre Development List

From: Amir Shehata <ashehata@whamcloud.com>

If a peer is created by Lustre make sure to lock that peer's
primary NID. This peer can be discovered in the background.
There is no need to block until discovery is complete, as Lustre
can continue on with the primary NID it provided.

Discovery will populate the peer with other interfaces the peer has
but will not change the peer's primary NID. It can also delete
peer's NIDs which Lustre told it about (not the Primary NID).

WC-bug-id: https://jira.whamcloud.com/browse/LU-14668
Lustre-commit: 024f9303bc6f32a31 ("LU-14668 lnet: Lock primary NID logic")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43563
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/peer.c | 69 ++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 53 insertions(+), 16 deletions(-)

diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index c2f5d8b..720af99 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -530,6 +530,16 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 			goto out;
 		}
 	}
+
+	/* If we're asked to lock down the primary NID we shouldn't be
+	 * deleting it
+	 */
+	if (lp->lp_state & LNET_PEER_LOCK_PRIMARY &&
+	    primary_nid == nid) {
+		rc = -EPERM;
+		goto out;
+	}
+
 	lpni = lnet_find_peer_ni_locked(nid);
 	if (!lpni) {
 		rc = -ENOENT;
@@ -1388,13 +1398,18 @@ struct lnet_peer_ni *
 	 * down then this discovery can introduce long delays into the mount
 	 * process, so skip it if it isn't necessary.
 	 */
-	while (!lnet_peer_discovery_disabled && !lnet_peer_is_uptodate(lp)) {
+	if (!lnet_peer_discovery_disabled && !lnet_peer_is_uptodate(lp)) {
 		spin_lock(&lp->lp_lock);
 		/* force a full discovery cycle */
-		lp->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH;
+		lp->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH |
+				LNET_PEER_LOCK_PRIMARY;
 		spin_unlock(&lp->lp_lock);
 
-		rc = lnet_discover_peer_locked(lpni, cpt, true);
+		/* start discovery in the background. Messages to that
+		 * peer will not go through until the discovery is
+		 * complete
+		 */
+		rc = lnet_discover_peer_locked(lpni, cpt, false);
 		if (rc)
 			goto out_decref;
 		/* The lpni (or lp) for this NID may have changed and our ref is
@@ -1408,14 +1423,6 @@ struct lnet_peer_ni *
 			goto out_unlock;
 		}
 		lp = lpni->lpni_peer_net->lpn_peer;
-
-		/* If we find that the peer has discovery disabled then we will
-		 * not modify whatever primary NID is currently set for this
-		 * peer. Thus, we can break out of this loop even if the peer
-		 * is not fully up to date.
-		 */
-		if (lnet_is_discovery_disabled(lp))
-			break;
 	}
 	primary_nid = lp->lp_primary_nid;
 out_decref:
@@ -1522,6 +1529,8 @@ struct lnet_peer_net *
 			lnet_peer_clr_non_mr_pref_nids(lp);
 		}
 	}
+	if (flags & LNET_PEER_LOCK_PRIMARY)
+		lp->lp_state |= LNET_PEER_LOCK_PRIMARY;
 	spin_unlock(&lp->lp_lock);
 
 	lp->lp_nnis++;
@@ -1676,9 +1685,27 @@ struct lnet_peer_net *
 		}
 		/* If this is the primary NID, destroy the peer. */
 		if (lnet_peer_ni_is_primary(lpni)) {
-			struct lnet_peer *rtr_lp =
+			struct lnet_peer *lp2 =
 				lpni->lpni_peer_net->lpn_peer;
-			int rtr_refcount = rtr_lp->lp_rtr_refcount;
+			int rtr_refcount = lp2->lp_rtr_refcount;
+
+			/* If the new peer that this NID belongs to is
+			 * a primary NID for another peer which we're
+			 * suppose to preserve the Primary for then we
+			 * don't want to mess with it. But the
+			 * configuration is wrong at this point, so we
+			 * should flag both of these peers as in a bad
+			 * state
+			 */
+			if (lp2->lp_state & LNET_PEER_LOCK_PRIMARY) {
+				spin_lock(&lp->lp_lock);
+				lp->lp_state |= LNET_PEER_BAD_CONFIG;
+				spin_unlock(&lp->lp_lock);
+				spin_lock(&lp2->lp_lock);
+				lp2->lp_state |= LNET_PEER_BAD_CONFIG;
+				spin_unlock(&lp2->lp_lock);
+				goto out_free_lpni;
+			}
 
 			/* if we're trying to delete a router it means
 			 * we're moving this peer NI to a new peer so must
@@ -1686,9 +1713,9 @@ struct lnet_peer_net *
 			 */
 			if (rtr_refcount > 0) {
 				flags |= LNET_PEER_RTR_NI_FORCE_DEL;
-				lnet_rtr_transfer_to_peer(rtr_lp, lp);
+				lnet_rtr_transfer_to_peer(lp2, lp);
 			}
-			lnet_peer_del(lpni->lpni_peer_net->lpn_peer);
+			lnet_peer_del(lp2);
 			lnet_peer_ni_decref_locked(lpni);
 			lpni = lnet_peer_ni_alloc(nid);
 			if (!lpni) {
@@ -1746,7 +1773,8 @@ struct lnet_peer_net *
 	if (lp->lp_primary_nid == nid)
 		goto out;
 
-	lp->lp_primary_nid = nid;
+	if (!(lp->lp_state & LNET_PEER_LOCK_PRIMARY))
+		lp->lp_primary_nid = nid;
 
 	rc = lnet_peer_add_nid(lp, nid, flags);
 	if (rc) {
@@ -1754,8 +1782,17 @@ struct lnet_peer_net *
 		goto out;
 	}
 out:
+	/* if this is a configured peer or the primary for that peer has
+	 * been locked, then we don't want to flag this scenario as
+	 * a failure
+	 */
+	if (lp->lp_state & LNET_PEER_CONFIGURED ||
+	    lp->lp_state & LNET_PEER_LOCK_PRIMARY)
+		return 0;
+
 	CDEBUG(D_NET, "peer %s NID %s: %d\n",
 	       libcfs_nid2str(old), libcfs_nid2str(nid), rc);
+
 	return rc;
 }
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 02/24] lustre: quota: enforce block quota for chgrp
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 01/24] lnet: Lock primary NID logic James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 03/24] lnet: introduce struct lnet_nid James Simmons
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Hongchao Zhang, Lustre Development List

From: Hongchao Zhang <hongchao@whamcloud.com>

In patch https://review.whamcloud.com/30146 "LU-5152 quota: enforce
block quota for chgrp", problems were introduced due to synchronous
requests from the MDS to the OSS to change the quota assignment of
files during chgrp operations. However, in some cases, the OSTs are
themselves out of grant and may send a quota request to the MDS,
which may result in a deadlock. Another issue is the slow performance
caused by the synchronous operation between MDT and OSTs.

This patch drops the synchronous RPC requirement of the original
patch #30146 to avoid this problem.

Previously, problems in quota tracking related to chgrp were introduced
due to synchronous RPCs from the MDS to the OSS when changing the group
ownership of objects for quota tracking since
Fixes: ("LU-5152 quota: enforce block quota for chgrp")

WC-bug-id: https://jira.whamcloud.com/browse/LU-11303
Lustre-commit: 83f5544d8518ad12 ("LU-11303 quota: enforce block quota for chgrp")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33996
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/llite_lib.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index f540caf..cc50503 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -1714,6 +1714,16 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
+	/* If this is a chgrp of a regular file, we want to reserve enough
+	 * quota to cover the entire file size.
+	 */
+	if (S_ISREG(inode->i_mode) && op_data->op_attr.ia_valid & ATTR_GID &&
+	    from_kgid(&init_user_ns, op_data->op_attr.ia_gid) !=
+	    from_kgid(&init_user_ns, inode->i_gid)) {
+		op_data->op_xvalid |= OP_XVALID_BLOCKS;
+		op_data->op_attr_blocks = inode->i_blocks;
+	}
+
 	rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, &request);
 	if (rc) {
 		ptlrpc_req_finished(request);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 03/24] lnet: introduce struct lnet_nid
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 01/24] lnet: Lock primary NID logic James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 02/24] lustre: quota: enforce block quota for chgrp James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 04/24] lnet: add string formating/parsing for IPv6 nids James Simmons
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

LNet nids are currently limited to 4-bytes for addresses.
This excludes the use of IPv6.

In order to support IPv6, introduce 'struct lnet_nid' which can hold
up to 128bit address and is extensible, and deprecate 'lnet_nid_t'.
lnet_nid_it will eventually be removed.  Where lnet_nid_t is often
passed around by value, 'struct lnet_nid' will normally be passed
around by reference as it is over twice as large.

The net_type field, which currently has value up to 16, is now limited
to 0-254 with 255 being used as a wildcard.  The most significant byte
is now a size field which gives the size of the whole nid minus 8.  So
zero is correct for current nids with 4-byte addresses.

Where we still need to use 4-byte-address nids, we will use names
containing "nid4".  So "nid4" is a lnet_nid_t when "nid" is a struct
lnet_nid.  lnet_nid_to_nid4 converts a 'struct lnet_nid' to an
lnet_nid_t.

While lnet_nid_t is stored and often transmitted in host-endian format
(and possibly byte-swapped on receipt), 'struct lnet_nid' is always
stored in network-byte-order (i.e.  big-endian).  This is more common
approach for network addresses.

In this first instance, 'struct lnet_nid' is used for ni_nid in
'struct lnet_ni', and related support functions.

In particular libcfs_nidstr() is introduced which parallels
libcfs_nid2str(), but takes 'struct lnet_nid'.

In cases were we need to have similar functions for old and new style
nid, the new function is introduced with a slightly different name,
such as libcfs_nid2str above, or LNET_NID_NET (like LNET_NIDNET).
It will be confusing having both, but the plan is to remove the old
names as soon as practical.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 82a17076f880770a ("LU-10391 lnet: introduce struct lnet_nid")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42100
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h        |  4 +-
 include/linux/lnet/lib-types.h       |  2 +-
 include/uapi/linux/lnet/lnet-idl.h   | 28 ++++++++++-
 include/uapi/linux/lnet/lnet-types.h | 66 ++++++++++++++++++++++++-
 include/uapi/linux/lnet/nidstr.h     | 10 +++-
 net/lnet/klnds/o2iblnd/o2iblnd.c     | 13 +++--
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c  | 17 ++++---
 net/lnet/klnds/socklnd/socklnd.c     | 10 ++--
 net/lnet/klnds/socklnd/socklnd_cb.c  | 10 ++--
 net/lnet/lnet/acceptor.c             |  3 +-
 net/lnet/lnet/api-ni.c               | 60 +++++++++++++---------
 net/lnet/lnet/config.c               |  5 +-
 net/lnet/lnet/lib-move.c             | 96 +++++++++++++++++++++---------------
 net/lnet/lnet/lib-msg.c              | 11 +++--
 net/lnet/lnet/lib-socket.c           | 32 +++++++++---
 net/lnet/lnet/lo.c                   |  3 +-
 net/lnet/lnet/net_fault.c            |  2 +-
 net/lnet/lnet/nidstrings.c           | 40 +++++++++++++++
 net/lnet/lnet/router.c               |  8 +--
 net/lnet/lnet/router_proc.c          |  2 +-
 net/lnet/lnet/udsp.c                 | 13 ++---
 21 files changed, 312 insertions(+), 123 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index 37489ae..acc069d 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -110,7 +110,7 @@
 
 	if (ni->ni_status && ni->ni_status->ns_status != status) {
 		CDEBUG(D_NET, "ni %s status changed from %#x to %#x\n",
-		       libcfs_nid2str(ni->ni_nid),
+		       libcfs_nidstr(&ni->ni_nid),
 		       ni->ni_status->ns_status, status);
 		ni->ni_status->ns_status = status;
 		update = true;
@@ -123,7 +123,7 @@
 lnet_ni_get_status_locked(struct lnet_ni *ni)
 __must_hold(&ni->ni_lock)
 {
-	if (ni->ni_nid == LNET_NID_LO_0)
+	if (nid_is_lo0(&ni->ni_nid))
 		return LNET_NI_STATUS_UP;
 	else if (atomic_read(&ni->ni_fatal_error_on))
 		return LNET_NI_STATUS_DOWN;
diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 85b0d54..80cf4f3 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -463,7 +463,7 @@ struct lnet_ni {
 	u32			*ni_cpts;
 
 	/* interface's NID */
-	lnet_nid_t		ni_nid;
+	struct lnet_nid		ni_nid;
 
 	/* instance-specific data */
 	void			*ni_data;
diff --git a/include/uapi/linux/lnet/lnet-idl.h b/include/uapi/linux/lnet/lnet-idl.h
index 1e27f3b..3fc0df1 100644
--- a/include/uapi/linux/lnet/lnet-idl.h
+++ b/include/uapi/linux/lnet/lnet-idl.h
@@ -40,18 +40,42 @@
  * These are sent in sender's byte order (i.e. receiver flips).
  */
 
-/**
- * Address of an end-point in an LNet network.
+/** Address of an end-point in an LNet network.
  *
  * A node can have multiple end-points and hence multiple addresses.
  * An LNet network can be a simple network (e.g. tcp0) or a network of
  * LNet networks connected by LNet routers. Therefore an end-point address
  * has two parts: network ID, and address within a network.
+ * The most-significant-byte in this format is always 0.  A larger value
+ * would imply a larger nid with a larger address.
  *
  * \see LNET_NIDNET, LNET_NIDADDR, and LNET_MKNID.
  */
 typedef __u64 lnet_nid_t;
 
+/*
+ * Address of LNet end-point in extended form
+ *
+ * To support addresses larger than 32bits we have
+ * an extended nid which supports up to 128 bits
+ * of address and is extensible.
+ * If nid_size is 0, then the nid can be stored in an lnet_nid_t,
+ * and the first 8 bytes of the 'struct lnet_nid' are identical to
+ * the lnet_nid_t in big-endian format.
+ * If nid_type == 0xff, then all other fields should be ignored
+ * and this is an ANY wildcard address.  In particular, the nid_size
+ * can be 0xff without making the address too big to fit.
+ */
+struct lnet_nid {
+	__u8	nid_size;	/* total bytes - 8 */
+	__u8	nid_type;
+	__be16	nid_num;
+	__be32	nid_addr[4];
+} __attribute__((packed));
+
+#define NID_BYTES(nid)		((nid)->nid_size + 8)
+#define NID_ADDR_BYTES(nid)	((nid)->nid_size + 4)
+
 /**
  * ID of a process in a node. Shortened as PID to distinguish from
  * lnet_process_id, the global process ID.
diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h
index 0c426ac..ba8a079 100644
--- a/include/uapi/linux/lnet/lnet-types.h
+++ b/include/uapi/linux/lnet/lnet-types.h
@@ -37,6 +37,9 @@
 #include <linux/types.h>
 #include <linux/lnet/lnet-idl.h>
 
+#include <linux/string.h>
+#include <asm/byteorder.h>
+
 /** \addtogroup lnet
  * @{
  */
@@ -57,6 +60,15 @@
 /** wildcard PID that matches any lnet_pid_t */
 #define LNET_PID_ANY	((lnet_pid_t)(-1))
 
+static inline int LNET_NID_IS_ANY(const struct lnet_nid *nid)
+{
+	/* A NULL pointer can be used to mean "ANY" */
+	return !nid || nid->nid_type == 0xFF;
+}
+
+#define LNET_ANY_NID ((struct lnet_nid)			\
+		      {0xFF, 0xFF, ~0, {~0, ~0, ~0, ~0} })
+
 #define LNET_PID_RESERVED 0xf0000000 /* reserved bits in PID */
 #define LNET_PID_USERFLAG 0x80000000 /* set in userspace peers */
 #define LNET_PID_LUSTRE	  12345
@@ -86,7 +98,7 @@ static inline __u32 LNET_NETNUM(__u32 net)
 
 static inline __u32 LNET_NETTYP(__u32 net)
 {
-	return (net >> 16) & 0xffff;
+	return (net >> 16) & 0xff;
 }
 
 static inline __u32 LNET_MKNET(__u32 type, __u32 num)
@@ -99,6 +111,58 @@ static inline __u32 LNET_MKNET(__u32 type, __u32 num)
 
 #define LNET_NET_ANY LNET_NIDNET(LNET_NID_ANY)
 
+static inline int nid_is_nid4(const struct lnet_nid *nid)
+{
+	return NID_ADDR_BYTES(nid) == 4;
+}
+
+/* LOLND may not be defined yet, so we cannot use an inline */
+#define nid_is_lo0(__nid)						\
+	((__nid)->nid_type == LOLND &&					\
+	 nid_is_nid4(__nid) &&						\
+	 (__nid)->nid_num == 0 &&					\
+	 (__nid)->nid_addr[0] == 0)
+
+static inline __u32 LNET_NID_NET(const struct lnet_nid *nid)
+{
+	return LNET_MKNET(nid->nid_type, __be16_to_cpu(nid->nid_num));
+}
+
+static inline void lnet_nid4_to_nid(lnet_nid_t nid4, struct lnet_nid *nid)
+{
+	if (nid4 == LNET_NID_ANY) {
+		/* equal to setting to LNET_ANY_NID */
+		memset(nid, 0xff, sizeof(*nid));
+		return;
+	}
+
+	nid->nid_size = 0;
+	nid->nid_type = LNET_NETTYP(LNET_NIDNET(nid4));
+	nid->nid_num = __cpu_to_be16(LNET_NETNUM(LNET_NIDNET(nid4)));
+	nid->nid_addr[0] = __cpu_to_be32(LNET_NIDADDR(nid4));
+	nid->nid_addr[1] = nid->nid_addr[2] = nid->nid_addr[3] = 0;
+}
+
+static inline lnet_nid_t lnet_nid_to_nid4(const struct lnet_nid *nid)
+{
+	if (LNET_NID_IS_ANY(nid))
+		return LNET_NID_ANY;
+
+	return LNET_MKNID(LNET_NID_NET(nid), __be32_to_cpu(nid->nid_addr[0]));
+}
+
+static inline int nid_same(const struct lnet_nid *n1,
+			    const struct lnet_nid *n2)
+{
+	return n1->nid_size == n2->nid_size &&
+		n1->nid_type == n2->nid_type &&
+		n1->nid_num == n2->nid_num &&
+		n1->nid_addr[0] == n2->nid_addr[0] &&
+		n1->nid_addr[1] == n2->nid_addr[1] &&
+		n1->nid_addr[2] == n2->nid_addr[2] &&
+		n1->nid_addr[3] == n2->nid_addr[3];
+}
+
 struct lnet_counters_health {
 	__u32	lch_rst_alloc;
 	__u32	lch_resend_count;
diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h
index caf28e2..d5b9d69 100644
--- a/include/uapi/linux/lnet/nidstr.h
+++ b/include/uapi/linux/lnet/nidstr.h
@@ -62,7 +62,7 @@ enum {
 struct list_head;
 
 #define LNET_NIDSTR_COUNT	1024	/* # of nidstrings */
-#define LNET_NIDSTR_SIZE	32	/* size of each one (see below for usage) */
+#define LNET_NIDSTR_SIZE	64	/* size of each one (see below for usage) */
 
 /* support decl needed by both kernel and user space */
 char *libcfs_next_nidstring(void);
@@ -90,6 +90,14 @@ static inline char *libcfs_nid2str(lnet_nid_t nid)
 				LNET_NIDSTR_SIZE);
 }
 
+char *libcfs_nidstr_r(const struct lnet_nid *nid,
+		      char *buf, __kernel_size_t buf_size);
+static inline char *libcfs_nidstr(const struct lnet_nid *nid)
+{
+	return libcfs_nidstr_r(nid, libcfs_next_nidstring(),
+			       LNET_NIDSTR_SIZE);
+}
+
 __u32 libcfs_str2net(const char *str);
 lnet_nid_t libcfs_str2nid(const char *str);
 int libcfs_str2anynid(lnet_nid_t *nid, const char *str);
diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c
index a4949d8..fd807c2 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd.c
@@ -178,8 +178,7 @@ void kiblnd_pack_msg(struct lnet_ni *ni, struct kib_msg *msg, int version,
 {
 	struct kib_net *net = ni->ni_data;
 
-	/*
-	 * CAVEAT EMPTOR! all message fields not set here should have been
+	/* CAVEAT EMPTOR! all message fields not set here should have been
 	 * initialised previously.
 	 */
 	msg->ibm_magic = IBLND_MSG_MAGIC;
@@ -188,7 +187,7 @@ void kiblnd_pack_msg(struct lnet_ni *ni, struct kib_msg *msg, int version,
 	msg->ibm_credits = credits;
 	/*   ibm_nob */
 	msg->ibm_cksum = 0;
-	msg->ibm_srcnid = ni->ni_nid;
+	msg->ibm_srcnid = lnet_nid_to_nid4(&ni->ni_nid);
 	msg->ibm_srcstamp = net->ibn_incarnation;
 	msg->ibm_dstnid = dstnid;
 	msg->ibm_dststamp = dststamp;
@@ -397,7 +396,7 @@ struct kib_peer_ni *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid)
 		 * created.
 		 */
 		if (peer_ni->ibp_nid != nid ||
-		    peer_ni->ibp_ni->ni_nid != ni->ni_nid)
+		    !nid_same(&peer_ni->ibp_ni->ni_nid, &ni->ni_nid))
 			continue;
 
 		CDEBUG(D_NET, "got peer_ni [%p] -> %s (%d) version: %x\n",
@@ -2201,7 +2200,7 @@ static int kiblnd_port_get_attr(struct kib_hca_dev *hdev)
 	list_for_each_entry(net, &hdev->ibh_dev->ibd_nets, ibn_list) {
 		if (val)
 			CDEBUG(D_NETERROR, "Fatal device error for NI %s\n",
-			       libcfs_nid2str(net->ibn_ni->ni_nid));
+			       libcfs_nidstr(&net->ibn_ni->ni_nid));
 		atomic_set(&net->ibn_ni->ni_fatal_error_on, val);
 	}
 }
@@ -2591,7 +2590,7 @@ static void kiblnd_shutdown(struct lnet_ni *ni)
 		wait_var_event_warning(&net->ibn_npeers,
 				       atomic_read(&net->ibn_npeers) == 0,
 				       "%s: waiting for %d peers to disconnect\n",
-				       libcfs_nid2str(ni->ni_nid),
+				       libcfs_nidstr(&ni->ni_nid),
 				       atomic_read(&net->ibn_npeers));
 
 		kiblnd_net_fini_pools(net);
@@ -2906,7 +2905,7 @@ static int kiblnd_startup(struct lnet_ni *ni)
 	}
 
 	net->ibn_dev = ibdev;
-	ni->ni_nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid), ibdev->ibd_ifip);
+	ni->ni_nid.nid_addr[0] = cpu_to_be32(ibdev->ibd_ifip);
 
 	ni->ni_dev_cpt = ifaces[i].li_cpt;
 
diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 8ccd2ab..380374e 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -507,7 +507,7 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type,
 	}
 
 	if (msg->ibm_srcnid != conn->ibc_peer->ibp_nid ||
-	    msg->ibm_dstnid != ni->ni_nid ||
+	    msg->ibm_dstnid != lnet_nid_to_nid4(&ni->ni_nid) ||
 	    msg->ibm_srcstamp != conn->ibc_incarnation ||
 	    msg->ibm_dststamp != net->ibn_incarnation) {
 		CERROR("Stale rx from %s\n",
@@ -2369,11 +2369,12 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx,
 	}
 
 	if (!ni ||				/* no matching net */
-	    ni->ni_nid != reqmsg->ibm_dstnid ||	/* right NET, wrong NID! */
+	    lnet_nid_to_nid4(&ni->ni_nid) !=
+	    reqmsg->ibm_dstnid ||		/* right NET, wrong NID! */
 	    net->ibn_dev != ibdev) {		/* wrong device */
 		CERROR("Can't accept conn from %s on %s (%s:%d:%pI4h): bad dst nid %s\n",
 		       libcfs_nid2str(nid),
-		       !ni ? "NA" : libcfs_nid2str(ni->ni_nid),
+		       ni ? libcfs_nidstr(&ni->ni_nid) : "NA",
 		       ibdev->ibd_ifname, ibdev->ibd_nnets,
 		       &ibdev->ibd_ifip,
 		       libcfs_nid2str(reqmsg->ibm_dstnid));
@@ -2490,8 +2491,8 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx,
 		 * the lower NID connection win so we can move forward.
 		 */
 		if (peer2->ibp_connecting &&
-		    nid < ni->ni_nid && peer2->ibp_races <
-		    MAX_CONN_RACES_BEFORE_ABORT) {
+		    nid < lnet_nid_to_nid4(&ni->ni_nid) &&
+		    peer2->ibp_races < MAX_CONN_RACES_BEFORE_ABORT) {
 			peer2->ibp_races++;
 			write_unlock_irqrestore(g_lock, flags);
 
@@ -2924,7 +2925,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx,
 	}
 
 	read_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
-	if (msg->ibm_dstnid == ni->ni_nid &&
+	if (msg->ibm_dstnid == lnet_nid_to_nid4(&ni->ni_nid) &&
 	    msg->ibm_dststamp == net->ibn_incarnation)
 		rc = 0;
 	else
@@ -3568,13 +3569,13 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx,
 	case IB_EVENT_PORT_ERR:
 	case IB_EVENT_DEVICE_FATAL:
 		CERROR("Fatal device error for NI %s\n",
-		       libcfs_nid2str(conn->ibc_peer->ibp_ni->ni_nid));
+		       libcfs_nidstr(&conn->ibc_peer->ibp_ni->ni_nid));
 		atomic_set(&conn->ibc_peer->ibp_ni->ni_fatal_error_on, 1);
 		return;
 
 	case IB_EVENT_PORT_ACTIVE:
 		CERROR("Port reactivated for NI %s\n",
-		       libcfs_nid2str(conn->ibc_peer->ibp_ni->ni_nid));
+		       libcfs_nidstr(&conn->ibc_peer->ibp_ni->ni_nid));
 		atomic_set(&conn->ibc_peer->ibp_ni->ni_fatal_error_on, 0);
 		return;
 
diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 96cb0e0..21569fb 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -949,7 +949,7 @@ struct ksock_peer_ni *
 		 * Am I already connecting to this guy?  Resolve in
 		 * favour of higher NID...
 		 */
-		if (peerid.nid < ni->ni_nid &&
+		if (peerid.nid < lnet_nid_to_nid4(&ni->ni_nid) &&
 		    ksocknal_connecting(peer_ni->ksnp_conn_cb,
 					((struct sockaddr *)&conn->ksnc_peeraddr))) {
 			rc = EALREADY;
@@ -1820,12 +1820,13 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 
 	case IOC_LIBCFS_REGISTER_MYNID:
 		/* Ignore if this is a noop */
-		if (data->ioc_nid == ni->ni_nid)
+		if (nid_is_nid4(&ni->ni_nid) &&
+		    data->ioc_nid == lnet_nid_to_nid4(&ni->ni_nid))
 			return 0;
 
 		CERROR("obsolete IOC_LIBCFS_REGISTER_MYNID: %s(%s)\n",
 		       libcfs_nid2str(data->ioc_nid),
-		       libcfs_nid2str(ni->ni_nid));
+		       libcfs_nidstr(&ni->ni_nid));
 		return -EINVAL;
 
 	case IOC_LIBCFS_PUSH_CONNECTION:
@@ -2369,8 +2370,7 @@ static int ksocknal_device_event(struct notifier_block *unused,
 
 	LASSERT(ksi);
 	LASSERT(ksi->ksni_addr.ss_family == AF_INET);
-	ni->ni_nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid),
-				ntohl(((struct sockaddr_in *)&ksi->ksni_addr)->sin_addr.s_addr));
+	ni->ni_nid.nid_addr[0] = ((struct sockaddr_in *)&ksi->ksni_addr)->sin_addr.s_addr;
 	list_add(&net->ksnn_list, &ksocknal_data.ksnd_nets);
 	net->ksnn_ni = ni;
 	ksocknal_data.ksnd_nnets++;
diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c
index efec479..e6cd976 100644
--- a/net/lnet/klnds/socklnd/socklnd_cb.c
+++ b/net/lnet/klnds/socklnd/socklnd_cb.c
@@ -1579,7 +1579,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 	/* rely on caller to hold a ref on socket so it wouldn't disappear */
 	LASSERT(conn->ksnc_proto);
 
-	hello->kshm_src_nid = ni->ni_nid;
+	hello->kshm_src_nid = lnet_nid_to_nid4(&ni->ni_nid);
 	hello->kshm_dst_nid = peer_nid;
 	hello->kshm_src_pid = the_lnet.ln_pid;
 
@@ -1628,7 +1628,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 	LASSERT(!active == !(conn->ksnc_type != SOCKLND_CONN_NONE));
 
 	timeout = active ? ksocknal_timeout() :
-			    lnet_acceptor_timeout();
+		lnet_acceptor_timeout();
 
 	rc = lnet_sock_read(sock, &hello->kshm_magic,
 			    sizeof(hello->kshm_magic), timeout);
@@ -1672,7 +1672,9 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 				conn->ksnc_proto = &ksocknal_protocol_v1x;
 #endif
 			hello->kshm_nips = 0;
-			ksocknal_send_hello(ni, conn, ni->ni_nid, hello);
+			ksocknal_send_hello(ni, conn,
+					    lnet_nid_to_nid4(&ni->ni_nid),
+					    hello);
 		}
 
 		CERROR("Unknown protocol version (%d.x expected) from %pIS\n",
@@ -1709,7 +1711,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		recv_id.pid = rpc_get_port((struct sockaddr *)
 					   &conn->ksnc_peeraddr) |
 					   LNET_PID_USERFLAG;
-		recv_id.nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid),
+		recv_id.nid = LNET_MKNID(LNET_NID_NET(&ni->ni_nid),
 					 ntohl(((struct sockaddr_in *)
 					 &conn->ksnc_peeraddr)->sin_addr.s_addr));
 	} else {
diff --git a/net/lnet/lnet/acceptor.c b/net/lnet/lnet/acceptor.c
index 3708b89..243c34f 100644
--- a/net/lnet/lnet/acceptor.c
+++ b/net/lnet/lnet/acceptor.c
@@ -284,7 +284,8 @@ struct socket *
 
 	ni = lnet_nid2ni_addref(cr.acr_nid);
 	if (!ni ||			/* no matching net */
-	    ni->ni_nid != cr.acr_nid) { /* right NET, wrong NID! */
+	    lnet_nid_to_nid4(&ni->ni_nid) != cr.acr_nid) {
+		/* right NET, wrong NID! */
 		if (ni)
 			lnet_ni_decref(ni);
 		LCONSOLE_ERROR_MSG(0x120,
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 41d2d26..9471edb 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -667,6 +667,17 @@ static void lnet_assert_wire_constants(void)
 	BUILD_BUG_ON((int)sizeof(lnet_nid_t) != 8);
 	BUILD_BUG_ON((int)sizeof(lnet_pid_t) != 4);
 
+	/* Checks for struct lnet_nid */
+	BUILD_BUG_ON((int)sizeof(struct lnet_nid) != 20);
+	BUILD_BUG_ON((int)offsetof(struct lnet_nid, nid_size) != 0);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_nid *)0)->nid_size) != 1);
+	BUILD_BUG_ON((int)offsetof(struct lnet_nid, nid_type) != 1);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_nid *)0)->nid_type) != 1);
+	BUILD_BUG_ON((int)offsetof(struct lnet_nid, nid_num) != 2);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_nid *)0)->nid_num) != 2);
+	BUILD_BUG_ON((int)offsetof(struct lnet_nid, nid_addr) != 4);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_nid *)0)->nid_addr) != 16);
+
 	/* Checks for struct lnet_process_id_packed */
 	BUILD_BUG_ON((int)sizeof(struct lnet_process_id_packed) != 12);
 	BUILD_BUG_ON((int)offsetof(struct lnet_process_id_packed, nid) != 0);
@@ -1518,16 +1529,18 @@ struct lnet_net *
 }
 
 struct lnet_ni *
-lnet_nid2ni_locked(lnet_nid_t nid, int cpt)
+lnet_nid2ni_locked(lnet_nid_t nid4, int cpt)
 {
 	struct lnet_net *net;
 	struct lnet_ni *ni;
+	struct lnet_nid nid;
 
 	LASSERT(cpt != LNET_LOCK_EX);
+	lnet_nid4_to_nid(nid4, &nid);
 
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			if (ni->ni_nid == nid)
+			if (nid_same(&ni->ni_nid, &nid))
 				return ni;
 		}
 	}
@@ -1826,7 +1839,9 @@ struct lnet_ping_buffer *
 
 			ns = &pbuf->pb_info.pi_ni[i];
 
-			ns->ns_nid = ni->ni_nid;
+			if (!nid_is_nid4(&ni->ni_nid))
+				continue;
+			ns->ns_nid = lnet_nid_to_nid4(&ni->ni_nid);
 
 			lnet_ni_lock(ni);
 			ns->ns_status = lnet_ni_get_status_locked(ni);
@@ -2142,7 +2157,7 @@ static void lnet_push_target_fini(void)
 			++i;
 			if ((i & (-i)) == i) {
 				CDEBUG(D_WARNING, "Waiting for zombie LNI %s\n",
-				       libcfs_nid2str(ni->ni_nid));
+				       libcfs_nidstr(&ni->ni_nid));
 			}
 			schedule_timeout_uninterruptible(HZ);
 
@@ -2167,7 +2182,7 @@ static void lnet_push_target_fini(void)
 
 		if (!islo)
 			CDEBUG(D_LNI, "Removed LNI %s\n",
-			       libcfs_nid2str(ni->ni_nid));
+			       libcfs_nidstr(&ni->ni_nid));
 
 		lnet_ni_free(ni);
 		i = 2;
@@ -2283,7 +2298,6 @@ static void lnet_push_target_fini(void)
 	struct lnet_tx_queue *tq;
 	int i;
 	struct lnet_net *net = ni->ni_net;
-	u32 seed;
 
 	mutex_lock(&the_lnet.ln_lnd_mutex);
 
@@ -2339,18 +2353,12 @@ static void lnet_push_target_fini(void)
 		tq->tq_credits = lnet_ni_tq_credits(ni);
 	}
 
-	/* Nodes with small feet have little entropy. The NID for this
-	 * node gives the most entropy in the low bits.
-	 */
-	seed = LNET_NIDADDR(ni->ni_nid);
-	add_device_randomness(&seed, sizeof(seed));
-
 	atomic_set(&ni->ni_tx_credits,
 		   lnet_ni_tq_credits(ni) * ni->ni_ncpts);
 	atomic_set(&ni->ni_healthv, LNET_MAX_HEALTH_VALUE);
 
 	CDEBUG(D_LNI, "Added LNI %s [%d/%d/%d/%d]\n",
-	       libcfs_nid2str(ni->ni_nid),
+	       libcfs_nidstr(&ni->ni_nid),
 	       ni->ni_net->net_tunables.lct_peer_tx_credits,
 	       lnet_ni_tq_credits(ni) * LNET_CPT_NUMBER,
 	       ni->ni_net->net_tunables.lct_peer_rtr_credits,
@@ -2924,7 +2932,7 @@ void lnet_lib_exit(void)
 	size_t min_size = 0;
 	int i;
 
-	if (!ni || !cfg_ni || !tun)
+	if (!ni || !cfg_ni || !tun || !nid_is_nid4(&ni->ni_nid))
 		return;
 
 	if (ni->ni_interface) {
@@ -2933,7 +2941,7 @@ void lnet_lib_exit(void)
 			sizeof(cfg_ni->lic_ni_intf));
 	}
 
-	cfg_ni->lic_nid = ni->ni_nid;
+	cfg_ni->lic_nid = lnet_nid_to_nid4(&ni->ni_nid);
 	cfg_ni->lic_status = lnet_ni_get_status_locked(ni);
 	cfg_ni->lic_dev_cpt = ni->ni_dev_cpt;
 
@@ -2993,7 +3001,7 @@ void lnet_lib_exit(void)
 	size_t min_size, tunable_size = 0;
 	int i;
 
-	if (!ni || !config)
+	if (!ni || !config || !nid_is_nid4(&ni->ni_nid))
 		return;
 
 	net_config = (struct lnet_ioctl_net_config *)config->cfg_bulk;
@@ -3007,7 +3015,7 @@ void lnet_lib_exit(void)
 		ni->ni_interface,
 		sizeof(net_config->ni_interface));
 
-	config->cfg_nid = ni->ni_nid;
+	config->cfg_nid = lnet_nid_to_nid4(&ni->ni_nid);
 	config->cfg_config_u.cfg_net.net_peer_timeout =
 		ni->ni_net->net_tunables.lct_peer_timeout;
 	config->cfg_config_u.cfg_net.net_max_tx_credits =
@@ -3287,7 +3295,7 @@ static int lnet_add_net_common(struct lnet_net *net,
 		rc = lnet_udsp_apply_policies_on_ni(ni);
 		if (rc)
 			CERROR("Failed to apply UDSPs on ni %s\n",
-			       libcfs_nid2str(ni->ni_nid));
+			       libcfs_nidstr(&ni->ni_nid));
 	}
 	lnet_net_unlock(LNET_LOCK_EX);
 
@@ -3637,12 +3645,13 @@ u32 lnet_get_dlc_seq_locked(void)
 	lnet_net_lock(LNET_LOCK_EX);
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			if (ni->ni_nid == nid || all) {
+			if (all || (nid_is_nid4(&ni->ni_nid) &&
+				    lnet_nid_to_nid4(&ni->ni_nid) == nid)) {
 				atomic_set(&ni->ni_healthv, value);
 				if (list_empty(&ni->ni_recovery) &&
 				    value < LNET_MAX_HEALTH_VALUE) {
 					CERROR("manually adding local NI %s to recovery\n",
-					       libcfs_nid2str(ni->ni_nid));
+					       libcfs_nidstr(&ni->ni_nid));
 					list_add_tail(&ni->ni_recovery,
 						      &the_lnet.ln_mt_localNIRecovq);
 					lnet_ni_addref_locked(ni, 0);
@@ -3666,7 +3675,7 @@ u32 lnet_get_dlc_seq_locked(void)
 	lnet_net_lock(LNET_LOCK_EX);
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			if (ni->ni_nid != nid && !all)
+			if (lnet_nid_to_nid4(&ni->ni_nid) != nid && !all)
 				continue;
 			if (LNET_NETTYP(net->net_id) == SOCKLND)
 				ni->ni_lnd_tunables.lnd_tun_u.lnd_sock.lnd_conns_per_peer = value;
@@ -3729,7 +3738,9 @@ u32 lnet_get_dlc_seq_locked(void)
 
 	lnet_net_lock(LNET_LOCK_EX);
 	list_for_each_entry(ni, &the_lnet.ln_mt_localNIRecovq, ni_recovery) {
-		list->rlst_nid_array[i] = ni->ni_nid;
+		if (!nid_is_nid4(&ni->ni_nid))
+			continue;
+		list->rlst_nid_array[i] = lnet_nid_to_nid4(&ni->ni_nid);
 		i++;
 		if (i >= LNET_MAX_SHOW_NUM_NID)
 			break;
@@ -4381,10 +4392,13 @@ void LNetDebugPeer(struct lnet_process_id id)
 
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
+			if (!nid_is_nid4(&ni->ni_nid))
+				/* FIXME this needs to be handled */
+				continue;
 			if (index-- != 0)
 				continue;
 
-			id->nid = ni->ni_nid;
+			id->nid = lnet_nid_to_nid4(&ni->ni_nid);
 			id->pid = the_lnet.ln_pid;
 			rc = 0;
 			break;
diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c
index 0117611..0c833fe 100644
--- a/net/lnet/lnet/config.c
+++ b/net/lnet/lnet/config.c
@@ -375,7 +375,7 @@ struct lnet_net *
 	if (ni->ni_interface) {
 		LCONSOLE_ERROR_MSG(0x115, "%s: interface %s already set for net %s: rc = %d\n",
 				   iface, ni->ni_interface,
-				   libcfs_net2str(LNET_NIDNET(ni->ni_nid)),
+				   libcfs_net2str(LNET_NID_NET(&ni->ni_nid)),
 				   -EINVAL);
 		return -EINVAL;
 	}
@@ -435,7 +435,8 @@ struct lnet_net *
 
 	ni->ni_net = net;
 	/* LND will fill in the address part of the NID */
-	ni->ni_nid = LNET_MKNID(net->net_id, 0);
+	ni->ni_nid.nid_type = LNET_NETTYP(net->net_id);
+	ni->ni_nid.nid_num = cpu_to_be16(LNET_NETNUM(net->net_id));
 
 	/* Store net namespace in which current ni is being created */
 	if (current->nsproxy && current->nsproxy->net_ns)
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 035bda3..c70ec37 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -537,7 +537,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	int rc;
 
 	LASSERT(!in_interrupt());
-	LASSERT(ni->ni_nid == LNET_NID_LO_0 ||
+	LASSERT(nid_is_lo0(&ni->ni_nid) ||
 		(msg->msg_txcredit && msg->msg_peertxcredit));
 
 	rc = ni->ni_net->net_lnd->lnd_send(ni, priv, msg);
@@ -648,7 +648,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 
 	/* can't get here if we're sending to the loopback interface */
 	if (the_lnet.ln_loni)
-		LASSERT(lp->lpni_nid != the_lnet.ln_loni->ni_nid);
+		LASSERT(lp->lpni_nid !=
+			lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid));
 
 	/* NB 'lp' is always the next hop */
 	if (!(msg->msg_target.pid & LNET_PID_USERFLAG) &&
@@ -1133,10 +1134,11 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		 * preferred, then let's use it
 		 */
 		if (best_ni) {
+			/* FIXME need to handle large-addr nid */
 			lpni_is_preferred = lnet_peer_is_pref_nid_locked(lpni,
-									 best_ni->ni_nid);
+									 lnet_nid_to_nid4(&best_ni->ni_nid));
 			CDEBUG(D_NET, "%s lpni_is_preferred = %d\n",
-			       libcfs_nid2str(best_ni->ni_nid),
+			       libcfs_nidstr(&best_ni->ni_nid),
 			       lpni_is_preferred);
 		} else {
 			lpni_is_preferred = false;
@@ -1514,9 +1516,9 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		if (best_ni)
 			CDEBUG(D_NET,
 			       "compare ni %s [c:%d, d:%d, s:%d, p:%u, g:%u] with best_ni %s [c:%d, d:%d, s:%d, p:%u, g:%u]\n",
-			       libcfs_nid2str(ni->ni_nid), ni_credits, distance,
+			       libcfs_nidstr(&ni->ni_nid), ni_credits, distance,
 			       ni->ni_seq, ni_sel_prio, ni_dev_prio,
-			       (best_ni) ? libcfs_nid2str(best_ni->ni_nid)
+			       (best_ni) ? libcfs_nidstr(&best_ni->ni_nid)
 			       : "not selected", best_credits, shortest_distance,
 			       (best_ni) ? best_ni->ni_seq : 0,
 			       best_sel_prio, best_dev_prio);
@@ -1561,7 +1563,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	}
 
 	CDEBUG(D_NET, "selected best_ni %s\n",
-	       (best_ni) ? libcfs_nid2str(best_ni->ni_nid) : "no selection");
+	       (best_ni) ? libcfs_nidstr(&best_ni->ni_nid) : "no selection");
 
 	return best_ni;
 }
@@ -1620,11 +1622,12 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 
 	/* No send credit hassles with LOLND */
 	lnet_ni_addref_locked(the_lnet.ln_loni, cpt);
-	msg->msg_hdr.dest_nid = cpu_to_le64(the_lnet.ln_loni->ni_nid);
+	msg->msg_hdr.dest_nid =
+		cpu_to_le64(lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid));
 	if (!msg->msg_routing)
 		msg->msg_hdr.src_nid =
-			cpu_to_le64(the_lnet.ln_loni->ni_nid);
-	msg->msg_target.nid = the_lnet.ln_loni->ni_nid;
+			cpu_to_le64(lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid));
+	msg->msg_target.nid = lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid);
 	lnet_msg_commit(msg, cpt);
 	msg->msg_txni = the_lnet.ln_loni;
 
@@ -1655,7 +1658,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 
 	CDEBUG(D_NET,
 	       "%s NI seq info: [%d:%d:%d:%u] %s LPNI seq info [%d:%d:%d:%u]\n",
-	       libcfs_nid2str(best_ni->ni_nid),
+	       libcfs_nidstr(&best_ni->ni_nid),
 	       best_ni->ni_seq, best_ni->ni_net->net_seq,
 	       atomic_read(&best_ni->ni_tx_credits),
 	       best_ni->ni_sel_priority,
@@ -1719,7 +1722,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	 * originator and set it here.
 	 */
 	if (!msg->msg_routing)
-		msg->msg_hdr.src_nid = cpu_to_le64(msg->msg_txni->ni_nid);
+		msg->msg_hdr.src_nid =
+			cpu_to_le64(lnet_nid_to_nid4(&msg->msg_txni->ni_nid));
 
 	if (routing) {
 		msg->msg_target_is_router = 1;
@@ -1757,7 +1761,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	if (!rc)
 		CDEBUG(D_NET, "TRACE: %s(%s:%s) -> %s(%s:%s) %s : %s try# %d\n",
 		       libcfs_nid2str(msg->msg_hdr.src_nid),
-		       libcfs_nid2str(msg->msg_txni->ni_nid),
+		       libcfs_nidstr(&msg->msg_txni->ni_nid),
 		       libcfs_nid2str(sd->sd_src_nid),
 		       libcfs_nid2str(msg->msg_hdr.dest_nid),
 		       libcfs_nid2str(sd->sd_dst_nid),
@@ -1775,9 +1779,10 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	if (!lnet_peer_is_multi_rail(lpni->lpni_peer_net->lpn_peer) &&
 	    !lnet_msg_is_response(msg) && lpni->lpni_pref_nnids == 0) {
 		CDEBUG(D_NET, "Setting preferred local NID %s on NMR peer %s\n",
-		       libcfs_nid2str(lni->ni_nid),
+		       libcfs_nidstr(&lni->ni_nid),
 		       libcfs_nid2str(lpni->lpni_nid));
-		lnet_peer_ni_set_non_mr_pref_nid(lpni, lni->ni_nid);
+		lnet_peer_ni_set_non_mr_pref_nid(lpni,
+						 lnet_nid_to_nid4(&lni->ni_nid));
 	}
 }
 
@@ -1828,7 +1833,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	}
 
 	if (sd->sd_best_lpni &&
-	    sd->sd_best_lpni->lpni_nid == the_lnet.ln_loni->ni_nid)
+	    sd->sd_best_lpni->lpni_nid ==
+	    lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid))
 		return lnet_handle_lo_send(sd);
 	else if (sd->sd_best_lpni)
 		return lnet_handle_send(sd);
@@ -1951,7 +1957,7 @@ struct lnet_ni *
 	struct lnet_peer_ni *gwni = NULL;
 	bool route_found = false;
 	lnet_nid_t src_nid = (sd->sd_src_nid != LNET_NID_ANY) ? sd->sd_src_nid :
-			      sd->sd_best_ni ? sd->sd_best_ni->ni_nid :
+			      sd->sd_best_ni ? lnet_nid_to_nid4(&sd->sd_best_ni->ni_nid) :
 			      LNET_NID_ANY;
 	int best_lpn_healthv = 0;
 	u32 best_lpn_sel_prio = LNET_MAX_SELECTION_PRIORITY;
@@ -2454,7 +2460,8 @@ struct lnet_ni *
 		 * network
 		 */
 		if (sd->sd_best_lpni &&
-		    sd->sd_best_lpni->lpni_nid == the_lnet.ln_loni->ni_nid) {
+		    sd->sd_best_lpni->lpni_nid ==
+		    lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid)) {
 			/* in case we initially started with a routed
 			 * destination, let's reset to local
 			 */
@@ -3195,7 +3202,7 @@ struct lnet_mt_event_info {
 		lnet_net_unlock(0);
 
 		CDEBUG(D_NET, "attempting to recover local ni: %s\n",
-		       libcfs_nid2str(ni->ni_nid));
+		       libcfs_nidstr(&ni->ni_nid));
 
 		lnet_ni_lock(ni);
 		if (!(ni->ni_recovery_state & LNET_NI_RECOVERY_PENDING)) {
@@ -3205,7 +3212,7 @@ struct lnet_mt_event_info {
 			ev_info = kzalloc(sizeof(*ev_info), GFP_NOFS);
 			if (!ev_info) {
 				CERROR("out of memory. Can't recover %s\n",
-				       libcfs_nid2str(ni->ni_nid));
+				       libcfs_nidstr(&ni->ni_nid));
 				lnet_ni_lock(ni);
 				ni->ni_recovery_state &=
 				  ~LNET_NI_RECOVERY_PENDING;
@@ -3218,7 +3225,8 @@ struct lnet_mt_event_info {
 			 * We'll unlink the mdh in this case below.
 			 */
 			LNetInvalidateMDHandle(&ni->ni_ping_mdh);
-			nid = ni->ni_nid;
+			/* FIXME need to handle large-addr nid */
+			nid = lnet_nid_to_nid4(&ni->ni_nid);
 
 			/* remove the NI from the local queue and drop the
 			 * reference count to it while we're recovering
@@ -3986,11 +3994,12 @@ void lnet_monitor_thr_stop(void)
 	lnet_ni_recv(ni, msg->msg_private, NULL, 0, 0, 0, 0);
 	msg->msg_receiving = 0;
 
-	rc = lnet_send(ni->ni_nid, msg, msg->msg_from);
+	/* FIXME need to handle large-addr nid */
+	rc = lnet_send(lnet_nid_to_nid4(&ni->ni_nid), msg, msg->msg_from);
 	if (rc < 0) {
 		/* didn't get as far as lnet_ni_send() */
 		CERROR("%s: Unable to send REPLY for GET from %s: %d\n",
-		       libcfs_nid2str(ni->ni_nid),
+		       libcfs_nidstr(&ni->ni_nid),
 		       libcfs_id2str(info.mi_id), rc);
 
 		lnet_finalize(msg, rc);
@@ -4020,7 +4029,7 @@ void lnet_monitor_thr_stop(void)
 	md = lnet_wire_handle2md(&hdr->msg.reply.dst_wmd);
 	if (!md || !md->md_threshold || md->md_me) {
 		CNETERR("%s: Dropping REPLY from %s for %s MD %#llx.%#llx\n",
-			libcfs_nid2str(ni->ni_nid), libcfs_id2str(src),
+			libcfs_nidstr(&ni->ni_nid), libcfs_id2str(src),
 			!md ? "invalid" : "inactive",
 			hdr->msg.reply.dst_wmd.wh_interface_cookie,
 			hdr->msg.reply.dst_wmd.wh_object_cookie);
@@ -4040,7 +4049,7 @@ void lnet_monitor_thr_stop(void)
 	if (mlength < rlength &&
 	    !(md->md_options & LNET_MD_TRUNCATE)) {
 		CNETERR("%s: Dropping REPLY from %s length %d for MD %#llx would overflow (%d)\n",
-			libcfs_nid2str(ni->ni_nid), libcfs_id2str(src),
+			libcfs_nidstr(&ni->ni_nid), libcfs_id2str(src),
 			rlength, hdr->msg.reply.dst_wmd.wh_object_cookie,
 			mlength);
 		lnet_res_unlock(cpt);
@@ -4048,7 +4057,7 @@ void lnet_monitor_thr_stop(void)
 	}
 
 	CDEBUG(D_NET, "%s: Reply from %s of length %d/%d into md %#llx\n",
-	       libcfs_nid2str(ni->ni_nid), libcfs_id2str(src),
+	       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(src),
 	       mlength, rlength, hdr->msg.reply.dst_wmd.wh_object_cookie);
 
 	lnet_msg_attach_md(msg, md, 0, mlength);
@@ -4088,7 +4097,7 @@ void lnet_monitor_thr_stop(void)
 		/* Don't moan; this is expected */
 		CDEBUG(D_NET,
 		       "%s: Dropping ACK from %s to %s MD %#llx.%#llx\n",
-		       libcfs_nid2str(ni->ni_nid), libcfs_id2str(src),
+		       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(src),
 		       !md ? "invalid" : "inactive",
 		       hdr->msg.ack.dst_wmd.wh_interface_cookie,
 		       hdr->msg.ack.dst_wmd.wh_object_cookie);
@@ -4101,7 +4110,7 @@ void lnet_monitor_thr_stop(void)
 	}
 
 	CDEBUG(D_NET, "%s: ACK from %s into md %#llx\n",
-	       libcfs_nid2str(ni->ni_nid), libcfs_id2str(src),
+	       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(src),
 	       hdr->msg.ack.dst_wmd.wh_object_cookie);
 
 	lnet_msg_attach_md(msg, md, 0, 0);
@@ -4213,12 +4222,13 @@ void lnet_monitor_thr_stop(void)
 	dest_pid = le32_to_cpu(hdr->dest_pid);
 	payload_length = le32_to_cpu(hdr->payload_length);
 
-	for_me = (ni->ni_nid == dest_nid);
+	/* FIXME handle large-addr nids */
+	for_me = (lnet_nid_to_nid4(&ni->ni_nid) == dest_nid);
 	cpt = lnet_cpt_of_nid(from_nid, ni);
 
 	CDEBUG(D_NET, "TRACE: %s(%s) <- %s : %s\n",
 	       libcfs_nid2str(dest_nid),
-	       libcfs_nid2str(ni->ni_nid),
+	       libcfs_nidstr(&ni->ni_nid),
 	       libcfs_nid2str(src_nid),
 	       lnet_msgtyp2str(type));
 
@@ -4274,7 +4284,7 @@ void lnet_monitor_thr_stop(void)
 	 * or malicious so we chop them off at the knees :)
 	 */
 	if (!for_me) {
-		if (LNET_NIDNET(dest_nid) == LNET_NIDNET(ni->ni_nid)) {
+		if (LNET_NIDNET(dest_nid) == LNET_NID_NET(&ni->ni_nid)) {
 			/* should have gone direct */
 			CERROR("%s, src %s: Bad dest nid %s (should have been sent direct)\n",
 			       libcfs_nid2str(from_nid),
@@ -4324,8 +4334,9 @@ void lnet_monitor_thr_stop(void)
 		goto drop;
 	}
 
+	/* FIXME need to support large-addr nid */
 	if (!list_empty(&the_lnet.ln_drop_rules) &&
-	    lnet_drop_rule_match(hdr, ni->ni_nid, NULL)) {
+	    lnet_drop_rule_match(hdr, lnet_nid_to_nid4(&ni->ni_nid), NULL)) {
 		CDEBUG(D_NET, "%s, src %s, dst %s: Dropping %s to simulate silent message loss\n",
 		       libcfs_nid2str(from_nid), libcfs_nid2str(src_nid),
 		       libcfs_nid2str(dest_nid), lnet_msgtyp2str(type));
@@ -4368,7 +4379,9 @@ void lnet_monitor_thr_stop(void)
 	}
 
 	lnet_net_lock(cpt);
-	lpni = lnet_nid2peerni_locked(from_nid, ni->ni_nid, cpt);
+	/* FIXME support large-addr nid */
+	lpni = lnet_nid2peerni_locked(from_nid, lnet_nid_to_nid4(&ni->ni_nid),
+				      cpt);
 	if (IS_ERR(lpni)) {
 		lnet_net_unlock(cpt);
 		CERROR("%s, src %s: Dropping %s (error %ld looking up sender)\n",
@@ -4790,7 +4803,7 @@ struct lnet_msg *
 	msg = kmem_cache_zalloc(lnet_msg_cachep, GFP_NOFS);
 	if (!msg) {
 		CERROR("%s: Dropping REPLY from %s: can't allocate msg\n",
-		       libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id));
+		       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(peer_id));
 		goto drop;
 	}
 
@@ -4801,7 +4814,7 @@ struct lnet_msg *
 
 	if (!getmd->md_threshold) {
 		CERROR("%s: Dropping REPLY from %s for inactive MD %p\n",
-		       libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id),
+		       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(peer_id),
 		       getmd);
 		lnet_res_unlock(cpt);
 		goto drop;
@@ -4810,7 +4823,7 @@ struct lnet_msg *
 	LASSERT(!getmd->md_offset);
 
 	CDEBUG(D_NET, "%s: Reply from %s md %p\n",
-	       libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id), getmd);
+	       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(peer_id), getmd);
 
 	/* setup information for lnet_build_msg_event */
 	msg->msg_initiator = getmsg->msg_txpeer->lpni_peer_net->lpn_peer->lp_primary_nid;
@@ -5032,7 +5045,8 @@ struct lnet_msg *
 	cpt = lnet_net_lock_current();
 
 	while ((ni = lnet_get_next_ni_locked(NULL, ni))) {
-		if (ni->ni_nid == dstnid) {
+		/* FIXME support large-addr nid */
+		if (lnet_nid_to_nid4(&ni->ni_nid) == dstnid) {
 			if (srcnidp)
 				*srcnidp = dstnid;
 			if (orderp) {
@@ -5046,7 +5060,7 @@ struct lnet_msg *
 			return local_nid_dist_zero ? 0 : 1;
 		}
 
-		if (!matched_dstnet && LNET_NIDNET(ni->ni_nid) == dstnet) {
+		if (!matched_dstnet && LNET_NID_NET(&ni->ni_nid) == dstnet) {
 			matched_dstnet = true;
 			/* We matched the destination net, but we may have
 			 * additional local NIs to inspect.
@@ -5055,7 +5069,8 @@ struct lnet_msg *
 			 * they may be overwritten if we match local NI above.
 			 */
 			if (srcnidp)
-				*srcnidp = ni->ni_nid;
+				/* FIXME support large-addr nids */
+				*srcnidp = lnet_nid_to_nid4(&ni->ni_nid);
 
 			if (orderp) {
 				/* Check if ni was originally created in
@@ -5110,7 +5125,8 @@ struct lnet_msg *
 				net = lnet_get_net_locked(shortest->lr_lnet);
 				LASSERT(net);
 				ni = lnet_get_next_ni_locked(net, NULL);
-				*srcnidp = ni->ni_nid;
+				/* FIXME support large-addr nids */
+				*srcnidp = lnet_nid_to_nid4(&ni->ni_nid);
 			}
 			if (orderp)
 				*orderp = order;
diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index e471848..b1f684f 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -473,7 +473,7 @@
 
 	CDEBUG(D_NET,
 	       "%s added to recovery queue. ping count: %u next ping: %lld health :%d\n",
-	       libcfs_nid2str(ni->ni_nid),
+	       libcfs_nidstr(&ni->ni_nid),
 	       ni->ni_ping_count,
 	       ni->ni_next_ping,
 	       atomic_read(&ni->ni_healthv));
@@ -796,10 +796,11 @@
 	/* if we're sending to the LOLND then the msg_txpeer will not be
 	 * set. So no need to sanity check it.
 	 */
-	if (msg->msg_tx_committed && msg->msg_txni->ni_nid != LNET_NID_LO_0)
+	if (msg->msg_tx_committed &&
+	    !nid_is_lo0(&msg->msg_txni->ni_nid))
 		LASSERT(msg->msg_txpeer);
 	else if (msg->msg_tx_committed &&
-		 msg->msg_txni->ni_nid == LNET_NID_LO_0)
+		 nid_is_lo0(&msg->msg_txni->ni_nid))
 		lo = true;
 
 	if (hstatus != LNET_MSG_STATUS_OK &&
@@ -827,7 +828,7 @@
 		LASSERT(ni);
 
 	CDEBUG(D_NET, "health check: %s->%s: %s: %s\n",
-	       libcfs_nid2str(ni->ni_nid),
+	       libcfs_nidstr(&ni->ni_nid),
 	       (lo) ? "self" : libcfs_nid2str(lpni->lpni_nid),
 	       lnet_msgtyp2str(msg->msg_type),
 	       lnet_health_error2str(hstatus));
@@ -1114,7 +1115,7 @@
 	CDEBUG(D_NET,
 	       "src %s(%s)->dst %s: %s simulate health error: %s\n",
 	       libcfs_nid2str(msg->msg_hdr.src_nid),
-	       libcfs_nid2str(msg->msg_txni->ni_nid),
+	       libcfs_nidstr(&msg->msg_txni->ni_nid),
 	       libcfs_nid2str(msg->msg_hdr.dest_nid),
 	       lnet_msgtyp2str(msg->msg_type),
 	       lnet_health_error2str(*hstatus));
diff --git a/net/lnet/lnet/lib-socket.c b/net/lnet/lnet/lib-socket.c
index 317d3cf..7deb48a 100644
--- a/net/lnet/lnet/lib-socket.c
+++ b/net/lnet/lnet/lib-socket.c
@@ -235,9 +235,33 @@ int choose_ipv4_src(u32 *ret, int interface, u32 dst_ipaddr, struct net *ns)
 #if IS_ENABLED(CONFIG_IPV6)
 		case AF_INET6: {
 			struct sockaddr_in6 *sin6 = (void *)&locaddr;
+			int val = 0;
 
 			sin6->sin6_family = AF_INET6;
 			sin6->sin6_addr = in6addr_any;
+
+			/* Make sure we get both IPv4 and IPv6 connections.
+			 * This is the default, but it can be overridden so we
+			 * force it back.
+			 */
+			/* From v5.7-rc6-2614-g5a892ff2facb when
+			 * kernel_setsockopt() was removed until
+			 * sockptr_t (above) there is no clean way to
+			 * pass kernel address to setsockopt.  We could
+			 * use get_fs()/set_fs(), but in this particular
+			 * situation there is an easier way.  It depends
+			 * on the fact that at least for these few
+			 * kernels a NULL address to ipv6_setsockopt()
+			 * is treated like the address of a zero.
+			 */
+			if (ipv6_only_sock(sock->sk) && !val) {
+				void *optval = NULL;
+
+				sock->ops->setsockopt(sock,
+						      IPPROTO_IPV6, IPV6_V6ONLY,
+						      optval, sizeof(val));
+			}
+
 			if (interface >= 0 && remaddr) {
 				struct sockaddr_in6 *rem = (void *)remaddr;
 
@@ -352,7 +376,6 @@ struct socket *
 lnet_sock_listen(int local_port, int backlog, struct net *ns)
 {
 	struct socket *sock;
-	int val = 0;
 	int rc;
 
 	sock = lnet_sock_create(-1, NULL, local_port, ns);
@@ -364,13 +387,6 @@ struct socket *
 		return ERR_PTR(rc);
 	}
 
-	/* Make sure we get both IPv4 and IPv6 connections.
-	 * This is the default, but it can be overridden so
-	 * we force it back.
-	 */
-	kernel_setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY,
-			  (char *)&val, sizeof(val));
-
 	rc = kernel_listen(sock, backlog);
 	if (!rc)
 		return sock;
diff --git a/net/lnet/lnet/lo.c b/net/lnet/lnet/lo.c
index 4ddf1cd..3d3dcf8 100644
--- a/net/lnet/lnet/lo.c
+++ b/net/lnet/lnet/lo.c
@@ -40,7 +40,8 @@
 	LASSERT(!lntmsg->msg_routing);
 	LASSERT(!lntmsg->msg_target_is_router);
 
-	return lnet_parse(ni, &lntmsg->msg_hdr, ni->ni_nid, lntmsg, 0);
+	return lnet_parse(ni, &lntmsg->msg_hdr,
+			  lnet_nid_to_nid4(&ni->ni_nid), lntmsg, 0);
 }
 
 static int
diff --git a/net/lnet/lnet/net_fault.c b/net/lnet/lnet/net_fault.c
index 0d19da4..4c50eec 100644
--- a/net/lnet/lnet/net_fault.c
+++ b/net/lnet/lnet/net_fault.c
@@ -684,7 +684,7 @@ struct delay_daemon_data {
 			list_del_init(&msg->msg_list);
 			ni = msg->msg_txni;
 			CDEBUG(D_NET, "TRACE: msg %p %s -> %s : %s\n", msg,
-			       libcfs_nid2str(ni->ni_nid),
+			       libcfs_nidstr(&ni->ni_nid),
 			       libcfs_nid2str(msg->msg_txpeer->lpni_nid),
 			       lnet_msgtyp2str(msg->msg_type));
 			lnet_ni_send(ni, msg);
diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c
index 209da0f..6da43d5 100644
--- a/net/lnet/lnet/nidstrings.c
+++ b/net/lnet/lnet/nidstrings.c
@@ -906,6 +906,46 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 }
 EXPORT_SYMBOL(libcfs_nid2str_r);
 
+char *
+libcfs_nidstr_r(const struct lnet_nid *nid, char *buf, size_t buf_size)
+{
+	u32 nnum = be16_to_cpu(nid->nid_num);
+	u32 lnd  = nid->nid_type;
+	struct netstrfns *nf;
+
+	if (LNET_NID_IS_ANY(nid)) {
+		strncpy(buf, "<?>", buf_size);
+		buf[buf_size - 1] = '\0';
+		return buf;
+	}
+
+	nf = libcfs_lnd2netstrfns(lnd);
+	if (nf && nid_is_nid4(nid)) {
+		size_t addr_len;
+
+		nf->nf_addr2str(ntohl(nid->nid_addr[0]), buf, buf_size);
+		addr_len = strlen(buf);
+		if (nnum == 0)
+			snprintf(buf + addr_len, buf_size - addr_len, "@%s",
+				 nf->nf_name);
+		else
+			snprintf(buf + addr_len, buf_size - addr_len, "@%s%u",
+				 nf->nf_name, nnum);
+	} else {
+		int l = 0;
+		int words = DIV_ROUND_UP(NID_ADDR_BYTES(nid), 4);
+		int i;
+
+		for (i = 0; i < words && i < 4; i++)
+			l = snprintf(buf + l, buf_size - l, "%s%x",
+				     i ? ":" : "", ntohl(nid->nid_addr[i]));
+		snprintf(buf + l, buf_size - l, "@<%u:%u>", lnd, nnum);
+	}
+
+	return buf;
+}
+EXPORT_SYMBOL(libcfs_nidstr_r);
+
 static struct netstrfns *
 libcfs_str2net_internal(const char *str, u32 *net)
 {
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index 9003d47..6335425 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -1691,21 +1691,21 @@ bool lnet_router_checker_active(void)
 	LASSERT(!in_interrupt());
 
 	CDEBUG(D_NET, "%s notifying %s: %s\n",
-	       !ni ? "userspace" : libcfs_nid2str(ni->ni_nid),
+	       !ni ? "userspace" : libcfs_nidstr(&ni->ni_nid),
 	       libcfs_nid2str(nid), alive ? "up" : "down");
 
 	if (ni &&
-	    LNET_NIDNET(ni->ni_nid) != LNET_NIDNET(nid)) {
+	    LNET_NID_NET(&ni->ni_nid) != LNET_NIDNET(nid)) {
 		CWARN("Ignoring notification of %s %s by %s (different net)\n",
 		      libcfs_nid2str(nid), alive ? "birth" : "death",
-		      libcfs_nid2str(ni->ni_nid));
+		      libcfs_nidstr(&ni->ni_nid));
 		return -EINVAL;
 	}
 
 	/* can't do predictions... */
 	if (when > now) {
 		CWARN("Ignoring prediction from %s of %s %s %lld seconds in the future\n",
-		      !ni ? "userspace" : libcfs_nid2str(ni->ni_nid),
+		      ni ? libcfs_nidstr(&ni->ni_nid) : "userspace",
 		      libcfs_nid2str(nid), alive ? "up" : "down", when - now);
 		return -EINVAL;
 	}
diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c
index 43f70b6..6649f06 100644
--- a/net/lnet/lnet/router_proc.c
+++ b/net/lnet/lnet/router_proc.c
@@ -702,7 +702,7 @@ static int proc_lnet_nis(struct ctl_table *table, int write,
 
 				s += scnprintf(s, tmpstr + tmpsiz - s,
 					       "%-24s %6s %5lld %4d %4d %4d %5d %5d %5d\n",
-					       libcfs_nid2str(ni->ni_nid), stat,
+					       libcfs_nidstr(&ni->ni_nid), stat,
 					       last_alive, *ni->ni_refs[i],
 					       ni->ni_net->net_tunables.lct_peer_tx_credits,
 					       ni->ni_net->net_tunables.lct_peer_rtr_credits,
diff --git a/net/lnet/lnet/udsp.c b/net/lnet/lnet/udsp.c
index 516db98..4495062 100644
--- a/net/lnet/lnet/udsp.c
+++ b/net/lnet/lnet/udsp.c
@@ -213,7 +213,7 @@ enum udsp_apply {
 	struct lnet_ud_nid_descr *ni_match = udi->udi_match;
 	u32 priority = (udi->udi_revert) ? -1 : udi->udi_priority;
 
-	rc = cfs_match_nid_net(ni->ni_nid,
+	rc = cfs_match_nid_net(lnet_nid_to_nid4(&ni->ni_nid),
 			       ni_match->ud_net_id.udn_net_type,
 			       &ni_match->ud_net_id.udn_net_num_range,
 			       &ni_match->ud_addr_range);
@@ -221,7 +221,7 @@ enum udsp_apply {
 		return 0;
 
 	CDEBUG(D_NET, "apply udsp on ni %s\n",
-	       libcfs_nid2str(ni->ni_nid));
+	       libcfs_nidstr(&ni->ni_nid));
 
 	/* Detected match. Set NIDs priority */
 	lnet_ni_set_sel_priority_locked(ni, priority);
@@ -481,7 +481,7 @@ enum udsp_apply {
 		    ni_action->ud_net_id.udn_net_type)
 			continue;
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			rc = cfs_match_nid_net(ni->ni_nid,
+			rc = cfs_match_nid_net(lnet_nid_to_nid4(&ni->ni_nid),
 					       ni_action->ud_net_id.udn_net_type,
 					       &ni_action->ud_net_id.udn_net_num_range,
 					       &ni_action->ud_addr_range);
@@ -500,15 +500,16 @@ enum udsp_apply {
 				}
 			}
 			CDEBUG(D_NET, "add nid %s as preferred for peer %s\n",
-			       libcfs_nid2str(ni->ni_nid),
+			       libcfs_nidstr(&ni->ni_nid),
 			       libcfs_nid2str(lpni->lpni_nid));
 			/* match. Add to pref NIDs */
-			rc = lnet_peer_add_pref_nid(lpni, ni->ni_nid);
+			rc = lnet_peer_add_pref_nid(lpni,
+						    lnet_nid_to_nid4(&ni->ni_nid));
 			lnet_net_lock(LNET_LOCK_EX);
 			/* success if EEXIST return */
 			if (rc && rc != -EEXIST) {
 				CERROR("Failed to add %s to %s pref nid list\n",
-				       libcfs_nid2str(ni->ni_nid),
+				       libcfs_nidstr(&ni->ni_nid),
 				       libcfs_nid2str(lpni->lpni_nid));
 				return rc;
 			}
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 04/24] lnet: add string formating/parsing for IPv6 nids
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (2 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 03/24] lnet: introduce struct lnet_nid James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 05/24] lnet: change lpni_nid in lnet_peer_ni to lnet_nid James Simmons
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

New entries for struct netstrfns:
  nf_addr2str_size
  nf_str2addr_size
which accept or report the size of the address in bytes.
New matching functions that can report or parse IPv4 and IPv6
addresses.

New interface - currently unused - libcfs_strnid() which takes a str
and provides a 'struct lnet_nid' with appropriate nid_size.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 7224b21156639a63 ("LU-10391 lnet: add string formating/parsing for IPv6 nids")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/43942
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h   |   4 ++
 include/uapi/linux/lnet/nidstr.h |   1 +
 net/lnet/lnet/nidstrings.c       | 109 +++++++++++++++++++++++++++++++++++++--
 3 files changed, 110 insertions(+), 4 deletions(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 80cf4f3..5b517cc 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -248,7 +248,11 @@ struct netstrfns {
 	char	*nf_name;
 	char	*nf_modname;
 	void	(*nf_addr2str)(u32 addr, char *str, size_t size);
+	void	(*nf_addr2str_size)(const __be32 *addr, size_t asize,
+				    char *str, size_t size);
 	int	(*nf_str2addr)(const char *str, int nob, u32 *addr);
+	int	(*nf_str2addr_size)(const char *str, int nob,
+				    __be32 *addr, size_t *asize);
 	int	(*nf_parse_addrlist)(char *str, int len,
 				     struct list_head *list);
 	int	(*nf_print_addrlist)(char *buffer, int count,
diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h
index d5b9d69..13a0d10 100644
--- a/include/uapi/linux/lnet/nidstr.h
+++ b/include/uapi/linux/lnet/nidstr.h
@@ -100,6 +100,7 @@ static inline char *libcfs_nidstr(const struct lnet_nid *nid)
 
 __u32 libcfs_str2net(const char *str);
 lnet_nid_t libcfs_str2nid(const char *str);
+int libcfs_strnid(struct lnet_nid *nid, const char *str);
 int libcfs_str2anynid(lnet_nid_t *nid, const char *str);
 char *libcfs_id2str(struct lnet_process_id id);
 void cfs_free_nidlist(struct list_head *list);
diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c
index 6da43d5..08f828b 100644
--- a/net/lnet/lnet/nidstrings.c
+++ b/net/lnet/lnet/nidstrings.c
@@ -38,6 +38,7 @@
 
 #include <linux/spinlock.h>
 #include <linux/slab.h>
+#include <linux/sunrpc/addr.h>
 #include <linux/libcfs/libcfs.h>
 #include <linux/libcfs/libcfs_string.h>
 #include <uapi/linux/lnet/nidstr.h>
@@ -466,8 +467,31 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 		 (addr >> 8) & 0xff, addr & 0xff);
 }
 
-/*
- * CAVEAT EMPTOR XscanfX
+static void
+libcfs_ip_addr2str_size(const __be32 *addr, size_t asize,
+			char *str, size_t size)
+{
+	struct sockaddr_storage sa = {};
+
+	switch (asize) {
+	case 4:
+		sa.ss_family = AF_INET;
+		memcpy(&((struct sockaddr_in *)(&sa))->sin_addr.s_addr,
+		       addr, asize);
+		break;
+	case 16:
+		sa.ss_family = AF_INET6;
+		memcpy(&((struct sockaddr_in6 *)(&sa))->sin6_addr.s6_addr,
+		       addr, asize);
+		break;
+	default:
+		return;
+	}
+
+	rpc_ntop((struct sockaddr *)&sa, str, size);
+}
+
+/* CAVEAT EMPTOR XscanfX
  * I use "%n" at the end of a sscanf format to detect trailing junk.  However
  * sscanf may return immediately if it sees the terminating '0' in a string, so
  * I initialise the %n variable to the expected length.  If sscanf sets it;
@@ -495,6 +519,37 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 	return 0;
 }
 
+static int
+libcfs_ip_str2addr_size(const char *str, int nob,
+			__be32 *addr, size_t *alen)
+{
+	struct sockaddr_storage sa;
+
+	/* Note: 'net' arg to rpc_pton is only needed for link-local
+	 * addresses.  Such addresses would not work with LNet routing,
+	 * so we can assume they aren't used.  So it doesn't matter
+	 * which net namespace is passed.
+	 */
+	if (rpc_pton(&init_net, str, nob,
+		     (struct sockaddr *)&sa, sizeof(sa)) == 0)
+		return 0;
+	if (sa.ss_family == AF_INET6) {
+		memcpy(addr,
+		       &((struct sockaddr_in6 *)(&sa))->sin6_addr.s6_addr,
+		       16);
+		*alen = 16;
+		return 1;
+	}
+	if (sa.ss_family == AF_INET) {
+		memcpy(addr,
+		       &((struct sockaddr_in *)(&sa))->sin_addr.s_addr,
+		       4);
+		*alen = 4;
+		return 1;
+	}
+	return 0;
+}
+
 /* Used by lnet/config.c so it can't be static */
 int
 cfs_ip_addr_parse(char *str, int len, struct list_head *list)
@@ -660,7 +715,9 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 	  .nf_name		= "tcp",
 	  .nf_modname		= "ksocklnd",
 	  .nf_addr2str		= libcfs_ip_addr2str,
+	  .nf_addr2str_size	= libcfs_ip_addr2str_size,
 	  .nf_str2addr		= libcfs_ip_str2addr,
+	  .nf_str2addr_size	= libcfs_ip_str2addr_size,
 	  .nf_parse_addrlist	= cfs_ip_addr_parse,
 	  .nf_print_addrlist	= libcfs_ip_addr_range_print,
 	  .nf_match_addr	= cfs_ip_addr_match
@@ -920,10 +977,14 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 	}
 
 	nf = libcfs_lnd2netstrfns(lnd);
-	if (nf && nid_is_nid4(nid)) {
+	if (nf) {
 		size_t addr_len;
 
-		nf->nf_addr2str(ntohl(nid->nid_addr[0]), buf, buf_size);
+		if (nf->nf_addr2str_size)
+			nf->nf_addr2str_size(nid->nid_addr, NID_ADDR_BYTES(nid),
+					     buf, buf_size);
+		else
+			nf->nf_addr2str(ntohl(nid->nid_addr[0]), buf, buf_size);
 		addr_len = strlen(buf);
 		if (nnum == 0)
 			snprintf(buf + addr_len, buf_size - addr_len, "@%s",
@@ -1020,6 +1081,46 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 }
 EXPORT_SYMBOL(libcfs_str2nid);
 
+int
+libcfs_strnid(struct lnet_nid *nid, const char *str)
+{
+	const char *sep = strchr(str, '@');
+	struct netstrfns *nf;
+	u32 net;
+
+	if (sep) {
+		nf = libcfs_str2net_internal(sep + 1, &net);
+		if (!nf)
+			return -EINVAL;
+	} else {
+		sep = str + strlen(str);
+		net = LNET_MKNET(SOCKLND, 0);
+		nf = libcfs_lnd2netstrfns(SOCKLND);
+		LASSERT(nf);
+	}
+
+	memset(nid, 0, sizeof(*nid));
+	nid->nid_type = LNET_NETTYP(net);
+	nid->nid_num = htons(LNET_NETNUM(net));
+	if (nf->nf_str2addr_size) {
+		size_t asize = 0;
+
+		if (!nf->nf_str2addr_size(str, (int)(sep - str),
+					  nid->nid_addr, &asize))
+			return -EINVAL;
+		nid->nid_size = asize - 4;
+	} else {
+		u32 addr;
+
+		if (!nf->nf_str2addr(str, (int)(sep - str), &addr))
+			return -EINVAL;
+		nid->nid_addr[0] = htonl(addr);
+		nid->nid_size = 0;
+	}
+	return 0;
+}
+EXPORT_SYMBOL(libcfs_strnid);
+
 char *
 libcfs_id2str(struct lnet_process_id id)
 {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 05/24] lnet: change lpni_nid in lnet_peer_ni to lnet_nid
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (3 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 04/24] lnet: add string formating/parsing for IPv6 nids James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 06/24] lnet: change lp_primary_nid to struct lnet_nid James Simmons
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

lpni_nid in 'struct lnet_peer_ni' is converted to 'struct lnet_nid'
and various supporting functions updated.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 6c5561a1e1eeab18e ("LU-10391 lnet: change lpni_nid in lnet_peer_ni to lnet_nid")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42101
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h  | 14 ++++++---
 include/linux/lnet/lib-types.h |  2 +-
 net/lnet/lnet/api-ni.c         | 61 +++++++++++++++++++++++++++----------
 net/lnet/lnet/lib-move.c       | 49 ++++++++++++++++--------------
 net/lnet/lnet/lib-msg.c        |  2 +-
 net/lnet/lnet/net_fault.c      |  2 +-
 net/lnet/lnet/peer.c           | 68 +++++++++++++++++++++++-------------------
 net/lnet/lnet/router.c         |  7 +++--
 net/lnet/lnet/router_proc.c    |  4 +--
 net/lnet/lnet/udsp.c           | 20 ++++++-------
 10 files changed, 139 insertions(+), 90 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index acc069d..05c099d 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -459,9 +459,14 @@ struct lnet_ni *
 			  char *iface);
 
 static inline int
-lnet_nid2peerhash(lnet_nid_t nid)
+lnet_nid2peerhash(struct lnet_nid *nid)
 {
-	return hash_long(nid, LNET_PEER_HASH_BITS);
+	u32 h = 0;
+	int i;
+
+	for (i = 0; i < 4; i++)
+		h = hash_32(nid->nid_addr[i]^h, 32);
+	return hash_32(LNET_NID_NET(nid) ^ h, LNET_PEER_HASH_BITS);
 }
 
 static inline struct list_head *
@@ -476,7 +481,7 @@ struct lnet_ni *
 extern int avoid_asym_router_failure;
 
 unsigned int lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number);
-int lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni);
+int lnet_cpt_of_nid_locked(struct lnet_nid *nid, struct lnet_ni *ni);
 int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni);
 struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt);
 struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid);
@@ -900,7 +905,8 @@ int lnet_get_peer_ni_info(u32 peer_index, u64 *nid,
 static inline bool
 lnet_peer_ni_is_primary(struct lnet_peer_ni *lpni)
 {
-	return lpni->lpni_nid == lpni->lpni_peer_net->lpn_peer->lp_primary_nid;
+	return lnet_nid_to_nid4(&lpni->lpni_nid) ==
+			lpni->lpni_peer_net->lpn_peer->lp_primary_nid;
 }
 
 bool lnet_peer_is_uptodate(struct lnet_peer *lp);
diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 5b517cc..a6223d2 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -606,7 +606,7 @@ struct lnet_peer_ni {
 	/* network peer is on */
 	struct lnet_net		*lpni_net;
 	/* peer's NID */
-	lnet_nid_t		 lpni_nid;
+	struct lnet_nid		 lpni_nid;
 	/* # refs */
 	struct kref		 lpni_kref;
 	/* health value for the peer */
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 9471edb..3ae88fc 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -1453,9 +1453,11 @@ struct lnet_net *
 }
 
 int
-lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni)
+lnet_cpt_of_nid_locked(struct lnet_nid *nid, struct lnet_ni *ni)
 {
 	struct lnet_net *net;
+	/* FIXME handle long-addr nid */
+	lnet_nid_t nid4 = lnet_nid_to_nid4(nid);
 
 	/* must called with hold of lnet_net_lock */
 	if (LNET_CPT_NUMBER == 1)
@@ -1470,33 +1472,35 @@ struct lnet_net *
 	 */
 	if (ni) {
 		if (ni->ni_cpts)
-			return ni->ni_cpts[lnet_nid_cpt_hash(nid,
+			return ni->ni_cpts[lnet_nid_cpt_hash(nid4,
 							     ni->ni_ncpts)];
 		else
-			return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+			return lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
 	}
 
 	/* no NI provided so look at the net */
-	net = lnet_get_net_locked(LNET_NIDNET(nid));
+	net = lnet_get_net_locked(LNET_NID_NET(nid));
 
 	if (net && net->net_cpts) {
-		return net->net_cpts[lnet_nid_cpt_hash(nid, net->net_ncpts)];
+		return net->net_cpts[lnet_nid_cpt_hash(nid4, net->net_ncpts)];
 	}
 
-	return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+	return lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
 }
 
 int
-lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni)
+lnet_cpt_of_nid(lnet_nid_t nid4, struct lnet_ni *ni)
 {
 	int cpt;
 	int cpt2;
+	struct lnet_nid nid;
 
 	if (LNET_CPT_NUMBER == 1)
 		return 0; /* the only one */
 
+	lnet_nid4_to_nid(nid4, &nid);
 	cpt = lnet_net_lock_current();
-	cpt2 = lnet_cpt_of_nid_locked(nid, ni);
+	cpt2 = lnet_cpt_of_nid_locked(&nid, ni);
 	lnet_net_unlock(cpt);
 
 	return cpt2;
@@ -1529,18 +1533,16 @@ struct lnet_net *
 }
 
 struct lnet_ni *
-lnet_nid2ni_locked(lnet_nid_t nid4, int cpt)
+lnet_nid_to_ni_locked(struct lnet_nid *nid, int cpt)
 {
 	struct lnet_net *net;
 	struct lnet_ni *ni;
-	struct lnet_nid nid;
 
 	LASSERT(cpt != LNET_LOCK_EX);
-	lnet_nid4_to_nid(nid4, &nid);
 
 	list_for_each_entry(net, &the_lnet.ln_nets, net_list) {
 		list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
-			if (nid_same(&ni->ni_nid, &nid))
+			if (nid_same(&ni->ni_nid, nid))
 				return ni;
 		}
 	}
@@ -1548,13 +1550,25 @@ struct lnet_ni *
 	return NULL;
 }
 
+struct lnet_ni  *
+lnet_nid2ni_locked(lnet_nid_t nid4, int cpt)
+{
+	struct lnet_nid nid;
+
+	lnet_nid4_to_nid(nid4, &nid);
+	return lnet_nid_to_ni_locked(&nid, cpt);
+}
+
 struct lnet_ni *
-lnet_nid2ni_addref(lnet_nid_t nid)
+lnet_nid2ni_addref(lnet_nid_t nid4)
 {
 	struct lnet_ni *ni;
+	struct lnet_nid nid;
+
+	lnet_nid4_to_nid(nid4, &nid);
 
 	lnet_net_lock(0);
-	ni = lnet_nid2ni_locked(nid, 0);
+	ni = lnet_nid_to_ni_locked(&nid, 0);
 	if (ni)
 		lnet_ni_addref_locked(ni, 0);
 	lnet_net_unlock(0);
@@ -1563,6 +1577,21 @@ struct lnet_ni *
 }
 EXPORT_SYMBOL(lnet_nid2ni_addref);
 
+struct lnet_ni *
+lnet_nid_to_ni_addref(struct lnet_nid *nid)
+{
+	struct lnet_ni *ni;
+
+	lnet_net_lock(0);
+	ni = lnet_nid_to_ni_locked(nid, 0);
+	if (ni)
+		lnet_ni_addref_locked(ni, 0);
+	lnet_net_unlock(0);
+
+	return ni;
+}
+EXPORT_SYMBOL(lnet_nid_to_ni_addref);
+
 int
 lnet_islocalnid(lnet_nid_t nid)
 {
@@ -3759,7 +3788,7 @@ u32 lnet_get_dlc_seq_locked(void)
 
 	lnet_net_lock(LNET_LOCK_EX);
 	list_for_each_entry(lpni, &the_lnet.ln_mt_peerNIRecovq, lpni_recovery) {
-		list->rlst_nid_array[i] = lpni->lpni_nid;
+		list->rlst_nid_array[i] = lnet_nid_to_nid4(&lpni->lpni_nid);
 		i++;
 		if (i >= LNET_MAX_SHOW_NUM_NID)
 			break;
@@ -4630,7 +4659,7 @@ static int lnet_ping(struct lnet_process_id id, signed long timeout,
 	p = NULL;
 	while ((p = lnet_get_next_peer_ni_locked(lp, NULL, p)) != NULL) {
 		buf[i].pid = id.pid;
-		buf[i].nid = p->lpni_nid;
+		buf[i].nid = lnet_nid_to_nid4(&p->lpni_nid);
 		if (++i >= n_ids)
 			break;
 	}
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index c70ec37..9a2fdb6 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -562,7 +562,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 					&msg->msg_private);
 	if (rc) {
 		CERROR("recv from %s / send to %s aborted: eager_recv failed %d\n",
-		       libcfs_nid2str(msg->msg_rxpeer->lpni_nid),
+		       libcfs_nidstr(&msg->msg_rxpeer->lpni_nid),
 		       libcfs_id2str(msg->msg_target), rc);
 		LASSERT(rc < 0); /* required by my callers */
 	}
@@ -648,8 +648,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 
 	/* can't get here if we're sending to the loopback interface */
 	if (the_lnet.ln_loni)
-		LASSERT(lp->lpni_nid !=
-			lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid));
+		LASSERT(!nid_same(&lp->lpni_nid, &the_lnet.ln_loni->ni_nid));
 
 	/* NB 'lp' is always the next hop */
 	if (!(msg->msg_target.pid & LNET_PID_USERFLAG) &&
@@ -1150,8 +1149,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		if (best_lpni)
 			CDEBUG(D_NET,
 			       "n:[%s, %s] h:[%d, %d] p:[%d, %d] c:[%d, %d] s:[%d, %d]\n",
-			       libcfs_nid2str(lpni->lpni_nid),
-			       libcfs_nid2str(best_lpni->lpni_nid),
+			       libcfs_nidstr(&lpni->lpni_nid),
+			       libcfs_nidstr(&best_lpni->lpni_nid),
 			       lpni_healthv, best_lpni_healthv,
 			       lpni_sel_prio, best_sel_prio,
 			       lpni->lpni_txcredits, best_lpni_credits,
@@ -1220,7 +1219,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	}
 
 	CDEBUG(D_NET, "sd_best_lpni = %s\n",
-	       libcfs_nid2str(best_lpni->lpni_nid));
+	       libcfs_nidstr(&best_lpni->lpni_nid));
 
 	return best_lpni;
 }
@@ -1662,7 +1661,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	       best_ni->ni_seq, best_ni->ni_net->net_seq,
 	       atomic_read(&best_ni->ni_tx_credits),
 	       best_ni->ni_sel_priority,
-	       libcfs_nid2str(best_lpni->lpni_nid),
+	       libcfs_nidstr(&best_lpni->lpni_nid),
 	       best_lpni->lpni_seq, best_lpni->lpni_peer_net->lpn_seq,
 	       best_lpni->lpni_txcredits,
 	       best_lpni->lpni_sel_priority);
@@ -1680,7 +1679,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	 * the configuration has changed. We don't have a hold on the best_ni
 	 * yet, and it may have vanished.
 	 */
-	cpt2 = lnet_cpt_of_nid_locked(best_lpni->lpni_nid, best_ni);
+	cpt2 = lnet_cpt_of_nid_locked(&best_lpni->lpni_nid, best_ni);
 	if (sd->sd_cpt != cpt2) {
 		u32 seq = lnet_get_dlc_seq_locked();
 
@@ -1709,7 +1708,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	 * what was originally set in the target or it will be the NID of
 	 * a router if this message should be routed
 	 */
-	msg->msg_target.nid = msg->msg_txpeer->lpni_nid;
+	/* FIXME handle large-addr nids */
+	msg->msg_target.nid = lnet_nid_to_nid4(&msg->msg_txpeer->lpni_nid);
 
 	/* lnet_msg_commit assigns the correct cpt to the message, which
 	 * is used to decrement the correct refcount on the ni when it's
@@ -1737,12 +1737,15 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		 * lnet_select_pathway() function and is never changed.
 		 * It's safe to use it here.
 		 */
-		msg->msg_hdr.dest_nid = cpu_to_le64(final_dst_lpni->lpni_nid);
+		/* FIXME handle large-addr nid */
+		msg->msg_hdr.dest_nid =
+			cpu_to_le64(lnet_nid_to_nid4(&final_dst_lpni->lpni_nid));
 	} else {
 		/* if we're not routing set the dest_nid to the best peer
 		 * ni NID that we picked earlier in the algorithm.
 		 */
-		msg->msg_hdr.dest_nid = cpu_to_le64(msg->msg_txpeer->lpni_nid);
+		msg->msg_hdr.dest_nid =
+			cpu_to_le64(lnet_nid_to_nid4(&msg->msg_txpeer->lpni_nid));
 	}
 
 	/* if we have response tracker block update it with the next hop
@@ -1751,7 +1754,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	if (msg->msg_md) {
 		rspt = msg->msg_md->md_rspt_ptr;
 		if (rspt) {
-			rspt->rspt_next_hop_nid = msg->msg_txpeer->lpni_nid;
+			rspt->rspt_next_hop_nid =
+				lnet_nid_to_nid4(&msg->msg_txpeer->lpni_nid);
 			CDEBUG(D_NET, "rspt_next_hop_nid = %s\n",
 			       libcfs_nid2str(rspt->rspt_next_hop_nid));
 		}
@@ -1765,7 +1769,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		       libcfs_nid2str(sd->sd_src_nid),
 		       libcfs_nid2str(msg->msg_hdr.dest_nid),
 		       libcfs_nid2str(sd->sd_dst_nid),
-		       libcfs_nid2str(msg->msg_txpeer->lpni_nid),
+		       libcfs_nidstr(&msg->msg_txpeer->lpni_nid),
 		       libcfs_nid2str(sd->sd_rtr_nid),
 		       lnet_msgtyp2str(msg->msg_type), msg->msg_retry_count);
 
@@ -1780,7 +1784,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	    !lnet_msg_is_response(msg) && lpni->lpni_pref_nnids == 0) {
 		CDEBUG(D_NET, "Setting preferred local NID %s on NMR peer %s\n",
 		       libcfs_nidstr(&lni->ni_nid),
-		       libcfs_nid2str(lpni->lpni_nid));
+		       libcfs_nidstr(&lpni->lpni_nid));
 		lnet_peer_ni_set_non_mr_pref_nid(lpni,
 						 lnet_nid_to_nid4(&lni->ni_nid));
 	}
@@ -1833,8 +1837,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	}
 
 	if (sd->sd_best_lpni &&
-	    sd->sd_best_lpni->lpni_nid ==
-	    lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid))
+	    nid_same(&sd->sd_best_lpni->lpni_nid,
+		      &the_lnet.ln_loni->ni_nid))
 		return lnet_handle_lo_send(sd);
 	else if (sd->sd_best_lpni)
 		return lnet_handle_send(sd);
@@ -1901,7 +1905,7 @@ struct lnet_ni *
 		return rc;
 	}
 
-	new_lpni = lnet_find_peer_ni_locked(lpni->lpni_nid);
+	new_lpni = lnet_find_peer_ni_locked(lnet_nid_to_nid4(&lpni->lpni_nid));
 	if (!new_lpni) {
 		lnet_peer_ni_decref_locked(lpni);
 		return -ENOENT;
@@ -2343,7 +2347,7 @@ struct lnet_ni *
 		/* If there is no best_ni we don't have a route */
 		if (!best_ni) {
 			CERROR("no path to %s from net %s\n",
-			       libcfs_nid2str(best_lpni->lpni_nid),
+			       libcfs_nidstr(&best_lpni->lpni_nid),
 			       libcfs_net2str(best_lpni->lpni_net->net_id));
 			return -EHOSTUNREACH;
 		}
@@ -2460,8 +2464,8 @@ struct lnet_ni *
 		 * network
 		 */
 		if (sd->sd_best_lpni &&
-		    sd->sd_best_lpni->lpni_nid ==
-		    lnet_nid_to_nid4(&the_lnet.ln_loni->ni_nid)) {
+		    nid_same(&sd->sd_best_lpni->lpni_nid,
+			     &the_lnet.ln_loni->ni_nid)) {
 			/* in case we initially started with a routed
 			 * destination, let's reset to local
 			 */
@@ -3461,7 +3465,7 @@ struct lnet_mt_event_info {
 			ev_info = kzalloc(sizeof(*ev_info), GFP_NOFS);
 			if (!ev_info) {
 				CERROR("out of memory. Can't recover %s\n",
-				       libcfs_nid2str(lpni->lpni_nid));
+				       libcfs_nidstr(&lpni->lpni_nid));
 				spin_lock(&lpni->lpni_lock);
 				lpni->lpni_state &=
 					~LNET_PEER_NI_RECOVERY_PENDING;
@@ -3472,7 +3476,8 @@ struct lnet_mt_event_info {
 			/* look at the comments in lnet_recover_local_nis() */
 			mdh = lpni->lpni_recovery_ping_mdh;
 			LNetInvalidateMDHandle(&lpni->lpni_recovery_ping_mdh);
-			nid = lpni->lpni_nid;
+			/* FIXME handle large-addr nid */
+			nid = lnet_nid_to_nid4(&lpni->lpni_nid);
 			lnet_net_lock(0);
 			list_del_init(&lpni->lpni_recovery);
 			lnet_peer_ni_decref_locked(lpni);
diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index b1f684f..3c8b7c3 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -829,7 +829,7 @@
 
 	CDEBUG(D_NET, "health check: %s->%s: %s: %s\n",
 	       libcfs_nidstr(&ni->ni_nid),
-	       (lo) ? "self" : libcfs_nid2str(lpni->lpni_nid),
+	       (lo) ? "self" : libcfs_nidstr(&lpni->lpni_nid),
 	       lnet_msgtyp2str(msg->msg_type),
 	       lnet_health_error2str(hstatus));
 
diff --git a/net/lnet/lnet/net_fault.c b/net/lnet/lnet/net_fault.c
index 4c50eec..06366df 100644
--- a/net/lnet/lnet/net_fault.c
+++ b/net/lnet/lnet/net_fault.c
@@ -685,7 +685,7 @@ struct delay_daemon_data {
 			ni = msg->msg_txni;
 			CDEBUG(D_NET, "TRACE: msg %p %s -> %s : %s\n", msg,
 			       libcfs_nidstr(&ni->ni_nid),
-			       libcfs_nid2str(msg->msg_txpeer->lpni_nid),
+			       libcfs_nidstr(&msg->msg_txpeer->lpni_nid),
 			       lnet_msgtyp2str(msg->msg_type));
 			lnet_ni_send(ni, msg);
 			continue;
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 720af99..4629a8b 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -59,7 +59,7 @@
 
 	list_for_each_entry_safe(lpni, tmp, &the_lnet.ln_remote_peer_ni_list,
 				 lpni_on_remote_peer_ni_list) {
-		if (LNET_NIDNET(lpni->lpni_nid) == net->net_id) {
+		if (LNET_NID_NET(&lpni->lpni_nid) == net->net_id) {
 			lpni->lpni_net = net;
 
 			spin_lock(&lpni->lpni_lock);
@@ -136,7 +136,7 @@
 	else
 		lpni->lpni_ns_status = LNET_NI_STATUS_UP;
 	lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL;
-	lpni->lpni_nid = nid;
+	lnet_nid4_to_nid(nid, &lpni->lpni_nid);
 	lpni->lpni_cpt = cpt;
 	atomic_set(&lpni->lpni_healthv, LNET_MAX_HEALTH_VALUE);
 
@@ -160,7 +160,7 @@
 			      &the_lnet.ln_remote_peer_ni_list);
 	}
 
-	CDEBUG(D_NET, "%p nid %s\n", lpni, libcfs_nid2str(lpni->lpni_nid));
+	CDEBUG(D_NET, "%p nid %s\n", lpni, libcfs_nidstr(&lpni->lpni_nid));
 
 	return lpni;
 }
@@ -334,7 +334,7 @@
 	}
 	CDEBUG(D_NET, "peer %s NID %s\n",
 	       libcfs_nid2str(lp->lp_primary_nid),
-	       libcfs_nid2str(lpni->lpni_nid));
+	       libcfs_nidstr(&lpni->lpni_nid));
 }
 
 /* called with lnet_net_lock LNET_LOCK_EX held */
@@ -346,7 +346,7 @@
 	/* don't remove a peer_ni if it's also a gateway */
 	if (lnet_isrouter(lpni) && !force) {
 		CERROR("Peer NI %s is a gateway. Can not delete it\n",
-		       libcfs_nid2str(lpni->lpni_nid));
+		       libcfs_nidstr(&lpni->lpni_nid));
 		return -EBUSY;
 	}
 
@@ -567,7 +567,7 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 		/* assign the next peer_ni to be the primary */
 		lpni2 = lnet_get_next_peer_ni_locked(lp, NULL, lpni);
 		LASSERT(lpni2);
-		lp->lp_primary_nid = lpni2->lpni_nid;
+		lp->lp_primary_nid = lnet_nid_to_nid4(&lpni2->lpni_nid);
 	}
 	rc = lnet_peer_ni_del_locked(lpni, force);
 
@@ -596,7 +596,8 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 				continue;
 
 			peer = lpni->lpni_peer_net->lpn_peer;
-			if (peer->lp_primary_nid != lpni->lpni_nid) {
+			if (peer->lp_primary_nid !=
+			    lnet_nid_to_nid4(&lpni->lpni_nid)) {
 				lnet_peer_ni_del_locked(lpni, false);
 				continue;
 			}
@@ -682,7 +683,7 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 }
 
 static struct lnet_peer_ni *
-lnet_get_peer_ni_locked(struct lnet_peer_table *ptable, lnet_nid_t nid)
+lnet_get_peer_ni_locked(struct lnet_peer_table *ptable, struct lnet_nid *nid)
 {
 	struct list_head *peers;
 	struct lnet_peer_ni *lp;
@@ -692,7 +693,7 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 
 	peers = &ptable->pt_hash[lnet_nid2peerhash(nid)];
 	list_for_each_entry(lp, peers, lpni_hashlist) {
-		if (lp->lpni_nid == nid) {
+		if (nid_same(&lp->lpni_nid, nid)) {
 			lnet_peer_ni_addref_locked(lp);
 			return lp;
 		}
@@ -702,16 +703,19 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 }
 
 struct lnet_peer_ni *
-lnet_find_peer_ni_locked(lnet_nid_t nid)
+lnet_find_peer_ni_locked(lnet_nid_t nid4)
 {
 	struct lnet_peer_ni *lpni;
 	struct lnet_peer_table *ptable;
 	int cpt;
+	struct lnet_nid nid;
 
-	cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+	lnet_nid4_to_nid(nid4, &nid);
+
+	cpt = lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
 
 	ptable = the_lnet.ln_peer_tables[cpt];
-	lpni = lnet_get_peer_ni_locked(ptable, nid);
+	lpni = lnet_get_peer_ni_locked(ptable, &nid);
 
 	return lpni;
 }
@@ -727,7 +731,7 @@ struct lnet_peer_ni *
 		return NULL;
 
 	list_for_each_entry(lpni, &lpn->lpn_peer_nis, lpni_peer_nis) {
-		if (lpni->lpni_nid == nid)
+		if (lnet_nid_to_nid4(&lpni->lpni_nid) == nid)
 			return lpni;
 	}
 
@@ -953,7 +957,7 @@ struct lnet_peer_ni *
 	struct lnet_nid_list *ne;
 
 	CDEBUG(D_NET, "%s: rtr pref emtpy: %d\n",
-	       libcfs_nid2str(lpni->lpni_nid),
+	       libcfs_nidstr(&lpni->lpni_nid),
 	       list_empty(&lpni->lpni_rtr_pref_nids));
 
 	if (list_empty(&lpni->lpni_rtr_pref_nids))
@@ -1071,7 +1075,7 @@ struct lnet_peer_ni *
 	spin_unlock(&lpni->lpni_lock);
 
 	CDEBUG(D_NET, "peer %s nid %s: %d\n",
-	       libcfs_nid2str(lpni->lpni_nid), libcfs_nid2str(nid), rc);
+	       libcfs_nidstr(&lpni->lpni_nid), libcfs_nid2str(nid), rc);
 	return rc;
 }
 
@@ -1096,7 +1100,7 @@ struct lnet_peer_ni *
 	spin_unlock(&lpni->lpni_lock);
 
 	CDEBUG(D_NET, "peer %s: %d\n",
-	       libcfs_nid2str(lpni->lpni_nid), rc);
+	       libcfs_nidstr(&lpni->lpni_nid), rc);
 	return rc;
 }
 
@@ -1472,7 +1476,7 @@ struct lnet_peer_net *
 	lnet_net_lock(LNET_LOCK_EX);
 	/* Add peer_ni to global peer table hash, if necessary. */
 	if (list_empty(&lpni->lpni_hashlist)) {
-		int hash = lnet_nid2peerhash(lpni->lpni_nid);
+		int hash = lnet_nid2peerhash(&lpni->lpni_nid);
 
 		ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt];
 		list_add_tail(&lpni->lpni_hashlist, &ptable->pt_hash[hash]);
@@ -1491,7 +1495,7 @@ struct lnet_peer_net *
 
 	/* Add peer_ni to peer_net */
 	lpni->lpni_peer_net = lpn;
-	if (lp->lp_primary_nid == lpni->lpni_nid)
+	if (lp->lp_primary_nid == lnet_nid_to_nid4(&lpni->lpni_nid))
 		list_add(&lpni->lpni_peer_nis, &lpn->lpn_peer_nis);
 	else
 		list_add_tail(&lpni->lpni_peer_nis, &lpn->lpn_peer_nis);
@@ -1502,7 +1506,7 @@ struct lnet_peer_net *
 	if (!lpn->lpn_peer) {
 		new_lpn = true;
 		lpn->lpn_peer = lp;
-		if (lp->lp_primary_nid == lpni->lpni_nid)
+		if (lp->lp_primary_nid == lnet_nid_to_nid4(&lpni->lpni_nid))
 			list_add(&lpn->lpn_peer_nets, &lp->lp_peer_nets);
 		else
 			list_add_tail(&lpn->lpn_peer_nets, &lp->lp_peer_nets);
@@ -1545,11 +1549,11 @@ struct lnet_peer_net *
 	rc = lnet_udsp_apply_policies_on_lpni(lpni);
 	if (rc)
 		CERROR("Failed to apply UDSPs on lpni %s\n",
-		       libcfs_nid2str(lpni->lpni_nid));
+		       libcfs_nidstr(&lpni->lpni_nid));
 
 	CDEBUG(D_NET, "peer %s NID %s flags %#x\n",
 	       libcfs_nid2str(lp->lp_primary_nid),
-	       libcfs_nid2str(lpni->lpni_nid), flags);
+	       libcfs_nidstr(&lpni->lpni_nid), flags);
 	lnet_peer_ni_decref_locked(lpni);
 	lnet_net_unlock(LNET_LOCK_EX);
 
@@ -1980,7 +1984,7 @@ struct lnet_peer_net *
 	struct lnet_peer_table *ptable;
 	struct lnet_peer_net *lpn;
 
-	CDEBUG(D_NET, "%p nid %s\n", lpni, libcfs_nid2str(lpni->lpni_nid));
+	CDEBUG(D_NET, "%p nid %s\n", lpni, libcfs_nidstr(&lpni->lpni_nid));
 
 	LASSERT(kref_read(&lpni->lpni_kref) == 0);
 	LASSERT(list_empty(&lpni->lpni_txq));
@@ -2581,7 +2585,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 
 	CDEBUG(D_NET, "peer %s NID %s: %d. %s\n",
 	       (lp ? libcfs_nid2str(lp->lp_primary_nid) : "(none)"),
-	       libcfs_nid2str(lpni->lpni_nid), rc,
+	       libcfs_nidstr(&lpni->lpni_nid), rc,
 	       (!block) ? "pending discovery" : "discovery complete");
 
 	return rc;
@@ -2944,7 +2948,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 	/* Construct the list of NIDs present in peer. */
 	lpni = NULL;
 	while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL)
-		curnis[ncurnis++] = lpni->lpni_nid;
+		curnis[ncurnis++] = lnet_nid_to_nid4(&lpni->lpni_nid);
 
 	/*
 	 * Check for NIDs in pbuf not present in curnis[].
@@ -3897,7 +3901,7 @@ void lnet_peer_discovery_stop(void)
 		aliveness = (lnet_is_peer_ni_alive(lp)) ? "up" : "down";
 
 	CDEBUG(D_WARNING, "%-24s %4d %5s %5d %5d %5d %5d %5d %ld\n",
-	       libcfs_nid2str(lp->lpni_nid), kref_read(&lp->lpni_kref),
+	       libcfs_nidstr(&lp->lpni_nid), kref_read(&lp->lpni_kref),
 	       aliveness, lp->lpni_net->net_tunables.lct_peer_tx_credits,
 	       lp->lpni_rtrcredits, lp->lpni_minrtrcredits,
 	       lp->lpni_txcredits, lp->lpni_mintxcredits, lp->lpni_txqnob);
@@ -3944,6 +3948,8 @@ void lnet_peer_discovery_stop(void)
 		struct list_head *peers = &peer_table->pt_hash[j];
 
 		list_for_each_entry(lp, peers, lpni_hashlist) {
+			if (!nid_is_nid4(&lp->lpni_nid))
+				continue;
 			if (peer_index-- > 0)
 				continue;
 
@@ -3954,7 +3960,7 @@ void lnet_peer_discovery_stop(void)
 					 lnet_is_peer_ni_alive(lp)
 					 ? "up" : "down");
 
-			*nid = lp->lpni_nid;
+			*nid = lnet_nid_to_nid4(&lp->lpni_nid);
 			*refcount = kref_read(&lp->lpni_kref);
 			*ni_peer_tx_credits =
 				lp->lpni_net->net_tunables.lct_peer_tx_credits;
@@ -4028,7 +4034,9 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 	lpni = NULL;
 	rc = -EFAULT;
 	while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL) {
-		nid = lpni->lpni_nid;
+		if (!nid_is_nid4(&lpni->lpni_nid))
+			continue;
+		nid = lnet_nid_to_nid4(&lpni->lpni_nid);
 		if (copy_to_user(bulk, &nid, sizeof(nid)))
 			goto out_free_hstats;
 		bulk += sizeof(nid);
@@ -4117,7 +4125,7 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 	if (!lpni->lpni_last_alive) {
 		CDEBUG(D_NET,
 		       "lpni %s(%p) not eligible for recovery last alive %lld\n",
-		       libcfs_nid2str(lpni->lpni_nid), lpni,
+		       libcfs_nidstr(&lpni->lpni_nid), lpni,
 		       lpni->lpni_last_alive);
 		return;
 	}
@@ -4125,7 +4133,7 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 	if (lnet_recovery_limit &&
 	    now > lpni->lpni_last_alive + lnet_recovery_limit) {
 		CDEBUG(D_NET, "lpni %s aged out last alive %lld\n",
-		       libcfs_nid2str(lpni->lpni_nid),
+		       libcfs_nidstr(&lpni->lpni_nid),
 		       lpni->lpni_last_alive);
 		/* Reset the ping count so that if this peer NI is added back to
 		 * the recovery queue we will send the first ping right away.
@@ -4141,7 +4149,7 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 
 	CDEBUG(D_NET,
 	       "%s added to recovery queue. ping count: %u next ping: %lld last alive: %lld health: %d\n",
-	       libcfs_nid2str(lpni->lpni_nid),
+	       libcfs_nidstr(&lpni->lpni_nid),
 	       lpni->lpni_ping_count,
 	       lpni->lpni_next_ping,
 	       lpni->lpni_last_alive,
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index 6335425..f6fcc93 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -1198,7 +1198,7 @@ bool lnet_router_checker_active(void)
 
 		/* discover the router */
 		CDEBUG(D_NET, "discover %s, cpt = %d\n",
-		       libcfs_nid2str(lpni->lpni_nid), cpt);
+		       libcfs_nidstr(&lpni->lpni_nid), cpt);
 		rc = lnet_discover_peer_locked(lpni, cpt, false);
 
 		/* drop ref taken above */
@@ -1772,7 +1772,8 @@ bool lnet_router_checker_active(void)
 		 */
 		if (lnet_is_discovery_disabled(lp)) {
 			list_for_each_entry(route, &lp->lp_routes, lr_gwlist) {
-				if (route->lr_nid == lpni->lpni_nid)
+				if (route->lr_nid ==
+				    lnet_nid_to_nid4(&lpni->lpni_nid))
 					lnet_set_route_aliveness(route, alive);
 			}
 		}
@@ -1781,7 +1782,7 @@ bool lnet_router_checker_active(void)
 	lnet_net_unlock(0);
 
 	if (ni && !alive)
-		lnet_notify_peer_down(ni, lpni->lpni_nid);
+		lnet_notify_peer_down(ni, lnet_nid_to_nid4(&lpni->lpni_nid));
 
 	cpt = lpni->lpni_cpt;
 	lnet_net_lock(cpt);
diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c
index 6649f06..1a04ac4 100644
--- a/net/lnet/lnet/router_proc.c
+++ b/net/lnet/lnet/router_proc.c
@@ -474,7 +474,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write,
 		}
 
 		if (peer) {
-			lnet_nid_t nid = peer->lpni_nid;
+			struct lnet_nid nid = peer->lpni_nid;
 			int nrefs = kref_read(&peer->lpni_kref);
 			time64_t lastalive = -1;
 			char *aliveness = "NA";
@@ -495,7 +495,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write,
 
 			s += scnprintf(s, tmpstr + tmpsiz - s,
 				       "%-24s %4d %5s %5lld %5d %5d %5d %5d %5d %d\n",
-				       libcfs_nid2str(nid), nrefs, aliveness,
+				       libcfs_nidstr(&nid), nrefs, aliveness,
 				       lastalive, maxcr, rtrcr, minrtrcr, txcr,
 				       mintxcr, txqnob);
 			LASSERT(tmpstr + tmpsiz - s > 0);
diff --git a/net/lnet/lnet/udsp.c b/net/lnet/lnet/udsp.c
index 4495062..03669e6 100644
--- a/net/lnet/lnet/udsp.c
+++ b/net/lnet/lnet/udsp.c
@@ -255,7 +255,7 @@ enum udsp_apply {
 									    lpni)) != NULL) {
 					if (!lnet_get_net_locked(lpni->lpni_peer_net->lpn_net_id))
 						continue;
-					gw_nid = lpni->lpni_nid;
+					gw_nid = lnet_nid_to_nid4(&lpni->lpni_nid);
 					rc = cfs_match_nid_net(gw_nid,
 							       rte_action->ud_net_id.udn_net_type,
 							       &rte_action->ud_net_id.udn_net_num_range,
@@ -437,7 +437,7 @@ enum udsp_apply {
 					CDEBUG(D_NET,
 					       "%spref rtr nids from lpni %s\n",
 					       (revert) ? "revert " : "clear ",
-					       libcfs_nid2str(lpni->lpni_nid));
+					       libcfs_nidstr(&lpni->lpni_nid));
 					lnet_peer_clr_pref_rtrs(lpni);
 					cleared = true;
 					if (revert) {
@@ -448,7 +448,7 @@ enum udsp_apply {
 				CDEBUG(D_NET,
 				       "add gw nid %s as preferred for peer %s\n",
 				       libcfs_nid2str(gw_nid),
-				       libcfs_nid2str(lpni->lpni_nid));
+				       libcfs_nidstr(&lpni->lpni_nid));
 				/* match. Add to pref NIDs */
 				rc = lnet_peer_add_pref_rtr(lpni, gw_nid);
 				lnet_net_lock(LNET_LOCK_EX);
@@ -456,7 +456,7 @@ enum udsp_apply {
 				if (rc && rc != -EEXIST) {
 					CERROR("Failed to add %s to %s pref rtr list\n",
 					       libcfs_nid2str(gw_nid),
-					       libcfs_nid2str(lpni->lpni_nid));
+					       libcfs_nidstr(&lpni->lpni_nid));
 					return rc;
 				}
 			}
@@ -492,7 +492,7 @@ enum udsp_apply {
 				lnet_peer_clr_pref_nids(lpni);
 				CDEBUG(D_NET, "%spref nids from lpni %s\n",
 				       (revert) ? "revert " : "clear ",
-				       libcfs_nid2str(lpni->lpni_nid));
+				       libcfs_nidstr(&lpni->lpni_nid));
 				cleared = true;
 				if (revert) {
 					lnet_net_lock(LNET_LOCK_EX);
@@ -501,7 +501,7 @@ enum udsp_apply {
 			}
 			CDEBUG(D_NET, "add nid %s as preferred for peer %s\n",
 			       libcfs_nidstr(&ni->ni_nid),
-			       libcfs_nid2str(lpni->lpni_nid));
+			       libcfs_nidstr(&lpni->lpni_nid));
 			/* match. Add to pref NIDs */
 			rc = lnet_peer_add_pref_nid(lpni,
 						    lnet_nid_to_nid4(&ni->ni_nid));
@@ -510,7 +510,7 @@ enum udsp_apply {
 			if (rc && rc != -EEXIST) {
 				CERROR("Failed to add %s to %s pref nid list\n",
 				       libcfs_nidstr(&ni->ni_nid),
-				       libcfs_nid2str(lpni->lpni_nid));
+				       libcfs_nidstr(&lpni->lpni_nid));
 				return rc;
 			}
 		}
@@ -530,7 +530,7 @@ enum udsp_apply {
 	bool local = udi->udi_local;
 	enum lnet_udsp_action_type type = udi->udi_type;
 
-	rc = cfs_match_nid_net(lpni->lpni_nid,
+	rc = cfs_match_nid_net(lnet_nid_to_nid4(&lpni->lpni_nid),
 			       lp_match->ud_net_id.udn_net_type,
 			       &lp_match->ud_net_id.udn_net_num_range,
 			       &lp_match->ud_addr_range);
@@ -629,7 +629,7 @@ enum udsp_apply {
 						    lpni_peer_nis) {
 					CDEBUG(D_NET,
 					       "udsp examining lpni %s\n",
-					       libcfs_nid2str(lpni->lpni_nid));
+					       libcfs_nidstr(&lpni->lpni_nid));
 					udi->udi_lpni = lpni;
 					rc = lnet_udsp_apply_rule_on_lpni(udi);
 					if (rc)
@@ -1017,7 +1017,7 @@ struct lnet_udsp *
 
 	info->cud_nid_priority = lpni->lpni_sel_priority;
 	CDEBUG(D_NET, "lpni %s has %d pref nids\n",
-	       libcfs_nid2str(lpni->lpni_nid),
+	       libcfs_nidstr(&lpni->lpni_nid),
 	       lpni->lpni_pref_nnids);
 	if (lpni->lpni_pref_nnids == 1) {
 		info->cud_pref_nid[0] = lpni->lpni_pref.nid;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 06/24] lnet: change lp_primary_nid to struct lnet_nid
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (4 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 05/24] lnet: change lpni_nid in lnet_peer_ni to lnet_nid James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 07/24] lnet: change lp_disc_*_nid " James Simmons
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

Change lp_primary_nid in struct lnet_peer to struct lnet_nid.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 36dd83ee8e143a472 ("LU-10391 lnet: change lp_primary_nid to struct lnet_nid")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42102
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h  |  10 +-
 include/linux/lnet/lib-types.h |   2 +-
 net/lnet/lnet/api-ni.c         |  46 +++++--
 net/lnet/lnet/lib-move.c       |   7 +-
 net/lnet/lnet/peer.c           | 297 +++++++++++++++++++++++------------------
 net/lnet/lnet/router.c         |  38 +++---
 net/lnet/lnet/router_proc.c    |   4 +-
 net/lnet/lnet/udsp.c           |   6 +-
 8 files changed, 238 insertions(+), 172 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index 05c099d..a4ec067 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -480,7 +480,8 @@ struct lnet_ni *
 extern struct lnet_lnd the_lolnd;
 extern int avoid_asym_router_failure;
 
-unsigned int lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number);
+unsigned int lnet_nid_cpt_hash(struct lnet_nid *nid,
+			       unsigned int number);
 int lnet_cpt_of_nid_locked(struct lnet_nid *nid, struct lnet_ni *ni);
 int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni);
 struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt);
@@ -838,6 +839,7 @@ struct lnet_peer_ni *lnet_nid2peerni_locked(lnet_nid_t nid, lnet_nid_t pref,
 struct lnet_peer_ni *lnet_peer_get_ni_locked(struct lnet_peer *lp,
 					     lnet_nid_t nid);
 struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid);
+struct lnet_peer_ni *lnet_peer_ni_find_locked(struct lnet_nid *nid);
 struct lnet_peer *lnet_find_peer(lnet_nid_t nid);
 void lnet_peer_net_added(struct lnet_net *net);
 lnet_nid_t lnet_peer_primary_nid_locked(lnet_nid_t nid);
@@ -905,8 +907,8 @@ int lnet_get_peer_ni_info(u32 peer_index, u64 *nid,
 static inline bool
 lnet_peer_ni_is_primary(struct lnet_peer_ni *lpni)
 {
-	return lnet_nid_to_nid4(&lpni->lpni_nid) ==
-			lpni->lpni_peer_net->lpn_peer->lp_primary_nid;
+	return nid_same(&lpni->lpni_nid,
+			 &lpni->lpni_peer_net->lpn_peer->lp_primary_nid);
 }
 
 bool lnet_peer_is_uptodate(struct lnet_peer *lp);
@@ -1081,7 +1083,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	if (old != alive)
 		CERROR("route to %s through %s has gone from %s to %s\n",
 		       libcfs_net2str(route->lr_net),
-		       libcfs_nid2str(route->lr_gateway->lp_primary_nid),
+		       libcfs_nidstr(&route->lr_gateway->lp_primary_nid),
 		       old ? "up" : "down",
 		       alive ? "up" : "down");
 }
diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index a6223d2..f980f2f 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -669,7 +669,7 @@ struct lnet_peer {
 	struct list_head	lp_rtr_list;
 
 	/* primary NID of the peer */
-	lnet_nid_t		lp_primary_nid;
+	struct lnet_nid		lp_primary_nid;
 
 	/* source NID to use during discovery */
 	lnet_nid_t		lp_disc_src_nid;
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 3ae88fc..f5b022f 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -1433,8 +1433,8 @@ struct lnet_net *
 	return false;
 }
 
-unsigned int
-lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number)
+static unsigned int
+lnet_nid4_cpt_hash(lnet_nid_t nid, unsigned int number)
 {
 	u64 key = nid;
 	unsigned int val;
@@ -1452,12 +1452,33 @@ struct lnet_net *
 	return (unsigned int)(key + val + (val >> 1)) % number;
 }
 
+unsigned int
+lnet_nid_cpt_hash(struct lnet_nid *nid, unsigned int number)
+{
+	unsigned int val;
+	u32 h = 0;
+	int i;
+
+	LASSERT(number >= 1 && number <= LNET_CPT_NUMBER);
+
+	if (number == 1)
+		return 0;
+
+	if (nid_is_nid4(nid))
+		return lnet_nid4_cpt_hash(lnet_nid_to_nid4(nid), number);
+
+	for (i = 0; i < 4; i++)
+		h = hash_32(nid->nid_addr[i] ^ h, 32);
+	val = hash_32(LNET_NID_NET(nid) ^ h, LNET_CPT_BITS);
+	if (val < number)
+		return val;
+	return (unsigned int)(h + val + (val >> 1)) % number;
+}
+
 int
 lnet_cpt_of_nid_locked(struct lnet_nid *nid, struct lnet_ni *ni)
 {
 	struct lnet_net *net;
-	/* FIXME handle long-addr nid */
-	lnet_nid_t nid4 = lnet_nid_to_nid4(nid);
 
 	/* must called with hold of lnet_net_lock */
 	if (LNET_CPT_NUMBER == 1)
@@ -1472,20 +1493,19 @@ struct lnet_net *
 	 */
 	if (ni) {
 		if (ni->ni_cpts)
-			return ni->ni_cpts[lnet_nid_cpt_hash(nid4,
+			return ni->ni_cpts[lnet_nid_cpt_hash(nid,
 							     ni->ni_ncpts)];
 		else
-			return lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
+			return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
 	}
 
 	/* no NI provided so look at the net */
 	net = lnet_get_net_locked(LNET_NID_NET(nid));
 
-	if (net && net->net_cpts) {
-		return net->net_cpts[lnet_nid_cpt_hash(nid4, net->net_ncpts)];
-	}
+	if (net && net->net_cpts)
+		return net->net_cpts[lnet_nid_cpt_hash(nid, net->net_ncpts)];
 
-	return lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
+	return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
 }
 
 int
@@ -4240,7 +4260,8 @@ u32 lnet_get_dlc_seq_locked(void)
 		mutex_lock(&the_lnet.ln_api_mutex);
 		lp = lnet_find_peer(ping->ping_id.nid);
 		if (lp) {
-			ping->ping_id.nid = lp->lp_primary_nid;
+			ping->ping_id.nid =
+				lnet_nid_to_nid4(&lp->lp_primary_nid);
 			ping->mr_info = lnet_peer_is_multi_rail(lp);
 			lnet_peer_decref_locked(lp);
 		}
@@ -4263,7 +4284,8 @@ u32 lnet_get_dlc_seq_locked(void)
 		mutex_lock(&the_lnet.ln_api_mutex);
 		lp = lnet_find_peer(discover->ping_id.nid);
 		if (lp) {
-			discover->ping_id.nid = lp->lp_primary_nid;
+			discover->ping_id.nid =
+				lnet_nid_to_nid4(&lp->lp_primary_nid);
 			discover->mr_info = lnet_peer_is_multi_rail(lp);
 			lnet_peer_decref_locked(lp);
 		}
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 9a2fdb6..8c8db31 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -1324,7 +1324,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 	list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
 		if (!lnet_is_route_alive(route))
 			continue;
-		gw_pnid = route->lr_gateway->lp_primary_nid;
+		gw_pnid = lnet_nid_to_nid4(&route->lr_gateway->lp_primary_nid);
 
 		/* no protection on below fields, but it's harmless */
 		if (last_route && (last_route->lr_seq - route->lr_seq < 0))
@@ -1938,7 +1938,7 @@ struct lnet_ni *
 	lnet_peer_ni_decref_locked(new_lpni);
 
 	CDEBUG(D_NET, "msg %p delayed. %s pending discovery\n",
-	       msg, libcfs_nid2str(peer->lp_primary_nid));
+	       msg, libcfs_nidstr(&peer->lp_primary_nid));
 
 	return LNET_DC_WAIT;
 }
@@ -4831,7 +4831,8 @@ struct lnet_msg *
 	       libcfs_nidstr(&ni->ni_nid), libcfs_id2str(peer_id), getmd);
 
 	/* setup information for lnet_build_msg_event */
-	msg->msg_initiator = getmsg->msg_txpeer->lpni_peer_net->lpn_peer->lp_primary_nid;
+	msg->msg_initiator =
+		lnet_nid_to_nid4(&getmsg->msg_txpeer->lpni_peer_net->lpn_peer->lp_primary_nid);
 	msg->msg_from = peer_id.nid;
 	msg->msg_type = LNET_MSG_GET; /* flag this msg as an "optimized" GET */
 	msg->msg_hdr.src_nid = peer_id.nid;
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 4629a8b..7f2c6f3 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -107,13 +107,15 @@
 }
 
 static struct lnet_peer_ni *
-lnet_peer_ni_alloc(lnet_nid_t nid)
+lnet_peer_ni_alloc(lnet_nid_t nid4)
 {
 	struct lnet_peer_ni *lpni;
 	struct lnet_net *net;
+	struct lnet_nid nid;
 	int cpt;
 
-	cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+	lnet_nid4_to_nid(nid4, &nid);
+	cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER);
 
 	lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt);
 	if (!lpni)
@@ -136,11 +138,11 @@
 	else
 		lpni->lpni_ns_status = LNET_NI_STATUS_UP;
 	lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL;
-	lnet_nid4_to_nid(nid, &lpni->lpni_nid);
+	lpni->lpni_nid = nid;
 	lpni->lpni_cpt = cpt;
 	atomic_set(&lpni->lpni_healthv, LNET_MAX_HEALTH_VALUE);
 
-	net = lnet_get_net_locked(LNET_NIDNET(nid));
+	net = lnet_get_net_locked(LNET_NID_NET(&nid));
 	lpni->lpni_net = net;
 	if (net) {
 		lpni->lpni_txcredits = net->net_tunables.lct_peer_tx_credits;
@@ -202,10 +204,12 @@
 }
 
 static struct lnet_peer *
-lnet_peer_alloc(lnet_nid_t nid)
+lnet_peer_alloc(lnet_nid_t nid4)
 {
 	struct lnet_peer *lp;
+	struct lnet_nid nid;
 
+	lnet_nid4_to_nid(nid4, &nid);
 	lp = kzalloc_cpt(sizeof(*lp), GFP_KERNEL, CFS_CPT_ANY);
 	if (!lp)
 		return NULL;
@@ -239,11 +243,11 @@
 	 * to ever use a different interface when sending messages to
 	 * myself.
 	 */
-	if (nid == LNET_NID_LO_0)
+	if (nid_is_lo0(&nid))
 		lp->lp_state = LNET_PEER_NO_DISCOVERY;
-	lp->lp_cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+	lp->lp_cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER);
 
-	CDEBUG(D_NET, "%p nid %s\n", lp, libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "%p nid %s\n", lp, libcfs_nidstr(&lp->lp_primary_nid));
 
 	return lp;
 }
@@ -251,7 +255,7 @@
 void
 lnet_destroy_peer_locked(struct lnet_peer *lp)
 {
-	CDEBUG(D_NET, "%p nid %s\n", lp, libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "%p nid %s\n", lp, libcfs_nidstr(&lp->lp_primary_nid));
 
 	LASSERT(atomic_read(&lp->lp_refcount) == 0);
 	LASSERT(lp->lp_rtr_refcount == 0);
@@ -333,7 +337,7 @@
 		wake_up(&the_lnet.ln_dc_waitq);
 	}
 	CDEBUG(D_NET, "peer %s NID %s\n",
-	       libcfs_nid2str(lp->lp_primary_nid),
+	       libcfs_nidstr(&lp->lp_primary_nid),
 	       libcfs_nidstr(&lpni->lpni_nid));
 }
 
@@ -448,7 +452,7 @@ void lnet_peer_uninit(void)
 	struct lnet_peer_ni *lpni = NULL, *lpni2;
 	int rc = 0, rc2 = 0;
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(peer->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&peer->lp_primary_nid));
 
 	spin_lock(&peer->lp_lock);
 	peer->lp_state |= LNET_PEER_MARK_DELETED;
@@ -517,13 +521,15 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
  *  -EBUSY:  The lnet_peer_ni is the primary, and not the only peer_ni.
  */
 static int
-lnet_peer_del_nid(struct lnet_peer *lp, lnet_nid_t nid, unsigned int flags)
+lnet_peer_del_nid(struct lnet_peer *lp, lnet_nid_t nid4, unsigned int flags)
 {
 	struct lnet_peer_ni *lpni;
-	lnet_nid_t primary_nid = lp->lp_primary_nid;
+	struct lnet_nid primary_nid = lp->lp_primary_nid;
+	struct lnet_nid nid;
 	int rc = 0;
 	bool force = (flags & LNET_PEER_RTR_NI_FORCE_DEL) ? true : false;
 
+	lnet_nid4_to_nid(nid4, &nid);
 	if (!(flags & LNET_PEER_CONFIGURED)) {
 		if (lp->lp_state & LNET_PEER_CONFIGURED) {
 			rc = -EPERM;
@@ -535,12 +541,12 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 	 * deleting it
 	 */
 	if (lp->lp_state & LNET_PEER_LOCK_PRIMARY &&
-	    primary_nid == nid) {
+	    nid_same(&primary_nid, &nid)) {
 		rc = -EPERM;
 		goto out;
 	}
 
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_peer_ni_find_locked(&nid);
 	if (!lpni) {
 		rc = -ENOENT;
 		goto out;
@@ -555,19 +561,19 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 	 * This function only allows deletion of the primary NID if it
 	 * is the only NID.
 	 */
-	if (nid == lp->lp_primary_nid && lp->lp_nnis != 1 && !force) {
+	if (nid_same(&nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && !force) {
 		rc = -EBUSY;
 		goto out;
 	}
 
 	lnet_net_lock(LNET_LOCK_EX);
 
-	if (nid == lp->lp_primary_nid && lp->lp_nnis != 1 && force) {
+	if (nid_same(&nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && force) {
 		struct lnet_peer_ni *lpni2;
 		/* assign the next peer_ni to be the primary */
 		lpni2 = lnet_get_next_peer_ni_locked(lp, NULL, lpni);
 		LASSERT(lpni2);
-		lp->lp_primary_nid = lnet_nid_to_nid4(&lpni2->lpni_nid);
+		lp->lp_primary_nid = lpni2->lpni_nid;
 	}
 	rc = lnet_peer_ni_del_locked(lpni, force);
 
@@ -575,7 +581,8 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 
 out:
 	CDEBUG(D_NET, "peer %s NID %s flags %#x: %d\n",
-	       libcfs_nid2str(primary_nid), libcfs_nid2str(nid), flags, rc);
+	       libcfs_nidstr(&primary_nid), libcfs_nidstr(&nid),
+	       flags, rc);
 
 	return rc;
 }
@@ -596,8 +603,8 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 				continue;
 
 			peer = lpni->lpni_peer_net->lpn_peer;
-			if (peer->lp_primary_nid !=
-			    lnet_nid_to_nid4(&lpni->lpni_nid)) {
+			if (!nid_same(&peer->lp_primary_nid,
+				      &lpni->lpni_nid)) {
 				lnet_peer_ni_del_locked(lpni, false);
 				continue;
 			}
@@ -643,8 +650,8 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp)
 			if (!lnet_isrouter(lp))
 				continue;
 
-			gw_nid = lp->lpni_peer_net->lpn_peer->lp_primary_nid;
-
+			/* FIXME handle large-addr nid */
+			gw_nid = lnet_nid_to_nid4(&lp->lpni_peer_net->lpn_peer->lp_primary_nid);
 			lnet_net_unlock(LNET_LOCK_EX);
 			lnet_del_route(LNET_NET_ANY, gw_nid);
 			lnet_net_lock(LNET_LOCK_EX);
@@ -712,7 +719,7 @@ struct lnet_peer_ni *
 
 	lnet_nid4_to_nid(nid4, &nid);
 
-	cpt = lnet_nid_cpt_hash(nid4, LNET_CPT_NUMBER);
+	cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER);
 
 	ptable = the_lnet.ln_peer_tables[cpt];
 	lpni = lnet_get_peer_ni_locked(ptable, &nid);
@@ -721,6 +728,21 @@ struct lnet_peer_ni *
 }
 
 struct lnet_peer_ni *
+lnet_peer_ni_find_locked(struct lnet_nid *nid)
+{
+	struct lnet_peer_ni *lpni;
+	struct lnet_peer_table *ptable;
+	int cpt;
+
+	cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
+
+	ptable = the_lnet.ln_peer_tables[cpt];
+	lpni = lnet_get_peer_ni_locked(ptable, nid);
+
+	return lpni;
+}
+
+struct lnet_peer_ni *
 lnet_peer_get_ni_locked(struct lnet_peer *lp, lnet_nid_t nid)
 {
 	struct lnet_peer_net *lpn;
@@ -896,9 +918,11 @@ struct lnet_peer_ni *
 	for (cpt = 0; cpt < lncpt; cpt++) {
 		ptable = the_lnet.ln_peer_tables[cpt];
 		list_for_each_entry(lp, &ptable->pt_peer_list, lp_peer_list) {
+			if (!nid_is_nid4(&lp->lp_primary_nid))
+				continue;
 			if (i >= count)
 				goto done;
-			id.nid = lp->lp_primary_nid;
+			id.nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 			if (copy_to_user(&ids[i], &id, sizeof(id)))
 				goto done;
 			i++;
@@ -1205,7 +1229,7 @@ struct lnet_peer_ni *
 		spin_unlock(&lpni->lpni_lock);
 	}
 	CDEBUG(D_NET, "peer %s nid %s: %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), libcfs_nid2str(nid), rc);
+	       libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid), rc);
 	return rc;
 }
 
@@ -1263,7 +1287,7 @@ struct lnet_peer_ni *
 	kfree(ne);
 out:
 	CDEBUG(D_NET, "peer %s nid %s: %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), libcfs_nid2str(nid), rc);
+	       libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid), rc);
 	return rc;
 }
 
@@ -1293,12 +1317,13 @@ struct lnet_peer_ni *
 lnet_nid_t
 lnet_peer_primary_nid_locked(lnet_nid_t nid)
 {
+	/* FIXME handle large-addr nid */
 	struct lnet_peer_ni *lpni;
 	lnet_nid_t primary_nid = nid;
 
 	lpni = lnet_find_peer_ni_locked(nid);
 	if (lpni) {
-		primary_nid = lpni->lpni_peer_net->lpn_peer->lp_primary_nid;
+		primary_nid = lnet_nid_to_nid4(&lpni->lpni_peer_net->lpn_peer->lp_primary_nid);
 		lnet_peer_ni_decref_locked(lpni);
 	}
 
@@ -1379,6 +1404,7 @@ struct lnet_peer_ni *
 }
 EXPORT_SYMBOL(LNetAddPeer);
 
+/* FIXME support large-addr nid */
 lnet_nid_t
 LNetPrimaryNID(lnet_nid_t nid)
 {
@@ -1428,7 +1454,7 @@ struct lnet_peer_ni *
 		}
 		lp = lpni->lpni_peer_net->lpn_peer;
 	}
-	primary_nid = lp->lp_primary_nid;
+	primary_nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 out_decref:
 	lnet_peer_ni_decref_locked(lpni);
 out_unlock:
@@ -1495,7 +1521,7 @@ struct lnet_peer_net *
 
 	/* Add peer_ni to peer_net */
 	lpni->lpni_peer_net = lpn;
-	if (lp->lp_primary_nid == lnet_nid_to_nid4(&lpni->lpni_nid))
+	if (nid_same(&lp->lp_primary_nid, &lpni->lpni_nid))
 		list_add(&lpni->lpni_peer_nis, &lpn->lpn_peer_nis);
 	else
 		list_add_tail(&lpni->lpni_peer_nis, &lpn->lpn_peer_nis);
@@ -1506,7 +1532,7 @@ struct lnet_peer_net *
 	if (!lpn->lpn_peer) {
 		new_lpn = true;
 		lpn->lpn_peer = lp;
-		if (lp->lp_primary_nid == lnet_nid_to_nid4(&lpni->lpni_nid))
+		if (nid_same(&lp->lp_primary_nid, &lpni->lpni_nid))
 			list_add(&lpn->lpn_peer_nets, &lp->lp_peer_nets);
 		else
 			list_add_tail(&lpn->lpn_peer_nets, &lp->lp_peer_nets);
@@ -1552,7 +1578,7 @@ struct lnet_peer_net *
 		       libcfs_nidstr(&lpni->lpni_nid));
 
 	CDEBUG(D_NET, "peer %s NID %s flags %#x\n",
-	       libcfs_nid2str(lp->lp_primary_nid),
+	       libcfs_nidstr(&lp->lp_primary_nid),
 	       libcfs_nidstr(&lpni->lpni_nid), flags);
 	lnet_peer_ni_decref_locked(lpni);
 	lnet_net_unlock(LNET_LOCK_EX);
@@ -1591,13 +1617,13 @@ struct lnet_peer_net *
 		 * that an existing peer is being modified.
 		 */
 		if (lp->lp_state & LNET_PEER_CONFIGURED) {
-			if (lp->lp_primary_nid != nid)
+			if (lnet_nid_to_nid4(&lp->lp_primary_nid) != nid)
 				rc = -EEXIST;
 			else if ((lp->lp_state ^ flags) & LNET_PEER_MULTI_RAIL)
 				rc = -EPERM;
 			goto out;
 		} else if (!(flags & LNET_PEER_CONFIGURED)) {
-			if (lp->lp_primary_nid == nid) {
+			if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid) {
 				rc = -EEXIST;
 				goto out;
 			}
@@ -1757,7 +1783,7 @@ struct lnet_peer_net *
 	lnet_peer_ni_decref_locked(lpni);
 out:
 	CDEBUG(D_NET, "peer %s NID %s flags %#x: %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), libcfs_nid2str(nid),
+	       libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid),
 	       flags, rc);
 	return rc;
 }
@@ -1771,14 +1797,14 @@ struct lnet_peer_net *
 lnet_peer_set_primary_nid(struct lnet_peer *lp, lnet_nid_t nid,
 			  unsigned int flags)
 {
-	lnet_nid_t old = lp->lp_primary_nid;
+	struct lnet_nid old = lp->lp_primary_nid;
 	int rc = 0;
 
-	if (lp->lp_primary_nid == nid)
+	if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid)
 		goto out;
 
 	if (!(lp->lp_state & LNET_PEER_LOCK_PRIMARY))
-		lp->lp_primary_nid = nid;
+		lnet_nid4_to_nid(nid, &lp->lp_primary_nid);
 
 	rc = lnet_peer_add_nid(lp, nid, flags);
 	if (rc) {
@@ -1795,7 +1821,7 @@ struct lnet_peer_net *
 		return 0;
 
 	CDEBUG(D_NET, "peer %s NID %s: %d\n",
-	       libcfs_nid2str(old), libcfs_nid2str(nid), rc);
+	       libcfs_nidstr(&old), libcfs_nid2str(nid), rc);
 
 	return rc;
 }
@@ -1906,10 +1932,10 @@ struct lnet_peer_net *
 	}
 
 	/* Primary NID must match */
-	if (lp->lp_primary_nid != prim_nid) {
+	if (lnet_nid_to_nid4(&lp->lp_primary_nid) != prim_nid) {
 		CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n",
 		       libcfs_nid2str(prim_nid),
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 		return -ENODEV;
 	}
 
@@ -1950,10 +1976,10 @@ struct lnet_peer_net *
 	lnet_peer_ni_decref_locked(lpni);
 	lp = lpni->lpni_peer_net->lpn_peer;
 
-	if (prim_nid != lp->lp_primary_nid) {
+	if (prim_nid != lnet_nid_to_nid4(&lp->lp_primary_nid)) {
 		CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n",
 		       libcfs_nid2str(prim_nid),
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 		return -ENODEV;
 	}
 
@@ -1966,7 +1992,7 @@ struct lnet_peer_net *
 	}
 	lnet_net_unlock(LNET_LOCK_EX);
 
-	if (nid == LNET_NID_ANY || nid == lp->lp_primary_nid)
+	if (nid == LNET_NID_ANY || nid == lnet_nid_to_nid4(&lp->lp_primary_nid))
 		return lnet_peer_del(lp);
 
 	flags = LNET_PEER_CONFIGURED;
@@ -2222,7 +2248,7 @@ static int lnet_peer_queue_for_discovery(struct lnet_peer *lp)
 	}
 
 	CDEBUG(D_NET, "Queue peer %s: %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), rc);
+	       libcfs_nidstr(&lp->lp_primary_nid), rc);
 
 	return rc;
 }
@@ -2238,7 +2264,7 @@ static void lnet_peer_discovery_complete(struct lnet_peer *lp)
 	LIST_HEAD(pending_msgs);
 
 	CDEBUG(D_NET, "Discovery complete. Dequeue peer %s\n",
-	       libcfs_nid2str(lp->lp_primary_nid));
+	       libcfs_nidstr(&lp->lp_primary_nid));
 
 	list_del_init(&lp->lp_dc_list);
 	spin_lock(&lp->lp_lock);
@@ -2310,7 +2336,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 		lp->lp_state |= LNET_PEER_FORCE_PING;
 		CDEBUG(D_NET, "Push Put error %d from %s (source %s)\n",
 		       ev->status,
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       libcfs_nid2str(ev->source.nid));
 		goto out;
 	}
@@ -2323,7 +2349,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 		lp->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
 		lp->lp_state |= LNET_PEER_FORCE_PING;
 		CDEBUG(D_NET, "Corrupted Push from %s\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 		goto out;
 	}
 
@@ -2340,43 +2366,32 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	 */
 	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_MULTI_RAIL)) {
 		CERROR("Push from non-Multi-Rail peer %s dropped\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 		goto out;
 	}
 
 	/* The peer may have discovery disabled at its end. Set
 	 * NO_DISCOVERY as appropriate.
 	 */
-	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY) ||
-	    lnet_peer_discovery_disabled) {
+	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY)) {
 		CDEBUG(D_NET, "Peer %s has discovery disabled\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 
-		/* Detect whether this peer has toggled discovery from on to
-		 * off and whether we can delete and re-create the peer. Peers
-		 * that were manually configured cannot be deleted by discovery.
-		 * We need to delete this peer and re-create it if the peer was
-		 * not configured manually, is currently considered DD capable,
-		 * and either:
-		 * 1. We've already discovered the peer (the peer has toggled
-		 *    the discovery feature from on to off), or
-		 * 2. The peer is considered MR, but it was not user configured
-		 *    (this was a "temporary" peer created via the kernel APIs
-		 *     that we're discovering for the first time)
+		/* Mark the peer for deletion if we already know about it
+		 * and it's going from discovery set to no discovery set
 		 */
-		if (!(lp->lp_state & (LNET_PEER_CONFIGURED |
-				      LNET_PEER_NO_DISCOVERY)) &&
-		    (lp->lp_state & (LNET_PEER_DISCOVERED |
-				     LNET_PEER_MULTI_RAIL))) {
+		if (!(lp->lp_state & (LNET_PEER_NO_DISCOVERY |
+				      LNET_PEER_DISCOVERING)) &&
+		     lp->lp_state & LNET_PEER_DISCOVERED) {
 			CDEBUG(D_NET, "Marking %s:0x%x for deletion\n",
-			       libcfs_nid2str(lp->lp_primary_nid),
+			       libcfs_nidstr(&lp->lp_primary_nid),
 			       lp->lp_state);
 			lp->lp_state |= LNET_PEER_MARK_DELETION;
 		}
 		lp->lp_state |= LNET_PEER_NO_DISCOVERY;
-	} else {
+	} else if (lp->lp_state & LNET_PEER_NO_DISCOVERY) {
 		CDEBUG(D_NET, "Peer %s has discovery enabled\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 		lp->lp_state &= ~LNET_PEER_NO_DISCOVERY;
 	}
 
@@ -2387,19 +2402,19 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	 */
 	if (lp->lp_state & LNET_PEER_MULTI_RAIL) {
 		CDEBUG(D_NET, "peer %s(%p) is MR\n",
-		       libcfs_nid2str(lp->lp_primary_nid), lp);
+		       libcfs_nidstr(&lp->lp_primary_nid), lp);
 	} else if (lp->lp_state & LNET_PEER_CONFIGURED) {
 		CWARN("Push says %s is Multi-Rail, DLC says not\n",
-		      libcfs_nid2str(lp->lp_primary_nid));
+		      libcfs_nidstr(&lp->lp_primary_nid));
 	} else if (lnet_peer_discovery_disabled) {
 		CDEBUG(D_NET, "peer %s(%p) not MR: DD disabled locally\n",
-		       libcfs_nid2str(lp->lp_primary_nid), lp);
+		       libcfs_nidstr(&lp->lp_primary_nid), lp);
 	} else if (lp->lp_state & LNET_PEER_NO_DISCOVERY) {
 		CDEBUG(D_NET, "peer %s(%p) not MR: DD disabled remotely\n",
-		       libcfs_nid2str(lp->lp_primary_nid), lp);
+		       libcfs_nidstr(&lp->lp_primary_nid), lp);
 	} else {
 		CDEBUG(D_NET, "peer %s(%p) is MR capable\n",
-		       libcfs_nid2str(lp->lp_primary_nid), lp);
+		       libcfs_nidstr(&lp->lp_primary_nid), lp);
 		lp->lp_state |= LNET_PEER_MULTI_RAIL;
 		lnet_peer_clr_non_mr_pref_nids(lp);
 	}
@@ -2414,7 +2429,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 		lp->lp_state &= ~LNET_PEER_NIDS_UPTODATE;
 		lp->lp_state |= LNET_PEER_FORCE_PING;
 		CDEBUG(D_NET, "Truncated Push from %s (%d nids)\n",
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       pbuf->pb_info.pi_nnis);
 		goto out;
 	}
@@ -2436,7 +2451,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 			memcpy(&lp->lp_data->pb_info, &pbuf->pb_info,
 			       LNET_PING_INFO_SIZE(pbuf->pb_info.pi_nnis));
 			CDEBUG(D_NET, "Ping/Push race from %s: %u vs %u\n",
-			       libcfs_nid2str(lp->lp_primary_nid),
+			       libcfs_nidstr(&lp->lp_primary_nid),
 			       LNET_PING_BUFFER_SEQNO(pbuf),
 			       LNET_PING_BUFFER_SEQNO(lp->lp_data));
 		}
@@ -2452,7 +2467,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	if (!lp->lp_data) {
 		lp->lp_state |= LNET_PEER_FORCE_PING;
 		CDEBUG(D_NET, "Cannot allocate Push buffer for %s %u\n",
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       LNET_PING_BUFFER_SEQNO(pbuf));
 		goto out;
 	}
@@ -2462,7 +2477,7 @@ void lnet_peer_push_event(struct lnet_event *ev)
 	       LNET_PING_INFO_SIZE(pbuf->pb_info.pi_nnis));
 	lp->lp_state |= LNET_PEER_DATA_PRESENT;
 	CDEBUG(D_NET, "Received Push %s %u\n",
-	       libcfs_nid2str(lp->lp_primary_nid),
+	       libcfs_nidstr(&lp->lp_primary_nid),
 	       LNET_PING_BUFFER_SEQNO(pbuf));
 
 out:
@@ -2584,7 +2599,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		goto again;
 
 	CDEBUG(D_NET, "peer %s NID %s: %d. %s\n",
-	       (lp ? libcfs_nid2str(lp->lp_primary_nid) : "(none)"),
+	       (lp ? libcfs_nidstr(&lp->lp_primary_nid) : "(none)"),
 	       libcfs_nidstr(&lpni->lpni_nid), rc,
 	       (!block) ? "pending discovery" : "discovery complete");
 
@@ -2608,7 +2623,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	spin_unlock(&lp->lp_lock);
 
 	CDEBUG(D_NET, "peer %s ev->status %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), ev->status);
+	       libcfs_nidstr(&lp->lp_primary_nid), ev->status);
 }
 
 /* Handle a Reply message. This is the reply to a Ping message. */
@@ -2632,7 +2647,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		lp->lp_ping_error = ev->status;
 		CDEBUG(D_NET, "Ping Reply error %d from %s (source %s)\n",
 		       ev->status,
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       libcfs_nid2str(ev->source.nid));
 		goto out;
 	}
@@ -2650,7 +2665,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		lp->lp_state |= LNET_PEER_PING_FAILED;
 		lp->lp_ping_error = 0;
 		CDEBUG(D_NET, "Corrupted Ping Reply from %s: %d\n",
-		       libcfs_nid2str(lp->lp_primary_nid), rc);
+		       libcfs_nidstr(&lp->lp_primary_nid), rc);
 		goto out;
 	}
 
@@ -2658,15 +2673,37 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	 * The peer may have discovery disabled at its end. Set
 	 * NO_DISCOVERY as appropriate.
 	 */
-	if ((pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY) &&
-	    !lnet_peer_discovery_disabled) {
+	if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_DISCOVERY) &&
+	    lnet_peer_discovery_disabled) {
 		CDEBUG(D_NET, "Peer %s has discovery enabled\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
-		lp->lp_state &= ~LNET_PEER_NO_DISCOVERY;
-	} else {
-		CDEBUG(D_NET, "Peer %s has discovery disabled\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
+
+		/* Detect whether this peer has toggled discovery from on to
+		 * off and whether we can delete and re-create the peer. Peers
+		 * that were manually configured cannot be deleted by discovery.
+		 * We need to delete this peer and re-create it if the peer was
+		 * not configured manually, is currently considered DD capable,
+		 * and either:
+		 * 1. We've already discovered the peer (the peer has toggled
+		 *    the discovery feature from on to off), or
+		 * 2. The peer is considered MR, but it was not user configured
+		 *    (this was a "temporary" peer created via the kernel APIs
+		 *     that we're discovering for the first time)
+		 */
+		if (!(lp->lp_state & (LNET_PEER_CONFIGURED |
+				      LNET_PEER_NO_DISCOVERY)) &&
+		    (lp->lp_state & (LNET_PEER_DISCOVERED |
+				     LNET_PEER_MULTI_RAIL))) {
+			CDEBUG(D_NET, "Marking %s:0x%x for deletion\n",
+			       libcfs_nidstr(&lp->lp_primary_nid),
+			       lp->lp_state);
+			lp->lp_state |= LNET_PEER_MARK_DELETION;
+		}
 		lp->lp_state |= LNET_PEER_NO_DISCOVERY;
+	} else {
+		CDEBUG(D_NET, "Peer %s has discovery enabled\n",
+		       libcfs_nidstr(&lp->lp_primary_nid));
+		lp->lp_state &= ~LNET_PEER_NO_DISCOVERY;
 	}
 
 	/*
@@ -2677,31 +2714,31 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	if (pbuf->pb_info.pi_features & LNET_PING_FEAT_MULTI_RAIL) {
 		if (lp->lp_state & LNET_PEER_MULTI_RAIL) {
 			CDEBUG(D_NET, "peer %s(%p) is MR\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp);
+			       libcfs_nidstr(&lp->lp_primary_nid), lp);
 		} else if (lp->lp_state & LNET_PEER_CONFIGURED) {
 			CWARN("Reply says %s is Multi-Rail, DLC says not\n",
-			      libcfs_nid2str(lp->lp_primary_nid));
+			      libcfs_nidstr(&lp->lp_primary_nid));
 		} else if (lnet_peer_discovery_disabled) {
 			CDEBUG(D_NET,
 			       "peer %s(%p) not MR: DD disabled locally\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp);
+			       libcfs_nidstr(&lp->lp_primary_nid), lp);
 		} else if (lp->lp_state & LNET_PEER_NO_DISCOVERY) {
 			CDEBUG(D_NET,
 			       "peer %s(%p) not MR: DD disabled remotely\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp);
+			       libcfs_nidstr(&lp->lp_primary_nid), lp);
 		} else {
 			CDEBUG(D_NET, "peer %s(%p) is MR capable\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp);
+			       libcfs_nidstr(&lp->lp_primary_nid), lp);
 			lp->lp_state |= LNET_PEER_MULTI_RAIL;
 			lnet_peer_clr_non_mr_pref_nids(lp);
 		}
 	} else if (lp->lp_state & LNET_PEER_MULTI_RAIL) {
 		if (lp->lp_state & LNET_PEER_CONFIGURED) {
 			CWARN("DLC says %s is Multi-Rail, Reply says not\n",
-			      libcfs_nid2str(lp->lp_primary_nid));
+			      libcfs_nidstr(&lp->lp_primary_nid));
 		} else {
 			CERROR("Multi-Rail state vanished from %s\n",
-			       libcfs_nid2str(lp->lp_primary_nid));
+			       libcfs_nidstr(&lp->lp_primary_nid));
 			lp->lp_state &= ~LNET_PEER_MULTI_RAIL;
 		}
 	}
@@ -2723,7 +2760,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		lp->lp_state |= LNET_PEER_PING_FAILED;
 		lp->lp_ping_error = 0;
 		CDEBUG(D_NET, "Truncated Reply from %s (%d nids)\n",
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       pbuf->pb_info.pi_nnis);
 		goto out;
 	}
@@ -2734,11 +2771,12 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 	 */
 	if (pbuf->pb_info.pi_features & LNET_PING_FEAT_MULTI_RAIL &&
 	    pbuf->pb_info.pi_nnis > 1 &&
-	    lp->lp_primary_nid == pbuf->pb_info.pi_ni[1].ns_nid) {
+	    lnet_nid_to_nid4(&lp->lp_primary_nid) ==
+	    pbuf->pb_info.pi_ni[1].ns_nid) {
 		if (LNET_PING_BUFFER_SEQNO(pbuf) < lp->lp_peer_seqno)
 			CDEBUG(D_NET,
 			       "peer %s: seq# got %u have %u. peer rebooted?\n",
-			       libcfs_nid2str(lp->lp_primary_nid),
+			       libcfs_nidstr(&lp->lp_primary_nid),
 			       LNET_PING_BUFFER_SEQNO(pbuf),
 			       lp->lp_peer_seqno);
 
@@ -2747,7 +2785,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 
 	/* We're happy with the state of the data in the buffer. */
 	CDEBUG(D_NET, "peer %s data present %u. state = 0x%x\n",
-	       libcfs_nid2str(lp->lp_primary_nid), lp->lp_peer_seqno,
+	       libcfs_nidstr(&lp->lp_primary_nid), lp->lp_peer_seqno,
 	       lp->lp_state);
 	if (lp->lp_state & LNET_PEER_DATA_PRESENT)
 		lnet_ping_buffer_decref(lp->lp_data);
@@ -2817,7 +2855,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		lp->lp_state |= LNET_PEER_PING_FAILED;
 		lp->lp_ping_error = -ETIMEDOUT;
 		CDEBUG(D_NET, "Ping Unlink for message to peer %s\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 	}
 	/* We've passed through LNetPut() */
 	if (lp->lp_state & LNET_PEER_PUSH_SENT) {
@@ -2825,7 +2863,7 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 		lp->lp_state |= LNET_PEER_PUSH_FAILED;
 		lp->lp_push_error = -ETIMEDOUT;
 		CDEBUG(D_NET, "Push Unlink for message to peer %s\n",
-		       libcfs_nid2str(lp->lp_primary_nid));
+		       libcfs_nidstr(&lp->lp_primary_nid));
 	}
 	spin_unlock(&lp->lp_lock);
 }
@@ -3003,7 +3041,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 		if (rc) {
 			CERROR("Error adding NID %s to peer %s: %d\n",
 			       libcfs_nid2str(addnis[i].ns_nid),
-			       libcfs_nid2str(lp->lp_primary_nid), rc);
+			       libcfs_nidstr(&lp->lp_primary_nid), rc);
 			if (rc == -ENOMEM)
 				goto out;
 		}
@@ -3026,7 +3064,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 		if (rc) {
 			CERROR("Error deleting NID %s from peer %s: %d\n",
 			       libcfs_nid2str(delnis[i]),
-			       libcfs_nid2str(lp->lp_primary_nid), rc);
+			       libcfs_nidstr(&lp->lp_primary_nid), rc);
 			if (rc == -ENOMEM)
 				goto out;
 		}
@@ -3064,8 +3102,8 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 	kfree(addnis);
 	kfree(delnis);
 	lnet_ping_buffer_decref(pbuf);
-	CDEBUG(D_NET, "peer %s (%p): %d\n", libcfs_nid2str(lp->lp_primary_nid),
-	       lp, rc);
+	CDEBUG(D_NET, "peer %s (%p): %d\n",
+	       libcfs_nidstr(&lp->lp_primary_nid), lp, rc);
 
 	if (rc) {
 		spin_lock(&lp->lp_lock);
@@ -3134,7 +3172,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp,
 	if (pbuf)
 		return lnet_peer_merge_data(lp, pbuf);
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&lp->lp_primary_nid));
 	return 0;
 }
 
@@ -3170,7 +3208,7 @@ static int lnet_peer_deletion(struct lnet_peer *lp)
 	lp->lp_state &= ~(LNET_PEER_DISCOVERING | LNET_PEER_FORCE_PING |
 			  LNET_PEER_FORCE_PUSH);
 	CDEBUG(D_NET, "peer %s(%p) state %#x\n",
-	       libcfs_nid2str(lp->lp_primary_nid), lp, lp->lp_state);
+	       libcfs_nidstr(&lp->lp_primary_nid), lp, lp->lp_state);
 
 	/* no-op if lnet_peer_del() has already been called on this peer */
 	if (lp->lp_state & LNET_PEER_MARK_DELETED)
@@ -3280,7 +3318,7 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 	if (pbuf->pb_info.pi_nnis <= 1)
 		goto out;
 	nid = pbuf->pb_info.pi_ni[1].ns_nid;
-	if (lp->lp_primary_nid == LNET_NID_LO_0) {
+	if (nid_is_lo0(&lp->lp_primary_nid)) {
 		rc = lnet_peer_set_primary_nid(lp, nid, flags);
 		if (!rc)
 			rc = lnet_peer_merge_data(lp, pbuf);
@@ -3291,8 +3329,8 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 	 * to update the status of the nids that we currently have
 	 * recorded in that peer.
 	 */
-	} else if (lp->lp_primary_nid == nid ||
-		   (lnet_is_nid_in_ping_info(lp->lp_primary_nid,
+	} else if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid ||
+		   (lnet_is_nid_in_ping_info(lnet_nid_to_nid4(&lp->lp_primary_nid),
 					     &pbuf->pb_info) &&
 		    lnet_is_discovery_disabled(lp))) {
 		rc = lnet_peer_merge_data(lp, pbuf);
@@ -3302,7 +3340,7 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 			rc = lnet_peer_set_primary_nid(lp, nid, flags);
 			if (rc) {
 				CERROR("Primary NID error %s versus %s: %d\n",
-				       libcfs_nid2str(lp->lp_primary_nid),
+				       libcfs_nidstr(&lp->lp_primary_nid),
 				       libcfs_nid2str(nid), rc);
 			} else {
 				rc = lnet_peer_merge_data(lp, pbuf);
@@ -3343,7 +3381,7 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 	}
 out:
 	CDEBUG(D_NET, "peer %s(%p): %d. state = 0x%x\n",
-	       libcfs_nid2str(lp->lp_primary_nid), lp, rc,
+	       libcfs_nidstr(&lp->lp_primary_nid), lp, rc,
 	       lp->lp_state);
 	mutex_unlock(&the_lnet.ln_api_mutex);
 
@@ -3377,7 +3415,7 @@ static int lnet_peer_ping_failed(struct lnet_peer *lp)
 		LNetMDUnlink(mdh);
 
 	CDEBUG(D_NET, "peer %s:%d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), rc);
+	       libcfs_nidstr(&lp->lp_primary_nid), rc);
 
 	spin_lock(&lp->lp_lock);
 	return rc ? rc : LNET_REDISCOVER_PEER;
@@ -3402,7 +3440,8 @@ static int lnet_peer_send_ping(struct lnet_peer *lp)
 
 	nnis = max_t(int, lp->lp_data_nnis, LNET_INTERFACES_MIN);
 
-	rc = lnet_send_ping(lp->lp_primary_nid, &lp->lp_ping_mdh, nnis, lp,
+	rc = lnet_send_ping(lnet_nid_to_nid4(&lp->lp_primary_nid),
+			    &lp->lp_ping_mdh, nnis, lp,
 			    the_lnet.ln_dc_handler, false);
 	/* if LNetMDBind in lnet_send_ping fails we need to decrement the
 	 * refcount on the peer, otherwise LNetMDUnlink will be called
@@ -3418,13 +3457,13 @@ static int lnet_peer_send_ping(struct lnet_peer *lp)
 		goto fail_error;
 	}
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&lp->lp_primary_nid));
 
 	spin_lock(&lp->lp_lock);
 	return 0;
 
 fail_error:
-	CDEBUG(D_NET, "peer %s: %d\n", libcfs_nid2str(lp->lp_primary_nid), rc);
+	CDEBUG(D_NET, "peer %s: %d\n", libcfs_nidstr(&lp->lp_primary_nid), rc);
 	/*
 	 * The errors that get us here are considered hard errors and
 	 * cause Discovery to terminate. So we clear PING_SENT, but do
@@ -3457,7 +3496,7 @@ static int lnet_peer_push_failed(struct lnet_peer *lp)
 	if (!LNetMDHandleIsInvalid(mdh))
 		LNetMDUnlink(mdh);
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&lp->lp_primary_nid));
 	spin_lock(&lp->lp_lock);
 	return rc ? rc : LNET_REDISCOVER_PEER;
 }
@@ -3472,7 +3511,7 @@ static int lnet_peer_discovered(struct lnet_peer *lp)
 
 	lp->lp_dc_error = 0;
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&lp->lp_primary_nid));
 
 	return 0;
 }
@@ -3531,7 +3570,7 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	if (lp->lp_disc_dst_nid != LNET_NID_ANY)
 		id.nid = lp->lp_disc_dst_nid;
 	else
-		id.nid = lp->lp_primary_nid;
+		id.nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 	lnet_net_unlock(cpt);
 
 	rc = LNetPut(lp->lp_disc_src_nid, lp->lp_push_mdh,
@@ -3547,7 +3586,7 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	if (rc)
 		goto fail_unlink;
 
-	CDEBUG(D_NET, "peer %s\n", libcfs_nid2str(lp->lp_primary_nid));
+	CDEBUG(D_NET, "peer %s\n", libcfs_nidstr(&lp->lp_primary_nid));
 
 	spin_lock(&lp->lp_lock);
 	return 0;
@@ -3556,8 +3595,8 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	LNetMDUnlink(lp->lp_push_mdh);
 	LNetInvalidateMDHandle(&lp->lp_push_mdh);
 fail_error:
-	CDEBUG(D_NET, "peer %s(%p): %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), lp, rc);
+	CDEBUG(D_NET, "peer %s(%p): %d\n", libcfs_nidstr(&lp->lp_primary_nid),
+	       lp, rc);
 	/*
 	 * The errors that get us here are considered hard errors and
 	 * cause Discovery to terminate. So we clear PUSH_SENT, but do
@@ -3577,7 +3616,7 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 static void lnet_peer_discovery_error(struct lnet_peer *lp, int error)
 {
 	CDEBUG(D_NET, "Discovery error %s: %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), error);
+	       libcfs_nidstr(&lp->lp_primary_nid), error);
 
 	spin_lock(&lp->lp_lock);
 	lp->lp_dc_error = error;
@@ -3738,7 +3777,7 @@ static int lnet_peer_discovery(void *arg)
 			 */
 			spin_lock(&lp->lp_lock);
 			CDEBUG(D_NET, "peer %s(%p) state %#x\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp,
+			       libcfs_nidstr(&lp->lp_primary_nid), lp,
 			       lp->lp_state);
 			if (lp->lp_state & (LNET_PEER_MARK_DELETION |
 					    LNET_PEER_MARK_DELETED))
@@ -3760,7 +3799,7 @@ static int lnet_peer_discovery(void *arg)
 			else
 				rc = lnet_peer_discovered(lp);
 			CDEBUG(D_NET, "peer %s(%p) state %#x rc %d\n",
-			       libcfs_nid2str(lp->lp_primary_nid), lp,
+			       libcfs_nidstr(&lp->lp_primary_nid), lp,
 			       lp->lp_state, rc);
 			spin_unlock(&lp->lp_lock);
 
@@ -4008,9 +4047,9 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk)
 		goto out_lp_decref;
 	}
 
-	cfg->prcfg_prim_nid = lp->lp_primary_nid;
+	cfg->prcfg_prim_nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 	cfg->prcfg_mr = lnet_peer_is_multi_rail(lp);
-	cfg->prcfg_cfg_nid = lp->lp_primary_nid;
+	cfg->prcfg_cfg_nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 	cfg->prcfg_count = lp->lp_nnis;
 	cfg->prcfg_size = size;
 	cfg->prcfg_state = lp->lp_state;
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index f6fcc93..2d5f0b6 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -175,7 +175,7 @@ static void lnet_del_route_from_rnet(lnet_nid_t gw_nid,
 	/* use the gateway's lp_primary_nid to delete the route as the
 	 * lr_nid can be a constituent NID of the peer
 	 */
-	lnet_del_route_from_rnet(route->lr_gateway->lp_primary_nid,
+	lnet_del_route_from_rnet(lnet_nid_to_nid4(&route->lr_gateway->lp_primary_nid),
 				 &rnet->lrn_routes, l);
 
 	if (lp) {
@@ -201,11 +201,11 @@ static void lnet_del_route_from_rnet(lnet_nid_t gw_nid,
 
 	lnet_net_lock(LNET_LOCK_EX);
 	CDEBUG(D_NET, "transferring routes from %s -> %s\n",
-	       libcfs_nid2str(src->lp_primary_nid),
-	       libcfs_nid2str(target->lp_primary_nid));
+	       libcfs_nidstr(&src->lp_primary_nid),
+	       libcfs_nidstr(&target->lp_primary_nid));
 	list_for_each_entry(route, &src->lp_routes, lr_gwlist) {
 		CDEBUG(D_NET, "%s: %s->%s\n",
-		       libcfs_nid2str(src->lp_primary_nid),
+		       libcfs_nidstr(&src->lp_primary_nid),
 		       libcfs_net2str(route->lr_net),
 		       libcfs_nid2str(route->lr_nid));
 	}
@@ -331,7 +331,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 		spin_unlock(&gw->lp_lock);
 		if (gw->lp_rtr_refcount > 0)
 			CERROR("peer %s is being used as a gateway but routing feature is not turned on\n",
-			       libcfs_nid2str(gw->lp_primary_nid));
+			       libcfs_nidstr(&gw->lp_primary_nid));
 		return false;
 	}
 	spin_unlock(&gw->lp_lock);
@@ -372,7 +372,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 	    (route->lr_hops == 1 || route->lr_hops == LNET_UNDEFINED_HOPS)) {
 		CWARN("route %s->%s is detected to be multi-hop but hop count is set to %d\n",
 		      libcfs_net2str(route->lr_net),
-		      libcfs_nid2str(route->lr_gateway->lp_primary_nid),
+		      libcfs_nidstr(&route->lr_gateway->lp_primary_nid),
 		      (int)route->lr_hops);
 	}
 }
@@ -420,7 +420,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 	if (lp_state & LNET_PEER_PING_FAILED ||
 	    pbuf->pb_info.pi_features & LNET_PING_FEAT_RTE_DISABLED) {
 		CDEBUG(D_NET, "Set routes down for gw %s because %s %d\n",
-		       libcfs_nid2str(lp->lp_primary_nid),
+		       libcfs_nidstr(&lp->lp_primary_nid),
 		       lp_state & LNET_PEER_PING_FAILED ? "ping failed" :
 		       "route feature is disabled", lp->lp_ping_error);
 		/* If the ping failed or the peer has routing disabled then
@@ -432,7 +432,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 	}
 
 	CDEBUG(D_NET, "Discovery is disabled. Processing reply for gw: %s:%d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), pbuf->pb_info.pi_nnis);
+	       libcfs_nidstr(&lp->lp_primary_nid), pbuf->pb_info.pi_nnis);
 
 	/* examine the ping response to determine if the routes on that
 	 * gateway should be declared alive.
@@ -521,7 +521,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 	 * determine otherwise.
 	 */
 	CDEBUG(D_NET, "%s: Router discovery failed %d\n",
-	       libcfs_nid2str(lp->lp_primary_nid), lp->lp_dc_error);
+	       libcfs_nidstr(&lp->lp_primary_nid), lp->lp_dc_error);
 	while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL)
 		lpni->lpni_ns_status = LNET_NI_STATUS_DOWN;
 
@@ -741,7 +741,8 @@ static void lnet_shuffle_seed(void)
 		}
 
 		/* our lookups must be true */
-		LASSERT(route2->lr_gateway->lp_primary_nid != gateway);
+		LASSERT(lnet_nid_to_nid4(&route2->lr_gateway->lp_primary_nid) !=
+			gateway);
 	}
 
 	/* It is possible to add multiple routes through the same peer,
@@ -790,7 +791,7 @@ static void lnet_shuffle_seed(void)
 	list_for_each_entry_safe(route, tmp, route_list, lr_list) {
 		gateway = route->lr_gateway;
 		if (gw_nid != LNET_NID_ANY &&
-		    gw_nid != gateway->lp_primary_nid)
+		    gw_nid != lnet_nid_to_nid4(&gateway->lp_primary_nid))
 			continue;
 
 		/* move to zombie to delete outside the lock
@@ -834,7 +835,7 @@ static void lnet_shuffle_seed(void)
 	if (lpni) {
 		lp = lpni->lpni_peer_net->lpn_peer;
 		LASSERT(lp);
-		gw_nid = lp->lp_primary_nid;
+		gw_nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 		lnet_peer_ni_decref_locked(lpni);
 	}
 
@@ -1142,7 +1143,7 @@ bool lnet_router_checker_active(void)
 			lpn = lnet_get_next_peer_net_locked(rtr, net_id);
 			if (!lpn) {
 				CERROR("gateway %s has no networks\n",
-				       libcfs_nid2str(rtr->lp_primary_nid));
+				       libcfs_nidstr(&rtr->lp_primary_nid));
 				break;
 			}
 
@@ -1159,7 +1160,7 @@ bool lnet_router_checker_active(void)
 			found_lpn = true;
 
 			CDEBUG(D_NET, "rtr %s(%p) %s(%p) next ping %lld\n",
-			       libcfs_nid2str(rtr->lp_primary_nid), rtr,
+			       libcfs_nidstr(&rtr->lp_primary_nid), rtr,
 			       libcfs_net2str(net_id), lpn,
 			       lpn->lpn_next_ping);
 
@@ -1169,7 +1170,7 @@ bool lnet_router_checker_active(void)
 
 		if (!found_lpn || !lpn) {
 			CERROR("no local network found for gateway %s\n",
-			       libcfs_nid2str(rtr->lp_primary_nid));
+			       libcfs_nidstr(&rtr->lp_primary_nid));
 			continue;
 		}
 
@@ -1184,11 +1185,12 @@ bool lnet_router_checker_active(void)
 		spin_unlock(&rtr->lp_lock);
 
 		/* find the peer_ni associated with the primary NID */
-		lpni = lnet_peer_get_ni_locked(rtr, rtr->lp_primary_nid);
+		lpni = lnet_peer_get_ni_locked(rtr,
+					       lnet_nid_to_nid4(&rtr->lp_primary_nid));
 		if (!lpni) {
 			CDEBUG(D_NET,
 			       "Expected to find an lpni for %s, but non found\n",
-			       libcfs_nid2str(rtr->lp_primary_nid));
+			       libcfs_nidstr(&rtr->lp_primary_nid));
 			continue;
 		}
 		lnet_peer_ni_addref_locked(lpni);
@@ -1208,7 +1210,7 @@ bool lnet_router_checker_active(void)
 			lpn->lpn_next_ping = now + alive_router_check_interval;
 		else
 			CERROR("Failed to discover router %s\n",
-			       libcfs_nid2str(rtr->lp_primary_nid));
+			       libcfs_nidstr(&rtr->lp_primary_nid));
 
 		/* NB cpt lock was dropped in lnet_discover_peer_locked() */
 		if (version != the_lnet.ln_routers_version) {
diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c
index 1a04ac4..2e3c802 100644
--- a/net/lnet/lnet/router_proc.c
+++ b/net/lnet/lnet/router_proc.c
@@ -310,7 +310,7 @@ static int proc_lnet_routers(struct ctl_table *table, int write,
 		}
 
 		if (peer) {
-			lnet_nid_t nid = peer->lp_primary_nid;
+			struct lnet_nid *nid = &peer->lp_primary_nid;
 			int nrefs = atomic_read(&peer->lp_refcount);
 			int nrtrrefs = peer->lp_rtr_refcount;
 			int alive = lnet_is_gateway_alive(peer);
@@ -319,7 +319,7 @@ static int proc_lnet_routers(struct ctl_table *table, int write,
 				       "%-4d %7d %5s %s\n",
 				       nrefs, nrtrrefs,
 				       alive ? "up" : "down",
-				       libcfs_nid2str(nid));
+				       libcfs_nidstr(nid));
 		}
 
 		lnet_net_unlock(0);
diff --git a/net/lnet/lnet/udsp.c b/net/lnet/lnet/udsp.c
index 03669e6..977a6a6 100644
--- a/net/lnet/lnet/udsp.c
+++ b/net/lnet/lnet/udsp.c
@@ -248,7 +248,7 @@ enum udsp_apply {
 		list_for_each_entry(rnet, rn_list, lrn_list) {
 			list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
 				/* look if gw nid on the same net matches */
-				gw_prim_nid = route->lr_gateway->lp_primary_nid;
+				gw_prim_nid = lnet_nid_to_nid4(&route->lr_gateway->lp_primary_nid);
 				lpni = NULL;
 				while ((lpni = lnet_get_next_peer_ni_locked(route->lr_gateway,
 									    NULL,
@@ -425,7 +425,7 @@ enum udsp_apply {
 		rn_list = &the_lnet.ln_remote_nets_hash[i];
 		list_for_each_entry(rnet, rn_list, lrn_list) {
 			list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
-				gw_nid = route->lr_gateway->lp_primary_nid;
+				gw_nid = lnet_nid_to_nid4(&route->lr_gateway->lp_primary_nid);
 				rc = cfs_match_nid_net(gw_nid,
 						       rte_action->ud_net_id.udn_net_type,
 						       &rte_action->ud_net_id.udn_net_num_range,
@@ -608,7 +608,7 @@ enum udsp_apply {
 		ptable = the_lnet.ln_peer_tables[cpt];
 		list_for_each_entry(lp, &ptable->pt_peer_list, lp_peer_list) {
 			CDEBUG(D_NET, "udsp examining lp %s\n",
-			       libcfs_nid2str(lp->lp_primary_nid));
+			       libcfs_nidstr(&lp->lp_primary_nid));
 			list_for_each_entry(lpn,
 					    &lp->lp_peer_nets,
 					    lpn_peer_nets) {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 07/24] lnet: change lp_disc_*_nid to struct lnet_nid
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (5 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 06/24] lnet: change lp_primary_nid to struct lnet_nid James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 08/24] lnet: socklnd: factor out key calculation for ksnd_peers James Simmons
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

Change lp_disc_src_nid and lp_disc_dst_nid in struct lnet_peer to
struct lnet_nid.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: f38529cd3a1722119 ("LU-10391 lnet: change lp_disc_*_nid to struct lnet_nid")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/44620
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h |  4 ++--
 net/lnet/lnet/peer.c           | 20 ++++++++++----------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index f980f2f..ba900e8 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -672,9 +672,9 @@ struct lnet_peer {
 	struct lnet_nid		lp_primary_nid;
 
 	/* source NID to use during discovery */
-	lnet_nid_t		lp_disc_src_nid;
+	struct lnet_nid		lp_disc_src_nid;
 	/* destination NID to use during discovery */
-	lnet_nid_t		lp_disc_dst_nid;
+	struct lnet_nid		lp_disc_dst_nid;
 
 	/* net to perform discovery on */
 	u32			lp_disc_net_id;
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 7f2c6f3..17f99ee 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -224,8 +224,8 @@
 	init_waitqueue_head(&lp->lp_dc_waitq);
 	spin_lock_init(&lp->lp_lock);
 	lp->lp_primary_nid = nid;
-	lp->lp_disc_src_nid = LNET_NID_ANY;
-	lp->lp_disc_dst_nid = LNET_NID_ANY;
+	lp->lp_disc_src_nid = LNET_ANY_NID;
+	lp->lp_disc_dst_nid = LNET_ANY_NID;
 	if (lnet_peers_start_down())
 		lp->lp_alive = false;
 	else
@@ -2635,8 +2635,8 @@ static void lnet_peer_clear_discovery_error(struct lnet_peer *lp)
 
 	spin_lock(&lp->lp_lock);
 
-	lp->lp_disc_src_nid = ev->target.nid;
-	lp->lp_disc_dst_nid = ev->source.nid;
+	lnet_nid4_to_nid(ev->target.nid, &lp->lp_disc_src_nid);
+	lnet_nid4_to_nid(ev->source.nid, &lp->lp_disc_dst_nid);
 
 	/*
 	 * If some kind of error happened the contents of message
@@ -3367,7 +3367,7 @@ static int lnet_peer_data_present(struct lnet_peer *lp)
 			 * received by lp, we need to set the discovery source
 			 * NID for new_lp to the NID stored in lp.
 			 */
-			if (lp->lp_disc_src_nid != LNET_NID_ANY) {
+			if (!LNET_NID_IS_ANY(&lp->lp_disc_src_nid)) {
 				new_lp->lp_disc_src_nid = lp->lp_disc_src_nid;
 				new_lp->lp_disc_dst_nid = lp->lp_disc_dst_nid;
 			}
@@ -3567,13 +3567,13 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	/* Refcount for MD. */
 	lnet_peer_addref_locked(lp);
 	id.pid = LNET_PID_LUSTRE;
-	if (lp->lp_disc_dst_nid != LNET_NID_ANY)
-		id.nid = lp->lp_disc_dst_nid;
+	if (!LNET_NID_IS_ANY(&lp->lp_disc_dst_nid))
+		id.nid = lnet_nid_to_nid4(&lp->lp_disc_dst_nid);
 	else
 		id.nid = lnet_nid_to_nid4(&lp->lp_primary_nid);
 	lnet_net_unlock(cpt);
 
-	rc = LNetPut(lp->lp_disc_src_nid, lp->lp_push_mdh,
+	rc = LNetPut(lnet_nid_to_nid4(&lp->lp_disc_src_nid), lp->lp_push_mdh,
 		     LNET_ACK_REQ, id, LNET_RESERVED_PORTAL,
 		     LNET_PROTO_PING_MATCHBITS, 0, 0);
 	/* reset the discovery nid. There is no need to restrict sending
@@ -3581,8 +3581,8 @@ static int lnet_peer_send_push(struct lnet_peer *lp)
 	 * get set to a specific NID, if we initiate discovery from the
 	 * scratch
 	 */
-	lp->lp_disc_src_nid = LNET_NID_ANY;
-	lp->lp_disc_dst_nid = LNET_NID_ANY;
+	lp->lp_disc_src_nid = LNET_ANY_NID;
+	lp->lp_disc_dst_nid = LNET_ANY_NID;
 	if (rc)
 		goto fail_unlink;
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 08/24] lnet: socklnd: factor out key calculation for ksnd_peers
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (6 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 07/24] lnet: change lp_disc_*_nid " James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 09/24] lnet: introduce lnet_processid for ksock_peer_ni James Simmons
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

The hash_table library requires a "long" to be used as a key.  We
currently provide the nid, which at 64bits is a suitable long on 64bit
hosts, but isn't really correct on 32bit hosts.

When we change to an extend nid (which is 160bits) it will be even
less appropriate.

So create a separate function to compute a 'long' key, and implement
by simply xoring 'long'-sized parts of the nid together.  On a 64bit
machine, this is currently optimized away for lnet_nid_t, but that
will change when we convert to struct lnet_nid.

This new function is placed in lnet-types.h as it will be more
generally useful later.

The hash_table library calls hash_long() on the key, so we don't need
to do anything more interesting than xoring.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 96a0c378c2e0a0c8f ("LU-10391 socklnd: factor out key calculation for ksnd_peers")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42103
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/uapi/linux/lnet/lnet-types.h | 10 ++++++++++
 net/lnet/klnds/socklnd/socklnd.c     | 17 +++++++++++------
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h
index ba8a079..5538d4e 100644
--- a/include/uapi/linux/lnet/lnet-types.h
+++ b/include/uapi/linux/lnet/lnet-types.h
@@ -163,6 +163,16 @@ static inline int nid_same(const struct lnet_nid *n1,
 		n1->nid_addr[3] == n2->nid_addr[3];
 }
 
+/* This can be used when we need to hash a nid */
+static inline unsigned long nidhash(lnet_nid_t nid)
+{
+	unsigned long hash = 0;
+
+	hash ^= LNET_NIDNET(nid);
+	hash ^= LNET_NIDADDR(nid);
+	return hash;
+}
+
 struct lnet_counters_health {
 	__u32	lch_rst_alloc;
 	__u32	lch_resend_count;
diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 21569fb..08d1cf4 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -221,9 +221,10 @@ struct ksock_peer_ni *
 ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_process_id id)
 {
 	struct ksock_peer_ni *peer_ni;
+	unsigned long hash = nidhash(id.nid);
 
 	hash_for_each_possible(ksocknal_data.ksnd_peers, peer_ni,
-			       ksnp_list, id.nid) {
+			       ksnp_list, hash) {
 		LASSERT(!peer_ni->ksnp_closing);
 
 		if (peer_ni->ksnp_ni != ni)
@@ -602,7 +603,8 @@ struct ksock_peer_ni *
 		peer_ni = peer2;
 	} else {
 		/* peer_ni table takes my ref on peer_ni */
-		hash_add(ksocknal_data.ksnd_peers, &peer_ni->ksnp_list, id.nid);
+		hash_add(ksocknal_data.ksnd_peers, &peer_ni->ksnp_list,
+			 nidhash(id.nid));
 	}
 
 	ksocknal_add_conn_cb_locked(peer_ni, conn_cb);
@@ -656,7 +658,8 @@ struct ksock_peer_ni *
 	write_lock_bh(&ksocknal_data.ksnd_global_lock);
 
 	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(id.nid, HASH_BITS(ksocknal_data.ksnd_peers));
+		lo = hash_min(nidhash(id.nid),
+			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
 		lo = 0;
@@ -935,7 +938,7 @@ struct ksock_peer_ni *
 			 * table (which takes my ref)
 			 */
 			hash_add(ksocknal_data.ksnd_peers,
-				 &peer_ni->ksnp_list, peerid.nid);
+				 &peer_ni->ksnp_list, nidhash(peerid.nid));
 		} else {
 			ksocknal_peer_decref(peer_ni);
 			peer_ni = peer2;
@@ -1567,7 +1570,8 @@ struct ksock_peer_ni *
 	write_lock_bh(&ksocknal_data.ksnd_global_lock);
 
 	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(id.nid, HASH_BITS(ksocknal_data.ksnd_peers));
+		lo = hash_min(nidhash(id.nid),
+			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
 		lo = 0;
@@ -1662,7 +1666,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 	int rc = -ENOENT;
 
 	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(id.nid, HASH_BITS(ksocknal_data.ksnd_peers));
+		lo = hash_min(nidhash(id.nid),
+			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
 		lo = 0;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 09/24] lnet: introduce lnet_processid for ksock_peer_ni
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (7 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 08/24] lnet: socklnd: factor out key calculation for ksnd_peers James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 10/24] lnet: enhance connect/accept to support large addr James Simmons
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

struct lnet_processid (without the '_') is like lnet_process_id, but
contains a 'struct lnet_nid' rather than lnet_nid_t.

So far it is only used for ksnp_id in struct ksock_peer_ni, and
related functions.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: e1dbfdd53e2ce9543 ("LU-10391 lnet: introduce lnet_processid for ksock_peer_ni")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42104
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h          |   1 +
 include/uapi/linux/lnet/lnet-types.h   |  36 +++++-
 include/uapi/linux/lnet/nidstr.h       |   1 +
 net/lnet/klnds/socklnd/socklnd.c       | 208 ++++++++++++++++++---------------
 net/lnet/klnds/socklnd/socklnd.h       |  14 +--
 net/lnet/klnds/socklnd/socklnd_cb.c    | 111 ++++++++++--------
 net/lnet/klnds/socklnd/socklnd_proto.c |  14 +--
 net/lnet/lnet/api-ni.c                 |  21 +++-
 net/lnet/lnet/nidstrings.c             |  18 +++
 9 files changed, 259 insertions(+), 165 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index a4ec067..3842976 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -484,6 +484,7 @@ unsigned int lnet_nid_cpt_hash(struct lnet_nid *nid,
 			       unsigned int number);
 int lnet_cpt_of_nid_locked(struct lnet_nid *nid, struct lnet_ni *ni);
 int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni);
+int lnet_nid2cpt(struct lnet_nid *nid, struct lnet_ni *ni);
 struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt);
 struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid);
 struct lnet_ni *lnet_net2ni_locked(u32 net, int cpt);
diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h
index 5538d4e..ec0c4ef 100644
--- a/include/uapi/linux/lnet/lnet-types.h
+++ b/include/uapi/linux/lnet/lnet-types.h
@@ -164,12 +164,14 @@ static inline int nid_same(const struct lnet_nid *n1,
 }
 
 /* This can be used when we need to hash a nid */
-static inline unsigned long nidhash(lnet_nid_t nid)
+static inline unsigned long nidhash(const struct lnet_nid *nid)
 {
+	int i;
 	unsigned long hash = 0;
 
-	hash ^= LNET_NIDNET(nid);
-	hash ^= LNET_NIDADDR(nid);
+	hash ^= LNET_NID_NET(nid);
+	for (i = 0; i < 4; i++)
+		hash ^= nid->nid_addr[i];
 	return hash;
 }
 
@@ -241,6 +243,34 @@ struct lnet_process_id {
 	/** process id */
 	lnet_pid_t pid;
 };
+
+/**
+ * Global process ID - with large addresses
+ */
+struct lnet_processid {
+	/** node id */
+	struct lnet_nid nid;
+	/** process id */
+	lnet_pid_t pid;
+};
+
+static inline void
+lnet_pid4_to_pid(struct lnet_process_id pid4, struct lnet_processid *pid)
+{
+	pid->pid = pid4.pid;
+	lnet_nid4_to_nid(pid4.nid, &pid->nid);
+}
+
+static inline struct lnet_process_id
+lnet_pid_to_pid4(struct lnet_processid *pid)
+{
+	struct lnet_process_id ret;
+
+	ret.pid = pid->pid;
+	ret.nid = lnet_nid_to_nid4(&pid->nid);
+	return ret;
+}
+
 /** @} lnet_addr */
 
 /** \addtogroup lnet_me
diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h
index 13a0d10..bfc9644 100644
--- a/include/uapi/linux/lnet/nidstr.h
+++ b/include/uapi/linux/lnet/nidstr.h
@@ -103,6 +103,7 @@ static inline char *libcfs_nidstr(const struct lnet_nid *nid)
 int libcfs_strnid(struct lnet_nid *nid, const char *str);
 int libcfs_str2anynid(lnet_nid_t *nid, const char *str);
 char *libcfs_id2str(struct lnet_process_id id);
+char *libcfs_idstr(struct lnet_processid *id);
 void cfs_free_nidlist(struct list_head *list);
 int cfs_parse_nidlist(char *str, int len, struct list_head *list);
 int cfs_print_nidlist(char *buffer, int count, struct list_head *list);
diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 08d1cf4..7397ac7 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -152,14 +152,14 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 }
 
 static struct ksock_peer_ni *
-ksocknal_create_peer(struct lnet_ni *ni, struct lnet_process_id id)
+ksocknal_create_peer(struct lnet_ni *ni, struct lnet_processid *id)
 {
-	int cpt = lnet_cpt_of_nid(id.nid, ni);
+	int cpt = lnet_nid2cpt(&id->nid, ni);
 	struct ksock_net *net = ni->ni_data;
 	struct ksock_peer_ni *peer_ni;
 
-	LASSERT(id.nid != LNET_NID_ANY);
-	LASSERT(id.pid != LNET_PID_ANY);
+	LASSERT(!LNET_NID_IS_ANY(&id->nid));
+	LASSERT(id->pid != LNET_PID_ANY);
 	LASSERT(!in_interrupt());
 
 	if (!atomic_inc_unless_negative(&net->ksnn_npeers)) {
@@ -174,7 +174,7 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 	}
 
 	peer_ni->ksnp_ni = ni;
-	peer_ni->ksnp_id = id;
+	peer_ni->ksnp_id = *id;
 	refcount_set(&peer_ni->ksnp_refcount, 1);   /* 1 ref for caller */
 	peer_ni->ksnp_closing = 0;
 	peer_ni->ksnp_accepting = 0;
@@ -197,7 +197,7 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 	struct ksock_net *net = peer_ni->ksnp_ni->ni_data;
 
 	CDEBUG(D_NET, "peer_ni %s %p deleted\n",
-	       libcfs_id2str(peer_ni->ksnp_id), peer_ni);
+	       libcfs_idstr(&peer_ni->ksnp_id), peer_ni);
 
 	LASSERT(!refcount_read(&peer_ni->ksnp_refcount));
 	LASSERT(!peer_ni->ksnp_accepting);
@@ -218,10 +218,10 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni)
 }
 
 struct ksock_peer_ni *
-ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_process_id id)
+ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_processid *id)
 {
 	struct ksock_peer_ni *peer_ni;
-	unsigned long hash = nidhash(id.nid);
+	unsigned long hash = nidhash(&id->nid);
 
 	hash_for_each_possible(ksocknal_data.ksnd_peers, peer_ni,
 			       ksnp_list, hash) {
@@ -230,12 +230,12 @@ struct ksock_peer_ni *
 		if (peer_ni->ksnp_ni != ni)
 			continue;
 
-		if (peer_ni->ksnp_id.nid != id.nid ||
-		    peer_ni->ksnp_id.pid != id.pid)
+		if (!nid_same(&peer_ni->ksnp_id.nid, &id->nid) ||
+		    peer_ni->ksnp_id.pid != id->pid)
 			continue;
 
 		CDEBUG(D_NET, "got peer_ni [%p] -> %s (%d)\n",
-		       peer_ni, libcfs_id2str(id),
+		       peer_ni, libcfs_idstr(id),
 		       refcount_read(&peer_ni->ksnp_refcount));
 		return peer_ni;
 	}
@@ -243,7 +243,7 @@ struct ksock_peer_ni *
 }
 
 struct ksock_peer_ni *
-ksocknal_find_peer(struct lnet_ni *ni, struct lnet_process_id id)
+ksocknal_find_peer(struct lnet_ni *ni, struct lnet_processid *id)
 {
 	struct ksock_peer_ni *peer_ni;
 
@@ -312,7 +312,8 @@ struct ksock_peer_ni *
 			if (index-- > 0)
 				continue;
 
-			*id = peer_ni->ksnp_id;
+			id->pid = peer_ni->ksnp_id.pid;
+			id->nid = lnet_nid_to_nid4(&peer_ni->ksnp_id.nid);
 			*myip = 0;
 			*peer_ip = 0;
 			*port = 0;
@@ -326,7 +327,8 @@ struct ksock_peer_ni *
 			if (index-- > 0)
 				continue;
 
-			*id = peer_ni->ksnp_id;
+			id->pid = peer_ni->ksnp_id.pid;
+			id->nid = lnet_nid_to_nid4(&peer_ni->ksnp_id.nid);
 			*myip = peer_ni->ksnp_passive_ips[j];
 			*peer_ip = 0;
 			*port = 0;
@@ -342,7 +344,8 @@ struct ksock_peer_ni *
 
 			conn_cb = peer_ni->ksnp_conn_cb;
 
-			*id = peer_ni->ksnp_id;
+			id->pid = peer_ni->ksnp_id.pid;
+			id->nid = lnet_nid_to_nid4(&peer_ni->ksnp_id.nid);
 			if (conn_cb->ksnr_addr.ss_family == AF_INET) {
 				struct sockaddr_in *sa;
 
@@ -465,13 +468,13 @@ struct ksock_peer_ni *
 			 * conn_cb)
 			 */
 			CDEBUG(D_NET, "Binding %s %pIS to interface %d\n",
-			       libcfs_id2str(peer_ni->ksnp_id),
+			       libcfs_idstr(&peer_ni->ksnp_id),
 			       &conn_cb->ksnr_addr,
 			       conn_iface);
 		} else {
 			CDEBUG(D_NET,
 			       "Rebinding %s %pIS from interface %d to %d\n",
-			       libcfs_id2str(peer_ni->ksnp_id),
+			       libcfs_idstr(&peer_ni->ksnp_id),
 			       &conn_cb->ksnr_addr,
 			       conn_cb->ksnr_myiface,
 			       conn_iface);
@@ -567,26 +570,27 @@ struct ksock_peer_ni *
 }
 
 int
-ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, u32 ipaddr,
-		  int port)
+ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id4,
+		  struct sockaddr *addr)
 {
 	struct ksock_peer_ni *peer_ni;
 	struct ksock_peer_ni *peer2;
 	struct ksock_conn_cb *conn_cb;
-	struct sockaddr_in sa = {.sin_family = AF_INET};
+	struct lnet_processid id;
 
-	if (id.nid == LNET_NID_ANY ||
-	    id.pid == LNET_PID_ANY)
+	if (id4.nid == LNET_NID_ANY ||
+	    id4.pid == LNET_PID_ANY)
 		return -EINVAL;
 
+	id.pid = id4.pid;
+	lnet_nid4_to_nid(id4.nid, &id.nid);
+
 	/* Have a brand new peer_ni ready... */
-	peer_ni = ksocknal_create_peer(ni, id);
+	peer_ni = ksocknal_create_peer(ni, &id);
 	if (IS_ERR(peer_ni))
 		return PTR_ERR(peer_ni);
 
-	sa.sin_addr.s_addr = htonl(ipaddr);
-	sa.sin_port = htons(port);
-	conn_cb = ksocknal_create_conn_cb((struct sockaddr *)&sa);
+	conn_cb = ksocknal_create_conn_cb(addr);
 	if (!conn_cb) {
 		ksocknal_peer_decref(peer_ni);
 		return -ENOMEM;
@@ -597,14 +601,14 @@ struct ksock_peer_ni *
 	/* always called with a ref on ni, so shutdown can't have started */
 	LASSERT(atomic_read(&((struct ksock_net *)ni->ni_data)->ksnn_npeers) >= 0);
 
-	peer2 = ksocknal_find_peer_locked(ni, id);
+	peer2 = ksocknal_find_peer_locked(ni, &id);
 	if (peer2) {
 		ksocknal_peer_decref(peer_ni);
 		peer_ni = peer2;
 	} else {
 		/* peer_ni table takes my ref on peer_ni */
 		hash_add(ksocknal_data.ksnd_peers, &peer_ni->ksnp_list,
-			 nidhash(id.nid));
+			 nidhash(&id.nid));
 	}
 
 	ksocknal_add_conn_cb_locked(peer_ni, conn_cb);
@@ -645,7 +649,7 @@ struct ksock_peer_ni *
 }
 
 static int
-ksocknal_del_peer(struct lnet_ni *ni, struct lnet_process_id id, u32 ip)
+ksocknal_del_peer(struct lnet_ni *ni, struct lnet_process_id id4, u32 ip)
 {
 	LIST_HEAD(zombies);
 	struct hlist_node *pnxt;
@@ -654,11 +658,15 @@ struct ksock_peer_ni *
 	int hi;
 	int i;
 	int rc = -ENOENT;
+	struct lnet_processid id;
+
+	id.pid = id4.pid;
+	lnet_nid4_to_nid(id4.nid, &id.nid);
 
 	write_lock_bh(&ksocknal_data.ksnd_global_lock);
 
-	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(nidhash(id.nid),
+	if (!LNET_NID_IS_ANY(&id.nid)) {
+		lo = hash_min(nidhash(&id.nid),
 			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
@@ -673,8 +681,8 @@ struct ksock_peer_ni *
 			if (peer_ni->ksnp_ni != ni)
 				continue;
 
-			if (!((id.nid == LNET_NID_ANY ||
-			       peer_ni->ksnp_id.nid == id.nid) &&
+			if (!((LNET_NID_IS_ANY(&id.nid) ||
+			       nid_same(&peer_ni->ksnp_id.nid, &id.nid)) &&
 			      (id.pid == LNET_PID_ANY ||
 			       peer_ni->ksnp_id.pid == id.pid)))
 				continue;
@@ -805,7 +813,7 @@ struct ksock_peer_ni *
 {
 	rwlock_t *global_lock = &ksocknal_data.ksnd_global_lock;
 	LIST_HEAD(zombies);
-	struct lnet_process_id peerid;
+	struct lnet_process_id peerid4;
 	u64 incarnation;
 	struct ksock_conn *conn;
 	struct ksock_conn *conn2;
@@ -879,7 +887,7 @@ struct ksock_peer_ni *
 
 		/* Active connection sends HELLO eagerly */
 		hello->kshm_nips =  0;
-		peerid = peer_ni->ksnp_id;
+		peerid4 = lnet_pid_to_pid4(&peer_ni->ksnp_id);
 
 		write_lock_bh(global_lock);
 		conn->ksnc_proto = peer_ni->ksnp_proto;
@@ -895,32 +903,35 @@ struct ksock_peer_ni *
 #endif
 		}
 
-		rc = ksocknal_send_hello(ni, conn, peerid.nid, hello);
+		rc = ksocknal_send_hello(ni, conn, peerid4.nid, hello);
 		if (rc)
 			goto failed_1;
 	} else {
-		peerid.nid = LNET_NID_ANY;
-		peerid.pid = LNET_PID_ANY;
+		peerid4.nid = LNET_NID_ANY;
+		peerid4.pid = LNET_PID_ANY;
 
 		/* Passive, get protocol from peer_ni */
 		conn->ksnc_proto = NULL;
 	}
 
-	rc = ksocknal_recv_hello(ni, conn, hello, &peerid, &incarnation);
+	rc = ksocknal_recv_hello(ni, conn, hello, &peerid4, &incarnation);
 	if (rc < 0)
 		goto failed_1;
 
 	LASSERT(!rc || active);
 	LASSERT(conn->ksnc_proto);
-	LASSERT(peerid.nid != LNET_NID_ANY);
+	LASSERT(peerid4.nid != LNET_NID_ANY);
 
-	cpt = lnet_cpt_of_nid(peerid.nid, ni);
+	cpt = lnet_cpt_of_nid(peerid4.nid, ni);
 
 	if (active) {
 		ksocknal_peer_addref(peer_ni);
 		write_lock_bh(global_lock);
 	} else {
-		peer_ni = ksocknal_create_peer(ni, peerid);
+		struct lnet_processid peerid;
+
+		lnet_pid4_to_pid(peerid4, &peerid);
+		peer_ni = ksocknal_create_peer(ni, &peerid);
 		if (IS_ERR(peer_ni)) {
 			rc = PTR_ERR(peer_ni);
 			goto failed_1;
@@ -931,14 +942,14 @@ struct ksock_peer_ni *
 		/* called with a ref on ni, so shutdown can't have started */
 		LASSERT(atomic_read(&((struct ksock_net *)ni->ni_data)->ksnn_npeers) >= 0);
 
-		peer2 = ksocknal_find_peer_locked(ni, peerid);
+		peer2 = ksocknal_find_peer_locked(ni, &peerid);
 		if (!peer2) {
 			/*
 			 * NB this puts an "empty" peer_ni in the peer
 			 * table (which takes my ref)
 			 */
 			hash_add(ksocknal_data.ksnd_peers,
-				 &peer_ni->ksnp_list, nidhash(peerid.nid));
+				 &peer_ni->ksnp_list, nidhash(&peerid.nid));
 		} else {
 			ksocknal_peer_decref(peer_ni);
 			peer_ni = peer2;
@@ -952,7 +963,7 @@ struct ksock_peer_ni *
 		 * Am I already connecting to this guy?  Resolve in
 		 * favour of higher NID...
 		 */
-		if (peerid.nid < lnet_nid_to_nid4(&ni->ni_nid) &&
+		if (peerid4.nid < lnet_nid_to_nid4(&ni->ni_nid) &&
 		    ksocknal_connecting(peer_ni->ksnp_conn_cb,
 					((struct sockaddr *)&conn->ksnc_peeraddr))) {
 			rc = EALREADY;
@@ -1051,7 +1062,7 @@ struct ksock_peer_ni *
 	    !rpc_cmp_addr((struct sockaddr *)&conn_cb->ksnr_addr,
 			  (struct sockaddr *)&conn->ksnc_peeraddr)) {
 		CERROR("Route %s %pIS connected to %pIS\n",
-		       libcfs_id2str(peer_ni->ksnp_id),
+		       libcfs_idstr(&peer_ni->ksnp_id),
 		       &conn_cb->ksnr_addr,
 		       &conn->ksnc_peeraddr);
 	}
@@ -1123,13 +1134,13 @@ struct ksock_peer_ni *
 	 */
 	CDEBUG(D_NET,
 	       "New conn %s p %d.x %pIS -> %pISp incarnation:%lld sched[%d]\n",
-	       libcfs_id2str(peerid), conn->ksnc_proto->pro_version,
+	       libcfs_id2str(peerid4), conn->ksnc_proto->pro_version,
 	       &conn->ksnc_myaddr, &conn->ksnc_peeraddr,
 	       incarnation, cpt);
 
 	if (!active) {
 		hello->kshm_nips = 0;
-		rc = ksocknal_send_hello(ni, conn, peerid.nid, hello);
+		rc = ksocknal_send_hello(ni, conn, peerid4.nid, hello);
 	}
 
 	kvfree(hello);
@@ -1185,10 +1196,10 @@ struct ksock_peer_ni *
 	if (warn) {
 		if (rc < 0)
 			CERROR("Not creating conn %s type %d: %s\n",
-			       libcfs_id2str(peerid), conn->ksnc_type, warn);
+			       libcfs_id2str(peerid4), conn->ksnc_type, warn);
 		else
 			CDEBUG(D_NET, "Not creating conn %s type %d: %s\n",
-			       libcfs_id2str(peerid), conn->ksnc_type, warn);
+			       libcfs_id2str(peerid4), conn->ksnc_type, warn);
 	}
 
 	if (!active) {
@@ -1199,7 +1210,7 @@ struct ksock_peer_ni *
 			 */
 			conn->ksnc_type = SOCKLND_CONN_NONE;
 			hello->kshm_nips = 0;
-			ksocknal_send_hello(ni, conn, peerid.nid, hello);
+			ksocknal_send_hello(ni, conn, peerid4.nid, hello);
 		}
 
 		write_lock_bh(global_lock);
@@ -1338,7 +1349,8 @@ struct ksock_peer_ni *
 	read_unlock(&ksocknal_data.ksnd_global_lock);
 
 	if (notify)
-		lnet_notify(peer_ni->ksnp_ni, peer_ni->ksnp_id.nid,
+		lnet_notify(peer_ni->ksnp_ni,
+			    lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
 			    false, false, last_alive);
 }
 
@@ -1481,7 +1493,8 @@ struct ksock_peer_ni *
 		last_rcv = conn->ksnc_rx_deadline -
 			   ksocknal_timeout();
 		CERROR("Completing partial receive from %s[%d], ip %pISp, with error, wanted: %zd, left: %d, last alive is %lld secs ago\n",
-		       libcfs_id2str(conn->ksnc_peer->ksnp_id), conn->ksnc_type,
+		       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
+		       conn->ksnc_type,
 		       &conn->ksnc_peeraddr,
 		       iov_iter_count(&conn->ksnc_rx_to), conn->ksnc_rx_nob_left,
 		       ktime_get_seconds() - last_rcv);
@@ -1493,21 +1506,21 @@ struct ksock_peer_ni *
 	case SOCKNAL_RX_LNET_HEADER:
 		if (conn->ksnc_rx_started)
 			CERROR("Incomplete receive of lnet header from %s, ip %pISp, with error, protocol: %d.x.\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr,
 			       conn->ksnc_proto->pro_version);
 		break;
 	case SOCKNAL_RX_KSM_HEADER:
 		if (conn->ksnc_rx_started)
 			CERROR("Incomplete receive of ksock message from %s, ip %pISp, with error, protocol: %d.x.\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr,
 			       conn->ksnc_proto->pro_version);
 		break;
 	case SOCKNAL_RX_SLOP:
 		if (conn->ksnc_rx_started)
 			CERROR("Incomplete receive of slops from %s, ip %pISp, with error\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       &conn->ksnc_peeraddr);
 	       break;
 	default:
@@ -1557,7 +1570,7 @@ struct ksock_peer_ni *
 }
 
 int
-ksocknal_close_matching_conns(struct lnet_process_id id, u32 ipaddr)
+ksocknal_close_matching_conns(struct lnet_processid *id, u32 ipaddr)
 {
 	struct ksock_peer_ni *peer_ni;
 	struct hlist_node *pnxt;
@@ -1569,8 +1582,8 @@ struct ksock_peer_ni *
 
 	write_lock_bh(&ksocknal_data.ksnd_global_lock);
 
-	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(nidhash(id.nid),
+	if (!LNET_NID_IS_ANY(&id->nid)) {
+		lo = hash_min(nidhash(&id->nid),
 			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
@@ -1583,10 +1596,10 @@ struct ksock_peer_ni *
 		hlist_for_each_entry_safe(peer_ni, pnxt,
 					  &ksocknal_data.ksnd_peers[i],
 					  ksnp_list) {
-			if (!((id.nid == LNET_NID_ANY ||
-			       id.nid == peer_ni->ksnp_id.nid) &&
-			      (id.pid == LNET_PID_ANY ||
-			       id.pid == peer_ni->ksnp_id.pid)))
+			if (!((LNET_NID_IS_ANY(&id->nid) ||
+			       nid_same(&id->nid, &peer_ni->ksnp_id.nid)) &&
+			      (id->pid == LNET_PID_ANY ||
+			       id->pid == peer_ni->ksnp_id.pid)))
 				continue;
 
 			count += ksocknal_close_peer_conns_locked(peer_ni,
@@ -1598,7 +1611,8 @@ struct ksock_peer_ni *
 	write_unlock_bh(&ksocknal_data.ksnd_global_lock);
 
 	/* wildcards always succeed */
-	if (id.nid == LNET_NID_ANY || id.pid == LNET_PID_ANY || !ipaddr)
+	if (LNET_NID_IS_ANY(&id->nid) || id->pid == LNET_PID_ANY ||
+	    !ipaddr)
 		return 0;
 
 	return count ? 0 : -ENOENT;
@@ -1611,15 +1625,15 @@ struct ksock_peer_ni *
 	 * The router is telling me she's been notified of a change in
 	 * gateway state....
 	 */
-	struct lnet_process_id id = {0};
-
-	id.nid = gw_nid;
-	id.pid = LNET_PID_ANY;
+	struct lnet_processid id = {
+		.pid	= LNET_PID_ANY,
+	};
 
 	CDEBUG(D_NET, "gw %s down\n", libcfs_nid2str(gw_nid));
 
+	lnet_nid4_to_nid(gw_nid, &id.nid);
 	/* If the gateway crashed, close all open connections... */
-	ksocknal_close_matching_conns(id, 0);
+	ksocknal_close_matching_conns(&id, 0);
 	return;
 
 	/*
@@ -1658,15 +1672,15 @@ struct ksock_peer_ni *
 	}
 }
 
-static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
+static int ksocknal_push(struct lnet_ni *ni, struct lnet_processid *id)
 {
 	int lo;
 	int hi;
 	int bkt;
 	int rc = -ENOENT;
 
-	if (id.nid != LNET_NID_ANY) {
-		lo = hash_min(nidhash(id.nid),
+	if (!LNET_NID_IS_ANY(&id->nid)) {
+		lo = hash_min(nidhash(&id->nid),
 			      HASH_BITS(ksocknal_data.ksnd_peers));
 		hi = lo;
 	} else {
@@ -1685,10 +1699,11 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 			hlist_for_each_entry(peer_ni,
 					     &ksocknal_data.ksnd_peers[bkt],
 					     ksnp_list) {
-				if (!((id.nid == LNET_NID_ANY ||
-				       id.nid == peer_ni->ksnp_id.nid) &&
-				      (id.pid == LNET_PID_ANY ||
-				       id.pid == peer_ni->ksnp_id.pid)))
+				if (!((LNET_NID_IS_ANY(&id->nid) ||
+				       nid_same(&id->nid,
+						&peer_ni->ksnp_id.nid)) &&
+				      (id->pid == LNET_PID_ANY ||
+				       id->pid == peer_ni->ksnp_id.pid)))
 					continue;
 
 				if (i++ == peer_off) {
@@ -1712,7 +1727,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 int
 ksocknal_ctl(struct lnet_ni *ni, unsigned int cmd, void *arg)
 {
-	struct lnet_process_id id = {0};
+	struct lnet_process_id id4 = {};
+	struct lnet_processid id = {};
 	struct libcfs_ioctl_data *data = arg;
 	int rc;
 
@@ -1752,32 +1768,34 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 		int share_count = 0;
 
 		rc = ksocknal_get_peer_info(ni, data->ioc_count,
-					    &id, &myip, &ip, &port,
+					    &id4, &myip, &ip, &port,
 					    &conn_count,  &share_count);
 		if (rc)
 			return rc;
 
-		data->ioc_nid  = id.nid;
+		data->ioc_nid = id4.nid;
 		data->ioc_count = share_count;
 		data->ioc_u32[0] = ip;
 		data->ioc_u32[1] = port;
 		data->ioc_u32[2] = myip;
 		data->ioc_u32[3] = conn_count;
-		data->ioc_u32[4] = id.pid;
+		data->ioc_u32[4] = id4.pid;
 		return 0;
 	}
 
-	case IOC_LIBCFS_ADD_PEER:
-		id.nid = data->ioc_nid;
-		id.pid = LNET_PID_LUSTRE;
-		return ksocknal_add_peer(ni, id,
-					 data->ioc_u32[0], /* IP */
-					 data->ioc_u32[1]); /* port */
+	case IOC_LIBCFS_ADD_PEER: {
+		struct sockaddr_in sa = {.sin_family = AF_INET};
 
+		id4.nid = data->ioc_nid;
+		id4.pid = LNET_PID_LUSTRE;
+		sa.sin_addr.s_addr = htonl(data->ioc_u32[0]);
+		sa.sin_port = htons(data->ioc_u32[1]);
+		return ksocknal_add_peer(ni, id4, (struct sockaddr *)&sa);
+	}
 	case IOC_LIBCFS_DEL_PEER:
-		id.nid = data->ioc_nid;
-		id.pid = LNET_PID_ANY;
-		return ksocknal_del_peer(ni, id,
+		id4.nid = data->ioc_nid;
+		id4.pid = LNET_PID_ANY;
+		return ksocknal_del_peer(ni, id4,
 					 data->ioc_u32[0]); /* IP */
 
 	case IOC_LIBCFS_GET_CONN: {
@@ -1797,7 +1815,7 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 		ksocknal_lib_get_conn_tunables(conn, &txmem, &rxmem, &nagle);
 
 		data->ioc_count = txmem;
-		data->ioc_nid = conn->ksnc_peer->ksnp_id.nid;
+		data->ioc_nid = lnet_nid_to_nid4(&conn->ksnc_peer->ksnp_id.nid);
 		data->ioc_flags = nagle;
 		if (psa->sin_family == AF_INET)
 			data->ioc_u32[0] = ntohl(psa->sin_addr.s_addr);
@@ -1818,9 +1836,9 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 	}
 
 	case IOC_LIBCFS_CLOSE_CONNECTION:
-		id.nid = data->ioc_nid;
+		lnet_nid4_to_nid(data->ioc_nid, &id.nid);
 		id.pid = LNET_PID_ANY;
-		return ksocknal_close_matching_conns(id,
+		return ksocknal_close_matching_conns(&id,
 						     data->ioc_u32[0]);
 
 	case IOC_LIBCFS_REGISTER_MYNID:
@@ -1835,9 +1853,9 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 		return -EINVAL;
 
 	case IOC_LIBCFS_PUSH_CONNECTION:
-		id.nid = data->ioc_nid;
+		lnet_nid4_to_nid(data->ioc_nid, &id.nid);
 		id.pid = LNET_PID_ANY;
-		return ksocknal_push(ni, id);
+		return ksocknal_push(ni, &id);
 
 	default:
 		return -EINVAL;
@@ -2145,7 +2163,7 @@ static int ksocknal_device_event(struct notifier_block *unused,
 			continue;
 
 		CWARN("Active peer_ni on shutdown: %s, ref %d, closing %d, accepting %d, err %d, zcookie %llu, txq %d, zc_req %d\n",
-		      libcfs_id2str(peer_ni->ksnp_id),
+		      libcfs_idstr(&peer_ni->ksnp_id),
 		      refcount_read(&peer_ni->ksnp_refcount),
 		      peer_ni->ksnp_closing,
 		      peer_ni->ksnp_accepting, peer_ni->ksnp_error,
diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h
index 7a55492..fe1bc7d 100644
--- a/net/lnet/klnds/socklnd/socklnd.h
+++ b/net/lnet/klnds/socklnd/socklnd.h
@@ -415,7 +415,7 @@ struct ksock_peer_ni {
 	time64_t		ksnp_last_alive;	/* when (in seconds) I was last
 							 * alive
 							 */
-	struct lnet_process_id	ksnp_id;		/* who's on the other end(s) */
+	struct lnet_processid	ksnp_id;		/* who's on the other end(s) */
 	refcount_t		ksnp_refcount;		/* # users */
 	int			ksnp_closing;		/* being closed */
 	int			ksnp_accepting;		/* # passive connections pending
@@ -625,12 +625,12 @@ int ksocknal_recv(struct lnet_ni *ni, void *private, struct lnet_msg *lntmsg,
 		  int delayed, struct iov_iter *to, unsigned int rlen);
 int ksocknal_accept(struct lnet_ni *ni, struct socket *sock);
 
-int ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, u32 ip,
-		      int port);
+int ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id,
+		      struct sockaddr *addr);
 struct ksock_peer_ni *ksocknal_find_peer_locked(struct lnet_ni *ni,
-					        struct lnet_process_id id);
+						struct lnet_processid *id);
 struct ksock_peer_ni *ksocknal_find_peer(struct lnet_ni *ni,
-				         struct lnet_process_id id);
+					 struct lnet_processid *id);
 void ksocknal_peer_failed(struct ksock_peer_ni *peer_ni);
 int ksocknal_create_conn(struct lnet_ni *ni, struct ksock_conn_cb *conn_cb,
 			 struct socket *sock, int type);
@@ -640,12 +640,12 @@ int ksocknal_create_conn(struct lnet_ni *ni, struct ksock_conn_cb *conn_cb,
 int ksocknal_close_peer_conns_locked(struct ksock_peer_ni *peer_ni,
 				     struct sockaddr *peer, int why);
 int ksocknal_close_conn_and_siblings(struct ksock_conn *conn, int why);
-int ksocknal_close_matching_conns(struct lnet_process_id id, u32 ipaddr);
+int ksocknal_close_matching_conns(struct lnet_processid *id, u32 ipaddr);
 struct ksock_conn *ksocknal_find_conn_locked(struct ksock_peer_ni *peer_ni,
 					     struct ksock_tx *tx, int nonblk);
 
 int ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx,
-			   struct lnet_process_id id);
+			   struct lnet_processid *id);
 struct ksock_tx *ksocknal_alloc_tx(int type, int size);
 void ksocknal_free_tx(struct ksock_tx *tx);
 struct ksock_tx *ksocknal_alloc_tx_noop(u64 cookie, int nonblk);
diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c
index e6cd976..a2298275 100644
--- a/net/lnet/klnds/socklnd/socklnd_cb.c
+++ b/net/lnet/klnds/socklnd/socklnd_cb.c
@@ -542,7 +542,7 @@ struct ksock_tx *
 			break;
 		}
 		CDEBUG(D_NET, "[%p] Error %d on write to %s ip %pISp\n",
-		       conn, rc, libcfs_id2str(conn->ksnc_peer->ksnp_id),
+		       conn, rc, libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 		       &conn->ksnc_peeraddr);
 	}
 
@@ -677,7 +677,7 @@ struct ksock_conn *
 	LASSERT(!conn->ksnc_closing);
 
 	CDEBUG(D_NET, "Sending to %s ip %pISp\n",
-	       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+	       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 	       &conn->ksnc_peeraddr);
 
 	ksocknal_tx_prep(conn, tx);
@@ -804,10 +804,11 @@ struct ksock_conn_cb *
 
 int
 ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx,
-		       struct lnet_process_id id)
+		       struct lnet_processid *id)
 {
 	struct ksock_peer_ni *peer_ni;
 	struct ksock_conn *conn;
+	struct sockaddr_in sa;
 	rwlock_t *g_lock;
 	int retry;
 	int rc;
@@ -846,23 +847,31 @@ struct ksock_conn_cb *
 
 		write_unlock_bh(g_lock);
 
-		if (id.pid & LNET_PID_USERFLAG) {
+		if (id->pid & LNET_PID_USERFLAG) {
 			CERROR("Refusing to create a connection to userspace process %s\n",
-			       libcfs_id2str(id));
+			       libcfs_idstr(id));
 			return -EHOSTUNREACH;
 		}
 
 		if (retry) {
-			CERROR("Can't find peer_ni %s\n", libcfs_id2str(id));
+			CERROR("Can't find peer_ni %s\n", libcfs_idstr(id));
 			return -EHOSTUNREACH;
 		}
 
-		rc = ksocknal_add_peer(ni, id,
-				       LNET_NIDADDR(id.nid),
-				       lnet_acceptor_port());
+		memset(&sa, 0, sizeof(sa));
+		sa.sin_family = AF_INET;
+		sa.sin_addr.s_addr = id->nid.nid_addr[0];
+		sa.sin_port = htons(lnet_acceptor_port());
+		{
+			struct lnet_process_id id4 = {
+				.pid = id->pid,
+				.nid = lnet_nid_to_nid4(&id->nid),
+			};
+			rc = ksocknal_add_peer(ni, id4, (struct sockaddr *)&sa);
+		}
 		if (rc) {
 			CERROR("Can't add peer_ni %s: %d\n",
-			       libcfs_id2str(id), rc);
+			       libcfs_idstr(id), rc);
 			return rc;
 		}
 	}
@@ -892,7 +901,7 @@ struct ksock_conn_cb *
 	write_unlock_bh(g_lock);
 
 	/* NB Routes may be ignored if connections to them failed recently */
-	CNETERR("No usable routes to %s\n", libcfs_id2str(id));
+	CNETERR("No usable routes to %s\n", libcfs_idstr(id));
 	tx->tx_hstatus = LNET_MSG_STATUS_REMOTE_ERROR;
 	return -EHOSTUNREACH;
 }
@@ -902,7 +911,7 @@ struct ksock_conn_cb *
 {
 	unsigned int mpflag = 0;
 	int type = lntmsg->msg_type;
-	struct lnet_process_id target = lntmsg->msg_target;
+	struct lnet_processid target;
 	unsigned int payload_niov = lntmsg->msg_niov;
 	struct bio_vec *payload_kiov = lntmsg->msg_kiov;
 	unsigned int payload_offset = lntmsg->msg_offset;
@@ -911,12 +920,14 @@ struct ksock_conn_cb *
 	int desc_size;
 	int rc;
 
-	/*
-	 * NB 'private' is different depending on what we're sending.
+	/* NB 'private' is different depending on what we're sending.
 	 * Just ignore it...
 	 */
+	target.pid = lntmsg->msg_target.pid;
+	lnet_nid4_to_nid(lntmsg->msg_target.nid, &target.nid);
+
 	CDEBUG(D_NET, "sending %u bytes in %d frags to %s\n",
-	       payload_nob, payload_niov, libcfs_id2str(target));
+	       payload_nob, payload_niov, libcfs_idstr(&target));
 
 	LASSERT(!payload_nob || payload_niov > 0);
 	LASSERT(payload_niov <= LNET_MAX_IOV);
@@ -954,7 +965,7 @@ struct ksock_conn_cb *
 	tx->tx_msg.ksm_zc_cookies[1] = 0;
 
 	/* The first fragment will be set later in pro_pack */
-	rc = ksocknal_launch_packet(ni, tx, target);
+	rc = ksocknal_launch_packet(ni, tx, &target);
 	if (mpflag)
 		memalloc_noreclaim_restore(mpflag);
 
@@ -1051,7 +1062,7 @@ struct ksock_conn_cb *
 {
 	struct kvec *kvec = conn->ksnc_rx_iov_space;
 	struct lnet_hdr *lhdr;
-	struct lnet_process_id *id;
+	struct lnet_processid *id;
 	int rc;
 
 	LASSERT(refcount_read(&conn->ksnc_conn_refcount) > 0);
@@ -1067,19 +1078,19 @@ struct ksock_conn_cb *
 		rc = ksocknal_receive(conn);
 
 		if (rc <= 0) {
-			struct lnet_process_id ksnp_id;
+			struct lnet_processid *ksnp_id;
 
-			ksnp_id = conn->ksnc_peer->ksnp_id;
+			ksnp_id = &conn->ksnc_peer->ksnp_id;
 
 			LASSERT(rc != -EAGAIN);
 
 			if (!rc)
 				CDEBUG(D_NET, "[%p] EOF from %s ip %pISp\n",
-				       conn, libcfs_id2str(ksnp_id),
+				       conn, libcfs_idstr(ksnp_id),
 				       &conn->ksnc_peeraddr);
 			else if (!conn->ksnc_closing)
 				CERROR("[%p] Error %d on read from %s ip %pISp\n",
-				       conn, rc, libcfs_id2str(ksnp_id),
+				       conn, rc, libcfs_idstr(ksnp_id),
 				       &conn->ksnc_peeraddr);
 
 			/* it's not an error if conn is being closed */
@@ -1105,7 +1116,7 @@ struct ksock_conn_cb *
 		if (conn->ksnc_msg.ksm_type != KSOCK_MSG_NOOP &&
 		    conn->ksnc_msg.ksm_type != KSOCK_MSG_LNET) {
 			CERROR("%s: Unknown message type: %x\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       conn->ksnc_msg.ksm_type);
 			ksocknal_new_packet(conn, 0);
 			ksocknal_close_conn_and_siblings(conn, -EPROTO);
@@ -1117,7 +1128,7 @@ struct ksock_conn_cb *
 		    conn->ksnc_msg.ksm_csum != conn->ksnc_rx_csum) {
 			/* NOOP Checksum error */
 			CERROR("%s: Checksum error, wire:0x%08X data:0x%08X\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       conn->ksnc_msg.ksm_csum, conn->ksnc_rx_csum);
 			ksocknal_new_packet(conn, 0);
 			ksocknal_close_conn_and_siblings(conn, -EPROTO);
@@ -1133,12 +1144,12 @@ struct ksock_conn_cb *
 				cookie = conn->ksnc_msg.ksm_zc_cookies[0];
 
 			rc = conn->ksnc_proto->pro_handle_zcack(conn, cookie,
-					       conn->ksnc_msg.ksm_zc_cookies[1]);
-
+								conn->ksnc_msg.ksm_zc_cookies[1]);
 			if (rc) {
 				CERROR("%s: Unknown ZC-ACK cookie: %llu, %llu\n",
-				       libcfs_id2str(conn->ksnc_peer->ksnp_id),
-				       cookie, conn->ksnc_msg.ksm_zc_cookies[1]);
+				       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
+				       cookie,
+				       conn->ksnc_msg.ksm_zc_cookies[1]);
 				ksocknal_new_packet(conn, 0);
 				ksocknal_close_conn_and_siblings(conn, -EPROTO);
 				return rc;
@@ -1172,7 +1183,7 @@ struct ksock_conn_cb *
 
 			/* Substitute process ID assigned at connection time */
 			lhdr->src_pid = cpu_to_le32(id->pid);
-			lhdr->src_nid = cpu_to_le64(id->nid);
+			lhdr->src_nid = cpu_to_le64(lnet_nid_to_nid4(&id->nid));
 		}
 
 		conn->ksnc_rx_state = SOCKNAL_RX_PARSE;
@@ -1180,7 +1191,8 @@ struct ksock_conn_cb *
 
 		rc = lnet_parse(conn->ksnc_peer->ksnp_ni,
 				&conn->ksnc_msg.ksm_u.lnetmsg.ksnm_hdr,
-				conn->ksnc_peer->ksnp_id.nid, conn, 0);
+				lnet_nid_to_nid4(&conn->ksnc_peer->ksnp_id.nid),
+				conn, 0);
 		if (rc < 0) {
 			/* I just received garbage: give up on this conn */
 			ksocknal_new_packet(conn, 0);
@@ -1207,7 +1219,7 @@ struct ksock_conn_cb *
 		    conn->ksnc_msg.ksm_csum &&  /* has checksum */
 		    conn->ksnc_msg.ksm_csum != conn->ksnc_rx_csum) {
 			CERROR("%s: Checksum error, wire:0x%08X data:0x%08X\n",
-			       libcfs_id2str(conn->ksnc_peer->ksnp_id),
+			       libcfs_idstr(&conn->ksnc_peer->ksnp_id),
 			       conn->ksnc_msg.ksm_csum, conn->ksnc_rx_csum);
 			rc = -EIO;
 		}
@@ -1219,9 +1231,10 @@ struct ksock_conn_cb *
 			id = &conn->ksnc_peer->ksnp_id;
 
 			rc = conn->ksnc_proto->pro_handle_zcreq(conn,
-					conn->ksnc_msg.ksm_zc_cookies[0],
-					*ksocknal_tunables.ksnd_nonblk_zcack ||
-					le64_to_cpu(lhdr->src_nid) != id->nid);
+								conn->ksnc_msg.ksm_zc_cookies[0],
+								*ksocknal_tunables.ksnd_nonblk_zcack ||
+								le64_to_cpu(lhdr->src_nid) !=
+								lnet_nid_to_nid4(&id->nid));
 		}
 
 		if (rc && conn->ksnc_lnet_msg)
@@ -1796,7 +1809,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		if (peer_ni->ksnp_accepting > 0) {
 			CDEBUG(D_NET,
 			       "peer_ni %s(%d) already connecting to me, retry later.\n",
-			       libcfs_nid2str(peer_ni->ksnp_id.nid),
+			       libcfs_nidstr(&peer_ni->ksnp_id.nid),
 			       peer_ni->ksnp_accepting);
 			retry_later = true;
 		}
@@ -1820,13 +1833,13 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
 		if (ktime_get_seconds() >= deadline) {
 			rc = -ETIMEDOUT;
-			lnet_connect_console_error(rc, peer_ni->ksnp_id.nid,
-						   (struct sockaddr *)
-						   &conn_cb->ksnr_addr);
+			lnet_connect_console_error(rc,
+						   lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
+						   (struct sockaddr *)&conn_cb->ksnr_addr);
 			goto failed;
 		}
 
-		sock = lnet_connect(peer_ni->ksnp_id.nid,
+		sock = lnet_connect(lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
 				    conn_cb->ksnr_myiface,
 				    (struct sockaddr *)&conn_cb->ksnr_addr,
 				    peer_ni->ksnp_ni->ni_net_ns);
@@ -1838,9 +1851,9 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		rc = ksocknal_create_conn(peer_ni->ksnp_ni, conn_cb, sock,
 					  type);
 		if (rc < 0) {
-			lnet_connect_console_error(rc, peer_ni->ksnp_id.nid,
-						   (struct sockaddr *)
-						   &conn_cb->ksnr_addr);
+			lnet_connect_console_error(rc,
+						   lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
+						   (struct sockaddr *)&conn_cb->ksnr_addr);
 			goto failed;
 		}
 
@@ -1851,7 +1864,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		retry_later = (rc);
 		if (retry_later)
 			CDEBUG(D_NET, "peer_ni %s: conn race, retry later.\n",
-			       libcfs_nid2str(peer_ni->ksnp_id.nid));
+			       libcfs_nidstr(&peer_ni->ksnp_id.nid));
 
 		write_lock_bh(&ksocknal_data.ksnd_global_lock);
 	}
@@ -2191,18 +2204,18 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 			switch (error) {
 			case ECONNRESET:
 				CNETERR("A connection with %s (%pISp) was reset; it may have rebooted.\n",
-					libcfs_id2str(peer_ni->ksnp_id),
+					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
 				break;
 			case ETIMEDOUT:
 				CNETERR("A connection with %s (%pISp) timed out; the network or node may be down.\n",
-					libcfs_id2str(peer_ni->ksnp_id),
+					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
 				break;
 			default:
 				CNETERR("An unexpected network error %d occurred with %s (%pISp\n",
 					error,
-					libcfs_id2str(peer_ni->ksnp_id),
+					libcfs_idstr(&peer_ni->ksnp_id),
 					&conn->ksnc_peeraddr);
 				break;
 			}
@@ -2215,7 +2228,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 			/* Timed out incomplete incoming message */
 			ksocknal_conn_addref(conn);
 			CNETERR("Timeout receiving from %s (%pISp), state %d wanted %zd left %d\n",
-				libcfs_id2str(peer_ni->ksnp_id),
+				libcfs_idstr(&peer_ni->ksnp_id),
 				&conn->ksnc_peeraddr,
 				conn->ksnc_rx_state,
 				iov_iter_count(&conn->ksnc_rx_to),
@@ -2236,7 +2249,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 				tx->tx_hstatus =
 					LNET_MSG_STATUS_LOCAL_TIMEOUT;
 			CNETERR("Timeout sending data to %s (%pISp) the network or that node may be down.\n",
-				libcfs_id2str(peer_ni->ksnp_id),
+				libcfs_idstr(&peer_ni->ksnp_id),
 				&conn->ksnc_peeraddr);
 			return conn;
 		}
@@ -2322,7 +2335,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		return -ENOMEM;
 	}
 
-	if (!ksocknal_launch_packet(peer_ni->ksnp_ni, tx, peer_ni->ksnp_id)) {
+	if (!ksocknal_launch_packet(peer_ni->ksnp_ni, tx, &peer_ni->ksnp_id)) {
 		read_lock(&ksocknal_data.ksnd_global_lock);
 		return 1;
 	}
@@ -2423,7 +2436,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		read_unlock(&ksocknal_data.ksnd_global_lock);
 
 		CERROR("Total %d stale ZC_REQs for peer_ni %s detected; the oldest(%p) timed out %lld secs ago, resid: %d, wmem: %d\n",
-		       n, libcfs_nid2str(peer_ni->ksnp_id.nid), tx_stale,
+		       n, libcfs_nidstr(&peer_ni->ksnp_id.nid), tx_stale,
 		       ktime_get_seconds() - deadline,
 		       resid, conn->ksnc_sock->sk->sk_wmem_queued);
 
diff --git a/net/lnet/klnds/socklnd/socklnd_proto.c b/net/lnet/klnds/socklnd/socklnd_proto.c
index 0a6072b..c3ba070 100644
--- a/net/lnet/klnds/socklnd/socklnd_proto.c
+++ b/net/lnet/klnds/socklnd/socklnd_proto.c
@@ -189,7 +189,7 @@
 	if (cookie == tx->tx_msg.ksm_zc_cookies[0] ||
 	    cookie == tx->tx_msg.ksm_zc_cookies[1]) {
 		CWARN("%s: duplicated ZC cookie: %llu\n",
-		      libcfs_id2str(conn->ksnc_peer->ksnp_id), cookie);
+		      libcfs_idstr(&conn->ksnc_peer->ksnp_id), cookie);
 		return 1; /* XXX return error in the future */
 	}
 
@@ -243,14 +243,14 @@
 		}
 
 	} else {
-		/*
-		 * ksm_zc_cookies[0] < ksm_zc_cookies[1],
-		 * it is range of cookies
+		/* ksm_zc_cookies[0] < ksm_zc_cookies[1], it is range
+		 * of cookies
 		 */
 		if (cookie >= tx->tx_msg.ksm_zc_cookies[0] &&
 		    cookie <= tx->tx_msg.ksm_zc_cookies[1]) {
 			CWARN("%s: duplicated ZC cookie: %llu\n",
-			      libcfs_id2str(conn->ksnc_peer->ksnp_id), cookie);
+			      libcfs_idstr(&conn->ksnc_peer->ksnp_id),
+			      cookie);
 			return 1; /* XXX: return error in the future */
 		}
 
@@ -398,8 +398,8 @@
 	if (!tx)
 		return -ENOMEM;
 
-	rc = ksocknal_launch_packet(peer_ni->ksnp_ni, tx, peer_ni->ksnp_id);
-	if (!rc)
+	rc = ksocknal_launch_packet(peer_ni->ksnp_ni, tx, &peer_ni->ksnp_id);
+	if (rc == 0)
 		return 0;
 
 	ksocknal_free_tx(tx);
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index f5b022f..1f053b3 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -1509,22 +1509,35 @@ struct lnet_net *
 }
 
 int
-lnet_cpt_of_nid(lnet_nid_t nid4, struct lnet_ni *ni)
+lnet_nid2cpt(struct lnet_nid *nid, struct lnet_ni *ni)
 {
 	int cpt;
 	int cpt2;
-	struct lnet_nid nid;
 
 	if (LNET_CPT_NUMBER == 1)
 		return 0; /* the only one */
 
-	lnet_nid4_to_nid(nid4, &nid);
 	cpt = lnet_net_lock_current();
-	cpt2 = lnet_cpt_of_nid_locked(&nid, ni);
+
+	cpt2 = lnet_cpt_of_nid_locked(nid, ni);
+
 	lnet_net_unlock(cpt);
 
 	return cpt2;
 }
+EXPORT_SYMBOL(lnet_nid2cpt);
+
+int
+lnet_cpt_of_nid(lnet_nid_t nid4, struct lnet_ni *ni)
+{
+	struct lnet_nid nid;
+
+	if (LNET_CPT_NUMBER == 1)
+		return 0; /* the only one */
+
+	lnet_nid4_to_nid(nid4, &nid);
+	return lnet_nid2cpt(&nid, ni);
+}
 EXPORT_SYMBOL(lnet_cpt_of_nid);
 
 int
diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c
index 08f828b..d91815d 100644
--- a/net/lnet/lnet/nidstrings.c
+++ b/net/lnet/lnet/nidstrings.c
@@ -1139,6 +1139,24 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 }
 EXPORT_SYMBOL(libcfs_id2str);
 
+char *
+libcfs_idstr(struct lnet_processid *id)
+{
+	char *str = libcfs_next_nidstring();
+
+	if (id->pid == LNET_PID_ANY) {
+		snprintf(str, LNET_NIDSTR_SIZE,
+			 "LNET_PID_ANY-%s", libcfs_nidstr(&id->nid));
+		return str;
+	}
+
+	snprintf(str, LNET_NIDSTR_SIZE, "%s%u-%s",
+		 ((id->pid & LNET_PID_USERFLAG) != 0) ? "U" : "",
+		 (id->pid & ~LNET_PID_USERFLAG), libcfs_nidstr(&id->nid));
+	return str;
+}
+EXPORT_SYMBOL(libcfs_idstr);
+
 int
 libcfs_str2anynid(lnet_nid_t *nidp, const char *str)
 {
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 10/24] lnet: enhance connect/accept to support large addr
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (8 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 09/24] lnet: introduce lnet_processid for ksock_peer_ni James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 11/24] lnet: change lr_nid to struct lnet_nid James Simmons
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

This patch introduces a version-2 of the acceptor protocol.  This
version uses a 'struct lnet_nid' rather than 'lnet_nid_t'

lnet_connect() now accepts a struct lnet_nid and uses version 2 if
necessary.  lnet_accept() accepts either v1 or v2.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: 9b0738c53c962f426 ("LU-10391 lnet: enhance connect/accept to support large addr")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/42105
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h       |   5 +-
 include/uapi/linux/lnet/lnet-idl.h  |  11 +++-
 net/lnet/klnds/socklnd/socklnd_cb.c |   8 +--
 net/lnet/lnet/acceptor.c            | 111 +++++++++++++++++++++++-------------
 net/lnet/lnet/api-ni.c              |   9 +++
 5 files changed, 97 insertions(+), 47 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index 3842976..9e7d0b8 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -489,6 +489,7 @@ unsigned int lnet_nid_cpt_hash(struct lnet_nid *nid,
 struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid);
 struct lnet_ni *lnet_net2ni_locked(u32 net, int cpt);
 struct lnet_ni *lnet_net2ni_addref(u32 net);
+struct lnet_ni *lnet_nid_to_ni_addref(struct lnet_nid *nid);
 struct lnet_net *lnet_get_net_locked(u32 net_id);
 
 extern unsigned int lnet_response_tracking;
@@ -742,9 +743,9 @@ void lnet_copy_kiov2iter(struct iov_iter *to,
 void lnet_register_lnd(struct lnet_lnd *lnd);
 void lnet_unregister_lnd(struct lnet_lnd *lnd);
 
-struct socket *lnet_connect(lnet_nid_t peer_nid, int interface,
+struct socket *lnet_connect(struct lnet_nid *peer_nid, int interface,
 			    struct sockaddr *peeraddr, struct net *ns);
-void lnet_connect_console_error(int rc, lnet_nid_t peer_nid,
+void lnet_connect_console_error(int rc, struct lnet_nid *peer_nid,
 				struct sockaddr *sa);
 int lnet_count_acceptor_nets(void);
 int lnet_acceptor_timeout(void);
diff --git a/include/uapi/linux/lnet/lnet-idl.h b/include/uapi/linux/lnet/lnet-idl.h
index 3fc0df1..b14723e 100644
--- a/include/uapi/linux/lnet/lnet-idl.h
+++ b/include/uapi/linux/lnet/lnet-idl.h
@@ -191,13 +191,22 @@ struct lnet_magicversion {
 
 /* Acceptor connection request */
 struct lnet_acceptor_connreq {
-	__u32	acr_magic;	/* PTL_ACCEPTOR_PROTO_MAGIC */
+	__u32	acr_magic;	/* LNET_PROTO_ACCEPTOR_MAGIC */
 	__u32	acr_version;	/* protocol version */
 	__u64	acr_nid;	/* target NID */
 } __attribute__((packed));
 
 #define LNET_PROTO_ACCEPTOR_VERSION	1
 
+struct lnet_acceptor_connreq_v2 {
+	__u32			acr_magic;	/* LNET_PROTO_ACCEPTOR_MAGIC */
+	__u32			acr_version;	/* protocol version - 2 */
+	struct lnet_nid		acr_nid;	/* target NID */
+} __attribute__((packed));
+
+/* For use with 16-byte addresses */
+#define LNET_PROTO_ACCEPTOR_VERSION_16  2
+
 struct lnet_counters_common {
 	__u32	lcc_msgs_alloc;
 	__u32	lcc_msgs_max;
diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c
index a2298275..edc584a 100644
--- a/net/lnet/klnds/socklnd/socklnd_cb.c
+++ b/net/lnet/klnds/socklnd/socklnd_cb.c
@@ -1833,13 +1833,12 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
 		if (ktime_get_seconds() >= deadline) {
 			rc = -ETIMEDOUT;
-			lnet_connect_console_error(rc,
-						   lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
+			lnet_connect_console_error(rc, &peer_ni->ksnp_id.nid,
 						   (struct sockaddr *)&conn_cb->ksnr_addr);
 			goto failed;
 		}
 
-		sock = lnet_connect(lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
+		sock = lnet_connect(&peer_ni->ksnp_id.nid,
 				    conn_cb->ksnr_myiface,
 				    (struct sockaddr *)&conn_cb->ksnr_addr,
 				    peer_ni->ksnp_ni->ni_net_ns);
@@ -1851,8 +1850,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		rc = ksocknal_create_conn(peer_ni->ksnp_ni, conn_cb, sock,
 					  type);
 		if (rc < 0) {
-			lnet_connect_console_error(rc,
-						   lnet_nid_to_nid4(&peer_ni->ksnp_id.nid),
+			lnet_connect_console_error(rc, &peer_ni->ksnp_id.nid,
 						   (struct sockaddr *)&conn_cb->ksnr_addr);
 			goto failed;
 		}
diff --git a/net/lnet/lnet/acceptor.c b/net/lnet/lnet/acceptor.c
index 243c34f..2306760 100644
--- a/net/lnet/lnet/acceptor.c
+++ b/net/lnet/lnet/acceptor.c
@@ -85,54 +85,58 @@
 EXPORT_SYMBOL(lnet_acceptor_timeout);
 
 void
-lnet_connect_console_error(int rc, lnet_nid_t peer_nid,
+lnet_connect_console_error(int rc, struct lnet_nid *peer_nid,
 			   struct sockaddr *sa)
 {
 	switch (rc) {
 	/* "normal" errors */
 	case -ECONNREFUSED:
 		CNETERR("Connection to %s at host %pISp was refused: check that Lustre is running on that node.\n",
-			libcfs_nid2str(peer_nid), sa);
+			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -EHOSTUNREACH:
 	case -ENETUNREACH:
 		CNETERR("Connection to %s at host %pIS was unreachable: the network or that node may be down, or Lustre may be misconfigured.\n",
-			libcfs_nid2str(peer_nid), sa);
+			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -ETIMEDOUT:
 		CNETERR("Connection to %s at host %pISp took too long: that node may be hung or experiencing high load.\n",
-			libcfs_nid2str(peer_nid), sa);
+			libcfs_nidstr(peer_nid), sa);
 		break;
 	case -ECONNRESET:
 		LCONSOLE_ERROR_MSG(0x11b,
 				   "Connection to %s at host %pISp was reset: is it running a compatible version of Lustre and is %s one of its NIDs?\n",
-				   libcfs_nid2str(peer_nid), sa,
-				   libcfs_nid2str(peer_nid));
+				   libcfs_nidstr(peer_nid), sa,
+				   libcfs_nidstr(peer_nid));
 		break;
 	case -EPROTO:
 		LCONSOLE_ERROR_MSG(0x11c,
 				   "Protocol error connecting to %s at host %pISp: is it running a compatible version of Lustre?\n",
-				   libcfs_nid2str(peer_nid), sa);
+				   libcfs_nidstr(peer_nid), sa);
 		break;
 	case -EADDRINUSE:
 		LCONSOLE_ERROR_MSG(0x11d,
 				   "No privileged ports available to connect to %s at host %pISp\n",
-				   libcfs_nid2str(peer_nid), sa);
+				   libcfs_nidstr(peer_nid), sa);
 		break;
 	default:
 		LCONSOLE_ERROR_MSG(0x11e,
 				   "Unexpected error %d connecting to %s at host %pISp\n",
-				   rc, libcfs_nid2str(peer_nid), sa);
+				   rc, libcfs_nidstr(peer_nid), sa);
 		break;
 	}
 }
 EXPORT_SYMBOL(lnet_connect_console_error);
 
 struct socket *
-lnet_connect(lnet_nid_t peer_nid, int interface, struct sockaddr *peeraddr,
+lnet_connect(struct lnet_nid *peer_nid, int interface,
+	     struct sockaddr *peeraddr,
 	     struct net *ns)
 {
-	struct lnet_acceptor_connreq cr;
+	struct lnet_acceptor_connreq cr1;
+	struct lnet_acceptor_connreq_v2 cr2;
+	void *cr;
+	int crsize;
 	struct socket *sock;
 	int rc;
 	int port;
@@ -156,20 +160,30 @@ struct socket *
 
 		BUILD_BUG_ON(LNET_PROTO_ACCEPTOR_VERSION != 1);
 
-		cr.acr_magic = LNET_PROTO_ACCEPTOR_MAGIC;
-		cr.acr_version = LNET_PROTO_ACCEPTOR_VERSION;
-		cr.acr_nid = peer_nid;
+		if (nid_is_nid4(peer_nid)) {
+			cr1.acr_magic = LNET_PROTO_ACCEPTOR_MAGIC;
+			cr1.acr_version = LNET_PROTO_ACCEPTOR_VERSION;
+			cr1.acr_nid = lnet_nid_to_nid4(peer_nid);
+			cr = &cr1;
+			crsize = sizeof(cr1);
 
-		if (the_lnet.ln_testprotocompat) {
-			/* single-shot proto check */
-			if (test_and_clear_bit(2, &the_lnet.ln_testprotocompat))
-				cr.acr_version++;
+			if (the_lnet.ln_testprotocompat) {
+				/* single-shot proto check */
+				if (test_and_clear_bit(2, &the_lnet.ln_testprotocompat))
+					cr1.acr_version++;
 
-			if (test_and_clear_bit(3, &the_lnet.ln_testprotocompat))
-				cr.acr_magic = LNET_PROTO_MAGIC;
+				if (test_and_clear_bit(3, &the_lnet.ln_testprotocompat))
+					cr1.acr_magic = LNET_PROTO_MAGIC;
+			}
+		} else {
+			cr2.acr_magic = LNET_PROTO_ACCEPTOR_MAGIC;
+			cr2.acr_version = LNET_PROTO_ACCEPTOR_VERSION_16;
+			cr2.acr_nid = *peer_nid;
+			cr = &cr2;
+			crsize = sizeof(cr2);
 		}
 
-		rc = lnet_sock_write(sock, &cr, sizeof(cr), accept_timeout);
+		rc = lnet_sock_write(sock, cr, crsize, accept_timeout);
 		if (rc)
 			goto failed_sock;
 
@@ -191,7 +205,10 @@ struct socket *
 lnet_accept(struct socket *sock, u32 magic)
 {
 	struct lnet_acceptor_connreq cr;
+	struct lnet_acceptor_connreq_v2 cr2;
+	struct lnet_nid nid;
 	struct sockaddr_storage peer;
+	int peer_version;
 	int rc;
 	int flip;
 	struct lnet_ni *ni;
@@ -249,14 +266,14 @@ struct socket *
 	if (flip)
 		__swab32s(&cr.acr_version);
 
-	if (cr.acr_version != LNET_PROTO_ACCEPTOR_VERSION) {
-		/*
-		 * future version compatibility!
+	switch (cr.acr_version) {
+	default:
+		/* future version compatibility!
 		 * An acceptor-specific protocol rev will first send a version
 		 * query.  I send back my current version to tell her I'm
 		 * "old".
 		 */
-		int peer_version = cr.acr_version;
+		peer_version = cr.acr_version;
 
 		memset(&cr, 0, sizeof(cr));
 		cr.acr_magic = LNET_PROTO_ACCEPTOR_MAGIC;
@@ -267,30 +284,47 @@ struct socket *
 			CERROR("Error sending magic+version in response to version %d from %pIS: %d\n",
 			       peer_version, &peer, rc);
 		return -EPROTO;
-	}
 
-	rc = lnet_sock_read(sock, &cr.acr_nid,
-			    sizeof(cr) -
-			    offsetof(struct lnet_acceptor_connreq, acr_nid),
-			    accept_timeout);
+	case LNET_PROTO_ACCEPTOR_VERSION:
+		rc = lnet_sock_read(sock, &cr.acr_nid,
+				    sizeof(cr) -
+				    offsetof(struct lnet_acceptor_connreq,
+					     acr_nid),
+				    accept_timeout);
+		if (rc)
+			break;
+		if (flip)
+			__swab64s(&cr.acr_nid);
+
+		lnet_nid4_to_nid(cr.acr_nid, &nid);
+		break;
+
+	case LNET_PROTO_ACCEPTOR_VERSION_16:
+		rc = lnet_sock_read(sock, &cr2.acr_nid,
+				    sizeof(cr2) -
+				    offsetof(struct lnet_acceptor_connreq_v2,
+					     acr_nid),
+				    accept_timeout);
+		if (rc)
+			break;
+		nid = cr2.acr_nid;
+		break;
+	}
 	if (rc) {
 		CERROR("Error %d reading connection request from %pIS\n",
 		       rc, &peer);
 		return -EIO;
 	}
 
-	if (flip)
-		__swab64s(&cr.acr_nid);
-
-	ni = lnet_nid2ni_addref(cr.acr_nid);
+	ni = lnet_nid_to_ni_addref(&nid);
 	if (!ni ||			/* no matching net */
-	    lnet_nid_to_nid4(&ni->ni_nid) != cr.acr_nid) {
+	    !nid_same(&ni->ni_nid, &nid)) {
 		/* right NET, wrong NID! */
 		if (ni)
 			lnet_ni_decref(ni);
 		LCONSOLE_ERROR_MSG(0x120,
 				   "Refusing connection from %pIS for %s: No matching NI\n",
-				   &peer, libcfs_nid2str(cr.acr_nid));
+				   &peer, libcfs_nidstr(&nid));
 		return -EPERM;
 	}
 
@@ -299,12 +333,11 @@ struct socket *
 		lnet_ni_decref(ni);
 		LCONSOLE_ERROR_MSG(0x121,
 				   "Refusing connection from %pIS for %s: NI doesn not accept IP connections\n",
-				   &peer, libcfs_nid2str(cr.acr_nid));
+				   &peer, libcfs_nidstr(&nid));
 		return -EPERM;
 	}
 
-	CDEBUG(D_NET, "Accept %s from %pIS\n",
-	       libcfs_nid2str(cr.acr_nid), &peer);
+	CDEBUG(D_NET, "Accept %s from %pIS\n", libcfs_nidstr(&nid), &peer);
 
 	rc = ni->ni_net->net_lnd->lnd_accept(ni, sock);
 
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 1f053b3..31ccb2c 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -810,6 +810,15 @@ static void lnet_assert_wire_constants(void)
 	BUILD_BUG_ON((int)offsetof(struct lnet_acceptor_connreq, acr_nid) != 8);
 	BUILD_BUG_ON((int)sizeof(((struct lnet_acceptor_connreq *)0)->acr_nid) != 8);
 
+	/* Checks for struct lnet_acceptor_connreq_v2 */
+	BUILD_BUG_ON((int)sizeof(struct lnet_acceptor_connreq_v2) != 28);
+	BUILD_BUG_ON((int)offsetof(struct lnet_acceptor_connreq_v2, acr_magic) != 0);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_acceptor_connreq_v2 *)0)->acr_magic) != 4);
+	BUILD_BUG_ON((int)offsetof(struct lnet_acceptor_connreq_v2, acr_version) != 4);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_acceptor_connreq_v2 *)0)->acr_version) != 4);
+	BUILD_BUG_ON((int)offsetof(struct lnet_acceptor_connreq_v2, acr_nid) != 8);
+	BUILD_BUG_ON((int)sizeof(((struct lnet_acceptor_connreq_v2 *)0)->acr_nid) != 20);
+
 	/* Checks for struct lnet_counters_common */
 	BUILD_BUG_ON((int)sizeof(struct lnet_counters_common) != 60);
 	BUILD_BUG_ON((int)offsetof(struct lnet_counters_common, lcc_msgs_alloc) != 0);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 11/24] lnet: change lr_nid to struct lnet_nid
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (9 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 10/24] lnet: enhance connect/accept to support large addr James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 12/24] lnet: extend rspt_next_hop_nid in lnet_rsp_tracker James Simmons
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

The nid in 'struct lnet_route' is now a struct lnet_nid'.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: d1e2f6fc688762222 ("LU-10391 lnet: change lr_nid to struct lnet_nid")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/43593
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h  |   9 ++--
 include/linux/lnet/lib-types.h |   2 +-
 net/lnet/lnet/api-ni.c         |  19 +++++++-
 net/lnet/lnet/config.c         |  15 +++---
 net/lnet/lnet/lib-move.c       |   2 +-
 net/lnet/lnet/peer.c           | 105 ++++++++++++++++++++++++-----------------
 net/lnet/lnet/router.c         |  43 +++++++++--------
 net/lnet/lnet/router_proc.c    |   2 +-
 8 files changed, 117 insertions(+), 80 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index 9e7d0b8..890f61a 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -517,7 +517,7 @@ int lnet_notify(struct lnet_ni *ni, lnet_nid_t peer, bool alive, bool reset,
 		time64_t when);
 void lnet_notify_locked(struct lnet_peer_ni *lp, int notifylnd, int alive,
 			time64_t when);
-int lnet_add_route(u32 net, u32 hops, lnet_nid_t gateway_nid,
+int lnet_add_route(u32 net, u32 hops, struct lnet_nid *gateway,
 		   u32 priority, u32 sensitivity);
 int lnet_del_route(u32 net, lnet_nid_t gw_nid);
 void lnet_move_route(struct lnet_route *route, struct lnet_peer *lp,
@@ -567,7 +567,8 @@ void lnet_rtr_transfer_to_peer(struct lnet_peer *src,
 void lnet_net_clr_pref_rtrs(struct lnet_net *net);
 int lnet_net_add_pref_rtr(struct lnet_net *net, lnet_nid_t gw_nid);
 
-int lnet_islocalnid(lnet_nid_t nid);
+int lnet_islocalnid4(lnet_nid_t nid);
+int lnet_islocalnid(struct lnet_nid *nid);
 int lnet_islocalnet(u32 net);
 int lnet_islocalnet_locked(u32 net);
 
@@ -837,9 +838,11 @@ struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer,
 						  struct lnet_peer_ni *prev);
 struct lnet_peer_ni *lnet_nid2peerni_locked(lnet_nid_t nid, lnet_nid_t pref,
 					    int cpt);
-struct lnet_peer_ni *lnet_nid2peerni_ex(lnet_nid_t nid, int cpt);
+struct lnet_peer_ni *lnet_nid2peerni_ex(struct lnet_nid *nid, int cpt);
 struct lnet_peer_ni *lnet_peer_get_ni_locked(struct lnet_peer *lp,
 					     lnet_nid_t nid);
+struct lnet_peer_ni *lnet_peer_ni_get_locked(struct lnet_peer *lp,
+					     struct lnet_nid *nid);
 struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid);
 struct lnet_peer_ni *lnet_peer_ni_find_locked(struct lnet_nid *nid);
 struct lnet_peer *lnet_find_peer(lnet_nid_t nid);
diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index ba900e8..1e1ddd7 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -905,7 +905,7 @@ struct lnet_route {
 	/* router node */
 	struct lnet_peer       *lr_gateway;
 	/* NID used to add route */
-	lnet_nid_t		lr_nid;
+	struct lnet_nid		lr_nid;
 	/* remote network number */
 	u32			lr_net;
 	/* local network number */
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 31ccb2c..0f4feda 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -1635,7 +1635,7 @@ struct lnet_ni *
 EXPORT_SYMBOL(lnet_nid_to_ni_addref);
 
 int
-lnet_islocalnid(lnet_nid_t nid)
+lnet_islocalnid4(lnet_nid_t nid)
 {
 	struct lnet_ni *ni;
 	int cpt;
@@ -1648,6 +1648,19 @@ struct lnet_ni *
 }
 
 int
+lnet_islocalnid(struct lnet_nid *nid)
+{
+	struct lnet_ni	*ni;
+	int		cpt;
+
+	cpt = lnet_net_lock_current();
+	ni = lnet_nid_to_ni_locked(nid, cpt);
+	lnet_net_unlock(cpt);
+
+	return ni != NULL;
+}
+
+int
 lnet_count_acceptor_nets(void)
 {
 	/* Return the # of NIs that need the acceptor. */
@@ -3852,6 +3865,7 @@ u32 lnet_get_dlc_seq_locked(void)
 	struct lnet_ioctl_config_data *config;
 	struct lnet_process_id id = { 0 };
 	struct lnet_ni *ni;
+	struct lnet_nid nid;
 	int rc;
 
 	BUILD_BUG_ON(LIBCFS_IOC_DATA_MAX <
@@ -3880,10 +3894,11 @@ u32 lnet_get_dlc_seq_locked(void)
 			  config->cfg_config_u.cfg_route.rtr_sensitivity;
 		}
 
+		lnet_nid4_to_nid(config->cfg_nid, &nid);
 		mutex_lock(&the_lnet.ln_api_mutex);
 		rc = lnet_add_route(config->cfg_net,
 				    config->cfg_config_u.cfg_route.rtr_hop,
-				    config->cfg_nid,
+				    &nid,
 				    config->cfg_config_u.cfg_route.rtr_priority,
 				    sensitivity);
 		mutex_unlock(&the_lnet.ln_api_mutex);
diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c
index 0c833fe..f499c91 100644
--- a/net/lnet/lnet/config.c
+++ b/net/lnet/lnet/config.c
@@ -1065,7 +1065,7 @@ struct lnet_ni *
 	struct list_head *tmp1;
 	struct list_head *tmp2;
 	u32 net;
-	lnet_nid_t nid;
+	struct lnet_nid nid;
 	struct lnet_text_buf *ltb;
 	struct lnet_text_buf *ltb1, *ltb2;
 	int rc;
@@ -1145,8 +1145,8 @@ struct lnet_ni *
 				if (rc < 0)
 					goto token_error;
 
-				nid = libcfs_str2nid(ltb->ltb_text);
-				if (nid == LNET_NID_ANY || nid == LNET_NID_LO_0)
+				if (libcfs_strnid(&nid, ltb->ltb_text) != 0 ||
+				    nid_is_lo0(&nid))
 					goto token_error;
 			}
 		}
@@ -1167,19 +1167,18 @@ struct lnet_ni *
 		LASSERT(net != LNET_NET_ANY);
 
 		list_for_each_entry(ltb2, &gateways, ltb_list) {
-			nid = libcfs_str2nid(ltb2->ltb_text);
-			LASSERT(nid != LNET_NID_ANY);
+			LASSERT(libcfs_strnid(&nid, ltb->ltb_text) == 0);
 
-			if (lnet_islocalnid(nid)) {
+			if (lnet_islocalnid(&nid)) {
 				*im_a_router = 1;
 				continue;
 			}
 
-			rc = lnet_add_route(net, hops, nid, priority, 1);
+			rc = lnet_add_route(net, hops, &nid, priority, 1);
 			if (rc && rc != -EEXIST && rc != -EHOSTUNREACH) {
 				CERROR("Can't create route to %s via %s\n",
 				       libcfs_net2str(net),
-				       libcfs_nid2str(nid));
+				       libcfs_nidstr(&nid));
 				goto out;
 			}
 		}
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 8c8db31..2454a0c 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -4298,7 +4298,7 @@ void lnet_monitor_thr_stop(void)
 			return -EPROTO;
 		}
 
-		if (lnet_islocalnid(dest_nid)) {
+		if (lnet_islocalnid4(dest_nid)) {
 			/*
 			 * dest is another local NI; sender should have used
 			 * this node's NID on its own network
diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c
index 17f99ee..4b6f339 100644
--- a/net/lnet/lnet/peer.c
+++ b/net/lnet/lnet/peer.c
@@ -107,15 +107,13 @@
 }
 
 static struct lnet_peer_ni *
-lnet_peer_ni_alloc(lnet_nid_t nid4)
+lnet_peer_ni_alloc(struct lnet_nid *nid)
 {
 	struct lnet_peer_ni *lpni;
 	struct lnet_net *net;
-	struct lnet_nid nid;
 	int cpt;
 
-	lnet_nid4_to_nid(nid4, &nid);
-	cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER);
+	cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
 
 	lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt);
 	if (!lpni)
@@ -138,11 +136,11 @@
 	else
 		lpni->lpni_ns_status = LNET_NI_STATUS_UP;
 	lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL;
-	lpni->lpni_nid = nid;
+	lpni->lpni_nid = *nid;
 	lpni->lpni_cpt = cpt;
 	atomic_set(&lpni->lpni_healthv, LNET_MAX_HEALTH_VALUE);
 
-	net = lnet_get_net_locked(LNET_NID_NET(&nid));
+	net = lnet_get_net_locked(LNET_NID_NET(nid));
 	lpni->lpni_net = net;
 	if (net) {
 		lpni->lpni_txcredits = net->net_tunables.lct_peer_tx_credits;
@@ -204,12 +202,10 @@
 }
 
 static struct lnet_peer *
-lnet_peer_alloc(lnet_nid_t nid4)
+lnet_peer_alloc(struct lnet_nid *nid)
 {
 	struct lnet_peer *lp;
-	struct lnet_nid nid;
 
-	lnet_nid4_to_nid(nid4, &nid);
 	lp = kzalloc_cpt(sizeof(*lp), GFP_KERNEL, CFS_CPT_ANY);
 	if (!lp)
 		return NULL;
@@ -223,7 +219,7 @@
 	INIT_LIST_HEAD(&lp->lp_rtr_list);
 	init_waitqueue_head(&lp->lp_dc_waitq);
 	spin_lock_init(&lp->lp_lock);
-	lp->lp_primary_nid = nid;
+	lp->lp_primary_nid = *nid;
 	lp->lp_disc_src_nid = LNET_ANY_NID;
 	lp->lp_disc_dst_nid = LNET_ANY_NID;
 	if (lnet_peers_start_down())
@@ -243,9 +239,9 @@
 	 * to ever use a different interface when sending messages to
 	 * myself.
 	 */
-	if (nid_is_lo0(&nid))
+	if (nid_is_lo0(nid))
 		lp->lp_state = LNET_PEER_NO_DISCOVERY;
-	lp->lp_cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER);
+	lp->lp_cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER);
 
 	CDEBUG(D_NET, "%p nid %s\n", lp, libcfs_nidstr(&lp->lp_primary_nid));
 
@@ -760,6 +756,24 @@ struct lnet_peer_ni *
 	return NULL;
 }
 
+struct lnet_peer_ni *
+lnet_peer_ni_get_locked(struct lnet_peer *lp, struct lnet_nid *nid)
+{
+	struct lnet_peer_net *lpn;
+	struct lnet_peer_ni *lpni;
+
+	lpn = lnet_peer_get_net_locked(lp, LNET_NID_NET(nid));
+	if (!lpn)
+		return NULL;
+
+	list_for_each_entry(lpni, &lpn->lpn_peer_nis, lpni_peer_nis) {
+		if (nid_same(&lpni->lpni_nid, nid))
+			return lpni;
+	}
+
+	return NULL;
+}
+
 struct lnet_peer *
 lnet_find_peer(lnet_nid_t nid)
 {
@@ -1592,20 +1606,21 @@ struct lnet_peer_net *
  * Call with the lnet_api_mutex held.
  */
 static int
-lnet_peer_add(lnet_nid_t nid, unsigned int flags)
+lnet_peer_add(lnet_nid_t nid4, unsigned int flags)
 {
+	struct lnet_nid nid;
 	struct lnet_peer *lp;
 	struct lnet_peer_net *lpn;
 	struct lnet_peer_ni *lpni;
 	int rc = 0;
 
-	LASSERT(nid != LNET_NID_ANY);
+	LASSERT(nid4 != LNET_NID_ANY);
 
 	/*
 	 * No need for the lnet_net_lock here, because the
 	 * lnet_api_mutex is held.
 	 */
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_find_peer_ni_locked(nid4);
 	if (lpni) {
 		/* A peer with this NID already exists. */
 		lp = lpni->lpni_peer_net->lpn_peer;
@@ -1617,13 +1632,13 @@ struct lnet_peer_net *
 		 * that an existing peer is being modified.
 		 */
 		if (lp->lp_state & LNET_PEER_CONFIGURED) {
-			if (lnet_nid_to_nid4(&lp->lp_primary_nid) != nid)
+			if (lnet_nid_to_nid4(&lp->lp_primary_nid) != nid4)
 				rc = -EEXIST;
 			else if ((lp->lp_state ^ flags) & LNET_PEER_MULTI_RAIL)
 				rc = -EPERM;
 			goto out;
 		} else if (!(flags & LNET_PEER_CONFIGURED)) {
-			if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid) {
+			if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid4) {
 				rc = -EEXIST;
 				goto out;
 			}
@@ -1634,13 +1649,14 @@ struct lnet_peer_net *
 
 	/* Create peer, peer_net, and peer_ni. */
 	rc = -ENOMEM;
-	lp = lnet_peer_alloc(nid);
+	lnet_nid4_to_nid(nid4, &nid);
+	lp = lnet_peer_alloc(&nid);
 	if (!lp)
 		goto out;
-	lpn = lnet_peer_net_alloc(LNET_NIDNET(nid));
+	lpn = lnet_peer_net_alloc(LNET_NID_NET(&nid));
 	if (!lpn)
 		goto out_free_lp;
-	lpni = lnet_peer_ni_alloc(nid);
+	lpni = lnet_peer_ni_alloc(&nid);
 	if (!lpni)
 		goto out_free_lpn;
 
@@ -1652,7 +1668,7 @@ struct lnet_peer_net *
 	kfree(lp);
 out:
 	CDEBUG(D_NET, "peer %s NID flags %#x: %d\n",
-	       libcfs_nid2str(nid), flags, rc);
+	       libcfs_nid2str(nid4), flags, rc);
 	return rc;
 }
 
@@ -1667,14 +1683,17 @@ struct lnet_peer_net *
  *             non-multi-rail peer.
  */
 static int
-lnet_peer_add_nid(struct lnet_peer *lp, lnet_nid_t nid, unsigned int flags)
+lnet_peer_add_nid(struct lnet_peer *lp, lnet_nid_t nid4, unsigned int flags)
 {
 	struct lnet_peer_net *lpn;
 	struct lnet_peer_ni *lpni;
+	struct lnet_nid nid;
 	int rc = 0;
 
 	LASSERT(lp);
-	LASSERT(nid != LNET_NID_ANY);
+	LASSERT(nid4 != LNET_NID_ANY);
+
+	lnet_nid4_to_nid(nid4, &nid);
 
 	/* A configured peer can only be updated through configuration. */
 	if (!(flags & LNET_PEER_CONFIGURED)) {
@@ -1700,7 +1719,7 @@ struct lnet_peer_net *
 		goto out;
 	}
 
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_find_peer_ni_locked(nid4);
 	if (lpni) {
 		/*
 		 * A peer_ni already exists. This is only a problem if
@@ -1747,14 +1766,14 @@ struct lnet_peer_net *
 			}
 			lnet_peer_del(lp2);
 			lnet_peer_ni_decref_locked(lpni);
-			lpni = lnet_peer_ni_alloc(nid);
+			lpni = lnet_peer_ni_alloc(&nid);
 			if (!lpni) {
 				rc = -ENOMEM;
 				goto out_free_lpni;
 			}
 		}
 	} else {
-		lpni = lnet_peer_ni_alloc(nid);
+		lpni = lnet_peer_ni_alloc(&nid);
 		if (!lpni) {
 			rc = -ENOMEM;
 			goto out_free_lpni;
@@ -1765,9 +1784,9 @@ struct lnet_peer_net *
 	 * Get the peer_net. Check that we're not adding a second
 	 * peer_ni on a peer_net of a non-multi-rail peer.
 	 */
-	lpn = lnet_peer_get_net_locked(lp, LNET_NIDNET(nid));
+	lpn = lnet_peer_get_net_locked(lp, LNET_NIDNET(nid4));
 	if (!lpn) {
-		lpn = lnet_peer_net_alloc(LNET_NIDNET(nid));
+		lpn = lnet_peer_net_alloc(LNET_NIDNET(nid4));
 		if (!lpn) {
 			rc = -ENOMEM;
 			goto out_free_lpni;
@@ -1783,7 +1802,7 @@ struct lnet_peer_net *
 	lnet_peer_ni_decref_locked(lpni);
 out:
 	CDEBUG(D_NET, "peer %s NID %s flags %#x: %d\n",
-	       libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid),
+	       libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid4),
 	       flags, rc);
 	return rc;
 }
@@ -1830,7 +1849,7 @@ struct lnet_peer_net *
  * lpni creation initiated due to traffic either sending or receiving.
  */
 static int
-lnet_peer_ni_traffic_add(lnet_nid_t nid, lnet_nid_t pref)
+lnet_peer_ni_traffic_add(struct lnet_nid *nid, lnet_nid_t pref)
 {
 	struct lnet_peer *lp;
 	struct lnet_peer_net *lpn;
@@ -1838,13 +1857,13 @@ struct lnet_peer_net *
 	unsigned int flags = 0;
 	int rc = 0;
 
-	if (nid == LNET_NID_ANY) {
+	if (LNET_NID_IS_ANY(nid)) {
 		rc = -EINVAL;
 		goto out;
 	}
 
 	/* lnet_net_lock is not needed here because ln_api_lock is held */
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_peer_ni_find_locked(nid);
 	if (lpni) {
 		/*
 		 * We must have raced with another thread. Since we
@@ -1861,7 +1880,7 @@ struct lnet_peer_net *
 	lp = lnet_peer_alloc(nid);
 	if (!lp)
 		goto out;
-	lpn = lnet_peer_net_alloc(LNET_NIDNET(nid));
+	lpn = lnet_peer_net_alloc(LNET_NID_NET(nid));
 	if (!lpn)
 		goto out_free_lp;
 	lpni = lnet_peer_ni_alloc(nid);
@@ -1877,7 +1896,7 @@ struct lnet_peer_net *
 out_free_lp:
 	kfree(lp);
 out:
-	CDEBUG(D_NET, "peer %s: %d\n", libcfs_nid2str(nid), rc);
+	CDEBUG(D_NET, "peer %s: %d\n", libcfs_nidstr(nid), rc);
 	return rc;
 }
 
@@ -2047,7 +2066,7 @@ struct lnet_peer_net *
 }
 
 struct lnet_peer_ni *
-lnet_nid2peerni_ex(lnet_nid_t nid, int cpt)
+lnet_nid2peerni_ex(struct lnet_nid *nid, int cpt)
 {
 	struct lnet_peer_ni *lpni = NULL;
 	int rc;
@@ -2059,7 +2078,7 @@ struct lnet_peer_ni *
 	 * find if a peer_ni already exists.
 	 * If so then just return that.
 	 */
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_peer_ni_find_locked(nid);
 	if (lpni)
 		return lpni;
 
@@ -2071,7 +2090,7 @@ struct lnet_peer_ni *
 		goto out_net_relock;
 	}
 
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_peer_ni_find_locked(nid);
 	LASSERT(lpni);
 
 out_net_relock:
@@ -2085,19 +2104,21 @@ struct lnet_peer_ni *
  * hold on the peer_ni.
  */
 struct lnet_peer_ni *
-lnet_nid2peerni_locked(lnet_nid_t nid, lnet_nid_t pref, int cpt)
+lnet_nid2peerni_locked(lnet_nid_t nid4, lnet_nid_t pref, int cpt)
 {
 	struct lnet_peer_ni *lpni = NULL;
+	struct lnet_nid nid;
 	int rc;
 
 	if (the_lnet.ln_state != LNET_STATE_RUNNING)
 		return ERR_PTR(-ESHUTDOWN);
 
+	lnet_nid4_to_nid(nid4, &nid);
 	/*
 	 * find if a peer_ni already exists.
 	 * If so then just return that.
 	 */
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_find_peer_ni_locked(nid4);
 	if (lpni)
 		return lpni;
 
@@ -2124,13 +2145,13 @@ struct lnet_peer_ni *
 		goto out_mutex_unlock;
 	}
 
-	rc = lnet_peer_ni_traffic_add(nid, pref);
+	rc = lnet_peer_ni_traffic_add(&nid, pref);
 	if (rc) {
 		lpni = ERR_PTR(rc);
 		goto out_mutex_unlock;
 	}
 
-	lpni = lnet_find_peer_ni_locked(nid);
+	lpni = lnet_find_peer_ni_locked(nid4);
 	LASSERT(lpni);
 
 out_mutex_unlock:
@@ -3242,7 +3263,7 @@ static int lnet_peer_deletion(struct lnet_peer *lp)
 		/* re-add these routes */
 		lnet_add_route(route->lr_net,
 			       route->lr_hops,
-			       route->lr_nid,
+			       &route->lr_nid,
 			       route->lr_priority,
 			       sensitivity);
 		kfree(route);
diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c
index 2d5f0b6..6cfcead 100644
--- a/net/lnet/lnet/router.c
+++ b/net/lnet/lnet/router.c
@@ -170,7 +170,7 @@ static void lnet_del_route_from_rnet(lnet_nid_t gw_nid,
 
 	CDEBUG(D_NET, "deleting route %s->%s\n",
 	       libcfs_net2str(route->lr_net),
-	       libcfs_nid2str(route->lr_nid));
+	       libcfs_nidstr(&route->lr_nid));
 
 	/* use the gateway's lp_primary_nid to delete the route as the
 	 * lr_nid can be a constituent NID of the peer
@@ -207,7 +207,7 @@ static void lnet_del_route_from_rnet(lnet_nid_t gw_nid,
 		CDEBUG(D_NET, "%s: %s->%s\n",
 		       libcfs_nidstr(&src->lp_primary_nid),
 		       libcfs_net2str(route->lr_net),
-		       libcfs_nid2str(route->lr_nid));
+		       libcfs_nidstr(&route->lr_nid));
 	}
 	list_splice_init(&src->lp_rtrq, &target->lp_rtrq);
 	list_for_each_entry_safe(route, tmp, &src->lp_routes, lr_gwlist) {
@@ -356,7 +356,7 @@ bool lnet_is_route_alive(struct lnet_route *route)
 	 * intent here is not to confuse the user who added the route.
 	 */
 	list_for_each_entry(route, &orig_lp->lp_routes, lr_gwlist) {
-		lpni = lnet_peer_get_ni_locked(orig_lp, route->lr_nid);
+		lpni = lnet_peer_ni_get_locked(orig_lp, &route->lr_nid);
 		if (!lpni) {
 			lnet_net_lock(LNET_LOCK_EX);
 			list_move(&route->lr_gwlist, &new_lp->lp_routes);
@@ -640,7 +640,7 @@ static void lnet_shuffle_seed(void)
 }
 
 int
-lnet_add_route(u32 net, u32 hops, lnet_nid_t gateway,
+lnet_add_route(u32 net, u32 hops, struct lnet_nid *gateway,
 	       u32 priority, u32 sensitivity)
 {
 	struct list_head *route_entry;
@@ -653,13 +653,13 @@ static void lnet_shuffle_seed(void)
 	int rc;
 
 	CDEBUG(D_NET, "Add route: remote net %s hops %d priority %u gw %s\n",
-	       libcfs_net2str(net), hops, priority, libcfs_nid2str(gateway));
+	       libcfs_net2str(net), hops, priority, libcfs_nidstr(gateway));
 
-	if (gateway == LNET_NID_ANY ||
-	    gateway == LNET_NID_LO_0 ||
+	if (LNET_NID_IS_ANY(gateway) ||
+	    nid_is_lo0(gateway) ||
 	    net == LNET_NET_ANY ||
 	    LNET_NETTYP(net) == LOLND ||
-	    LNET_NIDNET(gateway) == net ||
+	    LNET_NID_NET(gateway) == net ||
 	    (hops != LNET_UNDEFINED_HOPS && (hops < 1 || hops > 255)))
 		return -EINVAL;
 
@@ -667,10 +667,10 @@ static void lnet_shuffle_seed(void)
 	if (lnet_islocalnet(net))
 		return -EEXIST;
 
-	if (!lnet_islocalnet(LNET_NIDNET(gateway))) {
+	if (!lnet_islocalnet(LNET_NID_NET(gateway))) {
 		CERROR("Cannot add route with gateway %s. There is no local interface configured on LNet %s\n",
-		       libcfs_nid2str(gateway),
-		       libcfs_net2str(LNET_NIDNET(gateway)));
+		       libcfs_nidstr(gateway),
+		       libcfs_net2str(LNET_NID_NET(gateway)));
 		return -EHOSTUNREACH;
 	}
 
@@ -679,7 +679,7 @@ static void lnet_shuffle_seed(void)
 	rnet = kzalloc(sizeof(*rnet), GFP_NOFS);
 	if (!route || !rnet) {
 		CERROR("Out of memory creating route %s %d %s\n",
-		       libcfs_net2str(net), hops, libcfs_nid2str(gateway));
+		       libcfs_net2str(net), hops, libcfs_nidstr(gateway));
 		kfree(route);
 		kfree(rnet);
 		return -ENOMEM;
@@ -688,9 +688,9 @@ static void lnet_shuffle_seed(void)
 	INIT_LIST_HEAD(&rnet->lrn_routes);
 	rnet->lrn_net = net;
 	/* store the local and remote net that the route represents */
-	route->lr_lnet = LNET_NIDNET(gateway);
+	route->lr_lnet = LNET_NID_NET(gateway);
 	route->lr_net = net;
-	route->lr_nid = gateway;
+	route->lr_nid = *gateway;
 	route->lr_priority = priority;
 	route->lr_hops = hops;
 	if (lnet_peers_start_down())
@@ -713,7 +713,7 @@ static void lnet_shuffle_seed(void)
 		rc = PTR_ERR(lpni);
 		CERROR("Error %d creating route %s %d %s\n", rc,
 		       libcfs_net2str(net), hops,
-		       libcfs_nid2str(gateway));
+		       libcfs_nidstr(gateway));
 		return rc;
 	}
 
@@ -741,8 +741,8 @@ static void lnet_shuffle_seed(void)
 		}
 
 		/* our lookups must be true */
-		LASSERT(lnet_nid_to_nid4(&route2->lr_gateway->lp_primary_nid) !=
-			gateway);
+		LASSERT(!nid_same(&route2->lr_gateway->lp_primary_nid,
+				  gateway));
 	}
 
 	/* It is possible to add multiple routes through the same peer,
@@ -933,8 +933,8 @@ int lnet_get_rtr_pool_cfg(int cpt, struct lnet_ioctl_pool_cfg *pool_cfg)
 }
 
 int
-lnet_get_route(int idx, u32 *net, u32 *hops,
-	       lnet_nid_t *gateway, u32 *flags, u32 *priority, u32 *sensitivity)
+lnet_get_route(int idx, u32 *net, u32 *hops, lnet_nid_t *gateway,
+	       u32 *flags, u32 *priority, u32 *sensitivity)
 {
 	struct lnet_remotenet *rnet;
 	struct list_head *rn_list;
@@ -950,7 +950,7 @@ int lnet_get_rtr_pool_cfg(int cpt, struct lnet_ioctl_pool_cfg *pool_cfg)
 			list_for_each_entry(route, &rnet->lrn_routes, lr_list) {
 				if (!idx--) {
 					*net = rnet->lrn_net;
-					*gateway = route->lr_nid;
+					*gateway = lnet_nid_to_nid4(&route->lr_nid);
 					*hops = route->lr_hops;
 					*priority =
 					    route->lr_priority;
@@ -1774,8 +1774,7 @@ bool lnet_router_checker_active(void)
 		 */
 		if (lnet_is_discovery_disabled(lp)) {
 			list_for_each_entry(route, &lp->lp_routes, lr_gwlist) {
-				if (route->lr_nid ==
-				    lnet_nid_to_nid4(&lpni->lpni_nid))
+				if (nid_same(&route->lr_nid, &lpni->lpni_nid))
 					lnet_set_route_aliveness(route, alive);
 			}
 		}
diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c
index 2e3c802..a53d6fa 100644
--- a/net/lnet/lnet/router_proc.c
+++ b/net/lnet/lnet/router_proc.c
@@ -217,7 +217,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write,
 				       libcfs_net2str(net), hops,
 				       priority,
 				       alive ? "up" : "down",
-				       libcfs_nid2str(route->lr_nid));
+				       libcfs_nidstr(&route->lr_nid));
 			LASSERT(tmpstr + tmpsiz - s > 0);
 		}
 
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 12/24] lnet: extend rspt_next_hop_nid in lnet_rsp_tracker
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (10 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 11/24] lnet: change lr_nid to struct lnet_nid James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 13/24] lustre: ptlrpc: two replay lock threads James Simmons
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Mr NeilBrown <neilb@suse.de>

rspt_next_hop_nid in 'struct lnet_rsp_tracker' is now
a 'struct lnet_nid'.

WC-bug-id: https://jira.whamcloud.com/browse/LU-10391
Lustre-commit: a34afe7f20ec7d618 ("LU-10391 lnet: extend rspt_next_hop_nid in lnet_rsp_tracker")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/43594
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h |  2 +-
 net/lnet/lnet/lib-move.c       | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 1e1ddd7..380a7b9 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -87,7 +87,7 @@ struct lnet_rsp_tracker {
 	/* cpt to lock */
 	int rspt_cpt;
 	/* nid of next hop */
-	lnet_nid_t rspt_next_hop_nid;
+	struct lnet_nid rspt_next_hop_nid;
 	/* deadline of the REPLY/ACK */
 	ktime_t rspt_deadline;
 	/* parent MD */
diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index 2454a0c..f2978eb 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -1755,9 +1755,9 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats,
 		rspt = msg->msg_md->md_rspt_ptr;
 		if (rspt) {
 			rspt->rspt_next_hop_nid =
-				lnet_nid_to_nid4(&msg->msg_txpeer->lpni_nid);
+				msg->msg_txpeer->lpni_nid;
 			CDEBUG(D_NET, "rspt_next_hop_nid = %s\n",
-			       libcfs_nid2str(rspt->rspt_next_hop_nid));
+			       libcfs_nidstr(&rspt->rspt_next_hop_nid));
 		}
 	}
 
@@ -2969,7 +2969,7 @@ struct lnet_mt_event_info {
 			if (ktime_compare(now, rspt->rspt_deadline) >= 0 ||
 			    the_lnet.ln_mt_state == LNET_MT_STATE_SHUTDOWN) {
 				struct lnet_peer_ni *lpni;
-				lnet_nid_t nid;
+				struct lnet_nid nid;
 
 				md = lnet_handle2md(&rspt->rspt_mdh);
 				if (!md) {
@@ -3028,14 +3028,14 @@ struct lnet_mt_event_info {
 
 				CDEBUG(D_NET,
 				       "Response timeout: md = %p: nid = %s\n",
-				       md, libcfs_nid2str(nid));
+				       md, libcfs_nidstr(&nid));
 
 				/* If there is a timeout on the response
 				 * from the next hop decrement its health
 				 * value so that we don't use it
 				 */
 				lnet_net_lock(0);
-				lpni = lnet_find_peer_ni_locked(nid);
+				lpni = lnet_peer_ni_find_locked(&nid);
 				if (lpni) {
 					lnet_handle_remote_failure_locked(lpni);
 					lnet_peer_ni_decref_locked(lpni);
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 13/24] lustre: ptlrpc: two replay lock threads
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (11 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 12/24] lnet: extend rspt_next_hop_nid in lnet_rsp_tracker James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 14/24] lustre: llite: Always do lookup on ENOENT in open James Simmons
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Vitaly Fertman, Vitaly Fertman, Lustre Development List

From: Vitaly Fertman <c17818@cray.com>

conflict to each other what leads to:
        ASSERTION( atomic_read(&imp->imp_replay_inflight) == 1 )

replay_lock_interpret() does ptlrpc_connect_import() on error, and one
thread will appear starting with connect reply interpret.

replay_lock_interpret() also wakes up ldlm_lock_replay_thread() which
does ptlrpc_import_recovery_state_machine().

It may happen that both threads will get to ldlm_replay_locks() on the
next round at the same time, both increment imp_replay_inflight and
the second one will assert.

The problem appeared in LU-13600 which added ldlm_lock_replay_thread()
with the ptlrpc_import_recovery_state_machine() call.

HPE-bug-id: LUS-10147
WC-bug-id: https://jira.whamcloud.com/browse/LU-14847
Lustre-commit: d7d7eb50c8f5fd3fc ("LU-14847 ptlrpc: two replay lock threads")
Fixes: 8cc7f22847 ("lustre: ptlrpc: limit rate of lock replays")
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/158931
Reviewed-on: https://review.whamcloud.com/44294
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/ldlm/ldlm_request.c   | 10 +++++++---
 fs/lustre/obdclass/obd_config.c |  4 ++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c
index 7718e07..746c45b 100644
--- a/fs/lustre/ldlm/ldlm_request.c
+++ b/fs/lustre/ldlm/ldlm_request.c
@@ -2253,7 +2253,8 @@ int __ldlm_replay_locks(struct obd_import *imp, bool rate_limit)
 	struct ldlm_lock *lock;
 	int rc = 0;
 
-	LASSERT(atomic_read(&imp->imp_replay_inflight) == 1);
+	while (atomic_read(&imp->imp_replay_inflight) != 1)
+		cond_resched();
 
 	/* don't replay locks if import failed recovery */
 	if (imp->imp_vbr_failed)
@@ -2311,9 +2312,12 @@ int ldlm_replay_locks(struct obd_import *imp)
 	struct task_struct *task;
 	int rc = 0;
 
-	class_import_get(imp);
 	/* ensure this doesn't fall to 0 before all have been queued */
-	atomic_inc(&imp->imp_replay_inflight);
+	if (atomic_inc_return(&imp->imp_replay_inflight) > 1) {
+		atomic_dec(&imp->imp_replay_inflight);
+		return 0;
+	}
+	class_import_get(imp);
 
 	task = kthread_run(ldlm_lock_replay_thread, imp, "ldlm_lock_replay");
 	if (IS_ERR(task)) {
diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c
index 3a0dbd5..cb70ed5 100644
--- a/fs/lustre/obdclass/obd_config.c
+++ b/fs/lustre/obdclass/obd_config.c
@@ -519,8 +519,8 @@ struct obd_device *class_incref(struct obd_device *obd,
 {
 	lu_ref_add_atomic(&obd->obd_reference, scope, source);
 	atomic_inc(&obd->obd_refcount);
-	CDEBUG(D_INFO, "incref %s (%p) now %d\n", obd->obd_name, obd,
-	       atomic_read(&obd->obd_refcount));
+	CDEBUG(D_INFO, "incref %s (%p) now %d - %s\n", obd->obd_name, obd,
+	       atomic_read(&obd->obd_refcount), scope);
 
 	return obd;
 }
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 14/24] lustre: llite: Always do lookup on ENOENT in open
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (12 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 13/24] lustre: ptlrpc: two replay lock threads James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 15/24] lustre: llite: Remove inode locking in ll_fsync James Simmons
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Patrick Farrell <pfarrell@whamcloud.com>

When there is no valid dentry found for a file we want to
open, we perform a full lookup, which goes to the server
and looks up the file by name. When we find an existing
dentry in cache *but the file is not open on the node*, we
do not do a full lookup.  We move directly to opening the
file.

When we open files, we use the FID of the file.  The
problem occurs when a new file is renamed *over* the file
we were trying to open.  This removes the FID we are
trying to open, but the file *name* userspace called open()
on is still present.  In this case, we will return ENOENT,
even though there is a file matching the name used in the
open() call.

The solution is when we get an ENOENT on open (indicating
our open raced with an unlink), we always send ESTALE back
to the VFS, which restarts the open and forces a lookup to
the server (by forcing Lustre to consider the dentry
invalid, see comments in ll_intent_file_open and code in
ll_revalidate_dentry).

This causes a lookup by name, which will correctly handle
the rename, allowing the open to proceed normally.

This should only generate extra retries in the case where a
positive dentry exists on the client but the file has been
removed on the server, ie, open racing with unlink.

This should hopefully be rare.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14949
lustre-commit: 72c1f7095203cc1ba ("LU-14949 llite: Always do lookup on ENOENT in open")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44675
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_support.h |  1 +
 fs/lustre/llite/file.c          | 23 +++++++++++++++--------
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index 1e8cebf..540e1e0 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -483,6 +483,7 @@
 #define OBD_FAIL_LLITE_CREATE_FILE_PAUSE2		0x1416
 #define OBD_FAIL_LLITE_RACE_MOUNT			0x1417
 #define OBD_FAIL_LLITE_PAGE_ALLOC			0x1418
+#define OBD_FAIL_LLITE_OPEN_DELAY			0x1419
 
 #define OBD_FAIL_FID_INDIR				0x1501
 #define OBD_FAIL_FID_INLMA				0x1502
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index aa5c662..10450ce 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -639,6 +639,8 @@ static int ll_intent_file_open(struct dentry *de, void *lmm, int lmmsize,
 	op_data->op_data = lmm;
 	op_data->op_data_size = lmmsize;
 
+	OBD_FAIL_TIMEOUT(OBD_FAIL_LLITE_OPEN_DELAY, cfs_fail_val);
+
 	rc = md_intent_lock(sbi->ll_md_exp, op_data, itp, &req,
 			    &ll_md_blocking_ast, 0);
 	kfree(name);
@@ -692,15 +694,20 @@ static int ll_intent_file_open(struct dentry *de, void *lmm, int lmmsize,
 	ptlrpc_req_finished(req);
 	ll_intent_drop_lock(itp);
 
-	/*
-	 * We did open by fid, but by the time we got to the server,
-	 * the object disappeared. If this is a create, we cannot really
-	 * tell the userspace that the file it was trying to create
-	 * does not exist. Instead let's return -ESTALE, and the VFS will
-	 * retry the create with LOOKUP_REVAL that we are going to catch
-	 * in ll_revalidate_dentry() and use lookup then.
+	/* We did open by fid, but by the time we got to the server, the object
+	 * disappeared.  This is possible if the object was unlinked, but it's
+	 * also possible if the object was unlinked by a rename.  In the case
+	 * of an object renamed over our existing one, we can't fail this open.
+	 * O_CREAT also goes through this path if we had an existing dentry,
+	 * and it's obviously wrong to return ENOENT for O_CREAT.
+	 *
+	 * Instead let's return -ESTALE, and the VFS will retry the open with
+	 * LOOKUP_REVAL, which we catch in ll_revalidate_dentry and fail to
+	 * revalidate, causing a lookup.  This causes extra lookups in the case
+	 * where we had a dentry in cache but the file is being unlinked and we
+	 * lose the race with unlink, but this should be very rare.
 	 */
-	if (rc == -ENOENT && itp->it_op & IT_CREAT)
+	if (rc == -ENOENT)
 		rc = -ESTALE;
 
 	return rc;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 15/24] lustre: llite: Remove inode locking in ll_fsync
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (13 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 14/24] lustre: llite: Always do lookup on ENOENT in open James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 16/24] lnet: socklnd: fix link state detection James Simmons
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Oleg Drokin <green@whamcloud.com>

It does not appear to be necessary

WC-bug-id: https://jira.whamcloud.com/browse/LU-14877
Lustre-commit: e8d76d1090e912ee5 ("LU-14877 llite: Remove inode locking in ll_fsync")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44368
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/file.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 10450ce..e60789b 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -4460,7 +4460,6 @@ int ll_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 
 
 	rc = file_write_and_wait_range(file, start, end);
-	inode_lock(inode);
 
 	/* catch async errors that were recorded back when async writeback
 	 * failed for pages in this mapping.
@@ -4503,8 +4502,6 @@ int ll_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 			fd->fd_write_failed = false;
 	}
 
-	inode_unlock(inode);
-
 	if (!rc)
 		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_FSYNC,
 				   ktime_us_delta(ktime_get(), kstart));
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 16/24] lnet: socklnd: fix link state detection
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (14 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 15/24] lustre: llite: Remove inode locking in ll_fsync James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 17/24] lustre: llite: check read only mount for setquota James Simmons
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Serguei Smirnov, Lustre Development List

From: Serguei Smirnov <ssmirnov@whamcloud.com>

Due to matching only the device index, link detection implemented
in LU-14742 has issues with confusing the link events for the
virtual interfaces with the link events for the interface that
LNet was actually configured to use. Fix this by improving
the identification of the event source: use both device name and
device index.

Also, to make sure the link fatal state is cleared only when
the device is bound to the IP address used at NI creation,
subscribe to inetaddr events in addition to the netdev events.

Fixes: 1db29e184712 ("lnet: socklnd: detect link state to set fatal error on ni")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14954
Lustre-commit: 008795508d65bb40b ("LU-14954 socklnd: fix link state detection")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44732
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 132 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 122 insertions(+), 10 deletions(-)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 7397ac7..b014aa8 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -1896,11 +1896,15 @@ static int ksocknal_get_link_status(struct net_device *dev)
 
 	LASSERT(dev);
 
-	if (!netif_running(dev))
+	if (!netif_running(dev)) {
 		ret = 0;
+		CDEBUG(D_NET, "device not running\n");
+	}
 	/* Some devices may not be providing link settings */
-	else if (dev->ethtool_ops->get_link)
+	else if (dev->ethtool_ops->get_link) {
 		ret = dev->ethtool_ops->get_link(dev);
+		CDEBUG(D_NET, "get_link returns %u\n", ret);
+	}
 
 	return ret;
 }
@@ -1909,11 +1913,16 @@ static int ksocknal_get_link_status(struct net_device *dev)
 ksocknal_handle_link_state_change(struct net_device *dev,
 				  unsigned char operstate)
 {
-	struct lnet_ni *ni;
+	struct lnet_ni *ni = NULL;
 	struct ksock_net *net;
 	struct ksock_net *cnxt;
 	int ifindex;
 	unsigned char link_down = !(operstate == IF_OPER_UP);
+	struct in_device *in_dev;
+	bool found_ip = false;
+	struct ksock_interface *ksi = NULL;
+	struct sockaddr_in *sa;
+	const struct in_ifaddr *ifa;
 
 	ifindex = dev->ifindex;
 
@@ -1922,20 +1931,91 @@ static int ksocknal_get_link_status(struct net_device *dev)
 
 	list_for_each_entry_safe(net, cnxt, &ksocknal_data.ksnd_nets,
 				 ksnn_list) {
-		if (net->ksnn_interface.ksni_index != ifindex)
+
+		ksi = &net->ksnn_interface;
+		sa = (void *)&ksi->ksni_addr;
+		found_ip = false;
+
+		if (ksi->ksni_index != ifindex ||
+		    strcmp(ksi->ksni_name, dev->name))
 			continue;
+
 		ni = net->ksnn_ni;
-		if (link_down)
+
+		in_dev = __in_dev_get_rtnl(dev);
+		if (!in_dev) {
+			CDEBUG(D_NET, "Interface %s has no IPv4 status.\n",
+			       dev->name);
+			CDEBUG(D_NET, "set link fatal state to 1\n");
+			atomic_set(&ni->ni_fatal_error_on, 1);
+			continue;
+		}
+		in_dev_for_each_ifa_rtnl(ifa, in_dev) {
+			if (sa->sin_addr.s_addr == ifa->ifa_local)
+				found_ip = true;
+		}
+
+		if (!found_ip) {
+			CDEBUG(D_NET, "Interface %s has no matching ip\n",
+			       dev->name);
+			CDEBUG(D_NET, "set link fatal state to 1\n");
+			atomic_set(&ni->ni_fatal_error_on, 1);
+			continue;
+		}
+
+		if (link_down) {
+			CDEBUG(D_NET, "set link fatal state to 1\n");
 			atomic_set(&ni->ni_fatal_error_on, link_down);
-		else
+		} else {
+			CDEBUG(D_NET, "set link fatal state to %u\n",
+			       (ksocknal_get_link_status(dev) == 0));
 			atomic_set(&ni->ni_fatal_error_on,
 				   (ksocknal_get_link_status(dev) == 0));
+		}
 	}
 out:
 	return 0;
 }
 
 
+static int
+ksocknal_handle_inetaddr_change(struct in_ifaddr *ifa, unsigned long event)
+{
+	struct lnet_ni *ni;
+	struct ksock_net *net;
+	struct ksock_net *cnxt;
+	struct net_device *event_netdev = ifa->ifa_dev->dev;
+	int ifindex;
+	struct ksock_interface *ksi = NULL;
+	struct sockaddr_in *sa;
+
+	if (!ksocknal_data.ksnd_nnets)
+		goto out;
+
+	ifindex = event_netdev->ifindex;
+
+	list_for_each_entry_safe(net, cnxt, &ksocknal_data.ksnd_nets,
+				 ksnn_list) {
+
+		ksi = &net->ksnn_interface;
+		sa = (void *)&ksi->ksni_addr;
+
+		if (ksi->ksni_index != ifindex ||
+		    strcmp(ksi->ksni_name, event_netdev->name))
+			continue;
+
+		if (sa->sin_addr.s_addr == ifa->ifa_local) {
+			CDEBUG(D_NET, "set link fatal state to %u\n",
+			       (event == NETDEV_DOWN));
+			ni = net->ksnn_ni;
+			atomic_set(&ni->ni_fatal_error_on,
+				   (event == NETDEV_DOWN));
+		}
+	}
+out:
+	return 0;
+}
+
 /************************************
  * Net device notifier event handler
  ************************************/
@@ -1947,6 +2027,9 @@ static int ksocknal_device_event(struct notifier_block *unused,
 
 	operstate = dev->operstate;
 
+	CDEBUG(D_NET, "devevent: status=%ld, iface=%s ifindex %d state %u\n",
+	       event, dev->name, dev->ifindex, operstate);
+
 	switch (event) {
 	case NETDEV_UP:
 	case NETDEV_DOWN:
@@ -1958,10 +2041,36 @@ static int ksocknal_device_event(struct notifier_block *unused,
 	return NOTIFY_OK;
 }
 
-static struct notifier_block ksocknal_notifier_block = {
+/************************************
+ * Inetaddr notifier event handler
+ ************************************/
+static int ksocknal_inetaddr_event(struct notifier_block *unused,
+				   unsigned long event, void *ptr)
+{
+	struct in_ifaddr *ifa = ptr;
+
+	CDEBUG(D_NET, "addrevent: status %ld ip addr %pI4, netmask %pI4.\n",
+	       event, &ifa->ifa_address, &ifa->ifa_mask);
+
+	switch (event) {
+	case NETDEV_UP:
+	case NETDEV_DOWN:
+	case NETDEV_CHANGE:
+		ksocknal_handle_inetaddr_change(ifa, event);
+		break;
+
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block ksocknal_dev_notifier_block = {
 	.notifier_call = ksocknal_device_event,
 };
 
+static struct notifier_block ksocknal_inetaddr_notifier_block = {
+	.notifier_call = ksocknal_inetaddr_event,
+};
+
 static void
 ksocknal_base_shutdown(void)
 {
@@ -1971,8 +2080,10 @@ static int ksocknal_device_event(struct notifier_block *unused,
 
 	LASSERT(!ksocknal_data.ksnd_nnets);
 
-	if (ksocknal_data.ksnd_init == SOCKNAL_INIT_ALL)
-		unregister_netdevice_notifier(&ksocknal_notifier_block);
+	if (ksocknal_data.ksnd_init == SOCKNAL_INIT_ALL) {
+		unregister_netdevice_notifier(&ksocknal_dev_notifier_block);
+		unregister_inetaddr_notifier(&ksocknal_inetaddr_notifier_block);
+	}
 
 	switch (ksocknal_data.ksnd_init) {
 	default:
@@ -2135,7 +2246,8 @@ static int ksocknal_device_event(struct notifier_block *unused,
 		goto failed;
 	}
 
-	register_netdevice_notifier(&ksocknal_notifier_block);
+	register_netdevice_notifier(&ksocknal_dev_notifier_block);
+	register_inetaddr_notifier(&ksocknal_inetaddr_notifier_block);
 
 	/* flag everything initialised */
 	ksocknal_data.ksnd_init = SOCKNAL_INIT_ALL;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 17/24] lustre: llite: check read only mount for setquota
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (15 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 16/24] lnet: socklnd: fix link state detection James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 18/24] lustre: llite: don't touch vma after filemap_fault James Simmons
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Hongchao Zhang, Lustre Development List

From: Hongchao Zhang <hongchao@whamcloud.com>

During setting quota, it should fail if the mount is read-only.

WC-bug-id: https://jira.whamcloud.com/browse/LU-14696
Lustre-commit: 29e00cecc6019fbdb ("LU-14696 llite: check read only mount for setquota")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43765
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/dir.c            | 8 ++++++--
 fs/lustre/llite/llite_internal.h | 2 +-
 fs/lustre/llite/llite_lib.c      | 2 +-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 57f7c3c..9a4ccfc 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -1080,8 +1080,9 @@ static int check_owner(int type, int id)
 	return 0;
 }
 
-int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
+int quotactl_ioctl(struct super_block *sb, struct if_quotactl *qctl)
 {
+	struct ll_sb_info *sbi = ll_s2sbi(sb);
 	int cmd = qctl->qc_cmd;
 	int type = qctl->qc_type;
 	int id = qctl->qc_id;
@@ -1097,6 +1098,9 @@ int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 	case LUSTRE_Q_SETDEFAULT_POOL:
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
+
+		if (sb->s_flags & SB_RDONLY)
+			return -EROFS;
 		break;
 	case Q_GETQUOTA:
 	case LUSTRE_Q_GETDEFAULT:
@@ -1873,7 +1877,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 			}
 		}
 
-		rc = quotactl_ioctl(sbi, qctl);
+		rc = quotactl_ioctl(inode->i_sb, qctl);
 		if (rc == 0 && copy_to_user((void __user *)arg, qctl,
 					    sizeof(*qctl)))
 			rc = -EFAULT;
diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index 6b5e318..ed6ff07 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -1080,7 +1080,7 @@ int ll_dir_read(struct inode *inode, u64 *ppos, struct md_op_data *op_data,
 struct page *ll_get_dir_page(struct inode *dir, struct md_op_data *op_data,
 			     u64 offset);
 void ll_release_page(struct inode *inode, struct page *page, bool remove);
-int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl);
+int quotactl_ioctl(struct super_block *sb, struct if_quotactl *qctl);
 
 enum get_default_layout_type {
 	GET_DEFAULT_LAYOUT_ROOT = 1,
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index cc50503..58e60c8 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -2263,7 +2263,7 @@ static int ll_statfs_project(struct inode *inode, struct kstatfs *sfs)
 	int ret;
 
 	qctl.qc_id = ll_i2info(inode)->lli_projid;
-	ret = quotactl_ioctl(ll_i2sbi(inode), &qctl);
+	ret = quotactl_ioctl(inode->i_sb, &qctl);
 	if (ret) {
 		/* ignore errors if project ID does not have
 		 * a quota limit or feature unsupported.
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 18/24] lustre: llite: don't touch vma after filemap_fault
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (16 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 17/24] lustre: llite: check read only mount for setquota James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 19/24] lnet: Check for -ESHUTDOWN in lnet_parse James Simmons
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Alexander Boyko, Lustre Development List

From: Alexander Boyko <alexander.boyko@hpe.com>

In case of error filemap_fault unlock mutex vma->vm_mm->mmap_sem,
so touching vma is dangerous, it could be reused or freed.
The patch uses local file variable to skip vma.

HPE-bug-id: LUS-10240
WC-bug-id: https://jira.whamcloud.com/browse/LU-14021
Lustre-commit: 0f5d3c4b954da2f6b ("LU-14021 llite: don't touch vma after filemap_fault")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/44558
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/llite_mmap.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/llite/llite_mmap.c b/fs/lustre/llite/llite_mmap.c
index ebcb8d9..85a082c 100644
--- a/fs/lustre/llite/llite_mmap.c
+++ b/fs/lustre/llite/llite_mmap.c
@@ -38,9 +38,9 @@
 #include <linux/unistd.h>
 #include <linux/uaccess.h>
 #include <linux/delay.h>
-
 #include <linux/fs.h>
 #include <linux/pagemap.h>
+#include <linux/file.h>
 
 #define DEBUG_SUBSYSTEM S_LLITE
 
@@ -317,6 +317,8 @@ static vm_fault_t __ll_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	result = io->ci_result;
 	if (result == 0) {
+		struct file *vm_file = vma->vm_file;
+
 		vio = vvp_env_io(env);
 		vio->u.fault.ft_vma = vma;
 		vio->u.fault.ft_vmpage = NULL;
@@ -324,13 +326,15 @@ static vm_fault_t __ll_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 		vio->u.fault.ft_flags = 0;
 		vio->u.fault.ft_flags_valid = false;
 
+		get_file(vm_file);
+
 		/* May call ll_readpage() */
-		ll_cl_add(vma->vm_file, env, io, LCC_MMAP);
+		ll_cl_add(vm_file, env, io, LCC_MMAP);
 
 		result = cl_io_loop(env, io);
 
-		ll_cl_remove(vma->vm_file, env);
-
+		ll_cl_remove(vm_file, env);
+		fput(vm_file);
 		/* ft_flags are only valid if we reached
 		 * the call to filemap_fault
 		 */
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 19/24] lnet: Check for -ESHUTDOWN in lnet_parse
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (17 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 18/24] lustre: llite: don't touch vma after filemap_fault James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 20/24] lustre: obdclass: EAGAIN after rhashtable_walk_next() James Simmons
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown
  Cc: Chris Horn, Lustre Development List

From: Chris Horn <chris.horn@hpe.com>

The fix for LU-8106, http://review.whamcloud.com/19993, no longer
works because rc does not have the return value from
lnet_nid2peerni_locked(). Use PTR_ERR to get the return value and
restore the LU-8106 fix.

HPE-bug-id: LUS-10333
Fixes: 6e872a4ffd ("lustre: lnet: peer/peer_ni handling adjustments")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14962
Lustre-commit: cce82630cbf2c7bad ("LU-14962 lnet: Check for -ESHUTDOWN in lnet_parse")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/44743
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/lib-move.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c
index f2978eb..b9b322a 100644
--- a/net/lnet/lnet/lib-move.c
+++ b/net/lnet/lnet/lib-move.c
@@ -4389,9 +4389,10 @@ void lnet_monitor_thr_stop(void)
 				      cpt);
 	if (IS_ERR(lpni)) {
 		lnet_net_unlock(cpt);
-		CERROR("%s, src %s: Dropping %s (error %ld looking up sender)\n",
+		rc = PTR_ERR(lpni);
+		CERROR("%s, src %s: Dropping %s (error %d looking up sender)\n",
 		       libcfs_nid2str(from_nid), libcfs_nid2str(src_nid),
-		       lnet_msgtyp2str(type), PTR_ERR(lpni));
+		       lnet_msgtyp2str(type), rc);
 		kfree(msg);
 		if (rc == -ESHUTDOWN)
 			/* We are shutting down. Don't do anything more */
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 20/24] lustre: obdclass: EAGAIN after rhashtable_walk_next()
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (18 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 19/24] lnet: Check for -ESHUTDOWN in lnet_parse James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 21/24] lustre: sec: filename encryption James Simmons
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Alex Zhuravlev <bzzz@whamcloud.com>

rhashtable_walk_next() can return -EAGAIN when concurrent resizing
has happened. so the callers should check for this error and just
repeat rhashtable_walk_next().

WC-bug-id: https://jira.whamcloud.com/browse/LU-14967
Lustre-commit: 96aa615f91cd25b04 ("LU-14967 obdclass: EAGAIN after rhashtable_walk_next()")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44766
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/vvp_dev.c  | 6 ++++++
 fs/lustre/obdclass/jobid.c | 5 +++++
 2 files changed, 11 insertions(+)

diff --git a/fs/lustre/llite/vvp_dev.c b/fs/lustre/llite/vvp_dev.c
index fdcd314..fda48bb 100644
--- a/fs/lustre/llite/vvp_dev.c
+++ b/fs/lustre/llite/vvp_dev.c
@@ -385,6 +385,12 @@ static struct page *vvp_pgcache_current(struct vvp_seq_private *priv)
 		struct inode *inode;
 		int nr;
 
+		if (IS_ERR(h)) {
+			if (PTR_ERR(h) == -EAGAIN)
+				continue;
+			break;
+		}
+
 		if (!priv->vsp_clob) {
 			struct lu_object *lu_obj;
 
diff --git a/fs/lustre/obdclass/jobid.c b/fs/lustre/obdclass/jobid.c
index 52ba398..da1af51 100644
--- a/fs/lustre/obdclass/jobid.c
+++ b/fs/lustre/obdclass/jobid.c
@@ -164,6 +164,11 @@ static void jobid_prune(struct work_struct *work)
 	rhashtable_walk_enter(&session_jobids, &iter);
 	rhashtable_walk_start(&iter);
 	while ((sj = rhashtable_walk_next(&iter)) != NULL) {
+		if (IS_ERR(sj)) {
+			if (PTR_ERR(sj) == -EAGAIN)
+				continue;
+			break;
+		}
 		if (!hlist_empty(&sj->sj_session->tasks[PIDTYPE_SID])) {
 			remaining++;
 			continue;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 21/24] lustre: sec: filename encryption
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (19 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 20/24] lustre: obdclass: EAGAIN after rhashtable_walk_next() James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:19 ` [lustre-devel] [PATCH 22/24] lustre: uapi: fixup UAPI headers for native Linux client James Simmons
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Sebastien Buisson <sbuisson@ddn.com>

On client side, call the appropriate fscrypt primitives for llite,
to proceed with filename encryption before sending requests to servers
and filename decryption upon request receipt.
Note we need specific overlay functions to handle encoding and
decoding of encrypted filenames, as we do not want server side to deal
with binary names before they reach the backend file system layer.

On server side, mainly the OSD layer, we need to know the encryption
status of files being processed.
If an object belongs to an encrypted file, the filename has been
encoded by the client because it is binary, so it needs to be decoded
before being handed over to the backend file system layer.
And conversely, the filename of an encrypted file has to be encoded
before being sent over the wire.
Note server side is osd-ldiskfs only for now.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13717
Lustre-commit: 4d38566a004f6a636 ("LU-13717 sec: filename encryption")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/43390
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd.h          |   4 ++
 fs/lustre/llite/crypto.c         | 144 +++++++++++++++++++++++++++++++++++++++
 fs/lustre/llite/dcache.c         |   8 +++
 fs/lustre/llite/dir.c            |  39 ++++++++++-
 fs/lustre/llite/file.c           |  11 +--
 fs/lustre/llite/llite_internal.h |  27 ++++++++
 fs/lustre/llite/llite_lib.c      |  51 +++++++++++++-
 fs/lustre/llite/namei.c          |  47 ++++++++++---
 fs/lustre/llite/statahead.c      |  48 +++++++++++++
 fs/lustre/mdc/mdc_lib.c          |   6 +-
 fs/lustre/ptlrpc/layout.c        |   2 +-
 11 files changed, 366 insertions(+), 21 deletions(-)

diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h
index 7c5e699..7642973 100644
--- a/fs/lustre/include/obd.h
+++ b/fs/lustre/include/obd.h
@@ -708,6 +708,7 @@ enum md_op_flags {
 	MF_GETATTR_BY_FID	= BIT(5),
 	MF_QOS_MKDIR		= BIT(6),
 	MF_RR_MKDIR		= BIT(7),
+	MF_OPNAME_KMALLOCED	= BIT(8),
 };
 
 enum md_cli_flags {
@@ -725,6 +726,9 @@ enum md_op_code {
 	LUSTRE_OPC_MKNOD,
 	LUSTRE_OPC_CREATE,
 	LUSTRE_OPC_ANY,
+	LUSTRE_OPC_LOOKUP,
+	LUSTRE_OPC_OPEN,
+	LUSTRE_OPC_MIGR,
 };
 
 /**
diff --git a/fs/lustre/llite/crypto.c b/fs/lustre/llite/crypto.c
index 34d0ad1..5d99037 100644
--- a/fs/lustre/llite/crypto.c
+++ b/fs/lustre/llite/crypto.c
@@ -153,6 +153,150 @@ static bool ll_empty_dir(struct inode *inode)
 	return true;
 }
 
+/**
+ * ll_setup_filename() - overlay to fscrypt_setup_filename
+ * @dir: the directory that will be searched
+ * @iname: the user-provided filename being searched for
+ * @lookup: 1 if we're allowed to proceed without the key because it's
+ *	->lookup() or we're finding the dir_entry for deletion; 0 if we cannot
+ *	proceed without the key because we're going to create the dir_entry.
+ * @fname: the filename information to be filled in
+ *
+ * This overlay function is necessary to properly encode @fname after
+ * encryption, as it will be sent over the wire.
+ */
+int ll_setup_filename(struct inode *dir, const struct qstr *iname,
+		      int lookup, struct fscrypt_name *fname)
+{
+	int rc;
+
+	rc = fscrypt_setup_filename(dir, iname, lookup, fname);
+	if (rc)
+		return rc;
+
+	if (IS_ENCRYPTED(dir) &&
+	    !name_is_dot_or_dotdot(fname->disk_name.name,
+				   fname->disk_name.len)) {
+		int presented_len = critical_chars(fname->disk_name.name,
+						   fname->disk_name.len);
+		char *buf;
+
+		buf = kmalloc(presented_len + 1, GFP_NOFS);
+		if (!buf) {
+			rc = -ENOMEM;
+			goto out_free;
+		}
+
+		if (presented_len == fname->disk_name.len)
+			memcpy(buf, fname->disk_name.name, presented_len);
+		else
+			critical_encode(fname->disk_name.name,
+					fname->disk_name.len, buf);
+		buf[presented_len] = '\0';
+		kfree(fname->crypto_buf.name);
+		fname->crypto_buf.name = buf;
+		fname->crypto_buf.len = presented_len;
+		fname->disk_name.name = fname->crypto_buf.name;
+		fname->disk_name.len = fname->crypto_buf.len;
+	}
+
+	return rc;
+
+out_free:
+	fscrypt_free_filename(fname);
+	return rc;
+}
+
+/**
+ * ll_fname_disk_to_usr() - overlay to fscrypt_fname_disk_to_usr
+ * @inode: the inode to convert name
+ * @hash: major hash for inode
+ * @minor_hash: minor hash for inode
+ * @iname: the user-provided filename needing conversion
+ * @oname: the filename information to be filled in
+ *
+ * The caller must have allocated sufficient memory for the @oname string.
+ *
+ * This overlay function is necessary to properly decode @iname before
+ * decryption, as it comes from the wire.
+ */
+int ll_fname_disk_to_usr(struct inode *inode,
+			 u32 hash, u32 minor_hash,
+			 struct fscrypt_str *iname, struct fscrypt_str *oname)
+{
+	struct fscrypt_str lltr = FSTR_INIT(iname->name, iname->len);
+	char *buf = NULL;
+	int rc;
+
+	if (IS_ENCRYPTED(inode) &&
+	    !name_is_dot_or_dotdot(lltr.name, lltr.len) &&
+	    strnchr(lltr.name, lltr.len, '=')) {
+		/* Only proceed to critical decode if
+		 * iname contains espace char '='.
+		 */
+		int len = lltr.len;
+
+		buf = kmalloc(len, GFP_NOFS);
+		if (!buf)
+			return -ENOMEM;
+
+		len = critical_decode(lltr.name, len, buf);
+		lltr.name = buf;
+		lltr.len = len;
+	}
+
+	rc = fscrypt_fname_disk_to_usr(inode, hash, minor_hash, &lltr, oname);
+
+	kfree(buf);
+
+	return rc;
+}
+
+/* Copied from fscrypt_d_revalidate, as it is not exported */
+/*
+ * Validate dentries in encrypted directories to make sure we aren't potentially
+ * caching stale dentries after a key has been added.
+ */
+int ll_revalidate_d_crypto(struct dentry *dentry, unsigned int flags)
+{
+	struct dentry *dir;
+	int err;
+	int valid;
+
+	/*
+	 * Plaintext names are always valid, since llcrypt doesn't support
+	 * reverting to ciphertext names without evicting the directory's inode
+	 * -- which implies eviction of the dentries in the directory.
+	 */
+	if (!(dentry->d_flags & DCACHE_ENCRYPTED_NAME))
+		return 1;
+
+	/*
+	 * Ciphertext name; valid if the directory's key is still unavailable.
+	 *
+	 * Although llcrypt forbids rename() on ciphertext names, we still must
+	 * use dget_parent() here rather than use ->d_parent directly.  That's
+	 * because a corrupted fs image may contain directory hard links, which
+	 * the VFS handles by moving the directory's dentry tree in the dcache
+	 * each time ->lookup() finds the directory and it already has a dentry
+	 * elsewhere.  Thus ->d_parent can be changing, and we must safely grab
+	 * a reference to some ->d_parent to prevent it from being freed.
+	 */
+
+	if (flags & LOOKUP_RCU)
+		return -ECHILD;
+
+	dir = dget_parent(dentry);
+	err = fscrypt_get_encryption_info(d_inode(dir));
+	valid = !fscrypt_has_encryption_key(d_inode(dir));
+	dput(dir);
+
+	if (err < 0)
+		return err;
+
+	return valid;
+}
+
 const struct fscrypt_operations lustre_cryptops = {
 	.key_prefix		= "lustre:",
 	.get_context		= ll_get_context,
diff --git a/fs/lustre/llite/dcache.c b/fs/lustre/llite/dcache.c
index 4162f46..a074a2c 100644
--- a/fs/lustre/llite/dcache.c
+++ b/fs/lustre/llite/dcache.c
@@ -235,6 +235,14 @@ static int ll_revalidate_dentry(struct dentry *dentry,
 				unsigned int lookup_flags)
 {
 	struct inode *dir = d_inode(dentry->d_parent);
+	int rc;
+
+	CDEBUG(D_VFSTRACE, "VFS Op:name=%s, flags=%u\n",
+	       dentry->d_name.name, lookup_flags);
+
+	rc = ll_revalidate_d_crypto(dentry, lookup_flags);
+	if (rc != 1)
+		return rc;
 
 	/* If this is intermediate component path lookup and we were able to get
 	 * to this dentry, then its lock has not been revoked and the
diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 9a4ccfc..f7216db 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -42,6 +42,7 @@
 #include <linux/pagevec.h>
 #include <linux/prefetch.h>
 #include <linux/security.h>
+#include <linux/fscrypt.h>
 
 #define DEBUG_SUBSYSTEM S_LLITE
 
@@ -181,11 +182,18 @@ int ll_dir_read(struct inode *inode, u64 *ppos, struct md_op_data *op_data,
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 	u64 pos = *ppos;
 	bool is_api32 = ll_need_32bit_api(sbi);
-	int is_hash64 = sbi->ll_flags & LL_SBI_64BIT_HASH;
+	bool is_hash64 = sbi->ll_flags & LL_SBI_64BIT_HASH;
+	struct fscrypt_str lltr = FSTR_INIT(NULL, 0);
 	struct page *page;
 	bool done = false;
 	int rc = 0;
 
+	if (IS_ENCRYPTED(inode)) {
+		rc = fscrypt_fname_alloc_buffer(inode, NAME_MAX, &lltr);
+		if (rc < 0)
+			return rc;
+	}
+
 	page = ll_get_dir_page(inode, op_data, pos);
 
 	while (rc == 0 && !done) {
@@ -232,8 +240,26 @@ int ll_dir_read(struct inode *inode, u64 *ppos, struct md_op_data *op_data,
 			 * so the parameter 'name' for 'ctx->actor()'
 			 * must be part of the 'ent'.
 			 */
-			done = !dir_emit(ctx, ent->lde_name,
-					 namelen, ino, type);
+			if (!IS_ENCRYPTED(inode)) {
+				done = !dir_emit(ctx, ent->lde_name, namelen,
+						 ino, type);
+			} else {
+				/* Directory is encrypted */
+				int save_len = lltr.len;
+				struct fscrypt_str de_name
+					= FSTR_INIT(ent->lde_name, namelen);
+
+				rc = ll_fname_disk_to_usr(inode, 0, 0, &de_name,
+							  &lltr);
+				de_name = lltr;
+				lltr.len = save_len;
+				if (rc) {
+					done = 1;
+					break;
+				}
+				done = !dir_emit(ctx, de_name.name, de_name.len,
+						 ino, type);
+			}
 		}
 
 		if (done) {
@@ -264,6 +290,7 @@ int ll_dir_read(struct inode *inode, u64 *ppos, struct md_op_data *op_data,
 	}
 
 	ctx->pos = pos;
+	fscrypt_fname_free_buffer(&lltr);
 	return rc;
 }
 
@@ -285,6 +312,12 @@ static int ll_readdir(struct file *filp, struct dir_context *ctx)
 	       PFID(ll_inode2fid(inode)), inode, (unsigned long)pos,
 	       i_size_read(inode), api32);
 
+	if (IS_ENCRYPTED(inode)) {
+		rc = fscrypt_get_encryption_info(inode);
+		if (rc && rc != -ENOKEY)
+			goto out;
+	}
+
 	if (pos == MDS_DIR_END_OFF) {
 		/*
 		 * end-of-file.
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index e60789b..ab7c72a 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -631,7 +631,7 @@ static int ll_intent_file_open(struct dentry *de, void *lmm, int lmmsize,
 	}
 
 	op_data  = ll_prep_md_op_data(NULL, d_inode(parent), inode, name, len,
-				      O_RDWR, LUSTRE_OPC_ANY, NULL);
+				      O_RDWR, LUSTRE_OPC_OPEN, NULL);
 	if (IS_ERR(op_data)) {
 		kfree(name);
 		return PTR_ERR(op_data);
@@ -2164,7 +2164,7 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename,
 			     struct ptlrpc_request **request)
 {
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
-	struct mdt_body *body;
+	struct mdt_body  *body;
 	struct lov_mds_md *lmm = NULL;
 	struct ptlrpc_request *req = NULL;
 	struct md_op_data *op_data;
@@ -4744,7 +4744,7 @@ int ll_migrate(struct inode *parent, struct file *file, struct lmv_user_md *lum,
 	}
 
 	op_data = ll_prep_md_op_data(NULL, parent, NULL, name, namelen,
-				     child_inode->i_mode, LUSTRE_OPC_ANY, NULL);
+				     child_inode->i_mode, LUSTRE_OPC_MIGR, NULL);
 	if (IS_ERR(op_data)) {
 		rc = PTR_ERR(op_data);
 		goto out_iput;
@@ -4788,8 +4788,9 @@ int ll_migrate(struct inode *parent, struct file *file, struct lmv_user_md *lum,
 		spin_unlock(&och->och_mod->mod_open_req->rq_lock);
 	}
 
-	rc = md_rename(ll_i2sbi(parent)->ll_md_exp, op_data, name, namelen,
-		       name, namelen, &request);
+	rc = md_rename(ll_i2sbi(parent)->ll_md_exp, op_data,
+		       op_data->op_name, op_data->op_namelen,
+		       op_data->op_name, op_data->op_namelen, &request);
 	if (!rc) {
 		LASSERT(request);
 		ll_update_times(request, parent);
diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index ed6ff07..25bd460 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -1731,6 +1731,33 @@ static inline struct pcc_super *ll_info2pccs(struct ll_inode_info *lli)
 }
 
 /* crypto.c */
+#ifdef CONFIG_FS_ENCRYPTION
+int ll_setup_filename(struct inode *dir, const struct qstr *iname,
+		      int lookup, struct fscrypt_name *fname);
+int ll_fname_disk_to_usr(struct inode *inode,
+			 u32 hash, u32 minor_hash,
+			 struct fscrypt_str *iname, struct fscrypt_str *oname);
+int ll_revalidate_d_crypto(struct dentry *dentry, unsigned int flags);
+#else
+int ll_setup_filename(struct inode *dir, const struct qstr *iname,
+		      int lookup, struct fscrypt_name *fname)
+{
+	return fscrypt_setup_filename(dir, iname, lookup, fname);
+}
+
+int ll_fname_disk_to_usr(struct inode *inode,
+			 u32 hash, u32 minor_hash,
+			 struct fscrypt_str *iname, struct fscrypt_str *oname)
+{
+	return fscrypt_fname_disk_to_usr(inode, hash, minor_hash, iname, oname);
+}
+
+int ll_revalidate_d_crypto(struct dentry *dentry, unsigned int flags)
+{
+	return 1;
+}
+#endif
+
 extern const struct fscrypt_operations lustre_cryptops;
 
 /* llite/llite_foreign.c */
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index 58e60c8..7a822b8 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -3003,6 +3003,9 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 				      u32 mode, enum md_op_code opc,
 				      void *data)
 {
+	struct fscrypt_name fname = { 0 };
+	int rc;
+
 	if (!name) {
 		/* Do not reuse namelen for something else. */
 		if (namelen)
@@ -3025,7 +3028,6 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 
 	ll_i2gids(op_data->op_suppgids, i1, i2);
 	op_data->op_fid1 = *ll_inode2fid(i1);
-	op_data->op_code = opc;
 
 	if (S_ISDIR(i1->i_mode)) {
 		down_read_non_owner(&ll_i2info(i1)->lli_lsm_sem);
@@ -3057,8 +3059,46 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 	if (ll_need_32bit_api(ll_i2sbi(i1)))
 		op_data->op_cli_flags |= CLI_API32;
 
-	op_data->op_name = name;
-	op_data->op_namelen = namelen;
+	if (opc == LUSTRE_OPC_LOOKUP || opc == LUSTRE_OPC_CREATE) {
+		/* In case of lookup, ll_setup_filename() has already been
+		 * called in ll_lookup_it(), so just take provided name.
+		 */
+		fname.disk_name.name = (unsigned char *)name;
+		fname.disk_name.len = namelen;
+	} else if (name && namelen) {
+		struct qstr dname = QSTR_INIT(name, namelen);
+		struct inode *dir;
+		int lookup;
+
+		if (!S_ISDIR(i1->i_mode) && i2 && S_ISDIR(i2->i_mode)) {
+			/* special case when called from ll_link() */
+			dir = i2;
+			lookup = 0;
+		} else {
+			dir = i1;
+			lookup = (int)(opc == LUSTRE_OPC_ANY);
+		}
+		rc = ll_setup_filename(dir, &dname, lookup, &fname);
+		if (rc) {
+			ll_finish_md_op_data(op_data);
+			return ERR_PTR(rc);
+		}
+		if (fname.disk_name.name &&
+		    fname.disk_name.name != (unsigned char *)name)
+			/* op_data->op_name must be freed after use */
+			op_data->op_flags |= MF_OPNAME_KMALLOCED;
+	}
+
+	/* In fact LUSTRE_OPC_LOOKUP, LUSTRE_OPC_OPEN, LUSTRE_OPC_MIGR
+	 * are LUSTRE_OPC_ANY
+	 */
+	if (opc == LUSTRE_OPC_LOOKUP || opc == LUSTRE_OPC_OPEN ||
+	    opc == LUSTRE_OPC_MIGR)
+		op_data->op_code = LUSTRE_OPC_ANY;
+	else
+		op_data->op_code = opc;
+	op_data->op_name = fname.disk_name.name;
+	op_data->op_namelen = fname.disk_name.len;
 	op_data->op_mode = mode;
 	op_data->op_mod_time = ktime_get_real_seconds();
 	op_data->op_fsuid = from_kuid(&init_user_ns, current_fsuid());
@@ -3078,6 +3118,11 @@ void ll_finish_md_op_data(struct md_op_data *op_data)
 	ll_unlock_md_op_lsm(op_data);
 	security_release_secctx(op_data->op_file_secctx,
 				op_data->op_file_secctx_size);
+	if (op_data->op_flags & MF_OPNAME_KMALLOCED)
+		/* allocated via ll_setup_filename called
+		 * from ll_prep_md_op_data
+		 */
+		kfree(op_data->op_name);
 	kfree(op_data->op_file_encctx);
 	kfree(op_data);
 }
diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index 54b4e0a..f0f10da 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -812,6 +812,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	struct md_op_data *op_data = NULL;
 	struct lov_user_md *lum = NULL;
 	char secctx_name[XATTR_NAME_MAX + 1];
+	struct fscrypt_name fname;
 	struct inode *inode;
 	u32 opc;
 	int rc;
@@ -846,12 +847,31 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	if (it->it_op & IT_CREAT)
 		opc = LUSTRE_OPC_CREATE;
 	else
-		opc = LUSTRE_OPC_ANY;
+		opc = LUSTRE_OPC_LOOKUP;
+
+	/* Here we should be calling fscrypt_prepare_lookup(). But it installs a
+	 * custom ->d_revalidate() method, so we lose ll_d_ops.
+	 * To workaround this, call ll_setup_filename() and do the rest
+	 * manually. Also make a copy of fscrypt_d_revalidate() (unfortunately
+	 * not exported function) and call it from ll_revalidate_dentry(), to
+	 * ensure we do not cache stale dentries after a key has been added.
+	 */
+	rc = ll_setup_filename(parent, &dentry->d_name, 1, &fname);
+	if ((!rc || rc == -ENOENT) && fname.is_ciphertext_name) {
+		spin_lock(&dentry->d_lock);
+		dentry->d_flags |= DCACHE_ENCRYPTED_NAME;
+		spin_unlock(&dentry->d_lock);
+	}
+	if (rc == -ENOENT)
+		return NULL;
+	if (rc)
+		return ERR_PTR(rc);
 
-	op_data = ll_prep_md_op_data(NULL, parent, NULL, dentry->d_name.name,
-				     dentry->d_name.len, 0, opc, NULL);
+	op_data = ll_prep_md_op_data(NULL, parent, NULL, fname.disk_name.name,
+				     fname.disk_name.len, 0, opc, NULL);
 	if (IS_ERR(op_data)) {
-		retval = ERR_CAST(op_data);
+		fscrypt_free_filename(&fname);
+		return ERR_CAST(op_data);
 		goto out;
 	}
 
@@ -1111,6 +1131,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 			op_data->op_file_encctx = NULL;
 			op_data->op_file_encctx_size = 0;
 		}
+		fscrypt_free_filename(&fname);
 		ll_finish_md_op_data(op_data);
 	}
 
@@ -1934,6 +1955,7 @@ static int ll_rename(struct inode *src, struct dentry *src_dchild,
 		     struct inode *tgt, struct dentry *tgt_dchild,
 		     unsigned int flags)
 {
+	struct fscrypt_name foldname, fnewname;
 	struct ptlrpc_request *request = NULL;
 	struct ll_sb_info *sbi = ll_i2sbi(src);
 	struct md_op_data *op_data;
@@ -1977,11 +1999,20 @@ static int ll_rename(struct inode *src, struct dentry *src_dchild,
 	if (tgt_dchild->d_inode)
 		op_data->op_fid4 = *ll_inode2fid(tgt_dchild->d_inode);
 
+	err = ll_setup_filename(src, &src_dchild->d_name, 1, &foldname);
+	if (err)
+		return err;
+	err = ll_setup_filename(tgt, &tgt_dchild->d_name, 1, &fnewname);
+	if (err) {
+		fscrypt_free_filename(&foldname);
+		return err;
+	}
 	err = md_rename(sbi->ll_md_exp, op_data,
-			src_dchild->d_name.name,
-			src_dchild->d_name.len,
-			tgt_dchild->d_name.name,
-			tgt_dchild->d_name.len, &request);
+			foldname.disk_name.name, foldname.disk_name.len,
+			fnewname.disk_name.name, fnewname.disk_name.len,
+			&request);
+	fscrypt_free_filename(&foldname);
+	fscrypt_free_filename(&fnewname);
 	ll_finish_md_op_data(op_data);
 	if (!err) {
 		ll_update_times(request, src);
diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c
index e00fe58..cb435d5 100644
--- a/fs/lustre/llite/statahead.c
+++ b/fs/lustre/llite/statahead.c
@@ -330,6 +330,11 @@ static void sa_free(struct ll_statahead_info *sai, struct sa_entry *entry)
 /* finish async stat RPC arguments */
 static void sa_fini_data(struct md_enqueue_info *minfo)
 {
+	struct md_op_data *op_data = &minfo->mi_data;
+
+	if (op_data->op_flags & MF_OPNAME_KMALLOCED)
+		/* allocated via ll_setup_filename called from sa_prep_data */
+		kfree(op_data->op_name);
 	ll_unlock_md_op_lsm(&minfo->mi_data);
 	iput(minfo->mi_dir);
 	kfree(minfo);
@@ -1031,6 +1036,7 @@ static int ll_statahead_thread(void *arg)
 			u64 hash;
 			int namelen;
 			char *name;
+			struct fscrypt_str lltr = FSTR_INIT(NULL, 0);
 
 			hash = le64_to_cpu(ent->lde_hash);
 			if (unlikely(hash < pos))
@@ -1107,7 +1113,27 @@ static int ll_statahead_thread(void *arg)
 			}
 			__set_current_state(TASK_RUNNING);
 
+			if (IS_ENCRYPTED(dir)) {
+				struct fscrypt_str de_name =
+					FSTR_INIT(ent->lde_name, namelen);
+
+				rc = fscrypt_fname_alloc_buffer(dir, NAME_MAX,
+								&lltr);
+				if (rc < 0)
+					continue;
+
+				if (ll_fname_disk_to_usr(dir, 0, 0, &de_name,
+							 &lltr)) {
+					fscrypt_fname_free_buffer(&lltr);
+					continue;
+				}
+
+				name = lltr.name;
+				namelen = lltr.len;
+			}
+
 			sa_statahead(parent, name, namelen, &fid);
+			fscrypt_fname_free_buffer(&lltr);
 		}
 
 		pos = le64_to_cpu(dp->ldp_hash_end);
@@ -1249,6 +1275,7 @@ enum {
 /* file is first dirent under @dir */
 static int is_first_dirent(struct inode *dir, struct dentry *dentry)
 {
+	struct fscrypt_str lltr = FSTR_INIT(NULL, 0);
 	const struct qstr *target = &dentry->d_name;
 	struct md_op_data *op_data;
 	struct page *page;
@@ -1260,6 +1287,14 @@ static int is_first_dirent(struct inode *dir, struct dentry *dentry)
 				     LUSTRE_OPC_ANY, dir);
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
+
+	if (IS_ENCRYPTED(dir)) {
+		int rc2 = fscrypt_fname_alloc_buffer(dir, NAME_MAX, &lltr);
+
+		if (rc2 < 0)
+			return rc2;
+	}
+
 	/**
 	 * FIXME choose the start offset of the readdir
 	 */
@@ -1286,6 +1321,7 @@ static int is_first_dirent(struct inode *dir, struct dentry *dentry)
 			u64 hash;
 			int namelen;
 			char *name;
+			struct fscrypt_str lltr = FSTR_INIT(NULL, 0);
 
 			hash = le64_to_cpu(ent->lde_hash);
 			/*
@@ -1327,6 +1363,17 @@ static int is_first_dirent(struct inode *dir, struct dentry *dentry)
 				continue;
 			}
 
+			if (IS_ENCRYPTED(dir)) {
+				struct fscrypt_str de_name =
+					FSTR_INIT(ent->lde_name, namelen);
+
+				if (ll_fname_disk_to_usr(dir, 0, 0, &de_name,
+							  &lltr))
+					continue;
+				name = lltr.name;
+				namelen = lltr.len;
+			}
+
 			if (target->len != namelen ||
 			    memcmp(target->name, name, namelen) != 0)
 				rc = LS_NOT_FIRST_DE;
@@ -1357,6 +1404,7 @@ static int is_first_dirent(struct inode *dir, struct dentry *dentry)
 		}
 	}
 out:
+	fscrypt_fname_free_buffer(&lltr);
 	ll_finish_md_op_data(op_data);
 
 	return rc;
diff --git a/fs/lustre/mdc/mdc_lib.c b/fs/lustre/mdc/mdc_lib.c
index ccaa0f2..d07ef81 100644
--- a/fs/lustre/mdc/mdc_lib.c
+++ b/fs/lustre/mdc/mdc_lib.c
@@ -101,8 +101,12 @@ static void mdc_pack_name(struct req_capsule *pill,
 	buf = req_capsule_client_get(pill, field);
 	buf_size = req_capsule_get_size(pill, field, RCL_CLIENT);
 
-	LASSERT(name && name_len && buf && buf_size == name_len + 1);
+	LASSERT(buf && buf_size == name_len + 1);
 
+	if (!name) {
+		buf[name_len] = '\0';
+		return;
+	}
 	cpy_len = strlcpy(buf, name, buf_size);
 
 	LASSERT(lu_name_is_valid_2(buf, cpy_len));
diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c
index 836b2a2..2874a41 100644
--- a/fs/lustre/ptlrpc/layout.c
+++ b/fs/lustre/ptlrpc/layout.c
@@ -969,7 +969,7 @@ struct req_msg_field RMF_FID_ARRAY =
 EXPORT_SYMBOL(RMF_FID_ARRAY);
 
 struct req_msg_field RMF_SYMTGT =
-	DEFINE_MSGF("symtgt", RMF_F_STRING, -1, NULL, NULL);
+	DEFINE_MSGF("symtgt", 0, -1, NULL, NULL);
 EXPORT_SYMBOL(RMF_SYMTGT);
 
 struct req_msg_field RMF_TGTUUID =
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 22/24] lustre: uapi: fixup UAPI headers for native Linux client.
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (20 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 21/24] lustre: sec: filename encryption James Simmons
@ 2021-09-22  2:19 ` James Simmons
  2021-09-22  2:20 ` [lustre-devel] [PATCH 23/24] lustre: ptlrpc: separate out server code for wiretest James Simmons
  2021-09-22  2:20 ` [lustre-devel] [PATCH 24/24] lustre: pcc: VM_WRITE should not trigger layout write James Simmons
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:19 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

This covers all the UAPI problems outside of the user land
wiretest utility. One set of problems is build and the second is
that UAPI header definitions are either user land only or never
used to valid data going to or from user land.

1) Use UAPI header definitions to validate data send to or from
   kernel space. We check lum_hash_type using LMV_HASH_TYPE_MASK.
   This avoids a round trip to the server which will report back
   an error. The other case is we check the values returned for
   LL_IOC_HSM_ACTION. We keep the original behavior of passing
   unknown data to the user land application but add debug
   logging if the data looks corrupt to help track down bug
   issues.

2) We can use QIF_DQBLKSIZE* instead of Lustre specific values
   for our quota handling. QIF_DQBLKSIZE* is a Linux UAPI quota
   value.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13903
Lustre-commit: d963e66f609c3bf47 ("LU-13903 uapi: fixup UAPI headers for native Linux client.")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44664
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/llite/dir.c                   |  5 +++++
 fs/lustre/llite/file.c                  | 19 ++++++++++++++++++-
 fs/lustre/ptlrpc/wiretest.c             |  5 +++++
 include/uapi/linux/lustre/lustre_user.h | 20 ++++++++++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index f7216db..b7dd2aa 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -423,6 +423,7 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 		},
 	};
 	bool encrypt = false;
+	int hash_flags;
 	int err;
 
 	if (unlikely(!lmv_user_magic_supported(lump->lum_magic)))
@@ -463,6 +464,10 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 					      LMV_HASH_TYPE_FNV_1A_64;
 	}
 
+	hash_flags = lump->lum_hash_type & ~LMV_HASH_TYPE_MASK;
+	if (hash_flags & ~LMV_HASH_FLAG_KNOWN)
+		return -EINVAL;
+
 	if (unlikely(!lmv_user_magic_supported(cpu_to_le32(lump->lum_magic))))
 		lustre_swab_lmv_user_md(lump);
 
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index ab7c72a..f340d67 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -3973,6 +3973,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags)
 	case LL_IOC_HSM_ACTION: {
 		struct md_op_data *op_data;
 		struct hsm_current_action *hca;
+		const char *action;
 		int rc;
 
 		hca = kzalloc(sizeof(*hca), GFP_KERNEL);
@@ -3988,10 +3989,26 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags)
 
 		rc = obd_iocontrol(cmd, ll_i2mdexp(inode), sizeof(*op_data),
 				   op_data, NULL);
+		if (rc < 0)
+			goto skip_copy;
+
+		/* The hsm_current_action retreived from the server could
+		 * contain corrupt information. If it is incorrect data collect
+		 * debug information. We still send the data even if incorrect
+		 * to user land to handle.
+		 */
+		action = hsm_user_action2name(hca->hca_action);
+		if (strcmp(action, "UNKNOWN") == 0 ||
+		    hca->hca_state > HPS_DONE) {
+			CDEBUG(D_HSM,
+			       "HSM current state %s action %s, offset = %llu, length %llu\n",
+			       hsm_progress_state2name(hca->hca_state), action,
+			       hca->hca_location.offset, hca->hca_location.length);
+		}
 
 		if (copy_to_user((char __user *)arg, hca, sizeof(*hca)))
 			rc = -EFAULT;
-
+skip_copy:
 		ll_finish_md_op_data(op_data);
 		kfree(hca);
 		return rc;
diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c
index b063cb9..1e89974 100644
--- a/fs/lustre/ptlrpc/wiretest.c
+++ b/fs/lustre/ptlrpc/wiretest.c
@@ -1908,6 +1908,11 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(union lquota_id) == 16, "found %lld\n",
 		 (long long)(int)sizeof(union lquota_id));
 
+	LASSERTF(QIF_DQBLKSIZE_BITS == 10, "found %lld\n",
+		 (long long)QIF_DQBLKSIZE_BITS);
+	LASSERTF(QIF_DQBLKSIZE == 1024, "found %lld\n",
+		 (long long)QIF_DQBLKSIZE);
+
 	/* Checks for struct obd_quotactl */
 	LASSERTF((int)sizeof(struct obd_quotactl) == 112, "found %lld\n",
 		 (long long)(int)sizeof(struct obd_quotactl));
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index 1940e52..5c4dadf 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -1877,6 +1877,26 @@ enum hsm_states {
  */
 #define HSM_FLAGS_MASK  (HSM_USER_MASK | HSM_STATUS_MASK)
 
+/**
+ * HSM request progress state
+ */
+enum hsm_progress_states {
+	HPS_NONE	= 0,
+	HPS_WAITING	= 1,
+	HPS_RUNNING	= 2,
+	HPS_DONE	= 3,
+};
+
+static inline const char *hsm_progress_state2name(enum hsm_progress_states s)
+{
+	switch  (s) {
+	case HPS_WAITING:	return "waiting";
+	case HPS_RUNNING:	return "running";
+	case HPS_DONE:		return "done";
+	default:		return "unknown";
+	}
+}
+
 struct hsm_extent {
 	__u64 offset;
 	__u64 length;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 23/24] lustre: ptlrpc: separate out server code for wiretest
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (21 preceding siblings ...)
  2021-09-22  2:19 ` [lustre-devel] [PATCH 22/24] lustre: uapi: fixup UAPI headers for native Linux client James Simmons
@ 2021-09-22  2:20 ` James Simmons
  2021-09-22  2:20 ` [lustre-devel] [PATCH 24/24] lustre: pcc: VM_WRITE should not trigger layout write James Simmons
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:20 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

Both the kernel and userland utility wiretest is used by both
client and server to validate data being sent over the network.
Make userland  wiretest buildable on the native Linux client
which lacks server specific data structures. Use of the UAPI
values to hardern testing of user land data passed to the
kernel.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13903
Lustre-commit: 9ef92397c3c806631 ("LU-13903 utils: separate out server code for wiretest")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43873
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/include/lustre_swab.h        |   1 -
 fs/lustre/obdclass/llog_swab.c         |  33 ------
 fs/lustre/ptlrpc/layout.c              |   3 +-
 fs/lustre/ptlrpc/pack_generic.c        |  13 +--
 fs/lustre/ptlrpc/wiretest.c            | 205 +++++++++++++--------------------
 include/uapi/linux/lustre/lustre_idl.h |  78 +++----------
 6 files changed, 97 insertions(+), 236 deletions(-)

diff --git a/fs/lustre/include/lustre_swab.h b/fs/lustre/include/lustre_swab.h
index bac3636..000e622 100644
--- a/fs/lustre/include/lustre_swab.h
+++ b/fs/lustre/include/lustre_swab.h
@@ -70,7 +70,6 @@
 void lustre_swab_lmv_desc(struct lmv_desc *ld);
 void lustre_swab_lmv_mds_md(union lmv_mds_md *lmm);
 void lustre_swab_lov_desc(struct lov_desc *ld);
-void lustre_swab_gl_desc(union ldlm_gl_desc *desc);
 void lustre_swab_ldlm_intent(struct ldlm_intent *i);
 void lustre_swab_ldlm_request(struct ldlm_request *rq);
 void lustre_swab_ldlm_reply(struct ldlm_reply *r);
diff --git a/fs/lustre/obdclass/llog_swab.c b/fs/lustre/obdclass/llog_swab.c
index 7bfc304..fcc2a48 100644
--- a/fs/lustre/obdclass/llog_swab.c
+++ b/fs/lustre/obdclass/llog_swab.c
@@ -189,25 +189,6 @@ void lustre_swab_llog_rec(struct llog_rec_hdr *rec)
 		break;
 	}
 
-	case CHANGELOG_USER_REC:
-	case CHANGELOG_USER_REC2:
-	{
-		struct llog_changelog_user_rec2 *cur =
-			(struct llog_changelog_user_rec2 *)rec;
-
-		__swab32s(&cur->cur_id);
-		__swab64s(&cur->cur_endrec);
-		if (cur->cur_hdr.lrh_type == CHANGELOG_USER_REC2) {
-			__swab32s(&cur->cur_mask);
-			BUILD_BUG_ON(offsetof(typeof(*cur), cur_padding1) == 0);
-			BUILD_BUG_ON(offsetof(typeof(*cur), cur_padding2) == 0);
-			BUILD_BUG_ON(offsetof(typeof(*cur), cur_padding3) == 0);
-		}
-		tail = (struct llog_rec_tail *)((char *)rec +
-						rec->lrh_len - sizeof(*tail));
-		break;
-	}
-
 	case HSM_AGENT_REC: {
 		struct llog_agent_req_rec *arr =
 			(struct llog_agent_req_rec *)rec;
@@ -225,20 +206,6 @@ void lustre_swab_llog_rec(struct llog_rec_hdr *rec)
 		break;
 	}
 
-	case MDS_SETATTR64_REC:
-	{
-		struct llog_setattr64_rec *lsr =
-			(struct llog_setattr64_rec *)rec;
-
-		lustre_swab_ost_id(&lsr->lsr_oi);
-		__swab32s(&lsr->lsr_uid);
-		__swab32s(&lsr->lsr_uid_h);
-		__swab32s(&lsr->lsr_gid);
-		__swab32s(&lsr->lsr_gid_h);
-		__swab64s(&lsr->lsr_valid);
-		tail = &lsr->lsr_tail;
-		break;
-	}
 	case OBD_CFG_REC:
 		/* these are swabbed as they are consumed */
 		break;
diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c
index 2874a41..f31ab6e 100644
--- a/fs/lustre/ptlrpc/layout.c
+++ b/fs/lustre/ptlrpc/layout.c
@@ -1052,8 +1052,7 @@ struct req_msg_field RMF_DLM_LVB =
 EXPORT_SYMBOL(RMF_DLM_LVB);
 
 struct req_msg_field RMF_DLM_GL_DESC =
-	DEFINE_MSGF("dlm_gl_desc", 0, sizeof(union ldlm_gl_desc),
-		    lustre_swab_gl_desc, NULL);
+	DEFINE_MSGF("dlm_gl_desc", 0, sizeof(union ldlm_gl_desc), NULL, NULL);
 EXPORT_SYMBOL(RMF_DLM_GL_DESC);
 
 struct req_msg_field RMF_MDT_MD =
diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c
index 6710e6b..62e060d 100644
--- a/fs/lustre/ptlrpc/pack_generic.c
+++ b/fs/lustre/ptlrpc/pack_generic.c
@@ -1738,17 +1738,6 @@ void lustre_swab_generic_32s(u32 *val)
 	__swab32s(val);
 }
 
-void lustre_swab_gl_desc(union ldlm_gl_desc *desc)
-{
-	lustre_swab_lu_fid(&desc->lquota_desc.gl_id.qid_fid);
-	__swab64s(&desc->lquota_desc.gl_flags);
-	__swab64s(&desc->lquota_desc.gl_ver);
-	__swab64s(&desc->lquota_desc.gl_hardlimit);
-	__swab64s(&desc->lquota_desc.gl_softlimit);
-	__swab64s(&desc->lquota_desc.gl_time);
-	BUILD_BUG_ON(offsetof(typeof(desc->lquota_desc), gl_pad2) == 0);
-}
-
 void lustre_swab_ost_lvb_v1(struct ost_lvb_v1 *lvb)
 {
 	__swab64s(&lvb->lvb_size);
@@ -2321,7 +2310,7 @@ void lustre_swab_ldlm_intent(struct ldlm_intent *i)
 static void lustre_swab_ldlm_resource_desc(struct ldlm_resource_desc *r)
 {
 	__swab32s(&r->lr_type);
-	BUILD_BUG_ON(offsetof(typeof(*r), lr_padding) == 0);
+	BUILD_BUG_ON(offsetof(typeof(*r), lr_pad) == 0);
 	lustre_swab_ldlm_res_id(&r->lr_name);
 }
 
diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c
index 1e89974..bf09341 100644
--- a/fs/lustre/ptlrpc/wiretest.c
+++ b/fs/lustre/ptlrpc/wiretest.c
@@ -24,7 +24,7 @@
  * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved.
  * Use is subject to license terms.
  *
- * Copyright (c) 2011, 2015, Intel Corporation.
+ * Copyright (c) 2011, 2017, Intel Corporation.
  */
 /*
  * This file is part of Lustre, http://www.lustre.org/
@@ -32,14 +32,15 @@
 
 #define DEBUG_SUBSYSTEM S_RPC
 
+#ifdef CONFIG_LUSTRE_FS_POSIX_ACL
 #include <linux/fs.h>
 #include <linux/posix_acl_xattr.h>
+#endif /* CONFIG_LUSTRE_FS_POSIX_ACL */
 
 #include <obd_support.h>
 #include <obd_class.h>
 #include <lustre_net.h>
 #include <lustre_disk.h>
-#include <uapi/linux/lustre/lustre_idl.h>
 #include <uapi/linux/lustre/lustre_cfg.h>
 
 #include "ptlrpc_internal.h"
@@ -48,9 +49,6 @@ void lustre_assert_wire_constants(void)
 {
 	/* Wire protocol assertions generated by 'wirecheck'
 	 * (make -C lustre/utils newwiretest)
-	 * running on Linux centos6-bis 2.6.32-358.0.1.el6-head
-	 * #3 SMP Wed Apr 17 17:37:43 CEST 2013
-	 * with gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)
 	 */
 
 	/* Constants... */
@@ -257,11 +255,12 @@ void lustre_assert_wire_constants(void)
 		 (long long)MDS_ATTR_KILL_SGID);
 	LASSERTF(MDS_ATTR_CTIME_SET == 0x0000000000002000ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_CTIME_SET);
+	LASSERTF(MDS_ATTR_FROM_OPEN == 0x0000000000004000ULL, "found 0x%.16llxULL\n",
+		 (long long)MDS_ATTR_FROM_OPEN);
 	LASSERTF(MDS_ATTR_BLOCKS == 0x0000000000008000ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_BLOCKS);
 	LASSERTF(MDS_ATTR_PROJID == 0x0000000000010000ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_PROJID);
-
 	LASSERTF(MDS_ATTR_LSIZE == 0x0000000000020000ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_LSIZE);
 	LASSERTF(MDS_ATTR_LBLOCKS == 0x0000000000040000ULL, "found 0x%.16llxULL\n",
@@ -420,30 +419,6 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct lustre_som_attrs *)0)->lsa_blocks) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct lustre_som_attrs *)0)->lsa_blocks));
 
-	/* Checks for struct lustre_mdt_attrs */
-	LASSERTF((int)sizeof(struct lustre_mdt_attrs) == 24, "found %lld\n",
-		 (long long)(int)sizeof(struct lustre_mdt_attrs));
-	LASSERTF((int)offsetof(struct lustre_mdt_attrs, lma_compat) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct lustre_mdt_attrs, lma_compat));
-	LASSERTF((int)sizeof(((struct lustre_mdt_attrs *)0)->lma_compat) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct lustre_mdt_attrs *)0)->lma_compat));
-	LASSERTF((int)offsetof(struct lustre_mdt_attrs, lma_incompat) == 4, "found %lld\n",
-		 (long long)(int)offsetof(struct lustre_mdt_attrs, lma_incompat));
-	LASSERTF((int)sizeof(((struct lustre_mdt_attrs *)0)->lma_incompat) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct lustre_mdt_attrs *)0)->lma_incompat));
-	LASSERTF((int)offsetof(struct lustre_mdt_attrs, lma_self_fid) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct lustre_mdt_attrs, lma_self_fid));
-	LASSERTF((int)sizeof(((struct lustre_mdt_attrs *)0)->lma_self_fid) == 16, "found %lld\n",
-		 (long long)(int)sizeof(((struct lustre_mdt_attrs *)0)->lma_self_fid));
-	LASSERTF(LMAI_RELEASED == 0x00000001UL, "found 0x%.8xUL\n",
-		 (unsigned int)LMAI_RELEASED);
-	LASSERTF(LMAC_HSM == 0x00000001UL, "found 0x%.8xUL\n",
-		 (unsigned int)LMAC_HSM);
-	LASSERTF(LMAC_NOT_IN_OI == 0x00000004UL, "found 0x%.8xUL\n",
-		 (unsigned int)LMAC_NOT_IN_OI);
-	LASSERTF(LMAC_FID_ON_OST == 0x00000008UL, "found 0x%.8xUL\n",
-		 (unsigned int)LMAC_FID_ON_OST);
-
 	/* Checks for struct ost_id */
 	LASSERTF((int)sizeof(struct ost_id) == 16, "found %lld\n",
 		 (long long)(int)sizeof(struct ost_id));
@@ -459,10 +434,12 @@ void lustre_assert_wire_constants(void)
 		 (long long)FID_SEQ_LLOG);
 	LASSERTF(FID_SEQ_ECHO == 2, "found %lld\n",
 		 (long long)FID_SEQ_ECHO);
-	LASSERTF(FID_SEQ_OST_MDT1 == 3, "found %lld\n",
-		 (long long)FID_SEQ_OST_MDT1);
-	LASSERTF(FID_SEQ_OST_MAX == 9, "found %lld\n",
-		 (long long)FID_SEQ_OST_MAX);
+	LASSERTF(FID_SEQ_UNUSED_START == 3, "found %lld\n",
+		 (long long)FID_SEQ_UNUSED_START);
+	LASSERTF(FID_SEQ_UNUSED_END == 9, "found %lld\n",
+		 (long long)FID_SEQ_UNUSED_END);
+	LASSERTF(FID_SEQ_LLOG_NAME == 10, "found %lld\n",
+		 (long long)FID_SEQ_LLOG_NAME);
 	LASSERTF(FID_SEQ_RSVD == 11, "found %lld\n",
 		 (long long)FID_SEQ_RSVD);
 	LASSERTF(FID_SEQ_IGIF == 12, "found %lld\n",
@@ -479,6 +456,8 @@ void lustre_assert_wire_constants(void)
 		 (long long)FID_SEQ_LOCAL_FILE);
 	LASSERTF(FID_SEQ_DOT_LUSTRE == 0x0000000200000002ULL, "found 0x%.16llxULL\n",
 		 (long long)FID_SEQ_DOT_LUSTRE);
+	LASSERTF(FID_SEQ_LOCAL_NAME == 0x0000000200000003ULL, "found 0x%.16llxULL\n",
+		 (long long)FID_SEQ_LOCAL_NAME);
 	LASSERTF(FID_SEQ_SPECIAL == 0x0000000200000004ULL, "found 0x%.16llxULL\n",
 		 (long long)FID_SEQ_SPECIAL);
 	LASSERTF(FID_SEQ_QUOTA == 0x0000000200000005ULL, "found 0x%.16llxULL\n",
@@ -497,6 +476,8 @@ void lustre_assert_wire_constants(void)
 		 (unsigned int)FID_OID_DOT_LUSTRE);
 	LASSERTF(FID_OID_DOT_LUSTRE_OBF == 0x00000002UL, "found 0x%.8xUL\n",
 		 (unsigned int)FID_OID_DOT_LUSTRE_OBF);
+	LASSERTF(FID_OID_DOT_LUSTRE_LPF == 0x00000003UL, "found 0x%.8xUL\n",
+		 (unsigned int)FID_OID_DOT_LUSTRE_LPF);
 
 	/* Checks for struct lu_dirent */
 	LASSERTF((int)sizeof(struct lu_dirent) == 32, "found %lld\n",
@@ -1112,8 +1093,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT_TRANSNO);
 	LASSERTF(OBD_CONNECT_IBITS == 0x1000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_IBITS);
-	LASSERTF(OBD_CONNECT_JOIN == 0x2000ULL, "found 0x%.16llxULL\n",
-		 OBD_CONNECT_JOIN);
+	LASSERTF(OBD_CONNECT_BARRIER == 0x2000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT_BARRIER);
 	LASSERTF(OBD_CONNECT_ATTRFID == 0x4000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_ATTRFID);
 	LASSERTF(OBD_CONNECT_NODEVOH == 0x8000ULL, "found 0x%.16llxULL\n",
@@ -1204,6 +1185,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT_DIR_STRIPE);
 	LASSERTF(OBD_CONNECT_SUBTREE == 0x800000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_SUBTREE);
+	LASSERTF(OBD_CONNECT_BULK_MBITS == 0x2000000000000000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT_BULK_MBITS);
 	LASSERTF(OBD_CONNECT_OBDOPACK == 0x4000000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_OBDOPACK);
 	LASSERTF(OBD_CONNECT_FLAGS2 == 0x8000000000000000ULL, "found 0x%.16llxULL\n",
@@ -1502,6 +1485,10 @@ void lustre_assert_wire_constants(void)
 		 OBD_MD_FLGETATTRLOCK);
 	LASSERTF(OBD_MD_FLDATAVERSION == (0x0010000000000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLDATAVERSION);
+	LASSERTF(OBD_MD_CLOSE_INTENT_EXECED == (0x0020000000000000ULL), "found 0x%.16llxULL\n",
+		 OBD_MD_CLOSE_INTENT_EXECED);
+	LASSERTF(OBD_MD_DEFAULT_MEA == (0x0040000000000000ULL), "found 0x%.16llxULL\n",
+		 OBD_MD_DEFAULT_MEA);
 	LASSERTF(OBD_MD_FLOSTLAYOUT == (0x0080000000000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLOSTLAYOUT);
 	LASSERTF(OBD_MD_FLPROJID == (0x0100000000000000ULL), "found 0x%.16llxULL\n",
@@ -1538,6 +1525,8 @@ void lustre_assert_wire_constants(void)
 	BUILD_BUG_ON(OBD_FL_MMAP != 0x00040000);
 	BUILD_BUG_ON(OBD_FL_RECOV_RESEND != 0x00080000);
 	BUILD_BUG_ON(OBD_FL_NOSPC_BLK != 0x00100000);
+	BUILD_BUG_ON(OBD_FL_FLUSH != 0x00200000);
+	BUILD_BUG_ON(OBD_FL_SHORT_IO != 0x00400000);
 
 	/* Checks for struct lov_ost_data_v1 */
 	LASSERTF((int)sizeof(struct lov_ost_data_v1) == 24, "found %lld\n",
@@ -1616,10 +1605,10 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_layout_gen) == 2, "found %lld\n",
 		 (long long)(int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_layout_gen));
 	BUILD_BUG_ON(LOV_MAXPOOLNAME != 15);
-	LASSERTF((int)offsetof(struct lov_mds_md_v3, lmm_pool_name[16]) == 48, "found %lld\n",
-		 (long long)(int)offsetof(struct lov_mds_md_v3, lmm_pool_name[16]));
-	LASSERTF((int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_pool_name[16]) == 1, "found %lld\n",
-		 (long long)(int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_pool_name[16]));
+	LASSERTF((int)offsetof(struct lov_mds_md_v3, lmm_pool_name[15 + 1]) == 48, "found %lld\n",
+		 (long long)(int)offsetof(struct lov_mds_md_v3, lmm_pool_name[15 + 1]));
+	LASSERTF((int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_pool_name[15 + 1]) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_pool_name[15 + 1]));
 	LASSERTF((int)offsetof(struct lov_mds_md_v3, lmm_objects[0]) == 48, "found %lld\n",
 		 (long long)(int)offsetof(struct lov_mds_md_v3, lmm_objects[0]));
 	LASSERTF((int)sizeof(((struct lov_mds_md_v3 *)0)->lmm_objects[0]) == 24, "found %lld\n",
@@ -1769,10 +1758,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct lmv_mds_md_v1, lmv_padding3));
 	LASSERTF((int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_padding3) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_padding3));
-	LASSERTF((int)offsetof(struct lmv_mds_md_v1, lmv_pool_name[16]) == 56, "found %lld\n",
-		 (long long)(int)offsetof(struct lmv_mds_md_v1, lmv_pool_name[16]));
-	LASSERTF((int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_pool_name[16]) == 1, "found %lld\n",
-		 (long long)(int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_pool_name[16]));
+	LASSERTF((int)offsetof(struct lmv_mds_md_v1, lmv_pool_name[15 + 1]) == 56, "found %lld\n",
+		 (long long)(int)offsetof(struct lmv_mds_md_v1, lmv_pool_name[15 + 1]));
+	LASSERTF((int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_pool_name[15 + 1]) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_pool_name[15 + 1]));
 	LASSERTF((int)offsetof(struct lmv_mds_md_v1, lmv_stripe_fids[0]) == 56, "found %lld\n",
 		 (long long)(int)offsetof(struct lmv_mds_md_v1, lmv_stripe_fids[0]));
 	LASSERTF((int)sizeof(((struct lmv_mds_md_v1 *)0)->lmv_stripe_fids[0]) == 16, "found %lld\n",
@@ -1913,6 +1902,11 @@ void lustre_assert_wire_constants(void)
 	LASSERTF(QIF_DQBLKSIZE == 1024, "found %lld\n",
 		 (long long)QIF_DQBLKSIZE);
 
+	LASSERTF(QIF_DQBLKSIZE_BITS == 10, "found %lld\n",
+		 (long long)QIF_DQBLKSIZE_BITS);
+	LASSERTF(QIF_DQBLKSIZE == 1024, "found %lld\n",
+		 (long long)QIF_DQBLKSIZE);
+
 	/* Checks for struct obd_quotactl */
 	LASSERTF((int)sizeof(struct obd_quotactl) == 112, "found %lld\n",
 		 (long long)(int)sizeof(struct obd_quotactl));
@@ -2060,6 +2054,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_BRW_OVER_GRPQUOTA);
 	LASSERTF(OBD_BRW_SOFT_SYNC == 0x4000, "found 0x%.8x\n",
 		 OBD_BRW_SOFT_SYNC);
+	LASSERTF(OBD_BRW_OVER_PRJQUOTA == 0x8000, "found 0x%.8x\n",
+		 OBD_BRW_OVER_PRJQUOTA);
 	LASSERTF(OBD_BRW_RDMA_ONLY == 0x20000, "found 0x%.8x\n",
 		 OBD_BRW_RDMA_ONLY);
 
@@ -3004,6 +3000,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct mdt_rec_resync, rs_padding0));
 	LASSERTF((int)sizeof(((struct mdt_rec_resync *)0)->rs_padding0) == 16, "found %lld\n",
 		 (long long)(int)sizeof(((struct mdt_rec_resync *)0)->rs_padding0));
+	LASSERTF((int)offsetof(struct mdt_rec_resync, rs_lease_handle) == 72, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_rec_resync, rs_lease_handle));
+	LASSERTF((int)sizeof(((struct mdt_rec_resync *)0)->rs_lease_handle) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_rec_resync *)0)->rs_lease_handle));
 	LASSERTF((int)offsetof(struct mdt_rec_resync, rs_padding1) == 80, "found %lld\n",
 		 (long long)(int)offsetof(struct mdt_rec_resync, rs_padding1));
 	LASSERTF((int)sizeof(((struct mdt_rec_resync *)0)->rs_padding1) == 8, "found %lld\n",
@@ -3278,6 +3278,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct ldlm_inodebits, bits));
 	LASSERTF((int)sizeof(((struct ldlm_inodebits *)0)->bits) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct ldlm_inodebits *)0)->bits));
+	LASSERTF((int)offsetof(struct ldlm_inodebits, cancel_bits) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ldlm_inodebits, cancel_bits));
+	LASSERTF((int)sizeof(((struct ldlm_inodebits *)0)->cancel_bits) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct ldlm_inodebits *)0)->cancel_bits));
 	LASSERTF((int)offsetof(struct ldlm_inodebits, li_gid) == 16, "found %lld\n",
 		 (long long)(int)offsetof(struct ldlm_inodebits, li_gid));
 	LASSERTF((int)sizeof(((struct ldlm_inodebits *)0)->li_gid) == 8, "found %lld\n",
@@ -3333,10 +3337,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct ldlm_resource_desc, lr_type));
 	LASSERTF((int)sizeof(((struct ldlm_resource_desc *)0)->lr_type) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct ldlm_resource_desc *)0)->lr_type));
-	LASSERTF((int)offsetof(struct ldlm_resource_desc, lr_padding) == 4, "found %lld\n",
-		 (long long)(int)offsetof(struct ldlm_resource_desc, lr_padding));
-	LASSERTF((int)sizeof(((struct ldlm_resource_desc *)0)->lr_padding) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ldlm_resource_desc *)0)->lr_padding));
+	LASSERTF((int)offsetof(struct ldlm_resource_desc, lr_pad) == 4, "found %lld\n",
+		 (long long)(int)offsetof(struct ldlm_resource_desc, lr_pad));
+	LASSERTF((int)sizeof(((struct ldlm_resource_desc *)0)->lr_pad) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ldlm_resource_desc *)0)->lr_pad));
 	LASSERTF((int)offsetof(struct ldlm_resource_desc, lr_name) == 8, "found %lld\n",
 		 (long long)(int)offsetof(struct ldlm_resource_desc, lr_name));
 	LASSERTF((int)sizeof(((struct ldlm_resource_desc *)0)->lr_name) == 32, "found %lld\n",
@@ -3532,6 +3536,15 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct ldlm_gl_lquota_desc *)0)->gl_pad2) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct ldlm_gl_lquota_desc *)0)->gl_pad2));
 
+	/* Checks for struct mgs_send_param */
+	LASSERTF((int)sizeof(struct mgs_send_param) == 1024, "found %lld\n",
+		 (long long)(int)sizeof(struct mgs_send_param));
+	BUILD_BUG_ON(MGS_PARAM_MAXLEN != 1024);
+	LASSERTF((int)offsetof(struct mgs_send_param, mgs_param[1024]) == 1024, "found %lld\n",
+		 (long long)(int)offsetof(struct mgs_send_param, mgs_param[1024]));
+	LASSERTF((int)sizeof(((struct mgs_send_param *)0)->mgs_param[1024]) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct mgs_send_param *)0)->mgs_param[1024]));
+
 	/* Checks for struct cfg_marker */
 	LASSERTF((int)sizeof(struct cfg_marker) == 160, "found %lld\n",
 		 (long long)(int)sizeof(struct cfg_marker));
@@ -3589,6 +3602,7 @@ void lustre_assert_wire_constants(void)
 	BUILD_BUG_ON(CHANGELOG_USER_REC != 0x10670000);
 	BUILD_BUG_ON(CHANGELOG_USER_REC2 != 0x10670002);
 	BUILD_BUG_ON(HSM_AGENT_REC != 0x10680000);
+	BUILD_BUG_ON(UPDATE_REC != 0x106a0000);
 	BUILD_BUG_ON(LLOG_HDR_MAGIC != 0x10645539);
 	BUILD_BUG_ON(LLOG_LOGID_MAGIC != 0x1064553b);
 
@@ -3727,42 +3741,6 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llog_unlink64_rec *)0)->lur_padding3) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct llog_unlink64_rec *)0)->lur_padding3));
 
-	/* Checks for struct llog_setattr64_rec */
-	LASSERTF((int)sizeof(struct llog_setattr64_rec) == 64, "found %lld\n",
-		 (long long)(int)sizeof(struct llog_setattr64_rec));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_hdr) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_hdr));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_hdr) == 16, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_hdr));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_oi) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_oi));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_oi) == 16, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_oi));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_uid) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_uid));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_uid) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_uid));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_uid_h) == 36, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_uid_h));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_uid_h) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_uid_h));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_gid) == 40, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_gid));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_gid) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_gid));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_gid_h) == 44, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_gid_h));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_gid_h) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_gid_h));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_valid) == 48, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_valid));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_valid) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_valid));
-	LASSERTF((int)offsetof(struct llog_setattr64_rec, lsr_tail) == 56, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_setattr64_rec, lsr_tail));
-	LASSERTF((int)sizeof(((struct llog_setattr64_rec *)0)->lsr_tail) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_setattr64_rec *)0)->lsr_tail));
-
 	/* Checks for struct llog_size_change_rec */
 	LASSERTF((int)sizeof(struct llog_size_change_rec) == 64, "found %lld\n",
 		 (long long)(int)sizeof(struct llog_size_change_rec));
@@ -3831,6 +3809,18 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct changelog_rec *)0)->cr_pfid) == 16, "found %lld\n",
 		 (long long)(int)sizeof(((struct changelog_rec *)0)->cr_pfid));
 
+	/* Checks for struct changelog_ext_rename */
+	LASSERTF((int)sizeof(struct changelog_ext_rename) == 32, "found %lld\n",
+		 (long long)(int)sizeof(struct changelog_ext_rename));
+	LASSERTF((int)offsetof(struct changelog_ext_rename, cr_sfid) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct changelog_ext_rename, cr_sfid));
+	LASSERTF((int)sizeof(((struct changelog_ext_rename *)0)->cr_sfid) == 16, "found %lld\n",
+		 (long long)(int)sizeof(((struct changelog_ext_rename *)0)->cr_sfid));
+	LASSERTF((int)offsetof(struct changelog_ext_rename, cr_spfid) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct changelog_ext_rename, cr_spfid));
+	LASSERTF((int)sizeof(((struct changelog_ext_rename *)0)->cr_spfid) == 16, "found %lld\n",
+		 (long long)(int)sizeof(((struct changelog_ext_rename *)0)->cr_spfid));
+
 	/* Checks for struct changelog_setinfo */
 	LASSERTF((int)sizeof(struct changelog_setinfo) == 12, "found %lld\n",
 		 (long long)(int)sizeof(struct changelog_setinfo));
@@ -3859,30 +3849,6 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llog_changelog_rec *)0)->cr_do_not_use) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct llog_changelog_rec *)0)->cr_do_not_use));
 
-	/* Checks for struct llog_changelog_user_rec */
-	LASSERTF((int)sizeof(struct llog_changelog_user_rec) == 40, "found %lld\n",
-		 (long long)(int)sizeof(struct llog_changelog_user_rec));
-	LASSERTF((int)offsetof(struct llog_changelog_user_rec, cur_hdr) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_changelog_user_rec, cur_hdr));
-	LASSERTF((int)sizeof(((struct llog_changelog_user_rec *)0)->cur_hdr) == 16, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_changelog_user_rec *)0)->cur_hdr));
-	LASSERTF((int)offsetof(struct llog_changelog_user_rec, cur_id) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_changelog_user_rec, cur_id));
-	LASSERTF((int)sizeof(((struct llog_changelog_user_rec *)0)->cur_id) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_changelog_user_rec *)0)->cur_id));
-	LASSERTF((int)offsetof(struct llog_changelog_user_rec, cur_padding) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_changelog_user_rec, cur_padding));
-	LASSERTF((int)sizeof(((struct llog_changelog_user_rec *)0)->cur_padding) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_changelog_user_rec *)0)->cur_padding));
-	LASSERTF((int)offsetof(struct llog_changelog_user_rec, cur_endrec) == 24, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_changelog_user_rec, cur_endrec));
-	LASSERTF((int)sizeof(((struct llog_changelog_user_rec *)0)->cur_endrec) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_changelog_user_rec *)0)->cur_endrec));
-	LASSERTF((int)offsetof(struct llog_changelog_user_rec, cur_tail) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_changelog_user_rec, cur_tail));
-	LASSERTF((int)sizeof(((struct llog_changelog_user_rec *)0)->cur_tail) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_changelog_user_rec *)0)->cur_tail));
-
 	/* Checks for struct llog_gen */
 	LASSERTF((int)sizeof(struct llog_gen) == 16, "found %lld\n",
 		 (long long)(int)sizeof(struct llog_gen));
@@ -3946,18 +3912,6 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct llog_log_hdr, llh_tgtuuid));
 	LASSERTF((int)sizeof(((struct llog_log_hdr *)0)->llh_tgtuuid) == 40, "found %lld\n",
 		 (long long)(int)sizeof(((struct llog_log_hdr *)0)->llh_tgtuuid));
-	LASSERTF((int)offsetof(struct llog_log_hdr, llh_reserved) == 84, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_log_hdr, llh_reserved));
-	LASSERTF((int)sizeof(((struct llog_log_hdr *)0)->llh_reserved) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_log_hdr *)0)->llh_reserved));
-	LASSERTF((int)offsetof(struct llog_log_hdr, llh_bitmap) == 88, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_log_hdr, llh_bitmap));
-	LASSERTF((int)sizeof(((struct llog_log_hdr *)0)->llh_bitmap) == 8096, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_log_hdr *)0)->llh_bitmap));
-	LASSERTF((int)offsetof(struct llog_log_hdr, llh_tail) == 8184, "found %lld\n",
-		 (long long)(int)offsetof(struct llog_log_hdr, llh_tail));
-	LASSERTF((int)sizeof(((struct llog_log_hdr *)0)->llh_tail) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct llog_log_hdr *)0)->llh_tail));
 	BUILD_BUG_ON(LLOG_F_ZAP_WHEN_EMPTY != 0x00000001);
 	BUILD_BUG_ON(LLOG_F_IS_CAT != 0x00000002);
 	BUILD_BUG_ON(LLOG_F_IS_PLAIN != 0x00000004);
@@ -4018,7 +3972,9 @@ void lustre_assert_wire_constants(void)
 	BUILD_BUG_ON(LLOG_CHANGELOG_REPL_CTXT != 13);
 	BUILD_BUG_ON(LLOG_CHANGELOG_USER_ORIG_CTXT != 14);
 	BUILD_BUG_ON(LLOG_AGENT_ORIG_CTXT != 15);
-	BUILD_BUG_ON(LLOG_MAX_CTXTS != 16);
+	BUILD_BUG_ON(LLOG_UPDATELOG_ORIG_CTXT != 16);
+	BUILD_BUG_ON(LLOG_UPDATELOG_REPL_CTXT != 17);
+	BUILD_BUG_ON(LLOG_MAX_CTXTS != 18);
 
 	/* Checks for struct llogd_conn_body */
 	LASSERTF((int)sizeof(struct llogd_conn_body) == 40, "found %lld\n",
@@ -4036,7 +3992,7 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx));
 
-	/* Checks for struct fiemap_info_key */
+	/* Checks for struct ll_fiemap_info_key */
 	LASSERTF((int)sizeof(struct ll_fiemap_info_key) == 248, "found %lld\n",
 		 (long long)(int)sizeof(struct ll_fiemap_info_key));
 	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_name[8]) == 8, "found %lld\n",
@@ -4167,7 +4123,6 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct mgs_config_body, mcb_units));
 	LASSERTF((int)sizeof(((struct mgs_config_body *)0)->mcb_units) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct mgs_config_body *)0)->mcb_units));
-
 	BUILD_BUG_ON(MGS_CFG_T_CONFIG != 0);
 	BUILD_BUG_ON(MGS_CFG_T_SPTLRPC != 1);
 	BUILD_BUG_ON(MGS_CFG_T_RECOVER != 2);
@@ -4608,6 +4563,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct hsm_current_action, hca_location));
 	LASSERTF((int)sizeof(((struct hsm_current_action *)0)->hca_location) == 16, "found %lld\n",
 		 (long long)(int)sizeof(((struct hsm_current_action *)0)->hca_location));
+	BUILD_BUG_ON(HPS_NONE != 0);
+	BUILD_BUG_ON(HPS_WAITING != 1);
+	BUILD_BUG_ON(HPS_RUNNING != 2);
+	BUILD_BUG_ON(HPS_DONE != 3);
 	BUILD_BUG_ON(HUA_NONE != 1);
 	BUILD_BUG_ON(HUA_ARCHIVE != 10);
 	BUILD_BUG_ON(HUA_RESTORE != 11);
diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h
index 77a64f2..7d92264 100644
--- a/include/uapi/linux/lustre/lustre_idl.h
+++ b/include/uapi/linux/lustre/lustre_idl.h
@@ -150,35 +150,6 @@ struct lu_seq_range_array {
  */
 
 /**
- * Flags for lustre_mdt_attrs::lma_compat and lustre_mdt_attrs::lma_incompat.
- * Deprecated since HSM and SOM attributes are now stored in separate on-disk
- * xattr.
- */
-enum lma_compat {
-	LMAC_HSM	= 0x00000001,
-/*	LMAC_SOM	= 0x00000002, obsolete since 2.8.0 */
-	LMAC_NOT_IN_OI	= 0x00000004, /* the object does NOT need OI mapping */
-	LMAC_FID_ON_OST = 0x00000008, /* For OST-object, its OI mapping is
-				       * under /O/<seq>/d<x>.
-				       */
-};
-
-/**
- * Masks for all features that should be supported by a Lustre version to
- * access a specific file.
- * This information is stored in lustre_mdt_attrs::lma_incompat.
- */
-enum lma_incompat {
-	LMAI_RELEASED		= 0x00000001, /* file is released */
-	LMAI_AGENT		= 0x00000002, /* agent inode */
-	LMAI_REMOTE_PARENT	= 0x00000004, /* the parent of the object
-					       * is on the remote MDT
-					       */
-};
-
-#define LMA_INCOMPAT_SUPP	(LMAI_AGENT | LMAI_REMOTE_PARENT)
-
-/**
  * fid constants
  */
 enum {
@@ -293,8 +264,8 @@ enum fid_seq {
 	FID_SEQ_OST_MDT0	= 0,
 	FID_SEQ_LLOG		= 1, /* unnamed llogs */
 	FID_SEQ_ECHO		= 2,
-	FID_SEQ_OST_MDT1	= 3,
-	FID_SEQ_OST_MAX		= 9, /* Max MDT count before OST_on_FID */
+	FID_SEQ_UNUSED_START	= 3, /* Unused */
+	FID_SEQ_UNUSED_END	= 9, /* Unused */
 	FID_SEQ_LLOG_NAME	= 10, /* named llogs */
 	FID_SEQ_RSVD		= 11,
 	FID_SEQ_IGIF		= 12,
@@ -340,6 +311,7 @@ enum special_oid {
 enum dot_lustre_oid {
 	FID_OID_DOT_LUSTRE	= 1UL,
 	FID_OID_DOT_LUSTRE_OBF	= 2UL,
+	FID_OID_DOT_LUSTRE_LPF	= 3UL,
 };
 
 /** OID for FID_SEQ_ROOT */
@@ -721,11 +693,8 @@ struct ptlrpc_body_v2 {
 #define OBD_CONNECT_LARGE_ACL		0x200ULL /* more than 32 ACL entries */
 #define OBD_CONNECT_TRANSNO		0x800ULL /*replay sends init transno */
 #define OBD_CONNECT_IBITS	       0x1000ULL /* not checked in 2.11+ */
-#define OBD_CONNECT_JOIN	       0x2000ULL /*files can be concatenated.
-						  *We do not support JOIN FILE
-						  *anymore, reserve this flags
-						  *just for preventing such bit
-						  *to be reused.
+#define OBD_CONNECT_BARRIER	       0x2000ULL /* write barrier. Resevered to
+						  * avoid use on client.
 						  */
 #define OBD_CONNECT_ATTRFID	       0x4000ULL /*Server can GetAttr By Fid*/
 #define OBD_CONNECT_NODEVOH	       0x8000ULL /*No open hndl on specl nodes*/
@@ -1214,8 +1183,6 @@ static inline __u32 lov_mds_md_size(__u16 stripes, __u32 lmm_magic)
 							 * requests means the
 							 * client holds the lock
 							 */
-#define OBD_MD_FLOBJCOUNT	(0x0000400000000000ULL) /* for multiple destroy */
-
 /*	OBD_MD_FLRMTLSETFACL	(0x0001000000000000ULL) lfs lsetfacl, obsolete */
 /*	OBD_MD_FLRMTLGETFACL	(0x0002000000000000ULL) lfs lgetfacl, obsolete */
 /*	OBD_MD_FLRMTRSETFACL	(0x0004000000000000ULL) lfs rsetfacl, obsolete */
@@ -1293,6 +1260,7 @@ struct hsm_state_set {
 				      * space for unstable pages; asking
 				      * it to sync quickly
 				      */
+#define OBD_BRW_OVER_PRJQUOTA 0x8000 /* Running out of project quota */
 #define OBD_BRW_RDMA_ONLY    0x20000 /* RPC contains RDMA-only pages*/
 
 #define OBD_MAX_GRANT 0x7fffffffUL /* Max grant allowed to one client: 2 GiB */
@@ -2272,8 +2240,8 @@ struct ldlm_intent {
 };
 
 struct ldlm_resource_desc {
-	enum ldlm_type lr_type;
-	__u32 lr_padding;	/* also fix lustre_swab_ldlm_resource_desc */
+	enum ldlm_type	   lr_type;
+	__u32		   lr_pad; /* also fix lustre_swab_ldlm_resource_desc */
 	struct ldlm_res_id lr_name;
 };
 
@@ -2435,6 +2403,8 @@ enum llog_ctxt_id {
 	/* for multiple changelog consumers */
 	LLOG_CHANGELOG_USER_ORIG_CTXT = 14,
 	LLOG_AGENT_ORIG_CTXT = 15, /**< agent requests generation on cdt */
+	LLOG_UPDATELOG_ORIG_CTXT = 16, /* update log. reserve for the client */
+	LLOG_UPDATELOG_REPL_CTXT = 17, /* update log. reserve for the client */
 	LLOG_MAX_CTXTS
 };
 
@@ -2478,6 +2448,9 @@ enum llog_op_type {
 	CHANGELOG_USER_REC	= LLOG_OP_MAGIC | 0x70000,
 	CHANGELOG_USER_REC2	= LLOG_OP_MAGIC | 0x70002,
 	HSM_AGENT_REC		= LLOG_OP_MAGIC | 0x80000,
+	UPDATE_REC		= LLOG_OP_MAGIC | 0xa0000, /* Resevered to avoid
+							    * use on client.
+							    */
 	LLOG_HDR_MAGIC		= LLOG_OP_MAGIC | 0x45539,
 	LLOG_LOGID_MAGIC	= LLOG_OP_MAGIC | 0x4553b,
 };
@@ -2572,31 +2545,6 @@ struct llog_changelog_rec {
 	struct llog_rec_tail	cr_do_not_use;	/**< for_sizezof_only */
 } __attribute__((packed));
 
-#define CHANGELOG_USER_NAMELEN 16 /* base name including NUL terminator */
-
-struct llog_changelog_user_rec {
-	struct llog_rec_hdr	cur_hdr;
-	__u32			cur_id;
-	__u32			cur_padding;
-	__u64			cur_endrec;
-	struct llog_rec_tail	cur_tail;
-} __attribute__((packed));
-
-/* this is twice the size of CHANGELOG_USER_REC */
-struct llog_changelog_user_rec2 {
-	struct llog_rec_hdr	cur_hdr;
-	__u32			cur_id;
-	/* only for use in relative time comparisons to detect idle users */
-	__u32			cur_time;
-	__u64			cur_endrec;
-	__u32                   cur_mask;
-	__u32			cur_padding1;
-	char			cur_name[CHANGELOG_USER_NAMELEN];
-	__u64			cur_padding2;
-	__u64			cur_padding3;
-	struct llog_rec_tail	cur_tail;
-} __attribute__((packed));
-
 enum agent_req_status {
 	ARS_WAITING,
 	ARS_STARTED,
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [lustre-devel] [PATCH 24/24] lustre: pcc: VM_WRITE should not trigger layout write
  2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
                   ` (22 preceding siblings ...)
  2021-09-22  2:20 ` [lustre-devel] [PATCH 23/24] lustre: ptlrpc: separate out server code for wiretest James Simmons
@ 2021-09-22  2:20 ` James Simmons
  23 siblings, 0 replies; 25+ messages in thread
From: James Simmons @ 2021-09-22  2:20 UTC (permalink / raw)
  To: Andreas Dilger, Oleg Drokin, NeilBrown; +Cc: Lustre Development List

From: Qian Yingjin <qian@ddn.com>

VM area marked with VM_WRITE means that pages may be written, but
mmap page write may never happen.
It should delay layout write until the actual modification on the
file happen in ->page_mkwrite().
Otherwise, it will trigger panic for PCC-RO sanity-pcc test_21f().

Fixes: 0b5ce361e ("lustre: flr: mmap write/punch does not stale other mirrors")
WC-bug-id: https://jira.whamcloud.com/browse/LU-14709
Lustre-commit: 373475a4f448c8e26 ("LU-14709 pcc: VM_WRITE should not trigger layout write")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44483
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/cl_object.h |  5 -----
 fs/lustre/llite/llite_mmap.c  | 19 ++++++++++---------
 fs/lustre/llite/vvp_io.c      |  3 +--
 fs/lustre/lov/lov_io.c        |  6 ++----
 4 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h
index d068454..a65240b 100644
--- a/fs/lustre/include/cl_object.h
+++ b/fs/lustre/include/cl_object.h
@@ -2465,11 +2465,6 @@ int cl_io_lru_reserve(const struct lu_env *env, struct cl_io *io,
 int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
 		     pgoff_t start, struct cl_read_ahead *ra);
 
-static inline int cl_io_is_fault_writable(const struct cl_io *io)
-{
-	return io->ci_type == CIT_FAULT && io->u.ci_fault.ft_writable;
-}
-
 /**
  * True, if @io is an O_APPEND write(2).
  */
diff --git a/fs/lustre/llite/llite_mmap.c b/fs/lustre/llite/llite_mmap.c
index 85a082c..8238a4e 100644
--- a/fs/lustre/llite/llite_mmap.c
+++ b/fs/lustre/llite/llite_mmap.c
@@ -83,11 +83,13 @@ struct vm_area_struct *our_vma(struct mm_struct *mm, unsigned long addr,
  * @vma		virtual memory area addressed to page fault
  * @env		corespondent lu_env to processing
  * @index	page index corespondent to fault.
+ * @mkwrite	whether it is mmap write.
  *
  * RETURN	error codes from cl_io_init.
  */
 static struct cl_io *
-ll_fault_io_init(struct lu_env *env, struct vm_area_struct *vma, pgoff_t index)
+ll_fault_io_init(struct lu_env *env, struct vm_area_struct *vma,
+		 pgoff_t index, bool mkwrite)
 {
 	struct file *file = vma->vm_file;
 	struct inode *inode = file_inode(file);
@@ -107,6 +109,11 @@ struct vm_area_struct *our_vma(struct mm_struct *mm, unsigned long addr,
 	fio->ft_index = index;
 	fio->ft_executable = vma->vm_flags & VM_EXEC;
 
+	if (mkwrite) {
+		fio->ft_mkwrite = 1;
+		fio->ft_writable = 1;
+	}
+
 	CDEBUG(D_MMAP,
 	       DFID": vma=%p start=%#lx end=%#lx vm_flags=%#lx idx=%lu\n",
 	       PFID(&ll_i2info(inode)->lli_fid), vma, vma->vm_start,
@@ -117,9 +124,6 @@ struct vm_area_struct *our_vma(struct mm_struct *mm, unsigned long addr,
 	else if (vma->vm_flags & VM_RAND_READ)
 		io->ci_rand_read = 1;
 
-	if (vma->vm_flags & VM_WRITE)
-		fio->ft_writable = 1;
-
 	rc = cl_io_init(env, io, CIT_FAULT, io->ci_obj);
 	if (rc == 0) {
 		struct vvp_io *vio = vvp_env_io(env);
@@ -157,7 +161,7 @@ static int __ll_page_mkwrite(struct vm_area_struct *vma, struct page *vmpage,
 	if (IS_ERR(env))
 		return PTR_ERR(env);
 
-	io = ll_fault_io_init(env, vma, vmpage->index);
+	io = ll_fault_io_init(env, vma, vmpage->index, true);
 	if (IS_ERR(io)) {
 		result = PTR_ERR(io);
 		goto out;
@@ -167,9 +171,6 @@ static int __ll_page_mkwrite(struct vm_area_struct *vma, struct page *vmpage,
 	if (result < 0)
 		goto out_io;
 
-	io->u.ci_fault.ft_mkwrite = 1;
-	io->u.ci_fault.ft_writable = 1;
-
 	vio = vvp_env_io(env);
 	vio->u.fault.ft_vma = vma;
 	vio->u.fault.ft_vmpage = vmpage;
@@ -309,7 +310,7 @@ static vm_fault_t __ll_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 		fault_ret = 0;
 	}
 
-	io = ll_fault_io_init(env, vma, vmf->pgoff);
+	io = ll_fault_io_init(env, vma, vmf->pgoff, false);
 	if (IS_ERR(io)) {
 		fault_ret = to_fault_error(PTR_ERR(io));
 		goto out;
diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c
index a117800..d8951ac 100644
--- a/fs/lustre/llite/vvp_io.c
+++ b/fs/lustre/llite/vvp_io.c
@@ -363,8 +363,7 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios)
 		io->ci_need_write_intent = 0;
 
 		LASSERT(io->ci_type == CIT_WRITE || cl_io_is_fallocate(io) ||
-			cl_io_is_trunc(io) || cl_io_is_mkwrite(io) ||
-			cl_io_is_fault_writable(io));
+			cl_io_is_trunc(io) || cl_io_is_mkwrite(io));
 
 		CDEBUG(D_VFSTRACE, DFID" write layout, type %u " DEXT "\n",
 		       PFID(lu_object_fid(&obj->co_lu)), io->ci_type,
diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c
index 2885943..eb71d7a 100644
--- a/fs/lustre/lov/lov_io.c
+++ b/fs/lustre/lov/lov_io.c
@@ -222,8 +222,7 @@ static int lov_io_mirror_write_intent(struct lov_io *lio,
 	io->ci_need_write_intent = 0;
 
 	if (!(io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io) ||
-	      cl_io_is_fallocate(io) || cl_io_is_trunc(io) ||
-	      cl_io_is_fault_writable(io)))
+	      cl_io_is_fallocate(io) || cl_io_is_trunc(io)))
 		return 0;
 
 	/* FLR: check if it needs to send a write intent RPC to server.
@@ -575,8 +574,7 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 	/* check if it needs to instantiate layout */
 	if (!(io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io) ||
 	      cl_io_is_fallocate(io) ||
-	      (cl_io_is_trunc(io) && io->u.ci_setattr.sa_attr.lvb_size > 0)) ||
-	      cl_io_is_fault_writable(io)) {
+	      (cl_io_is_trunc(io) && io->u.ci_setattr.sa_attr.lvb_size > 0))) {
 		result = 0;
 		goto out;
 	}
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-09-22  2:22 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22  2:19 [lustre-devel] [PATCH 00/24] lustre: Update to OpenSFS Sept 21, 2021 James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 01/24] lnet: Lock primary NID logic James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 02/24] lustre: quota: enforce block quota for chgrp James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 03/24] lnet: introduce struct lnet_nid James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 04/24] lnet: add string formating/parsing for IPv6 nids James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 05/24] lnet: change lpni_nid in lnet_peer_ni to lnet_nid James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 06/24] lnet: change lp_primary_nid to struct lnet_nid James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 07/24] lnet: change lp_disc_*_nid " James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 08/24] lnet: socklnd: factor out key calculation for ksnd_peers James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 09/24] lnet: introduce lnet_processid for ksock_peer_ni James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 10/24] lnet: enhance connect/accept to support large addr James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 11/24] lnet: change lr_nid to struct lnet_nid James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 12/24] lnet: extend rspt_next_hop_nid in lnet_rsp_tracker James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 13/24] lustre: ptlrpc: two replay lock threads James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 14/24] lustre: llite: Always do lookup on ENOENT in open James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 15/24] lustre: llite: Remove inode locking in ll_fsync James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 16/24] lnet: socklnd: fix link state detection James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 17/24] lustre: llite: check read only mount for setquota James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 18/24] lustre: llite: don't touch vma after filemap_fault James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 19/24] lnet: Check for -ESHUTDOWN in lnet_parse James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 20/24] lustre: obdclass: EAGAIN after rhashtable_walk_next() James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 21/24] lustre: sec: filename encryption James Simmons
2021-09-22  2:19 ` [lustre-devel] [PATCH 22/24] lustre: uapi: fixup UAPI headers for native Linux client James Simmons
2021-09-22  2:20 ` [lustre-devel] [PATCH 23/24] lustre: ptlrpc: separate out server code for wiretest James Simmons
2021-09-22  2:20 ` [lustre-devel] [PATCH 24/24] lustre: pcc: VM_WRITE should not trigger layout write James Simmons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).