lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH 08/37] lnet: socklnd: fix local interface binding
Date: Wed, 15 Jul 2020 16:44:49 -0400	[thread overview]
Message-ID: <1594845918-29027-9-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org>

From: Amir Shehata <ashehata@whamcloud.com>

When a node is configured with multiple interfaces in
Multi-Rail config, socklnd was not utilizing the local interface
requested by LNet. In essence LNet was using all the NIDs in round
robin, however the socklnd module was not binding to the correct
interface. Traffic was thus sent on a subset of the interfaces.

The reason is that the route interface number was not being set.
In most cases lnet_connect() is called to create a socket. The
socket is bound to the interface provided and then
ksocknal_create_conn() is called to create the socklnd connection.
ksocknal_create_conn() calls ksocknal_associate_route_conn_locked()
at which point the route's local interface is assigned. However,
this is already too late as the socket has already been created
and bound to a local interface.

Therefore, it's important to assign the route's interface before
calling lnet_connect() to ensure socket is bound to correct local
interface.

To address this issue, the route's interface index is initialized
to the NI's interface index when it's added to the peer_ni.

Another bug fixed:
The interface index was not being initialized in the startup
routine.

Note: We're strictly assuming that there is one interface for each
NI. This is because tcp bonding will be removed from the socklnd as
it has been deprecated by LNet mutli-rail.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13566
Lustre-commit: a7c9aba5eb96d ("LU-13566 socklnd: fix local interface binding")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38743
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index 444b90b..2b8fd3d 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -409,12 +409,14 @@ struct ksock_peer_ni *
 {
 	struct ksock_conn *conn;
 	struct ksock_route *route2;
+	struct ksock_net *net = peer_ni->ksnp_ni->ni_data;
 
 	LASSERT(!peer_ni->ksnp_closing);
 	LASSERT(!route->ksnr_peer);
 	LASSERT(!route->ksnr_scheduled);
 	LASSERT(!route->ksnr_connecting);
 	LASSERT(!route->ksnr_connected);
+	LASSERT(net->ksnn_ninterfaces > 0);
 
 	/* LASSERT(unique) */
 	list_for_each_entry(route2, &peer_ni->ksnp_routes, ksnr_list) {
@@ -428,6 +430,11 @@ struct ksock_peer_ni *
 
 	route->ksnr_peer = peer_ni;
 	ksocknal_peer_addref(peer_ni);
+
+	/* set the route's interface to the current net's interface */
+	route->ksnr_myiface = net->ksnn_interfaces[0].ksni_index;
+	net->ksnn_interfaces[0].ksni_nroutes++;
+
 	/* peer_ni's routelist takes over my ref on 'route' */
 	list_add_tail(&route->ksnr_list, &peer_ni->ksnp_routes);
 
@@ -2667,6 +2674,7 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 		net->ksnn_ninterfaces = 1;
 		ni->ni_dev_cpt = ifaces[0].li_cpt;
 		ksi->ksni_ipaddr = ifaces[0].li_ipaddr;
+		ksi->ksni_index = ksocknal_ip2index(ksi->ksni_ipaddr, ni);
 		ksi->ksni_netmask = ifaces[0].li_netmask;
 		strlcpy(ksi->ksni_name, ifaces[0].li_name,
 			sizeof(ksi->ksni_name));
@@ -2706,6 +2714,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id)
 				ksi = &net->ksnn_interfaces[j];
 				ni->ni_dev_cpt = ifaces[j].li_cpt;
 				ksi->ksni_ipaddr = ifaces[j].li_ipaddr;
+				ksi->ksni_index =
+					ksocknal_ip2index(ksi->ksni_ipaddr, ni);
 				ksi->ksni_netmask = ifaces[j].li_netmask;
 				strlcpy(ksi->ksni_name, ifaces[j].li_name,
 					sizeof(ksi->ksni_name));
-- 
1.8.3.1

  parent reply	other threads:[~2020-07-15 20:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15 20:44 [lustre-devel] [PATCH 00/37] lustre: latest patches landed to OpenSFS 07/14/2020 James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 01/37] lustre: osc: fix osc_extent_find() James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 02/37] lustre: ldlm: check slv and limit before updating James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 03/37] lustre: sec: better struct sepol_downcall_data James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 04/37] lustre: obdclass: remove init to 0 from lustre_init_lsi() James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 05/37] lustre: ptlrpc: handle conn_hash rhashtable resize James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 06/37] lustre: lu_object: convert lu_object cache to rhashtable James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 07/37] lustre: osc: disable ext merging for rdma only pages and non-rdma James Simmons
2020-07-15 20:44 ` James Simmons [this message]
2020-07-15 20:44 ` [lustre-devel] [PATCH 09/37] lnet: o2iblnd: allocate init_qp_attr on stack James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 10/37] lnet: Fix some out-of-date comments James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 11/37] lnet: socklnd: don't fall-back to tcp_sendpage James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 12/37] lustre: ptlrpc: re-enterable signal_completed_replay() James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 13/37] lustre: obdcalss: ensure LCT_QUIESCENT take sync James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 14/37] lustre: remove some "#ifdef CONFIG*" from .c files James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 15/37] lustre: obdclass: use offset instead of cp_linkage James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 16/37] lustre: obdclass: re-declare cl_page variables to reduce its size James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 17/37] lustre: osc: re-declare ops_from/to to shrink osc_page James Simmons
2020-07-15 20:44 ` [lustre-devel] [PATCH 18/37] lustre: llite: Fix lock ordering in pagevec_dirty James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 19/37] lustre: misc: quiet compiler warning on armv7l James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 20/37] lustre: llite: fix to free cl_dio_aio properly James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 21/37] lnet: o2iblnd: Use ib_mtu_int_to_enum() James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 22/37] lnet: o2iblnd: wait properly for fps->increasing James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 23/37] lnet: o2iblnd: use need_resched() James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 24/37] lnet: o2iblnd: Use list_for_each_entry_safe James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 25/37] lnet: socklnd: use need_resched() James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 26/37] lnet: socklnd: use list_for_each_entry_safe() James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 27/37] lnet: socklnd: convert various refcounts to refcount_t James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 28/37] lnet: libcfs: don't call unshare_fs_struct() James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 29/37] lnet: Allow router to forward to healthier NID James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 30/37] lustre: llite: annotate non-owner locking James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 31/37] lustre: osc: consume grants for direct I/O James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 32/37] lnet: remove LNetMEUnlink and clean up related code James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 33/37] lnet: Set remote NI status in lnet_notify James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 34/37] lustre: ptlrpc: fix endless loop issue James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 35/37] lustre: llite: fix short io for AIO James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 36/37] lnet: socklnd: change ksnd_nthreads to atomic_t James Simmons
2020-07-15 20:45 ` [lustre-devel] [PATCH 37/37] lnet: check rtr_nid is a gateway James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1594845918-29027-9-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).