From: James Simmons <jsimmons@infradead.org> To: lustre-devel@lists.lustre.org Subject: [lustre-devel] [PATCH 24/42] lnet: Do not overwrite destination when routing Date: Mon, 5 Oct 2020 20:06:03 -0400 [thread overview] Message-ID: <1601942781-24950-25-git-send-email-jsimmons@infradead.org> (raw) In-Reply-To: <1601942781-24950-1-git-send-email-jsimmons@infradead.org> From: Chris Horn <chris.horn@hpe.com> MR path selection in a routed environment is supposed to allow the originator of a message to set the final destination NID. On a multi-hop route, intermediate routers execute the same code path as the message originator (i.e. the remote send cases). This causes them to overwrite the destination NID when forwarding the message. Check the msg_routing flag to determine whether we should set the final destination NID (i.e. LNet peer NI). A somewhat related issue is that because intermediate routers are not selecting a destination lpni, they need to pick the next-hop lpni based on the destination NID's remote net. Fixes: 111c56a3c7e ("lnet: fix remote peer ni selection") HPE-bug-id: LUS-8919 WC-bug-id: https://jira.whamcloud.com/browse/LU-13605 Lustre-commit: ec94d6f77b61fe ("LU-13605 lnet: Do not overwrite destination when routing") Signed-off-by: Chris Horn <chris.horn@hpe.com> Reviewed-on: https://review.whamcloud.com/38731 Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Signed-off-by: James Simmons <jsimmons@infradead.org> --- net/lnet/lnet/lib-move.c | 102 +++++++++++++++++++++++++++-------------------- 1 file changed, 59 insertions(+), 43 deletions(-) diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 7474d44..1c9fb41 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1830,52 +1830,73 @@ struct lnet_ni * } if (!route_found) { - /* we've already looked up the initial lpni using dst_nid */ - lpni = sd->sd_best_lpni; - /* the peer tree must be in existence */ - LASSERT(lpni && lpni->lpni_peer_net && - lpni->lpni_peer_net->lpn_peer); - lp = lpni->lpni_peer_net->lpn_peer; - - list_for_each_entry(lpn, &lp->lp_peer_nets, lpn_peer_nets) { - /* is this remote network reachable? */ - rnet = lnet_find_rnet_locked(lpn->lpn_net_id); - if (!rnet) - continue; + if (sd->sd_msg->msg_routing) { + /* If I'm routing this message then I need to find the + * next hop based on the destination NID + */ + best_rnet = lnet_find_rnet_locked(LNET_NIDNET(sd->sd_dst_nid)); + if (!best_rnet) { + CERROR("Unable to route message to %s - Route table may be misconfigured\n", + libcfs_nid2str(sd->sd_dst_nid)); + return -EHOSTUNREACH; + } + } else { + /* we've already looked up the initial lpni using + * dst_nid + */ + lpni = sd->sd_best_lpni; + /* the peer tree must be in existence */ + LASSERT(lpni && lpni->lpni_peer_net && + lpni->lpni_peer_net->lpn_peer); + lp = lpni->lpni_peer_net->lpn_peer; + + list_for_each_entry(lpn, &lp->lp_peer_nets, + lpn_peer_nets) { + /* is this remote network reachable? */ + rnet = lnet_find_rnet_locked(lpn->lpn_net_id); + if (!rnet) + continue; + + if (!best_lpn) { + best_lpn = lpn; + best_rnet = rnet; + } + + if (best_lpn->lpn_seq <= lpn->lpn_seq) + continue; - if (!best_lpn) { best_lpn = lpn; best_rnet = rnet; } - if (best_lpn->lpn_seq <= lpn->lpn_seq) - continue; + if (!best_lpn) { + CERROR("peer %s has no available nets\n", + libcfs_nid2str(sd->sd_dst_nid)); + return -EHOSTUNREACH; + } - best_lpn = lpn; - best_rnet = rnet; - } + sd->sd_best_lpni = lnet_find_best_lpni(sd->sd_best_ni, + sd->sd_dst_nid, + lp, + best_lpn->lpn_net_id); + if (!sd->sd_best_lpni) { + CERROR("peer %s is unreachable\n", + libcfs_nid2str(sd->sd_dst_nid)); + return -EHOSTUNREACH; + } - if (!best_lpn) { - CERROR("peer %s has no available nets\n", - libcfs_nid2str(sd->sd_dst_nid)); - return -EHOSTUNREACH; - } + /* We're attempting to round robin over the remote peer + * NI's so update the final destination we selected + */ + sd->sd_final_dst_lpni = sd->sd_best_lpni; - sd->sd_best_lpni = lnet_find_best_lpni(sd->sd_best_ni, - sd->sd_dst_nid, - lp, - best_lpn->lpn_net_id); - if (!sd->sd_best_lpni) { - CERROR("peer %s is unreachable\n", - libcfs_nid2str(sd->sd_dst_nid)); - return -EHOSTUNREACH; + /* Increment the sequence number of the remote lpni so + * we can round robin over the different interfaces of + * the remote lpni + */ + sd->sd_best_lpni->lpni_seq++; } - /* We're attempting to round robin over the remote peer - * NI's so update the final destination we selected - */ - sd->sd_final_dst_lpni = sd->sd_best_lpni; - /* find the best route. Restrict the selection on the net of the * local NI if we've already picked the local NI to send from. * Otherwise, let's pick any route we can find and then find @@ -1903,12 +1924,6 @@ struct lnet_ni * gw = best_route->lr_gateway; LASSERT(gw == gwni->lpni_peer_net->lpn_peer); local_lnet = best_route->lr_lnet; - - /* Increment the sequence number of the remote lpni so we - * can round robin over the different interfaces of the - * remote lpni - */ - sd->sd_best_lpni->lpni_seq++; } /* Discover this gateway if it hasn't already been discovered. @@ -1945,7 +1960,8 @@ struct lnet_ni * if (sd->sd_rtr_nid == LNET_NID_ANY) { LASSERT(best_route && last_route); best_route->lr_seq = last_route->lr_seq + 1; - best_lpn->lpn_seq++; + if (best_lpn) + best_lpn->lpn_seq++; } return 0; -- 1.8.3.1
next prev parent reply other threads:[~2020-10-06 0:06 UTC|newest] Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-06 0:05 [lustre-devel] [PATCH 00/42] lustre: OpenSFS backport for Oct 4 2020 James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 01/42] lustre: ptlrpc: don't require CONFIG_CRYPTO_CRC32 James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 02/42] lustre: dom: lock cancel to drop pages James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 03/42] lustre: sec: use memchr_inv() to check if page is zero James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 04/42] lustre: mdc: fix lovea for replay James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 05/42] lustre: llite: add test to check client deadlock selinux James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 06/42] lnet: use init_wait(), not init_waitqueue_entry() James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 07/42] lustre: lov: make various lov_object.c function static James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 08/42] lustre: llite: return -ENODATA if no default layout James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 09/42] lnet: libcfs: don't save journal_info in dumplog thread James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 10/42] lustre: ldlm: lru code cleanup James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 11/42] lustre: ldlm: cancel LRU improvement James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 12/42] lnet: Do not set preferred NI for MR peer James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 13/42] lustre: ptlrpc: prefer crc32_le() over CryptoAPI James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 14/42] lnet: call event handlers without res_lock James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 15/42] lnet: Conditionally attach rspt in LNetPut & LNetGet James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 16/42] lustre: llite: reuse same cl_dio_aio for one IO James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 17/42] lustre: llite: move iov iter forward by ourself James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 18/42] lustre: llite: report client stats sumsq James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 19/42] lnet: Support checking for MD leaks James Simmons 2020-10-06 0:05 ` [lustre-devel] [PATCH 20/42] lnet: don't read debugfs lnet stats when shutting down James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 21/42] lnet: Loosen restrictions on LNet Health params James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 22/42] lnet: Fix reference leak in lnet_select_pathway James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 23/42] lustre: llite: prune invalid dentries James Simmons 2020-10-06 0:06 ` James Simmons [this message] 2020-10-06 0:06 ` [lustre-devel] [PATCH 25/42] lustre: lov: don't use inline for operations functions James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 26/42] lustre: osc: don't allow negative grants James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 27/42] lustre: mgc: Use IR for client->MDS/OST connections James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 28/42] lustre: ldlm: don't use a locks without l_ast_data James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 29/42] lustre: lov: discard unused lov_dump_lmm* functions James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 30/42] lustre: lov: guard against class_exp2obd() returning NULL James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 31/42] lustre: clio: don't call aio_complete() in lustre upon errors James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 32/42] lustre: llite: it_lock_bits should be bit-wise tested James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 33/42] lustre: ldlm: control lru_size for extent lock James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 34/42] lustre: ldlm: pool fixes James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 35/42] lustre: ldlm: pool recalc forceful call James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 36/42] lustre: don't take spinlock to read a 'long' James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 37/42] lustre: osc: Do ELC on locks with no OSC object James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 38/42] lnet: deadlock on LNet shutdown James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 39/42] lustre: update version to 2.13.56 James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 40/42] lustre: llite: increase readahead default values James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 41/42] lustre: obdclass: don't initialize obj for zero FID James Simmons 2020-10-06 0:06 ` [lustre-devel] [PATCH 42/42] lustre: obdclass: fixes and improvements for jobid James Simmons
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1601942781-24950-25-git-send-email-jsimmons@infradead.org \ --to=jsimmons@infradead.org \ --cc=lustre-devel@lists.lustre.org \ --subject='Re: [lustre-devel] [PATCH 24/42] lnet: Do not overwrite destination when routing' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).