* [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
@ 2022-06-16 15:57 Gerd Rausch
2022-06-24 23:11 ` Jason Gunthorpe
0 siblings, 1 reply; 7+ messages in thread
From: Gerd Rausch @ 2022-06-16 15:57 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky; +Cc: linux-rdma
Unlike with IPv[46], where "ip_finish_output2" triggers
a refresh of STALE neighbour entries via "neigh_output",
"rdma_resolve_addr" never triggers an update.
If a wrong STALE entry ever enters the cache, it'll remain
wrong forever (unless refreshed via TCP/IP, or otherwise).
Let the cache inconsistency resolve itself by triggering
an update from "rdma_resolve_addr".
Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
---
v2: Add a "(__force u32)" cast for "__ipv4_neigh_lookup_noref"
drivers/infiniband/core/addr.c | 40 ++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index f253295795f0..704dc9cc130e 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -394,6 +394,8 @@ static int addr4_resolve(struct sockaddr *src_sock,
__be32 dst_ip = dst_in->sin_addr.s_addr;
struct rtable *rt;
struct flowi4 fl4;
+ struct net_device *dev;
+ struct neighbour *neigh;
int ret;
memset(&fl4, 0, sizeof(fl4));
@@ -409,6 +411,24 @@ static int addr4_resolve(struct sockaddr *src_sock,
addr->hoplimit = ip4_dst_hoplimit(&rt->dst);
+ /* trigger ARP-entry refresh if necessary,
+ * the same way "ip_finish_output2" does
+ */
+ if (addr->bound_dev_if) {
+ dev = dev_get_by_index(addr->net, addr->bound_dev_if);
+ } else {
+ dev = rt->dst.dev;
+ dev_hold(dev);
+ }
+ if (dev) {
+ rcu_read_lock_bh();
+ neigh = __ipv4_neigh_lookup_noref(dev, (__force u32)dst_ip);
+ if (neigh)
+ neigh_event_send(neigh, NULL);
+ rcu_read_unlock_bh();
+ dev_put(dev);
+ }
+
*prt = rt;
return 0;
}
@@ -424,6 +444,8 @@ static int addr6_resolve(struct sockaddr *src_sock,
(const struct sockaddr_in6 *)dst_sock;
struct flowi6 fl6;
struct dst_entry *dst;
+ struct net_device *dev;
+ struct neighbour *neigh;
memset(&fl6, 0, sizeof fl6);
fl6.daddr = dst_in->sin6_addr;
@@ -439,6 +461,24 @@ static int addr6_resolve(struct sockaddr *src_sock,
addr->hoplimit = ip6_dst_hoplimit(dst);
+ /* trigger neighbour-entry refresh if necessary,
+ * the same way "ip6_finish_output2" does
+ */
+ if (addr->bound_dev_if) {
+ dev = dev_get_by_index(addr->net, addr->bound_dev_if);
+ } else {
+ dev = dst->dev;
+ dev_hold(dev);
+ }
+ if (dev) {
+ rcu_read_lock_bh();
+ neigh = __ipv6_neigh_lookup_noref(dst->dev, &dst_in->sin6_addr);
+ if (neigh)
+ neigh_event_send(neigh, NULL);
+ rcu_read_unlock_bh();
+ dev_put(dev);
+ }
+
*pdst = dst;
return 0;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-06-16 15:57 [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr" Gerd Rausch
@ 2022-06-24 23:11 ` Jason Gunthorpe
2022-06-25 0:03 ` Gerd Rausch
0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 23:11 UTC (permalink / raw)
To: Gerd Rausch; +Cc: Leon Romanovsky, linux-rdma
On Thu, Jun 16, 2022 at 08:57:14AM -0700, Gerd Rausch wrote:
> Unlike with IPv[46], where "ip_finish_output2" triggers
> a refresh of STALE neighbour entries via "neigh_output",
> "rdma_resolve_addr" never triggers an update.
Why? We alread call neigh_event_send() right after this in
addr_resolve_neigh(), and this seems to ignore the input resolve_neigh
to addr_resolve() ?
I think there is more going on here than this commit message is
explaining.
Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-06-24 23:11 ` Jason Gunthorpe
@ 2022-06-25 0:03 ` Gerd Rausch
2022-06-25 0:10 ` Jason Gunthorpe
0 siblings, 1 reply; 7+ messages in thread
From: Gerd Rausch @ 2022-06-25 0:03 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma
Hi Jason,
On Fri, 2022-06-24 at 20:11 -0300, Jason Gunthorpe wrote:
> On Thu, Jun 16, 2022 at 08:57:14AM -0700, Gerd Rausch wrote:
> > Unlike with IPv[46], where "ip_finish_output2" triggers
> > a refresh of STALE neighbour entries via "neigh_output",
> > "rdma_resolve_addr" never triggers an update.
>
> Why? We alread call neigh_event_send() right after this in
> addr_resolve_neigh(), and this seems to ignore the input resolve_neigh
> to addr_resolve() ?
>
This in "dst_fetch_ha"?:
--------%<--------%<--------%<--------%<--------%<--------
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
ret = -ENODATA;
--------%<--------%<--------%<--------%<--------%<--------
With:
#define NUD_VALID (NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE|NUD_PROBE
|NUD_STALE|NUD_DELAY)
and the knowledge that the ARP entry is NUD_STALE,
how would that function be called and perform the necessary refresh?
> I think there is more going on here than this commit message is
> explaining.
>
If the intention of the above mentioned "neigh_event_send" was to
refresh stale entries, then the "if" condition ought to change, no?
Thanks,
Gerd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-06-25 0:03 ` Gerd Rausch
@ 2022-06-25 0:10 ` Jason Gunthorpe
2022-06-25 0:55 ` Gerd Rausch
0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2022-06-25 0:10 UTC (permalink / raw)
To: Gerd Rausch; +Cc: Leon Romanovsky, linux-rdma
On Fri, Jun 24, 2022 at 05:03:23PM -0700, Gerd Rausch wrote:
> Hi Jason,
>
> On Fri, 2022-06-24 at 20:11 -0300, Jason Gunthorpe wrote:
> > On Thu, Jun 16, 2022 at 08:57:14AM -0700, Gerd Rausch wrote:
> > > Unlike with IPv[46], where "ip_finish_output2" triggers
> > > a refresh of STALE neighbour entries via "neigh_output",
> > > "rdma_resolve_addr" never triggers an update.
> >
> > Why? We alread call neigh_event_send() right after this in
> > addr_resolve_neigh(), and this seems to ignore the input resolve_neigh
> > to addr_resolve() ?
> >
>
> This in "dst_fetch_ha"?:
> --------%<--------%<--------%<--------%<--------%<--------
> if (!(n->nud_state & NUD_VALID)) {
> neigh_event_send(n, NULL);
> ret = -ENODATA;
> --------%<--------%<--------%<--------%<--------%<--------
>
> With:
> #define NUD_VALID (NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE|NUD_PROBE
> |NUD_STALE|NUD_DELAY)
>
> and the knowledge that the ARP entry is NUD_STALE,
> how would that function be called and perform the necessary refresh?
>
>
> > I think there is more going on here than this commit message is
> > explaining.
> >
>
> If the intention of the above mentioned "neigh_event_send" was to
> refresh stale entries, then the "if" condition ought to change, no?
Maybe yes.
If you are saying with this patch that neigh_event_send() should be
called unconditionally for every 'packet' why not remove the test
above instead of calling it twice?
On the other hand I see many places checking for NUD_VALID before
calling it.
So, really the commit message needs to explain how neigh_event_send()
is supposed to be used and explain that NUD_VALID is OK to drop.
I'm having a hard time guessing from a quick look around.
Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-06-25 0:10 ` Jason Gunthorpe
@ 2022-06-25 0:55 ` Gerd Rausch
2022-07-28 17:55 ` Gerd Rausch
0 siblings, 1 reply; 7+ messages in thread
From: Gerd Rausch @ 2022-06-25 0:55 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma
Hi Jason,
On Fri, 2022-06-24 at 21:10 -0300, Jason Gunthorpe wrote:
> On Fri, Jun 24, 2022 at 05:03:23PM -0700, Gerd Rausch wrote:
[...]
> > This in "dst_fetch_ha"?:
> > --------%<--------%<--------%<--------%<--------%<--------
> > if (!(n->nud_state & NUD_VALID)) {
> > neigh_event_send(n, NULL);
> > ret = -ENODATA;
> > --------%<--------%<--------%<--------%<--------%<--------
> >
> > With:
> > #define NUD_VALID (NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE|NUD_PROBE
> > > NUD_STALE|NUD_DELAY)
> >
> > and the knowledge that the ARP entry is NUD_STALE,
> > how would that function be called and perform the necessary refresh?
> >
> >
> > > I think there is more going on here than this commit message is
> > > explaining.
> > >
> >
> > If the intention of the above mentioned "neigh_event_send" was to
> > refresh stale entries, then the "if" condition ought to change, no?
>
> Maybe yes.
>
> If you are saying with this patch that neigh_event_send() should be
> called unconditionally for every 'packet' why not remove the test
> above instead of calling it twice?
>
It looks like starting from the very first time when "neigh_event_send"
was introduced here:
935ef2d7a291 RDMA/cma: Use neigh_event_send() to start neighbour discovery
it was always strictly coupled to the "-ENODATA" case.
In other words, I don't see how the STALE case was covered, and I'd have
to guess wether the -ENODATA coupling was intentional or accidental.
Now it may be perfectly possible to just make the "neigh_event_send"
call above unconditional and call it a day.
I mean, simpler fixes always win over more complex ones.
But, I personally don't know the twists & turns of this code well enough
to be able to assert that this won't leave any non-covered conditional
corner cases. I certainly hadn't tested that.
> On the other hand I see many places checking for NUD_VALID before
> calling it.
>
> So, really the commit message needs to explain how neigh_event_send()
> is supposed to be used and explain that NUD_VALID is OK to drop.
>
Some places do check for NUD_VALID (e.g. "__ip_do_redirect"),
many others don't (e.g. "__teql_resolve", or "neigh_resolve_output"
itself).
Even in the case of "dst_fetch_ha":
I can not state whether the check for !NUD_VALUD was done intentionally
or not. I can observe though that the side-effect of that check is that
STALE entries won't get refreshed. So something is clearly off here.
Let's look "__ip_do_redirect":
I would have to guess why that check for !NUD_VALID is there.
It's been there as far back as the GIT history goes.
But I prefer not having to put guess-work into my Commit-Log.
That said, I don't mind putting guess-work in this e-mail though:
Perhaps the author(s) knew that the code would have to go through
"ip_finish_output2" (or whatever the predecessor's name was) eventually,
and therefore it was okay to kick off a refresh only if !NUD_VALID.
At the end of the day, if the desired behavior is to always refresh
STALE entries, we should do so.
Calling "neigh_event_send" for !NUD_VALID only won't achieve that.
> I'm having a hard time guessing from a quick look around.
>
Thanks for looking though,
Gerd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-06-25 0:55 ` Gerd Rausch
@ 2022-07-28 17:55 ` Gerd Rausch
2022-07-28 19:24 ` Jason Gunthorpe
0 siblings, 1 reply; 7+ messages in thread
From: Gerd Rausch @ 2022-07-28 17:55 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma
Hi Jason,
On Fri, 2022-06-24 at 17:55 -0700, Gerd Rausch wrote:
> In other words, I don't see how the STALE case was covered, and I'd have
> to guess wether the -ENODATA coupling was intentional or accidental.
>
> Now it may be perfectly possible to just make the "neigh_event_send"
> call above unconditional and call it a day.
>
> I mean, simpler fixes always win over more complex ones.
>
> But, I personally don't know the twists & turns of this code well enough
> to be able to assert that this won't leave any non-covered conditional
> corner cases. I certainly hadn't tested that.
>
I finally got around to trying the much simpler fix, and it appears to
work just as well:
--------%<--------%<--------%<--------%<--------%<--------%<--------
--- drivers/infiniband/core/addr.c-
+++ drivers/infiniband/core/addr.c
@@ -336,11 +336,11 @@ static int dst_fetch_ha(const struct dst
return -ENODATA;
if (!(n->nud_state & NUD_VALID)) {
- neigh_event_send(n, NULL);
ret = -ENODATA;
} else {
neigh_ha_snapshot(dev_addr->dst_dev_addr, n, dst->dev);
}
+ neigh_event_send(n, NULL);
neigh_release(n);
--------%<--------%<--------%<--------%<--------%<--------%<--------
Tested with IPv4 only, but this should work just as well with IPv6:
STALE neighbor entries transition to DELAY -> REACHABLE with this
change, i.e. the entries get updated.
Thanks,
Gerd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"
2022-07-28 17:55 ` Gerd Rausch
@ 2022-07-28 19:24 ` Jason Gunthorpe
0 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2022-07-28 19:24 UTC (permalink / raw)
To: Gerd Rausch; +Cc: Leon Romanovsky, linux-rdma
On Thu, Jul 28, 2022 at 10:55:39AM -0700, Gerd Rausch wrote:
> Hi Jason,
>
> On Fri, 2022-06-24 at 17:55 -0700, Gerd Rausch wrote:
> > In other words, I don't see how the STALE case was covered, and I'd have
> > to guess wether the -ENODATA coupling was intentional or accidental.
> >
> > Now it may be perfectly possible to just make the "neigh_event_send"
> > call above unconditional and call it a day.
> >
> > I mean, simpler fixes always win over more complex ones.
> >
> > But, I personally don't know the twists & turns of this code well enough
> > to be able to assert that this won't leave any non-covered conditional
> > corner cases. I certainly hadn't tested that.
> >
>
> I finally got around to trying the much simpler fix, and it appears to
> work just as well:
>
> --------%<--------%<--------%<--------%<--------%<--------%<--------
> --- drivers/infiniband/core/addr.c-
> +++ drivers/infiniband/core/addr.c
> @@ -336,11 +336,11 @@ static int dst_fetch_ha(const struct dst
> return -ENODATA;
>
> if (!(n->nud_state & NUD_VALID)) {
> - neigh_event_send(n, NULL);
> ret = -ENODATA;
> } else {
> neigh_ha_snapshot(dev_addr->dst_dev_addr, n, dst->dev);
> }
> + neigh_event_send(n, NULL);
>
> neigh_release(n);
>
> --------%<--------%<--------%<--------%<--------%<--------%<--------
>
> Tested with IPv4 only, but this should work just as well with IPv6:
>
> STALE neighbor entries transition to DELAY -> REACHABLE with this
> change, i.e. the entries get updated.
It seems OK, but I would still like to see an attempt to explain how
nud_state and neigh_event_send is supposed to be used.. Maybe futile,
but lets try at least.
Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-07-28 19:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-16 15:57 [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr" Gerd Rausch
2022-06-24 23:11 ` Jason Gunthorpe
2022-06-25 0:03 ` Gerd Rausch
2022-06-25 0:10 ` Jason Gunthorpe
2022-06-25 0:55 ` Gerd Rausch
2022-07-28 17:55 ` Gerd Rausch
2022-07-28 19:24 ` Jason Gunthorpe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.