linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Help understand use of MAC address resolution in RDMA
@ 2021-08-05 14:38 Olga Kornievskaia
  2021-08-06  1:44 ` Jason Gunthorpe
  0 siblings, 1 reply; 3+ messages in thread
From: Olga Kornievskaia @ 2021-08-05 14:38 UTC (permalink / raw)
  To: linux-rdma

Hi folks,

Can somebody help me understand how RoCE (this is probably RDMA core
and not specific to RoCE but I'm not sure) manages destination MAC
addresses for its connection?

Specifically the problem being observed is a server initiates an RDMA
CM disconnect (client replies), client tries to reconnect. Server
sends an ARP advertising a different MAC for the IP that the RDMA
connection was using. RDMA code keeps sending the RDMA CM connect
message to the old MAC for a certains period of time (90-100sec) then
it finally sends it to the new MAC address.

Question: how does the core RDMA layer manage the MAC address for the
connection. Why does it seem like it ignores the ARP updates?

Thank you.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help understand use of MAC address resolution in RDMA
  2021-08-05 14:38 Help understand use of MAC address resolution in RDMA Olga Kornievskaia
@ 2021-08-06  1:44 ` Jason Gunthorpe
  2021-08-06  2:05   ` Olga Kornievskaia
  0 siblings, 1 reply; 3+ messages in thread
From: Jason Gunthorpe @ 2021-08-06  1:44 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-rdma

On Thu, Aug 05, 2021 at 10:38:42AM -0400, Olga Kornievskaia wrote:
> Hi folks,
> 
> Can somebody help me understand how RoCE (this is probably RDMA core
> and not specific to RoCE but I'm not sure) manages destination MAC
> addresses for its connection?
> 
> Specifically the problem being observed is a server initiates an RDMA
> CM disconnect (client replies), client tries to reconnect. Server
> sends an ARP advertising a different MAC for the IP that the RDMA
> connection was using. RDMA code keeps sending the RDMA CM connect
> message to the old MAC for a certains period of time (90-100sec) then
> it finally sends it to the new MAC address.
> 
> Question: how does the core RDMA layer manage the MAC address for the
> connection. Why does it seem like it ignores the ARP updates?

RDMA objects acquire a MAC adress when they are created and do not
synchronize with the neighbor cache after.

What you are seeing is that the CM_ID object holds the bad mac until
it is destoroyed and likely a new CM_ID object gets created that holds
the updated MAC

Jason

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help understand use of MAC address resolution in RDMA
  2021-08-06  1:44 ` Jason Gunthorpe
@ 2021-08-06  2:05   ` Olga Kornievskaia
  0 siblings, 0 replies; 3+ messages in thread
From: Olga Kornievskaia @ 2021-08-06  2:05 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma

On Thu, Aug 5, 2021 at 9:44 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Thu, Aug 05, 2021 at 10:38:42AM -0400, Olga Kornievskaia wrote:
> > Hi folks,
> >
> > Can somebody help me understand how RoCE (this is probably RDMA core
> > and not specific to RoCE but I'm not sure) manages destination MAC
> > addresses for its connection?
> >
> > Specifically the problem being observed is a server initiates an RDMA
> > CM disconnect (client replies), client tries to reconnect. Server
> > sends an ARP advertising a different MAC for the IP that the RDMA
> > connection was using. RDMA code keeps sending the RDMA CM connect
> > message to the old MAC for a certains period of time (90-100sec) then
> > it finally sends it to the new MAC address.
> >
> > Question: how does the core RDMA layer manage the MAC address for the
> > connection. Why does it seem like it ignores the ARP updates?
>
> RDMA objects acquire a MAC adress when they are created and do not
> synchronize with the neighbor cache after.
>
> What you are seeing is that the CM_ID object holds the bad mac until
> it is destoroyed and likely a new CM_ID object gets created that holds
> the updated MAC

First of all, thank you for the feedback. A few more questions on
that. It sounds like you are agreeing with me that the ARP update is
ignored. Question: do you think that's an acceptable/expected
behaviour or it's a bug that needs to be fixed? Indeed the successful
CM connect request has a different communication ID on the network
trace. Question: is the period that the CMA would keep retrying before
giving up a configuration option (by the caller of the connection or
system in general)? Would tuning that value to be smaller so that it
is more sensitive to ARP updates be the path forward?

Thank you.

>
> Jason

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-06  2:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05 14:38 Help understand use of MAC address resolution in RDMA Olga Kornievskaia
2021-08-06  1:44 ` Jason Gunthorpe
2021-08-06  2:05   ` Olga Kornievskaia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).