All of lore.kernel.org
 help / color / mirror / Atom feed
* RDMA device renames and node description
@ 2020-02-14 18:13 Dennis Dalessandro
  2020-02-18 14:04 ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-14 18:13 UTC (permalink / raw)
  To: linux-rdma; +Cc: Jason Gunthorpe, Leon Romanovsky, Honggang LI, Gal Pressman

Was there any discussion on the upgrade scenario for existing 
deployments as far as device-rename changing node descriptions?

If someone is running an older version of rdma-core they are going to 
have a certain set of node descriptions for each node. This could be in 
logs, or configuration databases, who knows what. Now if they upgrade to 
a new version of rdma-core their node descriptions all automatically 
change out from under them by default.

Of course the admin could disable the rename prior to upgrade and as 
Leon pointed out previously the upgrade won't remove the disablement 
file. The problem is they would have to know to do that ahead of time.

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-14 18:13 RDMA device renames and node description Dennis Dalessandro
@ 2020-02-18 14:04 ` Leon Romanovsky
  2020-02-18 17:11   ` Dennis Dalessandro
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2020-02-18 14:04 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
> Was there any discussion on the upgrade scenario for existing deployments as
> far as device-rename changing node descriptions?
>
> If someone is running an older version of rdma-core they are going to have a
> certain set of node descriptions for each node. This could be in logs, or
> configuration databases, who knows what. Now if they upgrade to a new
> version of rdma-core their node descriptions all automatically change out
> from under them by default.
>
> Of course the admin could disable the rename prior to upgrade and as Leon
> pointed out previously the upgrade won't remove the disablement file. The
> problem is they would have to know to do that ahead of time.

Dennis,

It was discussed and the conclusion was that most if not all users are
using one of two upgrade and strategy.

First option is to rely on distro and every distro behaves differently
in such cases, some of them won't change anything till their last user
dies :) and others more dynamic with more up-to-date packages already
adopted our default.

Second option is to use numerous OFED stacks, which are expected to
provide full upgrade to all components which will work smoothly.

Users who upgrade their system from live upstream repo are expected to
be proficient enough to be deal with change of defaults.

Thanks

>
> -Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-18 14:04 ` Leon Romanovsky
@ 2020-02-18 17:11   ` Dennis Dalessandro
  2020-02-18 20:08     ` Jason Gunthorpe
  2020-02-19  7:11     ` Leon Romanovsky
  0 siblings, 2 replies; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-18 17:11 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
> On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
>> Was there any discussion on the upgrade scenario for existing deployments as
>> far as device-rename changing node descriptions?
>>
>> If someone is running an older version of rdma-core they are going to have a
>> certain set of node descriptions for each node. This could be in logs, or
>> configuration databases, who knows what. Now if they upgrade to a new
>> version of rdma-core their node descriptions all automatically change out
>> from under them by default.
>>
>> Of course the admin could disable the rename prior to upgrade and as Leon
>> pointed out previously the upgrade won't remove the disablement file. The
>> problem is they would have to know to do that ahead of time.
> 
> Dennis,
> 
> It was discussed and the conclusion was that most if not all users are
> using one of two upgrade and strategy.

Do you have a pointer to a thread I can read, I apparently missed it?

> First option is to rely on distro and every distro behaves differently
> in such cases, some of them won't change anything till their last user
> dies :) and others more dynamic with more up-to-date packages already
> adopted our default.

This is the issue I see. The problem is when the distro doesn't know any 
better and pulls in a new rdma-core and breaks things unintentionally. 
Up to date is good, but up to date that brings with it what is 
essentially an ABI breakage is not.

> Second option is to use numerous OFED stacks, which are expected to
> provide full upgrade to all components which will work smoothly.

Yeah I'm sure OFED will handle things for themselves.

> 
> Users who upgrade their system from live upstream repo are expected to
> be proficient enough to be deal with change of defaults.

Yeah this I completely agree with.

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-18 17:11   ` Dennis Dalessandro
@ 2020-02-18 20:08     ` Jason Gunthorpe
  2020-02-19  7:11     ` Leon Romanovsky
  1 sibling, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2020-02-18 20:08 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: Leon Romanovsky, linux-rdma, Honggang LI, Gal Pressman

On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:

> > First option is to rely on distro and every distro behaves differently
> > in such cases, some of them won't change anything till their last user
> > dies :) and others more dynamic with more up-to-date packages already
> > adopted our default.
> 
> This is the issue I see. The problem is when the distro doesn't know any
> better and pulls in a new rdma-core and breaks things unintentionally. Up to
> date is good, but up to date that brings with it what is essentially an ABI
> breakage is not.

The point of having the distros update is to get the breakage
fixed. Fedora, Ubuntu, etc should all track upstream and resolve
breakage through their bug process.

Remember, at the time this was set out nobody came forward to say that
there was distro-included userspace that (wrongly) hard coded
names. Now it is a bit late to backtrack - we need to move forward
with fixed userspace..

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-18 17:11   ` Dennis Dalessandro
  2020-02-18 20:08     ` Jason Gunthorpe
@ 2020-02-19  7:11     ` Leon Romanovsky
  2020-02-19 14:14       ` Dennis Dalessandro
  1 sibling, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2020-02-19  7:11 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:
> On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
> > On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
> > > Was there any discussion on the upgrade scenario for existing deployments as
> > > far as device-rename changing node descriptions?
> > >
> > > If someone is running an older version of rdma-core they are going to have a
> > > certain set of node descriptions for each node. This could be in logs, or
> > > configuration databases, who knows what. Now if they upgrade to a new
> > > version of rdma-core their node descriptions all automatically change out
> > > from under them by default.
> > >
> > > Of course the admin could disable the rename prior to upgrade and as Leon
> > > pointed out previously the upgrade won't remove the disablement file. The
> > > problem is they would have to know to do that ahead of time.
> >
> > Dennis,
> >
> > It was discussed and the conclusion was that most if not all users are
> > using one of two upgrade and strategy.
>
> Do you have a pointer to a thread I can read, I apparently missed it?

First, we started to talk about it even before patches were sent.
See this summary from LPC 2017:
 * the sysadmin will be able to disable this for "backward support"
https://lore.kernel.org/linux-rdma/20170917125603.GA5788@mtr-leonro.local/
Second, during the submission too, just need to continue to google it :)

>
> > First option is to rely on distro and every distro behaves differently
> > in such cases, some of them won't change anything till their last user
> > dies :) and others more dynamic with more up-to-date packages already
> > adopted our default.
>
> This is the issue I see. The problem is when the distro doesn't know any
> better and pulls in a new rdma-core and breaks things unintentionally. Up to
> date is good, but up to date that brings with it what is essentially an ABI
> breakage is not.

ABI breakage is a strong word, luckily enough it is not defined at all.
We never considered dmesg prints, device names, device ordering as an
ABI. You can't rely on debug features too, they can disappear too.

So the bottom line, the expectation that distro should fix all broken
software before enabling device renaming and their bugs are not excuse
to declare ABI breakage.

>
> > Second option is to use numerous OFED stacks, which are expected to
> > provide full upgrade to all components which will work smoothly.
>
> Yeah I'm sure OFED will handle things for themselves.

At the end, OFED stacks behave like "mini-distros", so if they manage to
handle it, distro should do the same.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19  7:11     ` Leon Romanovsky
@ 2020-02-19 14:14       ` Dennis Dalessandro
  2020-02-19 14:35         ` Gal Pressman
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-19 14:14 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On 2/19/2020 2:11 AM, Leon Romanovsky wrote:
> On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:
>> On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
>>> On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
>>>> Was there any discussion on the upgrade scenario for existing deployments as
>>>> far as device-rename changing node descriptions?
>>>>
>>>> If someone is running an older version of rdma-core they are going to have a
>>>> certain set of node descriptions for each node. This could be in logs, or
>>>> configuration databases, who knows what. Now if they upgrade to a new
>>>> version of rdma-core their node descriptions all automatically change out
>>>> from under them by default.
>>>>
>>>> Of course the admin could disable the rename prior to upgrade and as Leon
>>>> pointed out previously the upgrade won't remove the disablement file. The
>>>> problem is they would have to know to do that ahead of time.
>>>
>>> Dennis,
>>>
>>> It was discussed and the conclusion was that most if not all users are
>>> using one of two upgrade and strategy.
>>
>> Do you have a pointer to a thread I can read, I apparently missed it?
> 
> First, we started to talk about it even before patches were sent.
> See this summary from LPC 2017:
>   * the sysadmin will be able to disable this for "backward support"
> https://lore.kernel.org/linux-rdma/20170917125603.GA5788@mtr-leonro.local/
> Second, during the submission too, just need to continue to google it :)

So it was discussed at a meeting 2 years ago that not everyone was at, I 
certainly wasn't, and you summarized with:

		* predictable or persistent device names?
			* need to be able to rename a device

That's not very helpful. Even Jason's presentation that is linked there 
does not address the down side to the node rename especially as far as 
the impact to node description is concerned.

I have looked at the original submission and again I don't see any 
mention of the node description problem. Just an admission that the 
names are harder to read and not what everyone is used to but being 
consistent in scripts is much more important [1].

I'd have to say the script angle is far less important than 
configuration files for thousands of nodes of a large deployment being 
obsoleted without an end user's knowledge beforehand.

>>
>>> First option is to rely on distro and every distro behaves differently
>>> in such cases, some of them won't change anything till their last user
>>> dies :) and others more dynamic with more up-to-date packages already
>>> adopted our default.
>>
>> This is the issue I see. The problem is when the distro doesn't know any
>> better and pulls in a new rdma-core and breaks things unintentionally. Up to
>> date is good, but up to date that brings with it what is essentially an ABI
>> breakage is not.
> 
> ABI breakage is a strong word, luckily enough it is not defined at all.
> We never considered dmesg prints, device names, device ordering as an
> ABI. You can't rely on debug features too, they can disappear too.

Agree, it is a strong word and we can call it what you want. The point 
is you should be able to rely on the node description not being changed 
out from under you unnecessarily though. We aren't talking about a debug 
feature here but a core feature to real world deployments.

Could you envision a patch to a user space library that just changes a 
devices hostname to something that was HW specific because it makes 
scripting easier? I contend that in some cases the node description 
remaining constant is just as important.

> So the bottom line, the expectation that distro should fix all broken
> software before enabling device renaming and their bugs are not excuse
> to declare ABI breakage.

Again, call it what you want, but you can't deny this change to force 
the rename by default has not broken things. For the record I'm not even 
talking about PSM2 here. There are other, more far reaching implications.

>>
>>> Second option is to use numerous OFED stacks, which are expected to
>>> provide full upgrade to all components which will work smoothly.
>>
>> Yeah I'm sure OFED will handle things for themselves.
> 
> At the end, OFED stacks behave like "mini-distros", so if they manage to
> handle it, distro should do the same.
  The difference there is to the distro the RDMA sub system is but one 
small part. To OFED it is the sole focus. So I expect OFED stacks to be 
more agile at handling this sort of thing.

[1]https://patchwork.kernel.org/patch/10870445/

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:14       ` Dennis Dalessandro
@ 2020-02-19 14:35         ` Gal Pressman
  2020-02-19 15:10           ` Leon Romanovsky
  2020-02-19 16:54           ` Jason Gunthorpe
  2020-02-19 14:48         ` Leon Romanovsky
  2020-02-19 16:58         ` Jason Gunthorpe
  2 siblings, 2 replies; 15+ messages in thread
From: Gal Pressman @ 2020-02-19 14:35 UTC (permalink / raw)
  To: Dennis Dalessandro, Leon Romanovsky
  Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On 19/02/2020 16:14, Dennis Dalessandro wrote:
> On 2/19/2020 2:11 AM, Leon Romanovsky wrote:
>> On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:
>>> On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
>>>> On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
>> ABI breakage is a strong word, luckily enough it is not defined at all.
>> We never considered dmesg prints, device names, device ordering as an
>> ABI. You can't rely on debug features too, they can disappear too.
> 
> Agree, it is a strong word and we can call it what you want. The point is you
> should be able to rely on the node description not being changed out from under
> you unnecessarily though. We aren't talking about a debug feature here but a
> core feature to real world deployments.
> 
> Could you envision a patch to a user space library that just changes a devices
> hostname to something that was HW specific because it makes scripting easier? I
> contend that in some cases the node description remaining constant is just as
> important.
> 
>> So the bottom line, the expectation that distro should fix all broken
>> software before enabling device renaming and their bugs are not excuse
>> to declare ABI breakage.
> 
> Again, call it what you want, but you can't deny this change to force the rename
> by default has not broken things. For the record I'm not even talking about PSM2
> here. There are other, more far reaching implications.

It's not just PSM2, it broke our libfabric provider and apparently MVAPICH as well:
http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2020-January/006960.html

Regarding the issue you described, why not disable the rename on the upgrade
path and only enable it for fresh installations?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:14       ` Dennis Dalessandro
  2020-02-19 14:35         ` Gal Pressman
@ 2020-02-19 14:48         ` Leon Romanovsky
  2020-02-19 15:34           ` Dennis Dalessandro
  2020-02-19 16:58         ` Jason Gunthorpe
  2 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2020-02-19 14:48 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On Wed, Feb 19, 2020 at 09:14:06AM -0500, Dennis Dalessandro wrote:
> On 2/19/2020 2:11 AM, Leon Romanovsky wrote:
> > On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:
> > > On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
> > > > On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
> > > > > Was there any discussion on the upgrade scenario for existing deployments as
> > > > > far as device-rename changing node descriptions?
> > > > >
> > > > > If someone is running an older version of rdma-core they are going to have a
> > > > > certain set of node descriptions for each node. This could be in logs, or
> > > > > configuration databases, who knows what. Now if they upgrade to a new
> > > > > version of rdma-core their node descriptions all automatically change out
> > > > > from under them by default.
> > > > >
> > > > > Of course the admin could disable the rename prior to upgrade and as Leon
> > > > > pointed out previously the upgrade won't remove the disablement file. The
> > > > > problem is they would have to know to do that ahead of time.
> > > >
> > > > Dennis,
> > > >
> > > > It was discussed and the conclusion was that most if not all users are
> > > > using one of two upgrade and strategy.
> > >
> > > Do you have a pointer to a thread I can read, I apparently missed it?
> >
> > First, we started to talk about it even before patches were sent.
> > See this summary from LPC 2017:
> >   * the sysadmin will be able to disable this for "backward support"
> > https://lore.kernel.org/linux-rdma/20170917125603.GA5788@mtr-leonro.local/
> > Second, during the submission too, just need to continue to google it :)
>
> So it was discussed at a meeting 2 years ago that not everyone was at, I
> certainly wasn't, and you summarized with:
>
> 		* predictable or persistent device names?
> 			* need to be able to rename a device
>
> That's not very helpful. Even Jason's presentation that is linked there does
> not address the down side to the node rename especially as far as the impact
> to node description is concerned.
>
> I have looked at the original submission and again I don't see any mention
> of the node description problem. Just an admission that the names are harder
> to read and not what everyone is used to but being consistent in scripts is
> much more important [1].
>
> I'd have to say the script angle is far less important than configuration
> files for thousands of nodes of a large deployment being obsoleted without
> an end user's knowledge beforehand.
>
> > >
> > > > First option is to rely on distro and every distro behaves differently
> > > > in such cases, some of them won't change anything till their last user
> > > > dies :) and others more dynamic with more up-to-date packages already
> > > > adopted our default.
> > >
> > > This is the issue I see. The problem is when the distro doesn't know any
> > > better and pulls in a new rdma-core and breaks things unintentionally. Up to
> > > date is good, but up to date that brings with it what is essentially an ABI
> > > breakage is not.
> >
> > ABI breakage is a strong word, luckily enough it is not defined at all.
> > We never considered dmesg prints, device names, device ordering as an
> > ABI. You can't rely on debug features too, they can disappear too.
>
> Agree, it is a strong word and we can call it what you want. The point is
> you should be able to rely on the node description not being changed out
> from under you unnecessarily though. We aren't talking about a debug feature
> here but a core feature to real world deployments.
>
> Could you envision a patch to a user space library that just changes a
> devices hostname to something that was HW specific because it makes
> scripting easier? I contend that in some cases the node description
> remaining constant is just as important.

I think that the opposite is true and afraid that change to old naming
scheme will return us to the state where broken software left unfixed.

So why don't you add such patch to Intel OFED package?

>
> > So the bottom line, the expectation that distro should fix all broken
> > software before enabling device renaming and their bugs are not excuse
> > to declare ABI breakage.
>
> Again, call it what you want, but you can't deny this change to force the
> rename by default has not broken things. For the record I'm not even talking
> about PSM2 here. There are other, more far reaching implications.

Sure, I'm not arguing about that, however like any other upstream
project, we want to keep an option to change defaults. We followed the
same path like netdev and systemd did in this regards.

>
> > >
> > > > Second option is to use numerous OFED stacks, which are expected to
> > > > provide full upgrade to all components which will work smoothly.
> > >
> > > Yeah I'm sure OFED will handle things for themselves.
> >
> > At the end, OFED stacks behave like "mini-distros", so if they manage to
> > handle it, distro should do the same.
>  The difference there is to the distro the RDMA sub system is but one small
> part. To OFED it is the sole focus. So I expect OFED stacks to be more agile
> at handling this sort of thing.

I disagree about first part of this paragraph. All major distributions
follow closely rdma-core and this ML.

>
> [1]https://patchwork.kernel.org/patch/10870445/
>
> -Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:35         ` Gal Pressman
@ 2020-02-19 15:10           ` Leon Romanovsky
  2020-02-19 16:54           ` Jason Gunthorpe
  1 sibling, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2020-02-19 15:10 UTC (permalink / raw)
  To: Gal Pressman; +Cc: Dennis Dalessandro, linux-rdma, Jason Gunthorpe, Honggang LI

On Wed, Feb 19, 2020 at 04:35:40PM +0200, Gal Pressman wrote:
> On 19/02/2020 16:14, Dennis Dalessandro wrote:
> > On 2/19/2020 2:11 AM, Leon Romanovsky wrote:
> >> On Tue, Feb 18, 2020 at 12:11:47PM -0500, Dennis Dalessandro wrote:
> >>> On 2/18/2020 9:04 AM, Leon Romanovsky wrote:
> >>>> On Fri, Feb 14, 2020 at 01:13:53PM -0500, Dennis Dalessandro wrote:
> >> ABI breakage is a strong word, luckily enough it is not defined at all.
> >> We never considered dmesg prints, device names, device ordering as an
> >> ABI. You can't rely on debug features too, they can disappear too.
> >
> > Agree, it is a strong word and we can call it what you want. The point is you
> > should be able to rely on the node description not being changed out from under
> > you unnecessarily though. We aren't talking about a debug feature here but a
> > core feature to real world deployments.
> >
> > Could you envision a patch to a user space library that just changes a devices
> > hostname to something that was HW specific because it makes scripting easier? I
> > contend that in some cases the node description remaining constant is just as
> > important.
> >
> >> So the bottom line, the expectation that distro should fix all broken
> >> software before enabling device renaming and their bugs are not excuse
> >> to declare ABI breakage.
> >
> > Again, call it what you want, but you can't deny this change to force the rename
> > by default has not broken things. For the record I'm not even talking about PSM2
> > here. There are other, more far reaching implications.
>
> It's not just PSM2, it broke our libfabric provider and apparently MVAPICH as well:
> http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2020-January/006960.html
>
> Regarding the issue you described, why not disable the rename on the upgrade
> path and only enable it for fresh installations?

Good suggestion, at least in theory, it can be done for the RPMs. Just
need to be careful to distinguish upgrades from pre-v24 versions and
post-v24 versions.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:48         ` Leon Romanovsky
@ 2020-02-19 15:34           ` Dennis Dalessandro
  0 siblings, 0 replies; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-19 15:34 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, Jason Gunthorpe, Honggang LI, Gal Pressman

On 2/19/2020 9:48 AM, Leon Romanovsky wrote:
>>> At the end, OFED stacks behave like "mini-distros", so if they manage to
>>> handle it, distro should do the same.
>>   The difference there is to the distro the RDMA sub system is but one small
>> part. To OFED it is the sole focus. So I expect OFED stacks to be more agile
>> at handling this sort of thing.
> 
> I disagree about first part of this paragraph. All major distributions
> follow closely rdma-core and this ML.
> 

The distros do certainly have people that follow the list. What I'm 
saying is we shouldn't have relied on them to ensure there were no 
implications brought on by the default behavior change. Instead we as 
developers should have brought the node description impact to light more 
proactively vs being reactive.

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:35         ` Gal Pressman
  2020-02-19 15:10           ` Leon Romanovsky
@ 2020-02-19 16:54           ` Jason Gunthorpe
  1 sibling, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2020-02-19 16:54 UTC (permalink / raw)
  To: Gal Pressman; +Cc: Dennis Dalessandro, Leon Romanovsky, linux-rdma, Honggang LI

On Wed, Feb 19, 2020 at 04:35:40PM +0200, Gal Pressman wrote:

> > Again, call it what you want, but you can't deny this change to force the rename
> > by default has not broken things. For the record I'm not even talking about PSM2
> > here. There are other, more far reaching implications.
> 
> It's not just PSM2, it broke our libfabric provider and apparently MVAPICH as well:
> http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2020-January/006960.html

You all recognize that finding stuff name dependent stuff like this is
horribly hacky, right?

The whole point of doing this, over a long time, is to get all this
hacky stuff fixed up.

> Regarding the issue you described, why not disable the rename on the upgrade
> path and only enable it for fresh installations?

That isn't really a rdma-core issue, though?

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 14:14       ` Dennis Dalessandro
  2020-02-19 14:35         ` Gal Pressman
  2020-02-19 14:48         ` Leon Romanovsky
@ 2020-02-19 16:58         ` Jason Gunthorpe
  2020-02-19 19:35           ` Dennis Dalessandro
  2 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2020-02-19 16:58 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: Leon Romanovsky, linux-rdma, Honggang LI, Gal Pressman

On Wed, Feb 19, 2020 at 09:14:06AM -0500, Dennis Dalessandro wrote:

> > ABI breakage is a strong word, luckily enough it is not defined at all.
> > We never considered dmesg prints, device names, device ordering as an
> > ABI. You can't rely on debug features too, they can disappear too.
> 
> Agree, it is a strong word and we can call it what you want. The point is
> you should be able to rely on the node description not being changed out
> from under you unnecessarily though. We aren't talking about a debug feature
> here but a core feature to real world deployments.

People really use the node description as some stable name? And then
they put the HCA name in it? Why?

Is that some thing unique to the OPA subnet manager?

I don't recall people complaining about this when we introduced
rdma-ndd by default and changed all the node descriptions away from
the kernel default.

Also don't forget the whole thing about the node description is
inherently racey, so relying on it is Rather A Bad Idea.

Should we change the default format string of rdma-ndd to something
else?

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 16:58         ` Jason Gunthorpe
@ 2020-02-19 19:35           ` Dennis Dalessandro
  2020-02-19 23:18             ` Ira Weiny
  0 siblings, 1 reply; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-19 19:35 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Honggang LI, Gal Pressman

On 2/19/2020 11:58 AM, Jason Gunthorpe wrote:
> On Wed, Feb 19, 2020 at 09:14:06AM -0500, Dennis Dalessandro wrote:
> 
>>> ABI breakage is a strong word, luckily enough it is not defined at all.
>>> We never considered dmesg prints, device names, device ordering as an
>>> ABI. You can't rely on debug features too, they can disappear too.
>>
>> Agree, it is a strong word and we can call it what you want. The point is
>> you should be able to rely on the node description not being changed out
>> from under you unnecessarily though. We aren't talking about a debug feature
>> here but a core feature to real world deployments.
> 
> People really use the node description as some stable name? And then
> they put the HCA name in it? Why?

I've seen it in multiple places. Including storage configuration files. 
Suffice to say, yes people use it.

> Is that some thing unique to the OPA subnet manager?

I don't think so.

> I don't recall people complaining about this when we introduced
> rdma-ndd by default and changed all the node descriptions away from
> the kernel default.

Sure but the reason rdma-ndd exists is because people care about the 
node descriptions. I can't really speak to the historical adoption of 
rdma-ndd but I believe it was a stand alone package/feature and was a 
conscious decision to use or not as opposed to the one package to rule 
them all rdma-core like we have now.

> Also don't forget the whole thing about the node description is
> inherently racey, so relying on it is Rather A Bad Idea.

I think that point is well taken and I don't think anyone is against the 
idea of fixing the "hacky" things as you like to say. This one just 
caught people by surprise is all.

> Should we change the default format string of rdma-ndd to something
> else?

I'm not sure. I can envision situations where a user has updated 
libraries that are happy with the new persistent names but still want 
the node description to not change. If rdma-ndd could do something to 
keep the node desc the same, then in situations like this the device 
rename would not have to be disabled.

Given that we have seen problems with MVAPICH (even with mlx5), 
libfabric, psm2, and I believe open mpi has a similar issue, and that 
Intel, Amazon, RedHat, and Suse are experiencing issues from this I 
think we should make things as flexible as possible to protect users 
from breakages.

We do want to move in a forward direction though so we don't want to go 
back to the old way unilaterally. I think distros can handle their 
upgrade situations and if we build in protection to rdma-ndd something 
like a specific udev rule for keeping the node desc the same. That gives 
us the flexibility until all the software and use cases catch up.

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 19:35           ` Dennis Dalessandro
@ 2020-02-19 23:18             ` Ira Weiny
  2020-02-20  2:26               ` Dennis Dalessandro
  0 siblings, 1 reply; 15+ messages in thread
From: Ira Weiny @ 2020-02-19 23:18 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, Honggang LI, Gal Pressman

On Wed, Feb 19, 2020 at 02:35:09PM -0500, Dennis Dalessandro wrote:
> On 2/19/2020 11:58 AM, Jason Gunthorpe wrote:
> > On Wed, Feb 19, 2020 at 09:14:06AM -0500, Dennis Dalessandro wrote:
> > 
> > > > ABI breakage is a strong word, luckily enough it is not defined at all.
> > > > We never considered dmesg prints, device names, device ordering as an
> > > > ABI. You can't rely on debug features too, they can disappear too.
> > > 
> > > Agree, it is a strong word and we can call it what you want. The point is
> > > you should be able to rely on the node description not being changed out
> > > from under you unnecessarily though. We aren't talking about a debug feature
> > > here but a core feature to real world deployments.
> > 
> > People really use the node description as some stable name? And then
> > they put the HCA name in it? Why?
> 
> I've seen it in multiple places. Including storage configuration files.
> Suffice to say, yes people use it.
> 
> > Is that some thing unique to the OPA subnet manager?
> 
> I don't think so.
> 
> > I don't recall people complaining about this when we introduced
> > rdma-ndd by default and changed all the node descriptions away from
> > the kernel default.
> 
> Sure but the reason rdma-ndd exists is because people care about the node
> descriptions.

Yes people do.  Give a sys-admin 

0x00117501017af5cc
vs
node0170 hca-0

And see which one they get frustrated with.

>
> I can't really speak to the historical adoption of rdma-ndd

I originally wrote it...  So I have some history.

> but I believe it was a stand alone package/feature and was a conscious
> decision to use or not as opposed to the one package to rule them all
> rdma-core like we have now.

rdma-ndd was built to solve the race between potential host name changes and
ports coming on line.

The background is that many people use hostnames to describe their nodes and if
they wanted to configure rdma-ndd it would react to new ports and/or the
hostname updates and turn around and update the node descriptor according to a
configuration specified...  If the user wanted to use hostnames they could...
Or it could be configured with some static name if that is what admins wanted.
Hostname was just the "most likely choice".

> 
> > Also don't forget the whole thing about the node description is
> > inherently racey, so relying on it is Rather A Bad Idea.
> 
> I think that point is well taken and I don't think anyone is against the
> idea of fixing the "hacky" things as you like to say. This one just caught
> people by surprise is all.
> 
> > Should we change the default format string of rdma-ndd to something
> > else?
> 
> I'm not sure. I can envision situations where a user has updated libraries
> that are happy with the new persistent names but still want the node
> description to not change. If rdma-ndd could do something to keep the node
> desc the same, then in situations like this the device rename would not have
> to be disabled.
> 
> Given that we have seen problems with MVAPICH (even with mlx5), libfabric,
> psm2, and I believe open mpi has a similar issue, and that Intel, Amazon,
> RedHat, and Suse are experiencing issues from this I think we should make
> things as flexible as possible to protect users from breakages.
> 
> We do want to move in a forward direction though so we don't want to go back
> to the old way unilaterally. I think distros can handle their upgrade
> situations and if we build in protection to rdma-ndd something like a
> specific udev rule for keeping the node desc the same. That gives us the
> flexibility until all the software and use cases catch up.

The use of node descriptor was intended to be entirely up to the installation
in a manner to debug/locate nodes.  Not be used in libraries.  I'm surprised
that libraries are broken.

Regardless does the old rdma-ndd config exist?  Could it be configured and/or
modified to give the old names?  When it was written we designed the default
config to give the old names for backwards compatibility.  Apparently this is
no longer true?

Ira

> 
> -Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RDMA device renames and node description
  2020-02-19 23:18             ` Ira Weiny
@ 2020-02-20  2:26               ` Dennis Dalessandro
  0 siblings, 0 replies; 15+ messages in thread
From: Dennis Dalessandro @ 2020-02-20  2:26 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, Honggang LI, Gal Pressman

On 2/19/2020 6:18 PM, Ira Weiny wrote:
> The use of node descriptor was intended to be entirely up to the installation
> in a manner to debug/locate nodes.  Not be used in libraries.  I'm surprised
> that libraries are broken.

Libraries are broken due to the rename of the device. The changing of 
the node descriptor is another consequence of the device rename. 
Libraries can be patched. Node descriptors changing out from under a sys 
admin is another problem altogether.

  > Regardless does the old rdma-ndd config exist?  Could it be 
configured and/or
> modified to give the old names?  When it was written we designed the default
> config to give the old names for backwards compatibility.  Apparently this is
> no longer true?

I'm sure there are ways to get back to the old name. The problem is what 
happens by default when users upgrade.

-Denny

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-02-20  2:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14 18:13 RDMA device renames and node description Dennis Dalessandro
2020-02-18 14:04 ` Leon Romanovsky
2020-02-18 17:11   ` Dennis Dalessandro
2020-02-18 20:08     ` Jason Gunthorpe
2020-02-19  7:11     ` Leon Romanovsky
2020-02-19 14:14       ` Dennis Dalessandro
2020-02-19 14:35         ` Gal Pressman
2020-02-19 15:10           ` Leon Romanovsky
2020-02-19 16:54           ` Jason Gunthorpe
2020-02-19 14:48         ` Leon Romanovsky
2020-02-19 15:34           ` Dennis Dalessandro
2020-02-19 16:58         ` Jason Gunthorpe
2020-02-19 19:35           ` Dennis Dalessandro
2020-02-19 23:18             ` Ira Weiny
2020-02-20  2:26               ` Dennis Dalessandro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.