linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
       [not found]       ` <e4603511-6dae-e26d-12a9-e9fa727a8d03@grimberg.me>
@ 2019-08-26  6:56         ` Christoph Hellwig
  2019-08-26  7:59           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-08-26  6:56 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart,
	linux-kernel, Greg Kroah-Hartman

On Thu, Aug 22, 2019 at 12:10:23PM -0700, Sagi Grimberg wrote:
>> You are correct that this information can be derived from sysfs, but the
>> main reason why we add these here, is because in udev rule we can't
>> just go ahead and start looking these up and parsing these..
>>
>> We could send the discovery aen with NVME_CTRL_NAME and have
>> then have systemd run something like:
>>
>> nvme connect-all -d nvme0 --sysfs
>>
>> and have nvme-cli retrieve all this stuff from sysfs?
>
> Actually that may be a problem.
>
> There could be a hypothetical case where after the event was fired
> and before it was handled, the discovery controller went away and
> came back again with a different controller instance, and the old
> instance is now a different discovery controller.
>
> This is why we need this information in the event. And we verify this
> information in sysfs in nvme-cli.

Well, that must be a usual issue with uevents, right?  Don't we usually
have a increasing serial number for that or something?

If I look at other callers of kobject_uevent_env none throws in such
a huge context.

>
> Makes sense?
---end quoted text---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-26  6:56         ` [PATCH v2 3/3] nvme: fire discovery log page change events to userspace Christoph Hellwig
@ 2019-08-26  7:59           ` Greg Kroah-Hartman
  2019-08-29 18:21             ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2019-08-26  7:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, linux-nvme, Keith Busch, James Smart, linux-kernel

On Mon, Aug 26, 2019 at 08:56:39AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 22, 2019 at 12:10:23PM -0700, Sagi Grimberg wrote:
> >> You are correct that this information can be derived from sysfs, but the
> >> main reason why we add these here, is because in udev rule we can't
> >> just go ahead and start looking these up and parsing these..
> >>
> >> We could send the discovery aen with NVME_CTRL_NAME and have
> >> then have systemd run something like:
> >>
> >> nvme connect-all -d nvme0 --sysfs
> >>
> >> and have nvme-cli retrieve all this stuff from sysfs?
> >
> > Actually that may be a problem.
> >
> > There could be a hypothetical case where after the event was fired
> > and before it was handled, the discovery controller went away and
> > came back again with a different controller instance, and the old
> > instance is now a different discovery controller.
> >
> > This is why we need this information in the event. And we verify this
> > information in sysfs in nvme-cli.
> 
> Well, that must be a usual issue with uevents, right?  Don't we usually
> have a increasing serial number for that or something?

Yes we do, userspace should use it to order events.  Does udev not
handle that properly today?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-26  7:59           ` Greg Kroah-Hartman
@ 2019-08-29 18:21             ` Sagi Grimberg
  2019-08-30  5:55               ` Christoph Hellwig
  2019-08-30  6:20               ` Greg Kroah-Hartman
  0 siblings, 2 replies; 14+ messages in thread
From: Sagi Grimberg @ 2019-08-29 18:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Christoph Hellwig
  Cc: linux-nvme, Keith Busch, James Smart, linux-kernel


>>>> You are correct that this information can be derived from sysfs, but the
>>>> main reason why we add these here, is because in udev rule we can't
>>>> just go ahead and start looking these up and parsing these..
>>>>
>>>> We could send the discovery aen with NVME_CTRL_NAME and have
>>>> then have systemd run something like:
>>>>
>>>> nvme connect-all -d nvme0 --sysfs
>>>>
>>>> and have nvme-cli retrieve all this stuff from sysfs?
>>>
>>> Actually that may be a problem.
>>>
>>> There could be a hypothetical case where after the event was fired
>>> and before it was handled, the discovery controller went away and
>>> came back again with a different controller instance, and the old
>>> instance is now a different discovery controller.
>>>
>>> This is why we need this information in the event. And we verify this
>>> information in sysfs in nvme-cli.
>>
>> Well, that must be a usual issue with uevents, right?  Don't we usually
>> have a increasing serial number for that or something?
> 
> Yes we do, userspace should use it to order events.  Does udev not
> handle that properly today?

The problem is not ordering of events, its really about the fact that
the chardev can be removed and reallocated for a different controller
(could be a completely different discovery controller) by the time
that userspace handles the event.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-29 18:21             ` Sagi Grimberg
@ 2019-08-30  5:55               ` Christoph Hellwig
  2019-08-30 18:08                 ` Sagi Grimberg
  2019-08-30  6:20               ` Greg Kroah-Hartman
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-08-30  5:55 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Greg Kroah-Hartman, Christoph Hellwig, linux-nvme, Keith Busch,
	James Smart, linux-kernel

On Thu, Aug 29, 2019 at 11:21:02AM -0700, Sagi Grimberg wrote:
>> Yes we do, userspace should use it to order events.  Does udev not
>> handle that properly today?
>
> The problem is not ordering of events, its really about the fact that
> the chardev can be removed and reallocated for a different controller
> (could be a completely different discovery controller) by the time
> that userspace handles the event.

The same is generally true for lot of kernel devices.  We could reduce
the chance by using the idr cyclic allocator.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-29 18:21             ` Sagi Grimberg
  2019-08-30  5:55               ` Christoph Hellwig
@ 2019-08-30  6:20               ` Greg Kroah-Hartman
  2019-08-30 18:14                 ` Sagi Grimberg
  1 sibling, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2019-08-30  6:20 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart, linux-kernel

On Thu, Aug 29, 2019 at 11:21:02AM -0700, Sagi Grimberg wrote:
> 
> > > > > You are correct that this information can be derived from sysfs, but the
> > > > > main reason why we add these here, is because in udev rule we can't
> > > > > just go ahead and start looking these up and parsing these..
> > > > > 
> > > > > We could send the discovery aen with NVME_CTRL_NAME and have
> > > > > then have systemd run something like:
> > > > > 
> > > > > nvme connect-all -d nvme0 --sysfs
> > > > > 
> > > > > and have nvme-cli retrieve all this stuff from sysfs?
> > > > 
> > > > Actually that may be a problem.
> > > > 
> > > > There could be a hypothetical case where after the event was fired
> > > > and before it was handled, the discovery controller went away and
> > > > came back again with a different controller instance, and the old
> > > > instance is now a different discovery controller.
> > > > 
> > > > This is why we need this information in the event. And we verify this
> > > > information in sysfs in nvme-cli.
> > > 
> > > Well, that must be a usual issue with uevents, right?  Don't we usually
> > > have a increasing serial number for that or something?
> > 
> > Yes we do, userspace should use it to order events.  Does udev not
> > handle that properly today?
> 
> The problem is not ordering of events, its really about the fact that
> the chardev can be removed and reallocated for a different controller
> (could be a completely different discovery controller) by the time
> that userspace handles the event.

So?  You will have gotten the remove and then new addition uevent in
order showing you this.  So your userspace code knows that something
went away and then came back properly so you should be kept in sync.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30  5:55               ` Christoph Hellwig
@ 2019-08-30 18:08                 ` Sagi Grimberg
  2019-08-30 18:36                   ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2019-08-30 18:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-nvme, Keith Busch, James Smart,
	linux-kernel, Hannes Reinecke


>>> Yes we do, userspace should use it to order events.  Does udev not
>>> handle that properly today?
>>
>> The problem is not ordering of events, its really about the fact that
>> the chardev can be removed and reallocated for a different controller
>> (could be a completely different discovery controller) by the time
>> that userspace handles the event.
> 
> The same is generally true for lot of kernel devices.  We could reduce
> the chance by using the idr cyclic allocator.

Well, it was raised by Hannes and James, so I'll ask them respond here
because I don't mind having it this way. I personally think that this
is a better approach than having a cyclic idr allocator. In general, I
don't necessarily think that this is a good idea to have cyclic
controller enumerations if we don't absolutely have to...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30  6:20               ` Greg Kroah-Hartman
@ 2019-08-30 18:14                 ` Sagi Grimberg
  2019-09-02 19:33                   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2019-08-30 18:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart, linux-kernel


>>>>>> You are correct that this information can be derived from sysfs, but the
>>>>>> main reason why we add these here, is because in udev rule we can't
>>>>>> just go ahead and start looking these up and parsing these..
>>>>>>
>>>>>> We could send the discovery aen with NVME_CTRL_NAME and have
>>>>>> then have systemd run something like:
>>>>>>
>>>>>> nvme connect-all -d nvme0 --sysfs
>>>>>>
>>>>>> and have nvme-cli retrieve all this stuff from sysfs?
>>>>>
>>>>> Actually that may be a problem.
>>>>>
>>>>> There could be a hypothetical case where after the event was fired
>>>>> and before it was handled, the discovery controller went away and
>>>>> came back again with a different controller instance, and the old
>>>>> instance is now a different discovery controller.
>>>>>
>>>>> This is why we need this information in the event. And we verify this
>>>>> information in sysfs in nvme-cli.
>>>>
>>>> Well, that must be a usual issue with uevents, right?  Don't we usually
>>>> have a increasing serial number for that or something?
>>>
>>> Yes we do, userspace should use it to order events.  Does udev not
>>> handle that properly today?
>>
>> The problem is not ordering of events, its really about the fact that
>> the chardev can be removed and reallocated for a different controller
>> (could be a completely different discovery controller) by the time
>> that userspace handles the event.
> 
> So?  You will have gotten the remove and then new addition uevent in
> order showing you this.  So your userspace code knows that something
> went away and then came back properly so you should be kept in sync.

Still don't understand how this is ok...

I have /dev/nvme0 represents a network endpoint that I would discover
from, it is raising me an event to do a discovery operation (namely to
issue an ioctl to it) so my udev code calls a systemd script.

By the time I actually get to do that, /dev/nvme0 represents now a new
network endpoint (where the event is no longer relevant to). I would
rather the discovery to explicitly fail than to give me something
different, so we pass some arguments that we verify in the operation.

Its a stretch case, but it was raised by people as a potential issue.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30 18:08                 ` Sagi Grimberg
@ 2019-08-30 18:36                   ` James Smart
  2019-08-30 21:07                     ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2019-08-30 18:36 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-nvme, Keith Busch, linux-kernel,
	Hannes Reinecke

On 8/30/2019 11:08 AM, Sagi Grimberg wrote:
>
>>>> Yes we do, userspace should use it to order events.  Does udev not
>>>> handle that properly today?
>>>
>>> The problem is not ordering of events, its really about the fact that
>>> the chardev can be removed and reallocated for a different controller
>>> (could be a completely different discovery controller) by the time
>>> that userspace handles the event.
>>
>> The same is generally true for lot of kernel devices.  We could reduce
>> the chance by using the idr cyclic allocator.
>
> Well, it was raised by Hannes and James, so I'll ask them respond here
> because I don't mind having it this way. I personally think that this
> is a better approach than having a cyclic idr allocator. In general, I
> don't necessarily think that this is a good idea to have cyclic
> controller enumerations if we don't absolutely have to...

We hit it right and left without the cyclic allocator, but that won't 
necessarily remove it.

Perhaps we should have had a unique token assigned to the controller, 
and have the event pass the name and the token.  The cli would then, if 
the token is present, validate it via an ioctl before proceeding with 
other ioctls.

Where all the connection arguments were added we due to the reuse issue 
and then solving the question of how to verify and/or lookup the desired 
controller, by using the shotgun approach rather than being very 
pointed, which is what the name/token would do.

-- james


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30 18:36                   ` James Smart
@ 2019-08-30 21:07                     ` Sagi Grimberg
  2019-08-30 22:24                       ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2019-08-30 21:07 UTC (permalink / raw)
  To: James Smart, Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-nvme, Keith Busch, linux-kernel,
	Hannes Reinecke


>>>>> Yes we do, userspace should use it to order events.  Does udev not
>>>>> handle that properly today?
>>>>
>>>> The problem is not ordering of events, its really about the fact that
>>>> the chardev can be removed and reallocated for a different controller
>>>> (could be a completely different discovery controller) by the time
>>>> that userspace handles the event.
>>>
>>> The same is generally true for lot of kernel devices.  We could reduce
>>> the chance by using the idr cyclic allocator.
>>
>> Well, it was raised by Hannes and James, so I'll ask them respond here
>> because I don't mind having it this way. I personally think that this
>> is a better approach than having a cyclic idr allocator. In general, I
>> don't necessarily think that this is a good idea to have cyclic
>> controller enumerations if we don't absolutely have to...
> 
> We hit it right and left without the cyclic allocator, but that won't 
> necessarily remove it.
> 
> Perhaps we should have had a unique token assigned to the controller, 
> and have the event pass the name and the token.  The cli would then, if 
> the token is present, validate it via an ioctl before proceeding with 
> other ioctls.
> 
> Where all the connection arguments were added we due to the reuse issue 
> and then solving the question of how to verify and/or lookup the desired 
> controller, by using the shotgun approach rather than being very 
> pointed, which is what the name/token would do.

This unique token is: trtype:traddr:trsvcid:host-traddr ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30 21:07                     ` Sagi Grimberg
@ 2019-08-30 22:24                       ` James Smart
  2019-09-09 15:10                         ` Hannes Reinecke
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2019-08-30 22:24 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-nvme, Keith Busch, linux-kernel,
	Hannes Reinecke

On 8/30/2019 2:07 PM, Sagi Grimberg wrote:
>
>>>>>> Yes we do, userspace should use it to order events.  Does udev not
>>>>>> handle that properly today?
>>>>>
>>>>> The problem is not ordering of events, its really about the fact that
>>>>> the chardev can be removed and reallocated for a different controller
>>>>> (could be a completely different discovery controller) by the time
>>>>> that userspace handles the event.
>>>>
>>>> The same is generally true for lot of kernel devices.  We could reduce
>>>> the chance by using the idr cyclic allocator.
>>>
>>> Well, it was raised by Hannes and James, so I'll ask them respond here
>>> because I don't mind having it this way. I personally think that this
>>> is a better approach than having a cyclic idr allocator. In general, I
>>> don't necessarily think that this is a good idea to have cyclic
>>> controller enumerations if we don't absolutely have to...
>>
>> We hit it right and left without the cyclic allocator, but that won't 
>> necessarily remove it.
>>
>> Perhaps we should have had a unique token assigned to the controller, 
>> and have the event pass the name and the token.  The cli would then, 
>> if the token is present, validate it via an ioctl before proceeding 
>> with other ioctls.
>>
>> Where all the connection arguments were added we due to the reuse 
>> issue and then solving the question of how to verify and/or lookup 
>> the desired controller, by using the shotgun approach rather than 
>> being very pointed, which is what the name/token would do.
>
> This unique token is: trtype:traddr:trsvcid:host-traddr ...

well yes :)  though rather verbose.   There is still a minute window as 
we're comparing values in sysfs, prior to opening the device, so 
technically something could change in that window between when we 
checked sysfs and when we open'd.   We can certain check after we open 
the device to solve that issue.

There is some elegance to a 32-bit token for the controller (can be an 
incrementing value) passed in the event and used when servicing the 
event that avoids a bunch of work.

Doing either of these would eliminate what Hannes liked - looking for 
the discovery controller with those attributes. Although, I don't know 
that looking for it is all that meaningful.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30 18:14                 ` Sagi Grimberg
@ 2019-09-02 19:33                   ` Greg Kroah-Hartman
  2019-09-04  1:35                     ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2019-09-02 19:33 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart, linux-kernel

On Fri, Aug 30, 2019 at 11:14:39AM -0700, Sagi Grimberg wrote:
> 
> > > > > > > You are correct that this information can be derived from sysfs, but the
> > > > > > > main reason why we add these here, is because in udev rule we can't
> > > > > > > just go ahead and start looking these up and parsing these..
> > > > > > > 
> > > > > > > We could send the discovery aen with NVME_CTRL_NAME and have
> > > > > > > then have systemd run something like:
> > > > > > > 
> > > > > > > nvme connect-all -d nvme0 --sysfs
> > > > > > > 
> > > > > > > and have nvme-cli retrieve all this stuff from sysfs?
> > > > > > 
> > > > > > Actually that may be a problem.
> > > > > > 
> > > > > > There could be a hypothetical case where after the event was fired
> > > > > > and before it was handled, the discovery controller went away and
> > > > > > came back again with a different controller instance, and the old
> > > > > > instance is now a different discovery controller.
> > > > > > 
> > > > > > This is why we need this information in the event. And we verify this
> > > > > > information in sysfs in nvme-cli.
> > > > > 
> > > > > Well, that must be a usual issue with uevents, right?  Don't we usually
> > > > > have a increasing serial number for that or something?
> > > > 
> > > > Yes we do, userspace should use it to order events.  Does udev not
> > > > handle that properly today?
> > > 
> > > The problem is not ordering of events, its really about the fact that
> > > the chardev can be removed and reallocated for a different controller
> > > (could be a completely different discovery controller) by the time
> > > that userspace handles the event.
> > 
> > So?  You will have gotten the remove and then new addition uevent in
> > order showing you this.  So your userspace code knows that something
> > went away and then came back properly so you should be kept in sync.
> 
> Still don't understand how this is ok...
> 
> I have /dev/nvme0 represents a network endpoint that I would discover
> from, it is raising me an event to do a discovery operation (namely to
> issue an ioctl to it) so my udev code calls a systemd script.
> 
> By the time I actually get to do that, /dev/nvme0 represents now a new
> network endpoint (where the event is no longer relevant to). I would
> rather the discovery to explicitly fail than to give me something
> different, so we pass some arguments that we verify in the operation.
> 
> Its a stretch case, but it was raised by people as a potential issue.

Ok, and how do you handle this same thing for something like /dev/sda ?
(hint, it isn't new, and is something we solved over a decade ago)

If you worry about stuff like this, use a persistant device naming
scheme and have your device node be pointed to by a symlink.  Create
that symlink by using the information in the initial 'ADD' uevent.

That way, when userspace opens the device node, it "knows" exactly which
one it opens.  It sounds like you have a bunch of metadata to describe
these uniquely, so pass that in the ADD event, not in some other crazy
non-standard manner.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-09-02 19:33                   ` Greg Kroah-Hartman
@ 2019-09-04  1:35                     ` Sagi Grimberg
  2019-09-04  5:25                       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2019-09-04  1:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart, linux-kernel


>> Still don't understand how this is ok...
>>
>> I have /dev/nvme0 represents a network endpoint that I would discover
>> from, it is raising me an event to do a discovery operation (namely to
>> issue an ioctl to it) so my udev code calls a systemd script.
>>
>> By the time I actually get to do that, /dev/nvme0 represents now a new
>> network endpoint (where the event is no longer relevant to). I would
>> rather the discovery to explicitly fail than to give me something
>> different, so we pass some arguments that we verify in the operation.
>>
>> Its a stretch case, but it was raised by people as a potential issue.
> 
> Ok, and how do you handle this same thing for something like /dev/sda ?
> (hint, it isn't new, and is something we solved over a decade ago)
> 
> If you worry about stuff like this, use a persistant device naming
> scheme and have your device node be pointed to by a symlink.  Create
> that symlink by using the information in the initial 'ADD' uevent.
> 
> That way, when userspace opens the device node, it "knows" exactly which
> one it opens.  It sounds like you have a bunch of metadata to describe
> these uniquely, so pass that in the ADD event, not in some other crazy
> non-standard manner.

We could send these variables when adding the device and then validating
them using a rich-text-explanatory symlink. Seems slightly backwards to
me, but that would work too.

We create the char device using device_add (in nvme_init_subsystem),
I didn't find any way to append env variables to that ADD uevent.

Did you mean that we should add another flavor of device_add that
accepts char *envp_ext[]?

What exactly is the "standard manner" to pass these variables to
the chardev KOBJ_ADD uevent? All other examples I could find use
KOBJ_CHANGE to pass private stuff..

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-09-04  1:35                     ` Sagi Grimberg
@ 2019-09-04  5:25                       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2019-09-04  5:25 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, linux-nvme, Keith Busch, James Smart, linux-kernel

On Tue, Sep 03, 2019 at 06:35:30PM -0700, Sagi Grimberg wrote:
> 
> > > Still don't understand how this is ok...
> > > 
> > > I have /dev/nvme0 represents a network endpoint that I would discover
> > > from, it is raising me an event to do a discovery operation (namely to
> > > issue an ioctl to it) so my udev code calls a systemd script.
> > > 
> > > By the time I actually get to do that, /dev/nvme0 represents now a new
> > > network endpoint (where the event is no longer relevant to). I would
> > > rather the discovery to explicitly fail than to give me something
> > > different, so we pass some arguments that we verify in the operation.
> > > 
> > > Its a stretch case, but it was raised by people as a potential issue.
> > 
> > Ok, and how do you handle this same thing for something like /dev/sda ?
> > (hint, it isn't new, and is something we solved over a decade ago)
> > 
> > If you worry about stuff like this, use a persistant device naming
> > scheme and have your device node be pointed to by a symlink.  Create
> > that symlink by using the information in the initial 'ADD' uevent.
> > 
> > That way, when userspace opens the device node, it "knows" exactly which
> > one it opens.  It sounds like you have a bunch of metadata to describe
> > these uniquely, so pass that in the ADD event, not in some other crazy
> > non-standard manner.
> 
> We could send these variables when adding the device and then validating
> them using a rich-text-explanatory symlink. Seems slightly backwards to
> me, but that would work too.

That's the way the driver model is expected to work, instead of having
to do crazy device-specific stuff.

> We create the char device using device_add (in nvme_init_subsystem),
> I didn't find any way to append env variables to that ADD uevent.

You do that in your uevent or dev_uevent callback like all other
subsystems.  Nothing "new" to do here, again, it's been working fine for
everyone else for well over a decade now :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace
  2019-08-30 22:24                       ` James Smart
@ 2019-09-09 15:10                         ` Hannes Reinecke
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2019-09-09 15:10 UTC (permalink / raw)
  To: James Smart, Sagi Grimberg, Christoph Hellwig
  Cc: Greg Kroah-Hartman, linux-nvme, Keith Busch, linux-kernel

On 8/31/19 12:24 AM, James Smart wrote:
> On 8/30/2019 2:07 PM, Sagi Grimberg wrote:
>>
>>>>>>> Yes we do, userspace should use it to order events.  Does udev not
>>>>>>> handle that properly today?
>>>>>>
>>>>>> The problem is not ordering of events, its really about the fact that
>>>>>> the chardev can be removed and reallocated for a different controller
>>>>>> (could be a completely different discovery controller) by the time
>>>>>> that userspace handles the event.
>>>>>
>>>>> The same is generally true for lot of kernel devices.  We could reduce
>>>>> the chance by using the idr cyclic allocator.
>>>>
>>>> Well, it was raised by Hannes and James, so I'll ask them respond here
>>>> because I don't mind having it this way. I personally think that this
>>>> is a better approach than having a cyclic idr allocator. In general, I
>>>> don't necessarily think that this is a good idea to have cyclic
>>>> controller enumerations if we don't absolutely have to...
>>>
>>> We hit it right and left without the cyclic allocator, but that won't
>>> necessarily remove it.
>>>
>>> Perhaps we should have had a unique token assigned to the controller,
>>> and have the event pass the name and the token.  The cli would then,
>>> if the token is present, validate it via an ioctl before proceeding
>>> with other ioctls.
>>>
>>> Where all the connection arguments were added we due to the reuse
>>> issue and then solving the question of how to verify and/or lookup
>>> the desired controller, by using the shotgun approach rather than
>>> being very pointed, which is what the name/token would do.
>>
>> This unique token is: trtype:traddr:trsvcid:host-traddr ...
> 
> well yes :)  though rather verbose.   There is still a minute window as
> we're comparing values in sysfs, prior to opening the device, so
> technically something could change in that window between when we
> checked sysfs and when we open'd.   We can certain check after we open
> the device to solve that issue.
> 
> There is some elegance to a 32-bit token for the controller (can be an
> incrementing value) passed in the event and used when servicing the
> event that avoids a bunch of work.
> 
> Doing either of these would eliminate what Hannes liked - looking for
> the discovery controller with those attributes. Although, I don't know
> that looking for it is all that meaningful.
> 
yeah, we do have this fundamental problem with uevents; they do refer to
a '/dev/nvmeX' device with those parameters, but this device might be
long gone (or have been reallocated) by the time the event is processed.

From my POV we have two choices here:
- rely on the transport options to find a matching controller (ignoring
the device name) and use that for sending out discovery requests. In the
end, it shouldn't really matter device we're using if the transport
options are identical. Although I'm not sure for RDMA; here we don't
necessarily have a host transport address, so we _might_ send the
discovery via the wrong controller in a CMIC enviroment.
- Match the options in nvme-cli, and just discard the event if it
doesn't match. Which is some additional coding in nvme-cli and might ran
afoul if we ever miss events.

I'd go for the second option; after considering the first introduces far
too many conditions rendering debugging impractical.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      Teamlead Storage & Networking
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 247165 (AG München), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-09-09 15:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190712180211.26333-1-sagi@grimberg.me>
     [not found] ` <20190712180211.26333-4-sagi@grimberg.me>
     [not found]   ` <20190822002328.GP9511@lst.de>
     [not found]     ` <205d06ab-fedc-739d-323f-b358aff2cbfe@grimberg.me>
     [not found]       ` <e4603511-6dae-e26d-12a9-e9fa727a8d03@grimberg.me>
2019-08-26  6:56         ` [PATCH v2 3/3] nvme: fire discovery log page change events to userspace Christoph Hellwig
2019-08-26  7:59           ` Greg Kroah-Hartman
2019-08-29 18:21             ` Sagi Grimberg
2019-08-30  5:55               ` Christoph Hellwig
2019-08-30 18:08                 ` Sagi Grimberg
2019-08-30 18:36                   ` James Smart
2019-08-30 21:07                     ` Sagi Grimberg
2019-08-30 22:24                       ` James Smart
2019-09-09 15:10                         ` Hannes Reinecke
2019-08-30  6:20               ` Greg Kroah-Hartman
2019-08-30 18:14                 ` Sagi Grimberg
2019-09-02 19:33                   ` Greg Kroah-Hartman
2019-09-04  1:35                     ` Sagi Grimberg
2019-09-04  5:25                       ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).