All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nvme: fix stuck discovery
@ 2018-05-24 14:18 Hannes Reinecke
  2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hannes Reinecke @ 2018-05-24 14:18 UTC (permalink / raw)


Hi all,

when the user manages to abort a 'nvme discover' the resulting discovery
controller is never removed, as it will try to reconnect for ever.
And as the 'duplicate_connect' option is not given any further attempt
will be returned with -EALREADY.

This patchset fixes this issue by a) disabling reconnection attempts
and b) allowing for duplicate connections on the discovery controller.

As usual, comments and reviews are welcome.

Hannes Reinecke (3):
  nvme: centralize discovery controller defaults
  nvme: avoid reconnection for the discovery controller
  nvme: allow duplicate connections to the discovery controller

 drivers/nvme/host/core.c    |  3 ++-
 drivers/nvme/host/fabrics.c | 12 ++++++++----
 2 files changed, 10 insertions(+), 5 deletions(-)

-- 
2.12.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] nvme: centralize discovery controller defaults
  2018-05-24 14:18 [PATCH 0/3] nvme: fix stuck discovery Hannes Reinecke
@ 2018-05-24 14:18 ` Hannes Reinecke
  2018-05-24 20:31   ` James Smart
  2018-05-25  8:55   ` Christoph Hellwig
  2018-05-24 14:18 ` [PATCH 2/3] nvme: avoid reconnection for the discovery controller Hannes Reinecke
  2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
  2 siblings, 2 replies; 14+ messages in thread
From: Hannes Reinecke @ 2018-05-24 14:18 UTC (permalink / raw)


When connecting to the discovery controller we have certain defaults
to observe, so centralize them to avoid inconsistencies due to argument
ordering.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/fabrics.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 13c5d3155c41..d25aa4322d16 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -698,10 +698,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			opts->discovery_nqn =
 				!(strcmp(opts->subsysnqn,
 					 NVME_DISC_SUBSYS_NAME));
-			if (opts->discovery_nqn) {
-				opts->kato = 0;
-				opts->nr_io_queues = 0;
-			}
 			break;
 		case NVMF_OPT_TRADDR:
 			p = match_strdup(args);
@@ -856,6 +852,10 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		}
 	}
 
+	if (opts->discovery_nqn) {
+		opts->kato = 0;
+		opts->nr_io_queues = 0;
+	}
 	if (ctrl_loss_tmo < 0)
 		opts->max_reconnects = -1;
 	else
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] nvme: avoid reconnection for the discovery controller
  2018-05-24 14:18 [PATCH 0/3] nvme: fix stuck discovery Hannes Reinecke
  2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
@ 2018-05-24 14:18 ` Hannes Reinecke
  2018-05-24 20:30   ` James Smart
  2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
  2 siblings, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2018-05-24 14:18 UTC (permalink / raw)


If the connection to the discovery controller fails for whatever
reason we should not attempt to reconnect; rather the user should
retry the discovery altogether.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/fabrics.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index d25aa4322d16..997d8677ab44 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -474,6 +474,9 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
 
 bool nvmf_should_reconnect(struct nvme_ctrl *ctrl)
 {
+	if (ctrl->opts->discovery_nqn)
+		return false;
+
 	if (ctrl->opts->max_reconnects != -1 &&
 	    ctrl->nr_reconnects < ctrl->opts->max_reconnects)
 		return true;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-05-24 14:18 [PATCH 0/3] nvme: fix stuck discovery Hannes Reinecke
  2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
  2018-05-24 14:18 ` [PATCH 2/3] nvme: avoid reconnection for the discovery controller Hannes Reinecke
@ 2018-05-24 14:18 ` Hannes Reinecke
  2018-05-24 20:31   ` James Smart
                     ` (2 more replies)
  2 siblings, 3 replies; 14+ messages in thread
From: Hannes Reinecke @ 2018-05-24 14:18 UTC (permalink / raw)


The whole point of the discovery controller is that it can accept
multiple connections. Additionally the cmic field is not even defined for
the discovery controller identify page.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/core.c    | 3 ++-
 drivers/nvme/host/fabrics.c | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 10b180b8ccf4..28ac580b07e5 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 		 * Verify that the subsystem actually supports multiple
 		 * controllers, else bail out.
 		 */
-		if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
+		if (!ctrl->opts->discovery_nqn &&
+		    nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
 			dev_err(ctrl->device,
 				"ignoring ctrl due to duplicate subnqn (%s).\n",
 				found->subnqn);
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 997d8677ab44..a29b0f00a88d 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -858,6 +858,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	if (opts->discovery_nqn) {
 		opts->kato = 0;
 		opts->nr_io_queues = 0;
+		opts->duplicate_connect = true;
 	}
 	if (ctrl_loss_tmo < 0)
 		opts->max_reconnects = -1;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] nvme: avoid reconnection for the discovery controller
  2018-05-24 14:18 ` [PATCH 2/3] nvme: avoid reconnection for the discovery controller Hannes Reinecke
@ 2018-05-24 20:30   ` James Smart
  2018-05-25  6:43     ` Hannes Reinecke
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-24 20:30 UTC (permalink / raw)




On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
> If the connection to the discovery controller fails for whatever
> reason we should not attempt to reconnect; rather the user should
> retry the discovery altogether.
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/fabrics.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index d25aa4322d16..997d8677ab44 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -474,6 +474,9 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
>   
>   bool nvmf_should_reconnect(struct nvme_ctrl *ctrl)
>   {
> +	if (ctrl->opts->discovery_nqn)
> +		return false;
> +
>   	if (ctrl->opts->max_reconnects != -1 &&
>   	    ctrl->nr_reconnects < ctrl->opts->max_reconnects)
>   		return true;

um no.? It should follow the same reconnect policies as anything other 
controller.?? There is no guarantee the event that triggered the 
discovery controller will do so again. Not all discovery controller 
connects are manually initiated by an admin.?? So failing the reconnect 
may mean you will never reconnect with the subsystems described by the 
discovery controller.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
@ 2018-05-24 20:31   ` James Smart
  2018-05-25  8:56   ` Christoph Hellwig
  2018-06-18 16:11   ` James Smart
  2 siblings, 0 replies; 14+ messages in thread
From: James Smart @ 2018-05-24 20:31 UTC (permalink / raw)




On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
> The whole point of the discovery controller is that it can accept
> multiple connections. Additionally the cmic field is not even defined for
> the discovery controller identify page.
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/core.c    | 3 ++-
>   drivers/nvme/host/fabrics.c | 1 +
>   2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 10b180b8ccf4..28ac580b07e5 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
>   		 * Verify that the subsystem actually supports multiple
>   		 * controllers, else bail out.
>   		 */
> -		if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
> +		if (!ctrl->opts->discovery_nqn &&
> +		    nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>   			dev_err(ctrl->device,
>   				"ignoring ctrl due to duplicate subnqn (%s).\n",
>   				found->subnqn);
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 997d8677ab44..a29b0f00a88d 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -858,6 +858,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   	if (opts->discovery_nqn) {
>   		opts->kato = 0;
>   		opts->nr_io_queues = 0;
> +		opts->duplicate_connect = true;
>   	}
>   	if (ctrl_loss_tmo < 0)
>   		opts->max_reconnects = -1;

this is reasonable.

-- james


Reviewed-by: James Smart? <james.smart at broadcom.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] nvme: centralize discovery controller defaults
  2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
@ 2018-05-24 20:31   ` James Smart
  2018-05-25  8:55   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: James Smart @ 2018-05-24 20:31 UTC (permalink / raw)




On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
> When connecting to the discovery controller we have certain defaults
> to observe, so centralize them to avoid inconsistencies due to argument
> ordering.
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/fabrics.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 13c5d3155c41..d25aa4322d16 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -698,10 +698,6 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   			opts->discovery_nqn =
>   				!(strcmp(opts->subsysnqn,
>   					 NVME_DISC_SUBSYS_NAME));
> -			if (opts->discovery_nqn) {
> -				opts->kato = 0;
> -				opts->nr_io_queues = 0;
> -			}
>   			break;
>   		case NVMF_OPT_TRADDR:
>   			p = match_strdup(args);
> @@ -856,6 +852,10 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
>   		}
>   	}
>   
> +	if (opts->discovery_nqn) {
> +		opts->kato = 0;
> +		opts->nr_io_queues = 0;
> +	}
>   	if (ctrl_loss_tmo < 0)
>   		opts->max_reconnects = -1;
>   	else

looks fine

-- james


Reviewed-by: James Smart? <james.smart at broadcom.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] nvme: avoid reconnection for the discovery controller
  2018-05-24 20:30   ` James Smart
@ 2018-05-25  6:43     ` Hannes Reinecke
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2018-05-25  6:43 UTC (permalink / raw)


On 05/24/2018 10:30 PM, James Smart wrote:
> 
> 
> On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
>> If the connection to the discovery controller fails for whatever
>> reason we should not attempt to reconnect; rather the user should
>> retry the discovery altogether.
>>
>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>> ---
>> ? drivers/nvme/host/fabrics.c | 3 +++
>> ? 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index d25aa4322d16..997d8677ab44 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -474,6 +474,9 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
>> ? bool nvmf_should_reconnect(struct nvme_ctrl *ctrl)
>> ? {
>> +??? if (ctrl->opts->discovery_nqn)
>> +??????? return false;
>> +
>> ????? if (ctrl->opts->max_reconnects != -1 &&
>> ????????? ctrl->nr_reconnects < ctrl->opts->max_reconnects)
>> ????????? return true;
> 
> um no.? It should follow the same reconnect policies as anything other 
> controller.?? There is no guarantee the event that triggered the 
> discovery controller will do so again. Not all discovery controller 
> connects are manually initiated by an admin.?? So failing the reconnect 
> may mean you will never reconnect with the subsystems described by the 
> discovery controller.
> 
Ah. Fair enough.
So we should be limiting the number of reconnects only when triggered by 
the admin. Should be doable; I'll be looking into it.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] nvme: centralize discovery controller defaults
  2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
  2018-05-24 20:31   ` James Smart
@ 2018-05-25  8:55   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-25  8:55 UTC (permalink / raw)


On Thu, May 24, 2018@04:18:15PM +0200, Hannes Reinecke wrote:
> When connecting to the discovery controller we have certain defaults
> to observe, so centralize them to avoid inconsistencies due to argument
> ordering.

Thanks, applied to nvme-4.18-2.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
  2018-05-24 20:31   ` James Smart
@ 2018-05-25  8:56   ` Christoph Hellwig
  2018-06-18 16:11   ` James Smart
  2 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-25  8:56 UTC (permalink / raw)


Thanks, applied to nvme-4.18-2.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
  2018-05-24 20:31   ` James Smart
  2018-05-25  8:56   ` Christoph Hellwig
@ 2018-06-18 16:11   ` James Smart
  2018-06-18 22:54     ` Hannes Reinecke
  2 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-06-18 16:11 UTC (permalink / raw)



On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
> The whole point of the discovery controller is that it can accept
> multiple connections. Additionally the cmic field is not even defined for
> the discovery controller identify page.
>
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>   drivers/nvme/host/core.c    | 3 ++-
>   drivers/nvme/host/fabrics.c | 1 +
>   2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 10b180b8ccf4..28ac580b07e5 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
>   		 * Verify that the subsystem actually supports multiple
>   		 * controllers, else bail out.
>   		 */
> -		if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
> +		if (!ctrl->opts->discovery_nqn &&
> +		    nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>   			dev_err(ctrl->device,
>   				"ignoring ctrl due to duplicate subnqn (%s).\n",
>   				found->subnqn);
>

Question: this snippet excludes the discovery controller from failing 
the init call. But in doing so, it allows the discovery controller to be 
seen as "single unique device" based on its subnqn with another path 
added.? I assume, the read of the discovery log command will be done 
against the /dev/nvme? controller instance, so it will still go to the 
right "path" that corresponds to where the "nvme connect or connect-all" 
was issued to.? Correct ??? and other commands against the discovery 
controller should also use ioctls to the /dev/nvme? name so the proper 
device/path is used/deleted - Correct ?

Are there any side effects of the fact that we allow the discovery 
controller to have multiple paths ?

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-06-18 16:11   ` James Smart
@ 2018-06-18 22:54     ` Hannes Reinecke
  2018-06-18 23:57       ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2018-06-18 22:54 UTC (permalink / raw)


On 06/18/2018 06:11 PM, James Smart wrote:
> 
> On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
>> The whole point of the discovery controller is that it can accept
>> multiple connections. Additionally the cmic field is not even defined for
>> the discovery controller identify page.
>>
>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>> ---
>> ? drivers/nvme/host/core.c??? | 3 ++-
>> ? drivers/nvme/host/fabrics.c | 1 +
>> ? 2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 10b180b8ccf4..28ac580b07e5 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct nvme_ctrl 
>> *ctrl, struct nvme_id_ctrl *id)
>> ?????????? * Verify that the subsystem actually supports multiple
>> ?????????? * controllers, else bail out.
>> ?????????? */
>> -??????? if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>> +??????? if (!ctrl->opts->discovery_nqn &&
>> +??????????? nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>> ????????????? dev_err(ctrl->device,
>> ????????????????? "ignoring ctrl due to duplicate subnqn (%s).\n",
>> ????????????????? found->subnqn);
>>
> 
> Question: this snippet excludes the discovery controller from failing 
> the init call. But in doing so, it allows the discovery controller to be 
> seen as "single unique device" based on its subnqn with another path 
> added.? I assume, the read of the discovery log command will be done 
> against the /dev/nvme? controller instance, so it will still go to the 
> right "path" that corresponds to where the "nvme connect or connect-all" 
> was issued to.? Correct ??? and other commands against the discovery 
> controller should also use ioctls to the /dev/nvme? name so the proper 
> device/path is used/deleted - Correct ?
> 
> Are there any side effects of the fact that we allow the discovery 
> controller to have multiple paths ?
> 
That would only be relevant if the discovery log page entry could be 
different for the individual paths.
It's somewhat hard to see how that could happen; it's not that one path 
can set the ANA CMIC bit, and the other path doesn't ...
So even if we somehow fail to get the path right we still would be 
getting the correct information, no?

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-06-18 22:54     ` Hannes Reinecke
@ 2018-06-18 23:57       ` James Smart
  2018-06-19 10:05         ` Hannes Reinecke
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-06-18 23:57 UTC (permalink / raw)




On 6/18/2018 3:54 PM, Hannes Reinecke wrote:
> On 06/18/2018 06:11 PM, James Smart wrote:
>>
>> On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
>>> The whole point of the discovery controller is that it can accept
>>> multiple connections. Additionally the cmic field is not even 
>>> defined for
>>> the discovery controller identify page.
>>>
>>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>>> ---
>>> ? drivers/nvme/host/core.c??? | 3 ++-
>>> ? drivers/nvme/host/fabrics.c | 1 +
>>> ? 2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 10b180b8ccf4..28ac580b07e5 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct 
>>> nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
>>> ?????????? * Verify that the subsystem actually supports multiple
>>> ?????????? * controllers, else bail out.
>>> ?????????? */
>>> -??????? if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>>> +??????? if (!ctrl->opts->discovery_nqn &&
>>> +??????????? nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>>> ????????????? dev_err(ctrl->device,
>>> ????????????????? "ignoring ctrl due to duplicate subnqn (%s).\n",
>>> ????????????????? found->subnqn);
>>>
>>
>> Question: this snippet excludes the discovery controller from failing 
>> the init call. But in doing so, it allows the discovery controller to 
>> be seen as "single unique device" based on its subnqn with another 
>> path added.? I assume, the read of the discovery log command will be 
>> done against the /dev/nvme? controller instance, so it will still go 
>> to the right "path" that corresponds to where the "nvme connect or 
>> connect-all" was issued to.? Correct ??? and other commands against 
>> the discovery controller should also use ioctls to the /dev/nvme? 
>> name so the proper device/path is used/deleted - Correct ?
>>
>> Are there any side effects of the fact that we allow the discovery 
>> controller to have multiple paths ?
>>
> That would only be relevant if the discovery log page entry could be 
> different for the individual paths.
> It's somewhat hard to see how that could happen; it's not that one 
> path can set the ANA CMIC bit, and the other path doesn't ...
> So even if we somehow fail to get the path right we still would be 
> getting the correct information, no?

This has little/none to do with pathing - but rather that all discovery 
controllers use a single well-known NQN - and they don't support 
multipathing.

There can be any number of discovery controllers, each with a different 
"view" of the storage subsystems available. Thus it matters very much 
which path you use. E.g. using a more extreme example: the discovery 
controller on a FC link could be very different from a discovery 
controller at an ip address.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: allow duplicate connections to the discovery controller
  2018-06-18 23:57       ` James Smart
@ 2018-06-19 10:05         ` Hannes Reinecke
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2018-06-19 10:05 UTC (permalink / raw)


On 06/19/2018 01:57 AM, James Smart wrote:
> 
> 
> On 6/18/2018 3:54 PM, Hannes Reinecke wrote:
>> On 06/18/2018 06:11 PM, James Smart wrote:
>>>
>>> On 5/24/2018 7:18 AM, Hannes Reinecke wrote:
>>>> The whole point of the discovery controller is that it can accept
>>>> multiple connections. Additionally the cmic field is not even 
>>>> defined for
>>>> the discovery controller identify page.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare at suse.com>
>>>> ---
>>>> ? drivers/nvme/host/core.c??? | 3 ++-
>>>> ? drivers/nvme/host/fabrics.c | 1 +
>>>> ? 2 files changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>> index 10b180b8ccf4..28ac580b07e5 100644
>>>> --- a/drivers/nvme/host/core.c
>>>> +++ b/drivers/nvme/host/core.c
>>>> @@ -2494,7 +2494,8 @@ static int nvme_init_subsystem(struct 
>>>> nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
>>>> ?????????? * Verify that the subsystem actually supports multiple
>>>> ?????????? * controllers, else bail out.
>>>> ?????????? */
>>>> -??????? if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>>>> +??????? if (!ctrl->opts->discovery_nqn &&
>>>> +??????????? nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
>>>> ????????????? dev_err(ctrl->device,
>>>> ????????????????? "ignoring ctrl due to duplicate subnqn (%s).\n",
>>>> ????????????????? found->subnqn);
>>>>
>>>
>>> Question: this snippet excludes the discovery controller from failing 
>>> the init call. But in doing so, it allows the discovery controller to 
>>> be seen as "single unique device" based on its subnqn with another 
>>> path added.? I assume, the read of the discovery log command will be 
>>> done against the /dev/nvme? controller instance, so it will still go 
>>> to the right "path" that corresponds to where the "nvme connect or 
>>> connect-all" was issued to.? Correct ??? and other commands against 
>>> the discovery controller should also use ioctls to the /dev/nvme? 
>>> name so the proper device/path is used/deleted - Correct ?
>>>
>>> Are there any side effects of the fact that we allow the discovery 
>>> controller to have multiple paths ?
>>>
>> That would only be relevant if the discovery log page entry could be 
>> different for the individual paths.
>> It's somewhat hard to see how that could happen; it's not that one 
>> path can set the ANA CMIC bit, and the other path doesn't ...
>> So even if we somehow fail to get the path right we still would be 
>> getting the correct information, no?
> 
> This has little/none to do with pathing - but rather that all discovery 
> controllers use a single well-known NQN - and they don't support 
> multipathing.
> 
> There can be any number of discovery controllers, each with a different 
> "view" of the storage subsystems available. Thus it matters very much 
> which path you use. E.g. using a more extreme example: the discovery 
> controller on a FC link could be very different from a discovery 
> controller at an ip address.
> 
Oh, indeed.
I was under the impression that we'd be indexing the discovery entries 
per transport/traaddr and subsysnqn.

But now I see that we'd be indexing them just by subsysnqn, ie we will 
only ever have one discovery subsysnqn entry.

Which really is wrong if different paths/transport can return different 
discovery log pages.
And locking access to the discovery subsysnqn has scalability impacts as 
we'ver just seen.

So we really should index the discovery subsysnqn by transport and 
traddr, not by subsysnqn.

Will be preparing a patch.


Cheers,

Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-19 10:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-24 14:18 [PATCH 0/3] nvme: fix stuck discovery Hannes Reinecke
2018-05-24 14:18 ` [PATCH 1/3] nvme: centralize discovery controller defaults Hannes Reinecke
2018-05-24 20:31   ` James Smart
2018-05-25  8:55   ` Christoph Hellwig
2018-05-24 14:18 ` [PATCH 2/3] nvme: avoid reconnection for the discovery controller Hannes Reinecke
2018-05-24 20:30   ` James Smart
2018-05-25  6:43     ` Hannes Reinecke
2018-05-24 14:18 ` [PATCH 3/3] nvme: allow duplicate connections to " Hannes Reinecke
2018-05-24 20:31   ` James Smart
2018-05-25  8:56   ` Christoph Hellwig
2018-06-18 16:11   ` James Smart
2018-06-18 22:54     ` Hannes Reinecke
2018-06-18 23:57       ` James Smart
2018-06-19 10:05         ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.