[PATCH 1/3] nvme-core: improve avoiding false remove namespace

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20  3:53 ` Chao Leng
  0 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-20  3:53 UTC (permalink / raw)
  To: linux-nvme; +Cc: linux-block, kbusch, axboe, hch, sagi, lengchao

nvme_revalidate_disk translate return error to 0 if it is not a fatal
error, thus avoid false remove namespace. If return error less than 0,
now only ENOMEM be translated to 0, but other error except ENODEV,
such as EAGAIN or EBUSY etc, also need translate to 0.
Another reason for improving the error translation: If request timeout
when connect, __nvme_submit_sync_cmd will return
NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
connect process, but falsely continue the connect process,
this may cause deadlock. Many functions which call
__nvme_submit_sync_cmd treat error code(> 0) as target not support and
continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
are cancled io by host, to fix this bug, we need set the flag:
NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
error to INTR. This is conflict with error translation of
nvme_revalidate_disk, may cause false remove namespace.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 drivers/nvme/host/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..43ac8a1ad65d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
 	 * Only fail the function if we got a fatal error back from the
 	 * device, otherwise ignore the error and just move on.
 	 */
-	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
-		ret = 0;
-	else if (ret > 0)
+	if (ret > 0 && (ret & NVME_SC_DNR))
 		ret = blk_status_to_errno(nvme_error_status(ret));
+	else if (ret != -ENODEV)
+		ret = 0;
 	return ret;
 }

-- 
2.16.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20  3:53 ` Chao Leng
  0 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-20  3:53 UTC (permalink / raw)
  To: linux-nvme; +Cc: axboe, sagi, linux-block, lengchao, kbusch, hch

nvme_revalidate_disk translate return error to 0 if it is not a fatal
error, thus avoid false remove namespace. If return error less than 0,
now only ENOMEM be translated to 0, but other error except ENODEV,
such as EAGAIN or EBUSY etc, also need translate to 0.
Another reason for improving the error translation: If request timeout
when connect, __nvme_submit_sync_cmd will return
NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
connect process, but falsely continue the connect process,
this may cause deadlock. Many functions which call
__nvme_submit_sync_cmd treat error code(> 0) as target not support and
continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
are cancled io by host, to fix this bug, we need set the flag:
NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
error to INTR. This is conflict with error translation of
nvme_revalidate_disk, may cause false remove namespace.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 drivers/nvme/host/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..43ac8a1ad65d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
 	 * Only fail the function if we got a fatal error back from the
 	 * device, otherwise ignore the error and just move on.
 	 */
-	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
-		ret = 0;
-	else if (ret > 0)
+	if (ret > 0 && (ret & NVME_SC_DNR))
 		ret = blk_status_to_errno(nvme_error_status(ret));
+	else if (ret != -ENODEV)
+		ret = 0;
 	return ret;
 }

-- 
2.16.4

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20  3:53 ` Chao Leng
@ 2020-08-20  4:33   ` Sagi Grimberg
  -1 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-20  4:33 UTC (permalink / raw)
  To: Chao Leng, linux-nvme; +Cc: linux-block, kbusch, axboe, hch


> nvme_revalidate_disk translate return error to 0 if it is not a fatal
> error, thus avoid false remove namespace. If return error less than 0,
> now only ENOMEM be translated to 0, but other error except ENODEV,
> such as EAGAIN or EBUSY etc, also need translate to 0.
> Another reason for improving the error translation: If request timeout
> when connect, __nvme_submit_sync_cmd will return
> NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
> connect process, but falsely continue the connect process,
> this may cause deadlock. Many functions which call
> __nvme_submit_sync_cmd treat error code(> 0) as target not support and
> continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
> are cancled io by host, to fix this bug, we need set the flag:
> NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
> error to INTR. This is conflict with error translation of
> nvme_revalidate_disk, may cause false remove namespace.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> ---
>   drivers/nvme/host/core.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 88cff309d8e4..43ac8a1ad65d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
>   	 * Only fail the function if we got a fatal error back from the
>   	 * device, otherwise ignore the error and just move on.
>   	 */
> -	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
> -		ret = 0;
> -	else if (ret > 0)
> +	if (ret > 0 && (ret & NVME_SC_DNR))
>   		ret = blk_status_to_errno(nvme_error_status(ret));
> +	else if (ret != -ENODEV)
> +		ret = 0;
>   	return ret;

We really need to take a step back here, I really don't like how
we are growing implicit assumptions on how statuses are interpreted.

Why don't we remove the -ENODEV error propagation back and instead
take care of it in the specific call-sites where we want to ignore
errors with proper quirks?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20  4:33   ` Sagi Grimberg
  0 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-20  4:33 UTC (permalink / raw)
  To: Chao Leng, linux-nvme; +Cc: linux-block, kbusch, axboe, hch


> nvme_revalidate_disk translate return error to 0 if it is not a fatal
> error, thus avoid false remove namespace. If return error less than 0,
> now only ENOMEM be translated to 0, but other error except ENODEV,
> such as EAGAIN or EBUSY etc, also need translate to 0.
> Another reason for improving the error translation: If request timeout
> when connect, __nvme_submit_sync_cmd will return
> NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
> connect process, but falsely continue the connect process,
> this may cause deadlock. Many functions which call
> __nvme_submit_sync_cmd treat error code(> 0) as target not support and
> continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
> are cancled io by host, to fix this bug, we need set the flag:
> NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
> error to INTR. This is conflict with error translation of
> nvme_revalidate_disk, may cause false remove namespace.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> ---
>   drivers/nvme/host/core.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 88cff309d8e4..43ac8a1ad65d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
>   	 * Only fail the function if we got a fatal error back from the
>   	 * device, otherwise ignore the error and just move on.
>   	 */
> -	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
> -		ret = 0;
> -	else if (ret > 0)
> +	if (ret > 0 && (ret & NVME_SC_DNR))
>   		ret = blk_status_to_errno(nvme_error_status(ret));
> +	else if (ret != -ENODEV)
> +		ret = 0;
>   	return ret;

We really need to take a step back here, I really don't like how
we are growing implicit assumptions on how statuses are interpreted.

Why don't we remove the -ENODEV error propagation back and instead
take care of it in the specific call-sites where we want to ignore
errors with proper quirks?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20  4:33   ` Sagi Grimberg
@ 2020-08-20  6:22     ` Chao Leng
  -1 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-20  6:22 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: linux-block, kbusch, axboe, hch



On 2020/8/20 12:33, Sagi Grimberg wrote:
> 
>> nvme_revalidate_disk translate return error to 0 if it is not a fatal
>> error, thus avoid false remove namespace. If return error less than 0,
>> now only ENOMEM be translated to 0, but other error except ENODEV,
>> such as EAGAIN or EBUSY etc, also need translate to 0.
>> Another reason for improving the error translation: If request timeout
>> when connect, __nvme_submit_sync_cmd will return
>> NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
>> connect process, but falsely continue the connect process,
>> this may cause deadlock. Many functions which call
>> __nvme_submit_sync_cmd treat error code(> 0) as target not support and
>> continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
>> are cancled io by host, to fix this bug, we need set the flag:
>> NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
>> error to INTR. This is conflict with error translation of
>> nvme_revalidate_disk, may cause false remove namespace.
>>
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/nvme/host/core.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 88cff309d8e4..43ac8a1ad65d 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
>>        * Only fail the function if we got a fatal error back from the
>>        * device, otherwise ignore the error and just move on.
>>        */
>> -    if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
>> -        ret = 0;
>> -    else if (ret > 0)
>> +    if (ret > 0 && (ret & NVME_SC_DNR))
>>           ret = blk_status_to_errno(nvme_error_status(ret));
>> +    else if (ret != -ENODEV)
>> +        ret = 0;
>>       return ret;
> 
> We really need to take a step back here, I really don't like how
> we are growing implicit assumptions on how statuses are interpreted.
agree.
> 
> Why don't we remove the -ENODEV error propagation back and instead
> take care of it in the specific call-sites where we want to ignore
> errors with proper quirks?

Maybe we can do like this.
---
  drivers/nvme/host/core.c | 15 ++++-----------
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..dc02208ff655 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2104,7 +2104,7 @@ static int _nvme_revalidate_disk(struct gendisk *disk)

  	ret = nvme_identify_ns(ctrl, ns->head->ns_id, &id);
  	if (ret)
-		goto out;
+		return 0;

  	if (id->ncap == 0) {
  		ret = -ENODEV;
@@ -2112,8 +2112,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
  	}

  	ret = nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
-	if (ret)
+	if (ret) {
+		ret = 0;
  		goto free_id;
+	}

  	if (!nvme_ns_ids_equal(&ns->head->ids, &ids)) {
  		dev_err(ctrl->device,
@@ -2125,15 +2127,6 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
  	ret = __nvme_revalidate_disk(disk, id);
  free_id:
  	kfree(id);
-out:
-	/*
-	 * Only fail the function if we got a fatal error back from the
-	 * device, otherwise ignore the error and just move on.
-	 */
-	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
-		ret = 0;
-	else if (ret > 0)
-		ret = blk_status_to_errno(nvme_error_status(ret));
  	return ret;
  }

-- 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20  6:22     ` Chao Leng
  0 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-20  6:22 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: linux-block, kbusch, axboe, hch



On 2020/8/20 12:33, Sagi Grimberg wrote:
> 
>> nvme_revalidate_disk translate return error to 0 if it is not a fatal
>> error, thus avoid false remove namespace. If return error less than 0,
>> now only ENOMEM be translated to 0, but other error except ENODEV,
>> such as EAGAIN or EBUSY etc, also need translate to 0.
>> Another reason for improving the error translation: If request timeout
>> when connect, __nvme_submit_sync_cmd will return
>> NVME_SC_HOST_ABORTED_CMD(>0). At this time, should terminate the
>> connect process, but falsely continue the connect process,
>> this may cause deadlock. Many functions which call
>> __nvme_submit_sync_cmd treat error code(> 0) as target not support and
>> continue, but NVME_SC_HOST_ABORTED_CMD and NVME_SC_HOST_PATH_ERROR both
>> are cancled io by host, to fix this bug, we need set the flag:
>> NVME_REQ_CANCELLED, thus __nvme_submit_sync_cmd will translate return
>> error to INTR. This is conflict with error translation of
>> nvme_revalidate_disk, may cause false remove namespace.
>>
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/nvme/host/core.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 88cff309d8e4..43ac8a1ad65d 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -2130,10 +2130,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
>>        * Only fail the function if we got a fatal error back from the
>>        * device, otherwise ignore the error and just move on.
>>        */
>> -    if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
>> -        ret = 0;
>> -    else if (ret > 0)
>> +    if (ret > 0 && (ret & NVME_SC_DNR))
>>           ret = blk_status_to_errno(nvme_error_status(ret));
>> +    else if (ret != -ENODEV)
>> +        ret = 0;
>>       return ret;
> 
> We really need to take a step back here, I really don't like how
> we are growing implicit assumptions on how statuses are interpreted.
agree.
> 
> Why don't we remove the -ENODEV error propagation back and instead
> take care of it in the specific call-sites where we want to ignore
> errors with proper quirks?

Maybe we can do like this.
---
  drivers/nvme/host/core.c | 15 ++++-----------
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..dc02208ff655 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2104,7 +2104,7 @@ static int _nvme_revalidate_disk(struct gendisk *disk)

  	ret = nvme_identify_ns(ctrl, ns->head->ns_id, &id);
  	if (ret)
-		goto out;
+		return 0;

  	if (id->ncap == 0) {
  		ret = -ENODEV;
@@ -2112,8 +2112,10 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
  	}

  	ret = nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
-	if (ret)
+	if (ret) {
+		ret = 0;
  		goto free_id;
+	}

  	if (!nvme_ns_ids_equal(&ns->head->ids, &ids)) {
  		dev_err(ctrl->device,
@@ -2125,15 +2127,6 @@ static int _nvme_revalidate_disk(struct gendisk *disk)
  	ret = __nvme_revalidate_disk(disk, id);
  free_id:
  	kfree(id);
-out:
-	/*
-	 * Only fail the function if we got a fatal error back from the
-	 * device, otherwise ignore the error and just move on.
-	 */
-	if (ret == -ENOMEM || (ret > 0 && !(ret & NVME_SC_DNR)))
-		ret = 0;
-	else if (ret > 0)
-		ret = blk_status_to_errno(nvme_error_status(ret));
  	return ret;
  }

-- 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20  4:33   ` Sagi Grimberg
@ 2020-08-20  8:29     ` Christoph Hellwig
  -1 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-08-20  8:29 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Chao Leng, linux-nvme, linux-block, kbusch, axboe, hch

On Wed, Aug 19, 2020 at 09:33:22PM -0700, Sagi Grimberg wrote:
> We really need to take a step back here, I really don't like how
> we are growing implicit assumptions on how statuses are interpreted.
>
> Why don't we remove the -ENODEV error propagation back and instead
> take care of it in the specific call-sites where we want to ignore
> errors with proper quirks?

So the one thing I'm not even sure about is if just ignoring the
errors was a good idea to start with.  They obviously are if we just
did a rescan and did run into an error while rescanning a namespace
that didn't change.  But what if it actually did change?

So I think a logic like in this patch kinda makes sense, but I think
we also need to retry and scan again on these kinds of errors.  Btw,
did you ever actually see -ENOMEM in practice?  With the small
allocations that we do it really should not happen normally, so
special casing for it always felt a little strange.

FYI, I've started rebasing various bits of work I've done to start
untangling the mess.  Here is my current WIP, which in this form
is completely untested:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20  8:29     ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-08-20  8:29 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: axboe, linux-nvme, linux-block, Chao Leng, kbusch, hch

On Wed, Aug 19, 2020 at 09:33:22PM -0700, Sagi Grimberg wrote:
> We really need to take a step back here, I really don't like how
> we are growing implicit assumptions on how statuses are interpreted.
>
> Why don't we remove the -ENODEV error propagation back and instead
> take care of it in the specific call-sites where we want to ignore
> errors with proper quirks?

So the one thing I'm not even sure about is if just ignoring the
errors was a good idea to start with.  They obviously are if we just
did a rescan and did run into an error while rescanning a namespace
that didn't change.  But what if it actually did change?

So I think a logic like in this patch kinda makes sense, but I think
we also need to retry and scan again on these kinds of errors.  Btw,
did you ever actually see -ENOMEM in practice?  With the small
allocations that we do it really should not happen normally, so
special casing for it always felt a little strange.

FYI, I've started rebasing various bits of work I've done to start
untangling the mess.  Here is my current WIP, which in this form
is completely untested:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20  8:29     ` Christoph Hellwig
@ 2020-08-20 15:44       ` Sagi Grimberg
  -1 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-20 15:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chao Leng, linux-nvme, linux-block, kbusch, axboe


>> We really need to take a step back here, I really don't like how
>> we are growing implicit assumptions on how statuses are interpreted.
>>
>> Why don't we remove the -ENODEV error propagation back and instead
>> take care of it in the specific call-sites where we want to ignore
>> errors with proper quirks?
> 
> So the one thing I'm not even sure about is if just ignoring the
> errors was a good idea to start with.  They obviously are if we just
> did a rescan and did run into an error while rescanning a namespace
> that didn't change.  But what if it actually did change?

Right, we don't know, so if we failed without DNR, we assume that
we will retry again and ignore the error. The assumption is that
we will retry when we will reconnect as we don't have a retry mechanism
for these requests.

> So I think a logic like in this patch kinda makes sense, but I think
> we also need to retry and scan again on these kinds of errors.

So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
cancelled requests and have the scan flow assume that these are
cancelled requests?

At the very least we need a good comment to say what is going on there.

   Btw,
> did you ever actually see -ENOMEM in practice?  With the small
> allocations that we do it really should not happen normally, so
> special casing for it always felt a little strange.

Never seen it, it's there just because we have allocations in the path.

> FYI, I've started rebasing various bits of work I've done to start
> untangling the mess.  Here is my current WIP, which in this form
> is completely untested:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup

This does not yet contain sorting out what is discussed here correct?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-20 15:44       ` Sagi Grimberg
  0 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-20 15:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block, kbusch, linux-nvme, Chao Leng, axboe


>> We really need to take a step back here, I really don't like how
>> we are growing implicit assumptions on how statuses are interpreted.
>>
>> Why don't we remove the -ENODEV error propagation back and instead
>> take care of it in the specific call-sites where we want to ignore
>> errors with proper quirks?
> 
> So the one thing I'm not even sure about is if just ignoring the
> errors was a good idea to start with.  They obviously are if we just
> did a rescan and did run into an error while rescanning a namespace
> that didn't change.  But what if it actually did change?

Right, we don't know, so if we failed without DNR, we assume that
we will retry again and ignore the error. The assumption is that
we will retry when we will reconnect as we don't have a retry mechanism
for these requests.

> So I think a logic like in this patch kinda makes sense, but I think
> we also need to retry and scan again on these kinds of errors.

So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
cancelled requests and have the scan flow assume that these are
cancelled requests?

At the very least we need a good comment to say what is going on there.

   Btw,
> did you ever actually see -ENOMEM in practice?  With the small
> allocations that we do it really should not happen normally, so
> special casing for it always felt a little strange.

Never seen it, it's there just because we have allocations in the path.

> FYI, I've started rebasing various bits of work I've done to start
> untangling the mess.  Here is my current WIP, which in this form
> is completely untested:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup

This does not yet contain sorting out what is discussed here correct?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20 15:44       ` Sagi Grimberg
@ 2020-08-21  1:36         ` Chao Leng
  -1 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-21  1:36 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: linux-nvme, linux-block, kbusch, axboe



On 2020/8/20 23:44, Sagi Grimberg wrote:
> 
>>> We really need to take a step back here, I really don't like how
>>> we are growing implicit assumptions on how statuses are interpreted.
>>>
>>> Why don't we remove the -ENODEV error propagation back and instead
>>> take care of it in the specific call-sites where we want to ignore
>>> errors with proper quirks?
>>
>> So the one thing I'm not even sure about is if just ignoring the
>> errors was a good idea to start with.  They obviously are if we just
>> did a rescan and did run into an error while rescanning a namespace
>> that didn't change.  But what if it actually did change?
> 
> Right, we don't know, so if we failed without DNR, we assume that
> we will retry again and ignore the error. The assumption is that
> we will retry when we will reconnect as we don't have a retry mechanism
> for these requests.
Except DNR or ENODEV, we can not know namespace change or not. This is
a low-probability event. In accordance with the principle of minor
influence and no suspicion, we assume namespace not change, maybe
a good choice. If the namespace is changed, the corresponding processing
is performed during the access, which does not cause any problem.
> 
>> So I think a logic like in this patch kinda makes sense, but I think
>> we also need to retry and scan again on these kinds of errors.
> 
> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
> cancelled requests and have the scan flow assume that these are
> cancelled requests?
> 
> At the very least we need a good comment to say what is going on there.
> 
>    Btw,
>> did you ever actually see -ENOMEM in practice?  With the small
>> allocations that we do it really should not happen normally, so
>> special casing for it always felt a little strange.
Agree.
Not only ENOMEM, If we do not know namespace change or not, assume
namespace not change maybe a good choice.
> 
> Never seen it, it's there just because we have allocations in the path.
> 
>> FYI, I've started rebasing various bits of work I've done to start
>> untangling the mess.  Here is my current WIP, which in this form
>> is completely untested:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
> 
> This does not yet contain sorting out what is discussed here correct?
> .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-21  1:36         ` Chao Leng
  0 siblings, 0 replies; 16+ messages in thread
From: Chao Leng @ 2020-08-21  1:36 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: linux-block, kbusch, axboe, linux-nvme



On 2020/8/20 23:44, Sagi Grimberg wrote:
> 
>>> We really need to take a step back here, I really don't like how
>>> we are growing implicit assumptions on how statuses are interpreted.
>>>
>>> Why don't we remove the -ENODEV error propagation back and instead
>>> take care of it in the specific call-sites where we want to ignore
>>> errors with proper quirks?
>>
>> So the one thing I'm not even sure about is if just ignoring the
>> errors was a good idea to start with.  They obviously are if we just
>> did a rescan and did run into an error while rescanning a namespace
>> that didn't change.  But what if it actually did change?
> 
> Right, we don't know, so if we failed without DNR, we assume that
> we will retry again and ignore the error. The assumption is that
> we will retry when we will reconnect as we don't have a retry mechanism
> for these requests.
Except DNR or ENODEV, we can not know namespace change or not. This is
a low-probability event. In accordance with the principle of minor
influence and no suspicion, we assume namespace not change, maybe
a good choice. If the namespace is changed, the corresponding processing
is performed during the access, which does not cause any problem.
> 
>> So I think a logic like in this patch kinda makes sense, but I think
>> we also need to retry and scan again on these kinds of errors.
> 
> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
> cancelled requests and have the scan flow assume that these are
> cancelled requests?
> 
> At the very least we need a good comment to say what is going on there.
> 
>    Btw,
>> did you ever actually see -ENOMEM in practice?  With the small
>> allocations that we do it really should not happen normally, so
>> special casing for it always felt a little strange.
Agree.
Not only ENOMEM, If we do not know namespace change or not, assume
namespace not change maybe a good choice.
> 
> Never seen it, it's there just because we have allocations in the path.
> 
>> FYI, I've started rebasing various bits of work I've done to start
>> untangling the mess.  Here is my current WIP, which in this form
>> is completely untested:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
> 
> This does not yet contain sorting out what is discussed here correct?
> .

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-20 15:44       ` Sagi Grimberg
@ 2020-08-21  6:25         ` Christoph Hellwig
  -1 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-08-21  6:25 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Chao Leng, linux-nvme, linux-block, kbusch, axboe

On Thu, Aug 20, 2020 at 08:44:13AM -0700, Sagi Grimberg wrote:
>> So the one thing I'm not even sure about is if just ignoring the
>> errors was a good idea to start with.  They obviously are if we just
>> did a rescan and did run into an error while rescanning a namespace
>> that didn't change.  But what if it actually did change?
>
> Right, we don't know, so if we failed without DNR, we assume that
> we will retry again and ignore the error. The assumption is that
> we will retry when we will reconnect as we don't have a retry mechanism
> for these requests.

Yes.  And I think for anything related to namespace (re-)scanning
we can actually trivially build a sane retry mechanism.  That is give
up on the current scan_work, and just rescan one after a short wait.

>> So I think a logic like in this patch kinda makes sense, but I think
>> we also need to retry and scan again on these kinds of errors.
>
> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
> cancelled requests and have the scan flow assume that these are
> cancelled requests?

How does nvme_submit_sync_cmd return -ENODEV?  As far as I can tell
-ENODEV is our special escape for expected-ish errors in namespace
scanning.

> At the very least we need a good comment to say what is going on there.

Absolutely.

>
>   Btw,
>> did you ever actually see -ENOMEM in practice?  With the small
>> allocations that we do it really should not happen normally, so
>> special casing for it always felt a little strange.
>
> Never seen it, it's there just because we have allocations in the path.
>
>> FYI, I've started rebasing various bits of work I've done to start
>> untangling the mess.  Here is my current WIP, which in this form
>> is completely untested:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
>
> This does not yet contain sorting out what is discussed here correct?

No, but all the infrastructure needed to implement my above idead.  Most
importanty the crazy revalidate callchains are pretty much gone and we're
down to just a few functions with reasonable call chains.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-21  6:25         ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-08-21  6:25 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: axboe, linux-nvme, linux-block, Chao Leng, kbusch, Christoph Hellwig

On Thu, Aug 20, 2020 at 08:44:13AM -0700, Sagi Grimberg wrote:
>> So the one thing I'm not even sure about is if just ignoring the
>> errors was a good idea to start with.  They obviously are if we just
>> did a rescan and did run into an error while rescanning a namespace
>> that didn't change.  But what if it actually did change?
>
> Right, we don't know, so if we failed without DNR, we assume that
> we will retry again and ignore the error. The assumption is that
> we will retry when we will reconnect as we don't have a retry mechanism
> for these requests.

Yes.  And I think for anything related to namespace (re-)scanning
we can actually trivially build a sane retry mechanism.  That is give
up on the current scan_work, and just rescan one after a short wait.

>> So I think a logic like in this patch kinda makes sense, but I think
>> we also need to retry and scan again on these kinds of errors.
>
> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
> cancelled requests and have the scan flow assume that these are
> cancelled requests?

How does nvme_submit_sync_cmd return -ENODEV?  As far as I can tell
-ENODEV is our special escape for expected-ish errors in namespace
scanning.

> At the very least we need a good comment to say what is going on there.

Absolutely.

>
>   Btw,
>> did you ever actually see -ENOMEM in practice?  With the small
>> allocations that we do it really should not happen normally, so
>> special casing for it always felt a little strange.
>
> Never seen it, it's there just because we have allocations in the path.
>
>> FYI, I've started rebasing various bits of work I've done to start
>> untangling the mess.  Here is my current WIP, which in this form
>> is completely untested:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
>
> This does not yet contain sorting out what is discussed here correct?

No, but all the infrastructure needed to implement my above idead.  Most
importanty the crazy revalidate callchains are pretty much gone and we're
down to just a few functions with reasonable call chains.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
  2020-08-21  6:25         ` Christoph Hellwig
@ 2020-08-21 20:23           ` Sagi Grimberg
  -1 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-21 20:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-nvme, linux-block, Chao Leng, kbusch


>>> So the one thing I'm not even sure about is if just ignoring the
>>> errors was a good idea to start with.  They obviously are if we just
>>> did a rescan and did run into an error while rescanning a namespace
>>> that didn't change.  But what if it actually did change?
>>
>> Right, we don't know, so if we failed without DNR, we assume that
>> we will retry again and ignore the error. The assumption is that
>> we will retry when we will reconnect as we don't have a retry mechanism
>> for these requests.
> 
> Yes.  And I think for anything related to namespace (re-)scanning
> we can actually trivially build a sane retry mechanism.  That is give
> up on the current scan_work, and just rescan one after a short wait.

There is no point in doing that if we are disconnected and will in
the future reconnect, which will trigger a scan that can actually
work.

>>> So I think a logic like in this patch kinda makes sense, but I think
>>> we also need to retry and scan again on these kinds of errors.
>>
>> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
>> cancelled requests and have the scan flow assume that these are
>> cancelled requests?
> 
> How does nvme_submit_sync_cmd return -ENODEV?  As far as I can tell
> -ENODEV is our special escape for expected-ish errors in namespace
> scanning.

One of these escapes I guess :)

>> At the very least we need a good comment to say what is going on there.
> 
> Absolutely.
> 
>>
>>    Btw,
>>> did you ever actually see -ENOMEM in practice?  With the small
>>> allocations that we do it really should not happen normally, so
>>> special casing for it always felt a little strange.
>>
>> Never seen it, it's there just because we have allocations in the path.
>>
>>> FYI, I've started rebasing various bits of work I've done to start
>>> untangling the mess.  Here is my current WIP, which in this form
>>> is completely untested:
>>>
>>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
>>
>> This does not yet contain sorting out what is discussed here correct?
> 
> No, but all the infrastructure needed to implement my above idead.  Most
> importanty the crazy revalidate callchains are pretty much gone and we're
> down to just a few functions with reasonable call chains.

OK, that makes sense. I'm still not convinced the retry makes sense
though...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme-core: improve avoiding false remove namespace
@ 2020-08-21 20:23           ` Sagi Grimberg
  0 siblings, 0 replies; 16+ messages in thread
From: Sagi Grimberg @ 2020-08-21 20:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, kbusch, Chao Leng, linux-nvme


>>> So the one thing I'm not even sure about is if just ignoring the
>>> errors was a good idea to start with.  They obviously are if we just
>>> did a rescan and did run into an error while rescanning a namespace
>>> that didn't change.  But what if it actually did change?
>>
>> Right, we don't know, so if we failed without DNR, we assume that
>> we will retry again and ignore the error. The assumption is that
>> we will retry when we will reconnect as we don't have a retry mechanism
>> for these requests.
> 
> Yes.  And I think for anything related to namespace (re-)scanning
> we can actually trivially build a sane retry mechanism.  That is give
> up on the current scan_work, and just rescan one after a short wait.

There is no point in doing that if we are disconnected and will in
the future reconnect, which will trigger a scan that can actually
work.

>>> So I think a logic like in this patch kinda makes sense, but I think
>>> we also need to retry and scan again on these kinds of errors.
>>
>> So you are OK with keeping nvme_submit_sync_cmd returning -ENODEV for
>> cancelled requests and have the scan flow assume that these are
>> cancelled requests?
> 
> How does nvme_submit_sync_cmd return -ENODEV?  As far as I can tell
> -ENODEV is our special escape for expected-ish errors in namespace
> scanning.

One of these escapes I guess :)

>> At the very least we need a good comment to say what is going on there.
> 
> Absolutely.
> 
>>
>>    Btw,
>>> did you ever actually see -ENOMEM in practice?  With the small
>>> allocations that we do it really should not happen normally, so
>>> special casing for it always felt a little strange.
>>
>> Never seen it, it's there just because we have allocations in the path.
>>
>>> FYI, I've started rebasing various bits of work I've done to start
>>> untangling the mess.  Here is my current WIP, which in this form
>>> is completely untested:
>>>
>>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-scanning-cleanup
>>
>> This does not yet contain sorting out what is discussed here correct?
> 
> No, but all the infrastructure needed to implement my above idead.  Most
> importanty the crazy revalidate callchains are pretty much gone and we're
> down to just a few functions with reasonable call chains.

OK, that makes sense. I'm still not convinced the retry makes sense
though...

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-08-21 20:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20  3:53 [PATCH 1/3] nvme-core: improve avoiding false remove namespace Chao Leng
2020-08-20  3:53 ` Chao Leng
2020-08-20  4:33 ` Sagi Grimberg
2020-08-20  4:33   ` Sagi Grimberg
2020-08-20  6:22   ` Chao Leng
2020-08-20  6:22     ` Chao Leng
2020-08-20  8:29   ` Christoph Hellwig
2020-08-20  8:29     ` Christoph Hellwig
2020-08-20 15:44     ` Sagi Grimberg
2020-08-20 15:44       ` Sagi Grimberg
2020-08-21  1:36       ` Chao Leng
2020-08-21  1:36         ` Chao Leng
2020-08-21  6:25       ` Christoph Hellwig
2020-08-21  6:25         ` Christoph Hellwig
2020-08-21 20:23         ` Sagi Grimberg
2020-08-21 20:23           ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.