All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] nvme: ANA implementation fixes
@ 2018-07-16 10:58 Hannes Reinecke
  2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


Hi Christoph,

here's a set of patches relative to your nvme-ana branch.
Most of these issues have been encountered when running
my blktest test case (to be submitted separately).
With these patches I haven't found any other issues with
the ANA implementation, so we're good to go in principle.

Note, I've submitted another patch ("nvme: register ns_id attributes ...")
which will probably clash with the ANA series and these patches.

But we can fix it up once we've agreed on when and how to
submit the ANA series proper.

As usual, comments and reviews are welcome.

Hannes Reinecke (5):
  nvme: count all ANA groups for ANA Log page
  nvmet: fixup crash on NULL device path
  nvme-multipath: fixup crash during disconnect
  nvme-multipath: disable ANA support if parsing fails
  nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state()

 drivers/nvme/host/core.c        |  2 +-
 drivers/nvme/host/multipath.c   | 77 ++++++++++++++++++++++++++++-------------
 drivers/nvme/target/admin-cmd.c |  5 +++
 drivers/nvme/target/configfs.c  |  6 ++++
 drivers/nvme/target/core.c      |  6 +++-
 5 files changed, 69 insertions(+), 27 deletions(-)

-- 
2.12.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] nvme: count all ANA groups for ANA Log page
  2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
@ 2018-07-16 10:58 ` Hannes Reinecke
  2018-07-19 14:31   ` Christoph Hellwig
  2018-09-17 13:49   ` Christoph Hellwig
  2018-07-16 10:58 ` [PATCH 2/5] nvmet: fixup crash on NULL device path Hannes Reinecke
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


When issuing a short read on the ANA log page the number of groups
should not change, even though the final returned data might contain
less groups than that number.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/target/admin-cmd.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 8de264ece826..92682ed11bc4 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -200,6 +200,11 @@ static void nvmet_execute_get_log_page_ana(struct nvmet_req *req)
 		offset += len;
 		ngrps++;
 	}
+	while (grpid <= NVMET_MAX_ANAGRPS) {
+		if (nvmet_ana_group_enabled[grpid])
+			ngrps++;
+		grpid++;
+	}
 
 	hdr.chgcnt = cpu_to_le64(nvmet_ana_chgcnt);
 	hdr.ngrps = cpu_to_le16(ngrps);
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
  2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
@ 2018-07-16 10:58 ` Hannes Reinecke
  2018-07-17 13:48   ` Christoph Hellwig
  2018-07-16 10:58 ` [PATCH 3/5] nvme-multipath: fixup crash during disconnect Hannes Reinecke
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


When writing an empty string into the device_path attribute the kernel
will crash with

nvmet: failed to open block device (null): (-22)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000

This patch sanitizes the error handling for invalid device path settings.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/target/configfs.c | 6 ++++++
 drivers/nvme/target/core.c     | 6 +++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 094d216ae62a..76efc8120c00 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -295,6 +295,12 @@ static ssize_t nvmet_ns_device_path_store(struct config_item *item,
 	ns->device_path = kstrndup(page, strcspn(page, "\n"), GFP_KERNEL);
 	if (!ns->device_path)
 		goto out_unlock;
+	if (!strlen(ns->device_path)) {
+		kfree(ns->device_path);
+		ns->device_path = NULL;
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 
 	mutex_unlock(&subsys->lock);
 	return count;
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 0aeccbddd716..956d293d293d 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -373,8 +373,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
 	if (ns->enabled)
 		goto out_unlock;
 
+	if (!ns->device_path) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 	ret = nvmet_bdev_ns_enable(ns);
-	if (ret)
+	if (ret == -ENOTBLK)
 		ret = nvmet_file_ns_enable(ns);
 	if (ret)
 		goto out_unlock;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/5] nvme-multipath: fixup crash during disconnect
  2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
  2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
  2018-07-16 10:58 ` [PATCH 2/5] nvmet: fixup crash on NULL device path Hannes Reinecke
@ 2018-07-16 10:58 ` Hannes Reinecke
  2018-07-19 14:31   ` Christoph Hellwig
  2018-07-16 10:58 ` [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails Hannes Reinecke
  2018-07-16 10:58 ` [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state() Hannes Reinecke
  4 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


The kernel will crash during disconnect, if the first path is
'optimized' and the second 'non-optimized'. Then upon removing the
first path any I/O coming in _before_ removing the second path will
find the first path to be unusable, and assign 'fallback' to the
second. But after exiting the loop it will assign the _first_ path
to 'current_path', leading to a crash later on.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/multipath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 8980b0137fa5..5205f41c0f6c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -129,7 +129,7 @@ static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head)
 	}
 
 	if (fallback)
-		rcu_assign_pointer(head->current_path, ns);
+		rcu_assign_pointer(head->current_path, fallback);
 	return fallback;
 }
 
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails
  2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
                   ` (2 preceding siblings ...)
  2018-07-16 10:58 ` [PATCH 3/5] nvme-multipath: fixup crash during disconnect Hannes Reinecke
@ 2018-07-16 10:58 ` Hannes Reinecke
  2018-07-19 14:43   ` Christoph Hellwig
  2018-07-16 10:58 ` [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state() Hannes Reinecke
  4 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


If we cannot parse the ANA log page we should just disable ANA
support altogether, otherwise we fail to connect to the controller
and we have no way of figuring out what went wrong.
And we should kill the WARN_ON() statements as we now have a valid
recovery strategy.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/core.c      |  2 +-
 drivers/nvme/host/multipath.c | 71 ++++++++++++++++++++++++++++++-------------
 2 files changed, 51 insertions(+), 22 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e84eb938b952..16128f03ae37 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3392,7 +3392,7 @@ static void nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result)
 		break;
 #ifdef CONFIG_NVME_MULTIPATH
 	case NVME_AER_NOTICE_ANA:
-		if (WARN_ON_ONCE(!ctrl->ana_log_buf))
+		if (!ctrl->ana_log_buf)
 			break;
 		queue_work(nvme_wq, &ctrl->ana_work);
 		break;
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 5205f41c0f6c..3ba7a975fd2a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -72,7 +72,7 @@ void nvme_failover_req(struct request *req)
 		 */
 		set_bit(NVME_NS_ANA_PENDING, &ns->flags);
 		nvme_mpath_clear_current_path(ns);
-		if (!WARN_ON_ONCE(!ns->ctrl->ana_log_buf))
+		if (ns->ctrl->ana_log_buf)
 			queue_work(nvme_wq, &ns->ctrl->ana_work);
 		break;
 	default:
@@ -303,26 +303,41 @@ static int nvme_parse_ana_log(struct nvme_ctrl *ctrl, void *data,
 		u32 nr_nsids = le32_to_cpu(desc->nnsids);
 		size_t nsid_buf_size = nr_nsids * sizeof(__le32);
 
-		if (WARN_ON_ONCE(desc->grpid == 0))
+		if (desc->grpid == 0) {
+			dev_dbg(ctrl->device,
+				"ANA group %d: invalid grpid\n", i);
 			return -EINVAL;
-		if (WARN_ON_ONCE(le32_to_cpu(desc->grpid) > ctrl->anagrpmax))
-			return -EINVAL;
-		if (WARN_ON_ONCE(desc->state == 0))
+		}
+		if (le32_to_cpu(desc->grpid) > ctrl->anagrpmax) {
+			dev_dbg(ctrl->device,
+				"ANA group %d: grpid too large (%d, max %d)\n",
+				i, le32_to_cpu(desc->grpid), ctrl->anagrpmax);
 			return -EINVAL;
-		if (WARN_ON_ONCE(desc->state > NVME_ANA_CHANGE))
+		}
+		if (desc->state == 0 || desc->state > NVME_ANA_CHANGE) {
+			dev_dbg(ctrl->device,
+				"ANA group %d: invalid ANA state %d\n",
+				i, desc->state);
 			return -EINVAL;
+		}
 
 		offset += sizeof(*desc);
-		if (WARN_ON_ONCE(offset > ctrl->ana_log_size - nsid_buf_size))
+		if (offset > ctrl->ana_log_size - nsid_buf_size) {
+			dev_dbg(ctrl->device,
+				"ANA group %d: descriptor overflow\n", i);
 			return -EINVAL;
+		}
 
 		error = cb(ctrl, desc, data);
 		if (error)
 			return error;
 
 		offset += nsid_buf_size;
-		if (WARN_ON_ONCE(offset > ctrl->ana_log_size - sizeof(*desc)))
+		if (offset > ctrl->ana_log_size - sizeof(*desc)) {
+			dev_dbg(ctrl->device,
+				"ANA group %d: end of buffer reached\n", i);
 			return -EINVAL;
+		}
 	}
 
 	return 0;
@@ -353,12 +368,12 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 		struct nvme_ana_group_desc *desc, void *data)
 {
 	u32 nr_nsids = le32_to_cpu(desc->nnsids), n = 0;
+	u32 grpid = le32_to_cpu(desc->grpid);
 	unsigned *nr_change_groups = data;
 	struct nvme_ns *ns;
 
 	dev_info(ctrl->device, "ANA group %d: %s.\n",
-			le32_to_cpu(desc->grpid),
-			nvme_ana_state_names[desc->state]);
+			grpid, nvme_ana_state_names[desc->state]);
 
 	if (desc->state == NVME_ANA_CHANGE)
 		(*nr_change_groups)++;
@@ -375,16 +390,23 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
 			break;
 	}
 	up_write(&ctrl->namespaces_rwsem);
-	WARN_ON_ONCE(n < nr_nsids);
+	if (n < nr_nsids)
+		dev_dbg(ctrl->device,
+			"ANA group %d: only found %d of %d namespaces\n",
+			grpid, n, nr_nsids);
 	return 0;
 }
 
 static int nvme_read_ana_log(struct nvme_ctrl *ctrl, bool groups_only)
 {
 	u32 nr_change_groups = 0;
-	int error;
+	int error = -ENXIO;
 
 	mutex_lock(&ctrl->ana_lock);
+	if (!ctrl->ana_log_buf) {
+		dev_warn(ctrl->device, "ANA disabled, skip reading ANA log\n");
+		goto out_unlock;
+	}
 	error = nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_ANA,
 			groups_only ? NVME_ANA_LOG_RGO : 0,
 			ctrl->ana_log_buf, ctrl->ana_log_size, 0);
@@ -395,8 +417,10 @@ static int nvme_read_ana_log(struct nvme_ctrl *ctrl, bool groups_only)
 
 	error = nvme_parse_ana_log(ctrl, &nr_change_groups,
 			nvme_update_ana_state);
-	if (error)
+	if (error) {
+		dev_warn(ctrl->device, "Failed to parse ANA log: %d\n", error);
 		goto out_unlock;
+	}
 
 	/*
 	 * In theory we should have an ANATT timer per group as they might enter
@@ -472,10 +496,16 @@ static int nvme_set_ns_ana_state(struct nvme_ctrl *ctrl,
 
 void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id)
 {
+	int ret = -ENXIO;
+
 	if (nvme_ctrl_use_ana(ns->ctrl)) {
 		mutex_lock(&ns->ctrl->ana_lock);
 		ns->ana_grpid = le32_to_cpu(id->anagrpid);
-		nvme_parse_ana_log(ns->ctrl, ns, nvme_set_ns_ana_state);
+		if (ns->ctrl->ana_log_buf)
+			ret = nvme_parse_ana_log(ns->ctrl, ns,
+						 nvme_set_ns_ana_state);
+		if (ret)
+			ns->ana_state = NVME_ANA_OPTIMIZED;
 		mutex_unlock(&ns->ctrl->ana_lock);
 	} else {
 		mutex_lock(&ns->head->lock);
@@ -533,16 +563,15 @@ int nvme_mpath_init(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 	INIT_WORK(&ctrl->ana_work, nvme_ana_work);
 	ctrl->ana_log_buf = kmalloc(ctrl->ana_log_size, GFP_KERNEL);
 	if (!ctrl->ana_log_buf)
-		goto out;
+		return -ENOMEM;
 
 	error = nvme_read_ana_log(ctrl, true);
-	if (error)
-		goto out_free_ana_log_buf;
+	if (error) {
+		dev_err(ctrl->device, "disabling ANA support.\n");
+		kfree(ctrl->ana_log_buf);
+		ctrl->ana_log_buf = NULL;
+	}
 	return 0;
-out_free_ana_log_buf:
-	kfree(ctrl->ana_log_buf);
-out:
-	return -ENOMEM;
 }
 
 void nvme_mpath_uninit(struct nvme_ctrl *ctrl)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state()
  2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
                   ` (3 preceding siblings ...)
  2018-07-16 10:58 ` [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails Hannes Reinecke
@ 2018-07-16 10:58 ` Hannes Reinecke
  2018-07-19 14:46   ` Christoph Hellwig
  4 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-16 10:58 UTC (permalink / raw)


We need to parse the entire ANA log page, otherwise later namespaces
will not be updates with the new ANA state.

Signed-off-by: Hannes Reinecke <hare at suse.com>
---
 drivers/nvme/host/multipath.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 3ba7a975fd2a..b92f0a6048c2 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -486,10 +486,8 @@ static int nvme_set_ns_ana_state(struct nvme_ctrl *ctrl,
 {
 	struct nvme_ns *ns = data;
 
-	if (ns->ana_grpid == le32_to_cpu(desc->grpid)) {
+	if (ns->ana_grpid == le32_to_cpu(desc->grpid))
 		nvme_update_ns_ana_state(desc, ns);
-		return -ENXIO; /* just break out of the loop */
-	}
 
 	return 0;
 }
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-16 10:58 ` [PATCH 2/5] nvmet: fixup crash on NULL device path Hannes Reinecke
@ 2018-07-17 13:48   ` Christoph Hellwig
  2018-07-17 13:48     ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-17 13:48 UTC (permalink / raw)


On Mon, Jul 16, 2018@12:58:34PM +0200, Hannes Reinecke wrote:
> index 094d216ae62a..76efc8120c00 100644
> --- a/drivers/nvme/target/configfs.c
> +++ b/drivers/nvme/target/configfs.c
> @@ -295,6 +295,12 @@ static ssize_t nvmet_ns_device_path_store(struct config_item *item,
>  	ns->device_path = kstrndup(page, strcspn(page, "\n"), GFP_KERNEL);
>  	if (!ns->device_path)
>  		goto out_unlock;
> +	if (!strlen(ns->device_path)) {
> +		kfree(ns->device_path);
> +		ns->device_path = NULL;
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}

I think this should rather be:

	ret = -EINVAL;
	len = strcspn(page, "\n");
	if (len == 0)
		goto out_unlock;
	ret = -ENOMEM;
	ns->device_path = kstrndup(page, len, GFP_KERNEL);
	if (!ns->device_path)
		goto out_unlock;

> index 0aeccbddd716..956d293d293d 100644
> --- a/drivers/nvme/target/core.c
> +++ b/drivers/nvme/target/core.c
> @@ -373,8 +373,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
>  	if (ns->enabled)
>  		goto out_unlock;
>  
> +	if (!ns->device_path) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}

I don't think we should need this fix.

>  	ret = nvmet_bdev_ns_enable(ns);
> -	if (ret)
> +	if (ret == -ENOTBLK)
>  		ret = nvmet_file_ns_enable(ns);

Looks sensible, but please split this into a separate patch.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-17 13:48   ` Christoph Hellwig
@ 2018-07-17 13:48     ` Hannes Reinecke
  2018-07-17 13:55       ` Christoph Hellwig
  2018-07-24 14:02       ` Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-17 13:48 UTC (permalink / raw)


On 07/17/2018 03:48 PM, Christoph Hellwig wrote:
> On Mon, Jul 16, 2018@12:58:34PM +0200, Hannes Reinecke wrote:
>> index 094d216ae62a..76efc8120c00 100644
>> --- a/drivers/nvme/target/configfs.c
>> +++ b/drivers/nvme/target/configfs.c
>> @@ -295,6 +295,12 @@ static ssize_t nvmet_ns_device_path_store(struct config_item *item,
>>  	ns->device_path = kstrndup(page, strcspn(page, "\n"), GFP_KERNEL);
>>  	if (!ns->device_path)
>>  		goto out_unlock;
>> +	if (!strlen(ns->device_path)) {
>> +		kfree(ns->device_path);
>> +		ns->device_path = NULL;
>> +		ret = -EINVAL;
>> +		goto out_unlock;
>> +	}
> 
> I think this should rather be:
> 
> 	ret = -EINVAL;
> 	len = strcspn(page, "\n");
> 	if (len == 0)
> 		goto out_unlock;
> 	ret = -ENOMEM;
> 	ns->device_path = kstrndup(page, len, GFP_KERNEL);
> 	if (!ns->device_path)
> 		goto out_unlock;
> 
Okay.

>> index 0aeccbddd716..956d293d293d 100644
>> --- a/drivers/nvme/target/core.c
>> +++ b/drivers/nvme/target/core.c
>> @@ -373,8 +373,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns)
>>  	if (ns->enabled)
>>  		goto out_unlock;
>>  
>> +	if (!ns->device_path) {
>> +		ret = -EINVAL;
>> +		goto out_unlock;
>> +	}
> 
> I don't think we should need this fix.
> 
>>  	ret = nvmet_bdev_ns_enable(ns);
>> -	if (ret)
>> +	if (ret == -ENOTBLK)
>>  		ret = nvmet_file_ns_enable(ns);
> 
> Looks sensible, but please split this into a separate patch.
> 
Right. And should be sent independently of the ANA stuff, too.

Will be updating both.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare at suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-17 13:48     ` Hannes Reinecke
@ 2018-07-17 13:55       ` Christoph Hellwig
  2018-07-24 14:02       ` Christoph Hellwig
  1 sibling, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-17 13:55 UTC (permalink / raw)


On Tue, Jul 17, 2018@03:48:41PM +0200, Hannes Reinecke wrote:
> On 07/17/2018 03:48 PM, Christoph Hellwig wrote:
> > On Mon, Jul 16, 2018@12:58:34PM +0200, Hannes Reinecke wrote:
> >> index 094d216ae62a..76efc8120c00 100644
> >> --- a/drivers/nvme/target/configfs.c
> >> +++ b/drivers/nvme/target/configfs.c
> >> @@ -295,6 +295,12 @@ static ssize_t nvmet_ns_device_path_store(struct config_item *item,
> >>  	ns->device_path = kstrndup(page, strcspn(page, "\n"), GFP_KERNEL);
> >>  	if (!ns->device_path)
> >>  		goto out_unlock;
> >> +	if (!strlen(ns->device_path)) {
> >> +		kfree(ns->device_path);
> >> +		ns->device_path = NULL;
> >> +		ret = -EINVAL;
> >> +		goto out_unlock;
> >> +	}
> > 
> > I think this should rather be:
> > 
> > 	ret = -EINVAL;
> > 	len = strcspn(page, "\n");
> > 	if (len == 0)
> > 		goto out_unlock;
> > 	ret = -ENOMEM;
> > 	ns->device_path = kstrndup(page, len, GFP_KERNEL);
> > 	if (!ns->device_path)
> > 		goto out_unlock;
> > 
> Okay.

Actually, based on a comment in mm/util.c we could even use kmemdup_nul
instead of kstrndup, might be worth to give it a spin.

> > Looks sensible, but please split this into a separate patch.
> > 
> Right. And should be sent independently of the ANA stuff, too.

Yes, this looks like 4.18 material.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] nvme: count all ANA groups for ANA Log page
  2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
@ 2018-07-19 14:31   ` Christoph Hellwig
  2018-09-17 13:49   ` Christoph Hellwig
  1 sibling, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-19 14:31 UTC (permalink / raw)


On Mon, Jul 16, 2018@12:58:33PM +0200, Hannes Reinecke wrote:
> When issuing a short read on the ANA log page the number of groups
> should not change, even though the final returned data might contain
> less groups than that number.

This is based on a "clarification" to the spec that Fred posted a few
days ago, which honestly looks entirely bogus.  I'll wait until that
is sorted out in the WG.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/5] nvme-multipath: fixup crash during disconnect
  2018-07-16 10:58 ` [PATCH 3/5] nvme-multipath: fixup crash during disconnect Hannes Reinecke
@ 2018-07-19 14:31   ` Christoph Hellwig
  2018-07-19 14:35     ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-19 14:31 UTC (permalink / raw)


On Mon, Jul 16, 2018@12:58:35PM +0200, Hannes Reinecke wrote:
> The kernel will crash during disconnect, if the first path is
> 'optimized' and the second 'non-optimized'. Then upon removing the
> first path any I/O coming in _before_ removing the second path will
> find the first path to be unusable, and assign 'fallback' to the
> second. But after exiting the loop it will assign the _first_ path
> to 'current_path', leading to a crash later on.
> 
> Signed-off-by: Hannes Reinecke <hare at suse.com>
> ---
>  drivers/nvme/host/multipath.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 8980b0137fa5..5205f41c0f6c 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -129,7 +129,7 @@ static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head)
>  	}
>  
>  	if (fallback)
> -		rcu_assign_pointer(head->current_path, ns);
> +		rcu_assign_pointer(head->current_path, fallback);
>  	return fallback;

This doesn't match the current version of __nvme_find_path that was
in the last posted series or the git tree.  What tree do you work against?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/5] nvme-multipath: fixup crash during disconnect
  2018-07-19 14:31   ` Christoph Hellwig
@ 2018-07-19 14:35     ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-19 14:35 UTC (permalink / raw)


On Thu, Jul 19, 2018@04:31:50PM +0200, Christoph Hellwig wrote:
> This doesn't match the current version of __nvme_find_path that was
> in the last posted series or the git tree.  What tree do you work against?

Actually, error on my side - the tree on this laptop wasn't uptodate,
sorry.  Applied for the next version.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails
  2018-07-16 10:58 ` [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails Hannes Reinecke
@ 2018-07-19 14:43   ` Christoph Hellwig
  2018-07-19 16:00     ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-19 14:43 UTC (permalink / raw)


On Mon, Jul 16, 2018@12:58:36PM +0200, Hannes Reinecke wrote:
> If we cannot parse the ANA log page we should just disable ANA
> support altogether, otherwise we fail to connect to the controller
> and we have no way of figuring out what went wrong.
> And we should kill the WARN_ON() statements as we now have a valid
> recovery strategy.

I don't like this.  There is a lot magic now where we need to
check if we have a buffer independent of wheter the controller
supports ANA to start with.

If we really have a that buggy controller we'll need to quirk it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state()
  2018-07-16 10:58 ` [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state() Hannes Reinecke
@ 2018-07-19 14:46   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-19 14:46 UTC (permalink / raw)


On Mon, Jul 16, 2018@12:58:37PM +0200, Hannes Reinecke wrote:
> We need to parse the entire ANA log page, otherwise later namespaces
> will not be updates with the new ANA state.

nvme_set_ns_ana_state is only used if we looks for exactly one
namespace, there are no later namespaces to be updated.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails
  2018-07-19 14:43   ` Christoph Hellwig
@ 2018-07-19 16:00     ` Hannes Reinecke
  2018-07-20 15:00       ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-19 16:00 UTC (permalink / raw)


On 07/19/2018 04:43 PM, Christoph Hellwig wrote:
> On Mon, Jul 16, 2018@12:58:36PM +0200, Hannes Reinecke wrote:
>> If we cannot parse the ANA log page we should just disable ANA
>> support altogether, otherwise we fail to connect to the controller
>> and we have no way of figuring out what went wrong.
>> And we should kill the WARN_ON() statements as we now have a valid
>> recovery strategy.
> 
> I don't like this.  There is a lot magic now where we need to
> check if we have a buffer independent of wheter the controller
> supports ANA to start with.
> 
> If we really have a that buggy controller we'll need to quirk it.
> 
Well, I've came across this will testing ANA support on the target.

There are two reasons why I did this:

1. There's this statement in nvme_failover_req():

	if (!WARN_ON_ONCE(!ns->ctrl->ana_log_buf))
		queue_work(nvme_wq, &ns->ctrl->ana_work);

(or a similar statement in nvme_handle_aen_notice())
which really doesn't make sense if we require 'ana_log_buf' to always be 
filled out.
If we require ana_log_buf to always be set we would crash immediately 
afterwards in nvme_read_ana_log() anyway, rendering the WARN_ON quite 
pointless.
So, from my POV, either remove the WARN_ON() (or at least replace it 
with a BUG_ON() to clarify we've messed up) or handle the case where 
ana_log_buf is _NOT_ set. But that involves graceful failure handling 
while parsing ANA log pages.

2. If we come across a target with invalid or incorrect ANA 
implementations we _cannot_ connect at all (as nvme_mpath_init() will 
return an error, causing mpath_init_identify() to fail, too).
Which means we don't have _any_ way of figuring out what went wrong.
So I've opted to handle error gracefully and disabling ANA support for 
those targets (which is perfectly legit methinks), allowing us to 
inspect the log page via nvme-cli and figure out what went wrong.

And in general I thought it was good practice to handle errors gracefully...

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails
  2018-07-19 16:00     ` Hannes Reinecke
@ 2018-07-20 15:00       ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-20 15:00 UTC (permalink / raw)


On Thu, Jul 19, 2018@06:00:43PM +0200, Hannes Reinecke wrote:
> There are two reasons why I did this:
>
> 1. There's this statement in nvme_failover_req():
>
> 	if (!WARN_ON_ONCE(!ns->ctrl->ana_log_buf))
> 		queue_work(nvme_wq, &ns->ctrl->ana_work);
>
> (or a similar statement in nvme_handle_aen_notice())
> which really doesn't make sense if we require 'ana_log_buf' to always be 
> filled out.

We could still have a controller return the ANA status code or
send the AEN if we don't use ANA.  So I guess we do need to remove
the warnings at least.

> If we require ana_log_buf to always be set we would crash immediately 
> afterwards in nvme_read_ana_log() anyway, rendering the WARN_ON quite 
> pointless.

Well, a conditional to skip code and warn when something is not supported
is a lot better than crashing, don't you agree?

> So, from my POV, either remove the WARN_ON() (or at least replace it with a 
> BUG_ON() to clarify we've messed up) or handle the case where ana_log_buf 
> is _NOT_ set. But that involves graceful failure handling while parsing ANA 
> log pages.

As said above we probably should remove the WARN_ON, although these cases
mean we had an issue either in the driver initialization or the data
returned by the controller.

> 2. If we come across a target with invalid or incorrect ANA implementations 
> we _cannot_ connect at all (as nvme_mpath_init() will return an error, 
> causing mpath_init_identify() to fail, too).

Which seems pretty sensible.  Without that people might just keep using
the setup in this form (note that a driver bug is another possibility,
it's not always the controllers that are broken :))

> And in general I thought it was good practice to handle errors gracefully...

And I fail to see where we handle errors much more gracefully here,
the only difference seems to be that we entirely ignore ANA errors.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-17 13:48     ` Hannes Reinecke
  2018-07-17 13:55       ` Christoph Hellwig
@ 2018-07-24 14:02       ` Christoph Hellwig
  2018-07-25  7:25         ` Hannes Reinecke
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-24 14:02 UTC (permalink / raw)


On Tue, Jul 17, 2018@03:48:41PM +0200, Hannes Reinecke wrote:
> Right. And should be sent independently of the ANA stuff, too.
> 
> Will be updating both.

Can you resend them?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/5] nvmet: fixup crash on NULL device path
  2018-07-24 14:02       ` Christoph Hellwig
@ 2018-07-25  7:25         ` Hannes Reinecke
  0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2018-07-25  7:25 UTC (permalink / raw)


On 07/24/2018 04:02 PM, Christoph Hellwig wrote:
> On Tue, Jul 17, 2018@03:48:41PM +0200, Hannes Reinecke wrote:
>> Right. And should be sent independently of the ANA stuff, too.
>>
>> Will be updating both.
> 
> Can you resend them?
> 
Done.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare at suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] nvme: count all ANA groups for ANA Log page
  2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
  2018-07-19 14:31   ` Christoph Hellwig
@ 2018-09-17 13:49   ` Christoph Hellwig
  1 sibling, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-09-17 13:49 UTC (permalink / raw)


Applied to nvme-4.19 now that 4004a is making progress..

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-09-17 13:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-16 10:58 [PATCH 0/5] nvme: ANA implementation fixes Hannes Reinecke
2018-07-16 10:58 ` [PATCH 1/5] nvme: count all ANA groups for ANA Log page Hannes Reinecke
2018-07-19 14:31   ` Christoph Hellwig
2018-09-17 13:49   ` Christoph Hellwig
2018-07-16 10:58 ` [PATCH 2/5] nvmet: fixup crash on NULL device path Hannes Reinecke
2018-07-17 13:48   ` Christoph Hellwig
2018-07-17 13:48     ` Hannes Reinecke
2018-07-17 13:55       ` Christoph Hellwig
2018-07-24 14:02       ` Christoph Hellwig
2018-07-25  7:25         ` Hannes Reinecke
2018-07-16 10:58 ` [PATCH 3/5] nvme-multipath: fixup crash during disconnect Hannes Reinecke
2018-07-19 14:31   ` Christoph Hellwig
2018-07-19 14:35     ` Christoph Hellwig
2018-07-16 10:58 ` [PATCH 4/5] nvme-multipath: disable ANA support if parsing fails Hannes Reinecke
2018-07-19 14:43   ` Christoph Hellwig
2018-07-19 16:00     ` Hannes Reinecke
2018-07-20 15:00       ` Christoph Hellwig
2018-07-16 10:58 ` [PATCH 5/5] nvme-multipath: parse entire ANA log page in nvme_set_ns_ana_state() Hannes Reinecke
2018-07-19 14:46   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.