* [PATCH v3 0/3] Couple of fixes detected with blktests
@ 2019-03-11 21:16 Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Sagi Grimberg @ 2019-03-11 21:16 UTC (permalink / raw)
nvme test 009 discovered a few issues with and without multipathing.
1. if a namespace has a larger block size than PAGE_SIZE we either panic
on a divide by zero with multipathing, or we see I/O errors without it.
- fix nvmet to always export block size that is leq PAGE_SIZE
- fix the core to fail ns allocation if it sees it
2. if a namespace allocation fails after nvme_init_ns_head we leak
subsystems due to a missing ns_head ref put
- fix it by putting the ns_head reference on nvme_alloc_ns error flow
Sagi Grimberg (3):
nvme: fail namespace revalidate if block size exceeds PAGE_SIZE
nvme: put ns_head ref if namespace fails allocation
nvmet-file: clamp-down file namespace lba_shift
drivers/nvme/host/core.c | 19 +++++++++++++++----
drivers/nvme/target/io-cmd-file.c | 7 ++++++-
2 files changed, 21 insertions(+), 5 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE
2019-03-11 21:16 [PATCH v3 0/3] Couple of fixes detected with blktests Sagi Grimberg
@ 2019-03-11 21:16 ` Sagi Grimberg
2019-03-11 21:21 ` Keith Busch
2019-03-11 22:05 ` Keith Busch
2019-03-11 21:16 ` [PATCH v3 2/3] nvme: put ns_head ref if namespace fails allocation Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 3/3] nvmet-file: clamp-down file namespace lba_shift Sagi Grimberg
2 siblings, 2 replies; 7+ messages in thread
From: Sagi Grimberg @ 2019-03-11 21:16 UTC (permalink / raw)
If our target exposed a namespace with a block size that is greater
than PAGE_SIZE, fail it as we do not support it.
This issue encountered when the nvmet namespace was backed by a
tempfile.
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
drivers/nvme/host/core.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7321da21f8c9..751e14468c0e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1635,7 +1635,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
blk_mq_unfreeze_queue(disk->queue);
}
-static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
+static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
{
struct nvme_ns *ns = disk->private_data;
@@ -1644,8 +1644,13 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
* block layer can use before failing read/write for 0 capacity.
*/
ns->lba_shift = id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ds;
- if (ns->lba_shift == 0)
+ if (ns->lba_shift > 12) {
+ dev_err(ns->ctrl->device, "ns lba_shift %d not supported\n",
+ ns->lba_shift);
+ return -ENOTSUPP;
+ } else if (ns->lba_shift == 0) {
ns->lba_shift = 9;
+ }
ns->noiob = le16_to_cpu(id->noiob);
ns->ms = le16_to_cpu(id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ms);
ns->ext = ns->ms && (id->flbas & NVME_NS_FLBAS_META_EXT);
@@ -1664,6 +1669,7 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
blk_queue_stack_limits(ns->head->disk->queue, ns->queue);
}
#endif
+ return 0;
}
static int nvme_revalidate_disk(struct gendisk *disk)
@@ -1688,7 +1694,10 @@ static int nvme_revalidate_disk(struct gendisk *disk)
goto out;
}
- __nvme_revalidate_disk(disk, id);
+ ret = __nvme_revalidate_disk(disk, id);
+ if (ret)
+ goto out;
+
nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
if (!nvme_ns_ids_equal(&ns->head->ids, &ids)) {
dev_err(ctrl->device,
@@ -3281,7 +3290,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
memcpy(disk->disk_name, disk_name, DISK_NAME_LEN);
ns->disk = disk;
- __nvme_revalidate_disk(disk, id);
+ if (__nvme_revalidate_disk(disk, id))
+ goto out_put_disk;
if ((ctrl->quirks & NVME_QUIRK_LIGHTNVM) && id->vs[0] == 0x1) {
if (nvme_nvm_register(ns, disk_name, node)) {
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/3] nvme: put ns_head ref if namespace fails allocation
2019-03-11 21:16 [PATCH v3 0/3] Couple of fixes detected with blktests Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
@ 2019-03-11 21:16 ` Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 3/3] nvmet-file: clamp-down file namespace lba_shift Sagi Grimberg
2 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2019-03-11 21:16 UTC (permalink / raw)
In case nvme_alloc_ns fails after we initialize ns_head but before we
add the ns to the controller namespaces list we need to explicitly put
the ns_head reference because when we teardown the controller we
won't find it, causing us to leak a dangling subsystem eventually.
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
drivers/nvme/host/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 751e14468c0e..5707ad6d987a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3319,6 +3319,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
mutex_lock(&ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
mutex_unlock(&ctrl->subsys->lock);
+ nvme_put_ns_head(ns->head);
out_free_id:
kfree(id);
out_free_queue:
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 3/3] nvmet-file: clamp-down file namespace lba_shift
2019-03-11 21:16 [PATCH v3 0/3] Couple of fixes detected with blktests Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 2/3] nvme: put ns_head ref if namespace fails allocation Sagi Grimberg
@ 2019-03-11 21:16 ` Sagi Grimberg
2 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2019-03-11 21:16 UTC (permalink / raw)
When the backing file is a tempfile for example, the inode i_blkbits
can be 1M in size which causes problems for hosts to support as the
disk block size.
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
drivers/nvme/target/io-cmd-file.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 517522305e5c..d52fe965a5d5 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -49,7 +49,12 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)
goto err;
ns->size = stat.size;
- ns->blksize_shift = file_inode(ns->file)->i_blkbits;
+ /*
+ * inode i_blkbits can be greater than the universally accepted upper
+ * bound so make sure we export we export a sane namespace lba_shift.
+ */
+ ns->blksize_shift = min_t(u8,
+ file_inode(ns->file)->i_blkbits, PAGE_SHIFT);
ns->bvec_cache = kmem_cache_create("nvmet-bvec",
NVMET_MAX_MPOOL_BVEC * sizeof(struct bio_vec),
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
@ 2019-03-11 21:21 ` Keith Busch
2019-03-11 22:02 ` Sagi Grimberg
2019-03-11 22:05 ` Keith Busch
1 sibling, 1 reply; 7+ messages in thread
From: Keith Busch @ 2019-03-11 21:21 UTC (permalink / raw)
On Mon, Mar 11, 2019@02:16:06PM -0700, Sagi Grimberg wrote:
> -static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
> +static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
> {
> struct nvme_ns *ns = disk->private_data;
>
> @@ -1644,8 +1644,13 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
> * block layer can use before failing read/write for 0 capacity.
> */
> ns->lba_shift = id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ds;
> - if (ns->lba_shift == 0)
> + if (ns->lba_shift > 12) {
> + dev_err(ns->ctrl->device, "ns lba_shift %d not supported\n",
> + ns->lba_shift);
> + return -ENOTSUPP;
> + } else if (ns->lba_shift == 0) {
> ns->lba_shift = 9;
> + }
Instead of deleting the namespace, can we make this a 0-capacity block
device like we do with extended metadata formats? That will fence it off
from generic read/write where this would cause a problem, and I'd like
to use these namespaces in passthrough mode for product testing purposes.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE
2019-03-11 21:21 ` Keith Busch
@ 2019-03-11 22:02 ` Sagi Grimberg
0 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2019-03-11 22:02 UTC (permalink / raw)
>> -static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
>> +static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
>> {
>> struct nvme_ns *ns = disk->private_data;
>>
>> @@ -1644,8 +1644,13 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
>> * block layer can use before failing read/write for 0 capacity.
>> */
>> ns->lba_shift = id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ds;
>> - if (ns->lba_shift == 0)
>> + if (ns->lba_shift > 12) {
>> + dev_err(ns->ctrl->device, "ns lba_shift %d not supported\n",
>> + ns->lba_shift);
>> + return -ENOTSUPP;
>> + } else if (ns->lba_shift == 0) {
>> ns->lba_shift = 9;
>> + }
>
> Instead of deleting the namespace, can we make this a 0-capacity block
> device like we do with extended metadata formats? That will fence it off
> from generic read/write where this would cause a problem, and I'd like
> to use these namespaces in passthrough mode for product testing purposes.
Yes we can, sent v4...
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
2019-03-11 21:21 ` Keith Busch
@ 2019-03-11 22:05 ` Keith Busch
1 sibling, 0 replies; 7+ messages in thread
From: Keith Busch @ 2019-03-11 22:05 UTC (permalink / raw)
On Mon, Mar 11, 2019@02:16:06PM -0700, Sagi Grimberg wrote:
> If our target exposed a namespace with a block size that is greater
> than PAGE_SIZE, fail it as we do not support it.
>
> This issue encountered when the nvmet namespace was backed by a
> tempfile.
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Thanks, looks good.
Reviewed-by: Keith Busch <keith.busch at intel.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-03-11 22:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-11 21:16 [PATCH v3 0/3] Couple of fixes detected with blktests Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 1/3] nvme: fail namespace revalidate if block size exceeds PAGE_SIZE Sagi Grimberg
2019-03-11 21:21 ` Keith Busch
2019-03-11 22:02 ` Sagi Grimberg
2019-03-11 22:05 ` Keith Busch
2019-03-11 21:16 ` [PATCH v3 2/3] nvme: put ns_head ref if namespace fails allocation Sagi Grimberg
2019-03-11 21:16 ` [PATCH v3 3/3] nvmet-file: clamp-down file namespace lba_shift Sagi Grimberg
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.