* [PATCH] nvme: configure discard at init time
@ 2018-05-02 15:52 Jens Axboe
2018-05-02 16:32 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2018-05-02 15:52 UTC (permalink / raw)
Currently nvme reconfigures discard for every disk revalidation. This
is problematic because any O_WRONLY or O_RDWR open will trigger a
partition scan through udev/systemd, and we will reconfigure discard.
This blows away any user settings, like discard_max_bytes.
Configure discard at init time instead.
Signed-off-by: Jens Axboe <axboe at kernel.dk>
---
I'm open to other suggestions as well, currently it sucks that you'd
have to continually re-configure the discard settings when someone opens
the device for writing.
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9df4f71e58ca..a35aa8050749 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1347,13 +1347,13 @@ static void nvme_set_chunk_size(struct nvme_ns *ns)
blk_queue_chunk_sectors(ns->queue, rounddown_pow_of_two(chunk_size));
}
-static void nvme_config_discard(struct nvme_ctrl *ctrl,
- unsigned stream_alignment, struct request_queue *queue)
+static void nvme_config_discard(struct nvme_ctrl *ctrl, struct nvme_ns *ns)
{
+ struct request_queue *queue = ns->queue;
u32 size = queue_logical_block_size(queue);
- if (stream_alignment)
- size *= stream_alignment;
+ if (ctrl->nr_streams && ns->sws && ns->sgs)
+ size *= ns->sws * ns->sgs;
BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
NVME_DSM_MAX_RANGES);
@@ -1426,8 +1426,6 @@ static void nvme_update_disk_info(struct gendisk *disk,
capacity = 0;
set_capacity(disk, capacity);
- if (ns->ctrl->oncs & NVME_CTRL_ONCS_DSM)
- nvme_config_discard(ns->ctrl, stream_alignment, disk->queue);
blk_mq_unfreeze_queue(disk->queue);
}
@@ -3043,6 +3041,9 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
__nvme_revalidate_disk(disk, id);
+ if (ns->ctrl->oncs & NVME_CTRL_ONCS_DSM)
+ nvme_config_discard(ctrl, ns);
+
down_write(&ctrl->namespaces_rwsem);
list_add_tail(&ns->list, &ctrl->namespaces);
up_write(&ctrl->namespaces_rwsem);
--
Jens Axboe
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] nvme: configure discard at init time
2018-05-02 15:52 [PATCH] nvme: configure discard at init time Jens Axboe
@ 2018-05-02 16:32 ` Keith Busch
2018-05-02 16:41 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-05-02 16:32 UTC (permalink / raw)
On Wed, May 02, 2018@09:52:07AM -0600, Jens Axboe wrote:
> Currently nvme reconfigures discard for every disk revalidation. This
> is problematic because any O_WRONLY or O_RDWR open will trigger a
> partition scan through udev/systemd, and we will reconfigure discard.
> This blows away any user settings, like discard_max_bytes.
>
> Configure discard at init time instead.
>
> Signed-off-by: Jens Axboe <axboe at kernel.dk>
>
> ---
>
> I'm open to other suggestions as well, currently it sucks that you'd
> have to continually re-configure the discard settings when someone opens
> the device for writing.
Your suggestion is probably fine. The only problem I can think of is a
_very_ unlikely scenario where a firmware update adds discard support,
then a user would have to reload the module in order to expose the
capability.
How about this?
---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a3771c5729f5..18191547e4bd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1353,6 +1353,10 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl,
{
u32 size = queue_logical_block_size(queue);
+ /* don't reset discard queue limits */
+ if (blk_queue_flag_test_and_set(QUEUE_FLAG_DISCARD, queue))
+ return;
+
if (stream_alignment)
size *= stream_alignment;
@@ -1364,7 +1368,6 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl,
blk_queue_max_discard_sectors(queue, UINT_MAX);
blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
- blk_queue_flag_set(QUEUE_FLAG_DISCARD, queue);
if (ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
blk_queue_max_write_zeroes_sectors(queue, UINT_MAX);
--
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] nvme: configure discard at init time
2018-05-02 16:32 ` Keith Busch
@ 2018-05-02 16:41 ` Jens Axboe
2018-05-02 16:54 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2018-05-02 16:41 UTC (permalink / raw)
On 5/2/18 10:32 AM, Keith Busch wrote:
> On Wed, May 02, 2018@09:52:07AM -0600, Jens Axboe wrote:
>> Currently nvme reconfigures discard for every disk revalidation. This
>> is problematic because any O_WRONLY or O_RDWR open will trigger a
>> partition scan through udev/systemd, and we will reconfigure discard.
>> This blows away any user settings, like discard_max_bytes.
>>
>> Configure discard at init time instead.
>>
>> Signed-off-by: Jens Axboe <axboe at kernel.dk>
>>
>> ---
>>
>> I'm open to other suggestions as well, currently it sucks that you'd
>> have to continually re-configure the discard settings when someone opens
>> the device for writing.
>
> Your suggestion is probably fine. The only problem I can think of is a
> _very_ unlikely scenario where a firmware update adds discard support,
> then a user would have to reload the module in order to expose the
> capability.
>
> How about this?
If we add discard through a firmware upgrade, we should also handle
the case where we lose discard through a firmware upgrade/downgrade.
Not that any of them are likely to ever happen, but...
How about this? Also handles the case where streams values are updated.
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9df4f71e58ca..b0c1f1ce8226 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1347,13 +1347,19 @@ static void nvme_set_chunk_size(struct nvme_ns *ns)
blk_queue_chunk_sectors(ns->queue, rounddown_pow_of_two(chunk_size));
}
-static void nvme_config_discard(struct nvme_ctrl *ctrl,
- unsigned stream_alignment, struct request_queue *queue)
+static void nvme_config_discard(struct nvme_ns *ns)
{
+ struct nvme_ctrl *ctrl = ns->ctrl;
+ struct request_queue *queue = ns->queue;
u32 size = queue_logical_block_size(queue);
- if (stream_alignment)
- size *= stream_alignment;
+ if (!(ctrl->oncs & NVME_CTRL_ONCS_DSM)) {
+ blk_queue_flag_clear(QUEUE_FLAG_DISCARD, queue);
+ return;
+ }
+
+ if (ctrl->nr_streams && ns->sws && ns->sgs)
+ size *= ns->sws * ns->sgs;
BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
NVME_DSM_MAX_RANGES);
@@ -1361,6 +1367,10 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl,
queue->limits.discard_alignment = 0;
queue->limits.discard_granularity = size;
+ /* If discard is already enabled, don't reset queue limits */
+ if (blk_queue_flag_test_and_set(QUEUE_FLAG_DISCARD, queue))
+ return;
+
blk_queue_max_discard_sectors(queue, UINT_MAX);
blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
blk_queue_flag_set(QUEUE_FLAG_DISCARD, queue);
@@ -1407,10 +1417,6 @@ static void nvme_update_disk_info(struct gendisk *disk,
{
sector_t capacity = le64_to_cpup(&id->nsze) << (ns->lba_shift - 9);
unsigned short bs = 1 << ns->lba_shift;
- unsigned stream_alignment = 0;
-
- if (ns->ctrl->nr_streams && ns->sws && ns->sgs)
- stream_alignment = ns->sws * ns->sgs;
blk_mq_freeze_queue(disk->queue);
blk_integrity_unregister(disk);
@@ -1427,7 +1433,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
set_capacity(disk, capacity);
if (ns->ctrl->oncs & NVME_CTRL_ONCS_DSM)
- nvme_config_discard(ns->ctrl, stream_alignment, disk->queue);
+ nvme_config_discard(ns);
blk_mq_unfreeze_queue(disk->queue);
}
--
Jens Axboe
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] nvme: configure discard at init time
2018-05-02 16:41 ` Jens Axboe
@ 2018-05-02 16:54 ` Keith Busch
2018-05-02 17:05 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-05-02 16:54 UTC (permalink / raw)
On Wed, May 02, 2018@10:41:03AM -0600, Jens Axboe wrote:
> If we add discard through a firmware upgrade, we should also handle
> the case where we lose discard through a firmware upgrade/downgrade.
> Not that any of them are likely to ever happen, but...
>
> How about this? Also handles the case where streams values are updated.
Good call. You also have it updating granularity when a format changes
the logical block size, so that's also a good thing.
One minor issue below:
> -static void nvme_config_discard(struct nvme_ctrl *ctrl,
> - unsigned stream_alignment, struct request_queue *queue)
> +static void nvme_config_discard(struct nvme_ns *ns)
> {
> + struct nvme_ctrl *ctrl = ns->ctrl;
> + struct request_queue *queue = ns->queue;
> u32 size = queue_logical_block_size(queue);
>
> - if (stream_alignment)
> - size *= stream_alignment;
> + if (!(ctrl->oncs & NVME_CTRL_ONCS_DSM)) {
> + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, queue);
> + return;
> + }
> +
> + if (ctrl->nr_streams && ns->sws && ns->sgs)
> + size *= ns->sws * ns->sgs;
>
> BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
> NVME_DSM_MAX_RANGES);
> @@ -1361,6 +1367,10 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl,
> queue->limits.discard_alignment = 0;
> queue->limits.discard_granularity = size;
>
> + /* If discard is already enabled, don't reset queue limits */
> + if (blk_queue_flag_test_and_set(QUEUE_FLAG_DISCARD, queue))
> + return;
> +
> blk_queue_max_discard_sectors(queue, UINT_MAX);
> blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
> blk_queue_flag_set(QUEUE_FLAG_DISCARD, queue);
> @@ -1407,10 +1417,6 @@ static void nvme_update_disk_info(struct gendisk *disk,
> {
> sector_t capacity = le64_to_cpup(&id->nsze) << (ns->lba_shift - 9);
> unsigned short bs = 1 << ns->lba_shift;
> - unsigned stream_alignment = 0;
> -
> - if (ns->ctrl->nr_streams && ns->sws && ns->sgs)
> - stream_alignment = ns->sws * ns->sgs;
>
> blk_mq_freeze_queue(disk->queue);
> blk_integrity_unregister(disk);
> @@ -1427,7 +1433,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
> set_capacity(disk, capacity);
>
> if (ns->ctrl->oncs & NVME_CTRL_ONCS_DSM)
> - nvme_config_discard(ns->ctrl, stream_alignment, disk->queue);
> + nvme_config_discard(ns);
Since nvme_config_discard now handles disabling the queue limit, we need
to call this unconditionally regardless of the ONCS_DSM
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] nvme: configure discard at init time
2018-05-02 16:54 ` Keith Busch
@ 2018-05-02 17:05 ` Jens Axboe
0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2018-05-02 17:05 UTC (permalink / raw)
On 5/2/18 10:54 AM, Keith Busch wrote:
> On Wed, May 02, 2018@10:41:03AM -0600, Jens Axboe wrote:
>> If we add discard through a firmware upgrade, we should also handle
>> the case where we lose discard through a firmware upgrade/downgrade.
>> Not that any of them are likely to ever happen, but...
>>
>> How about this? Also handles the case where streams values are updated.
>
> Good call. You also have it updating granularity when a format changes
> the logical block size, so that's also a good thing.
>
> One minor issue below:
>
>> -static void nvme_config_discard(struct nvme_ctrl *ctrl,
>> - unsigned stream_alignment, struct request_queue *queue)
>> +static void nvme_config_discard(struct nvme_ns *ns)
>> {
>> + struct nvme_ctrl *ctrl = ns->ctrl;
>> + struct request_queue *queue = ns->queue;
>> u32 size = queue_logical_block_size(queue);
>>
>> - if (stream_alignment)
>> - size *= stream_alignment;
>> + if (!(ctrl->oncs & NVME_CTRL_ONCS_DSM)) {
>> + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, queue);
>> + return;
>> + }
>> +
>> + if (ctrl->nr_streams && ns->sws && ns->sgs)
>> + size *= ns->sws * ns->sgs;
>>
>> BUILD_BUG_ON(PAGE_SIZE / sizeof(struct nvme_dsm_range) <
>> NVME_DSM_MAX_RANGES);
>> @@ -1361,6 +1367,10 @@ static void nvme_config_discard(struct nvme_ctrl *ctrl,
>> queue->limits.discard_alignment = 0;
>> queue->limits.discard_granularity = size;
>>
>> + /* If discard is already enabled, don't reset queue limits */
>> + if (blk_queue_flag_test_and_set(QUEUE_FLAG_DISCARD, queue))
>> + return;
>> +
>> blk_queue_max_discard_sectors(queue, UINT_MAX);
>> blk_queue_max_discard_segments(queue, NVME_DSM_MAX_RANGES);
>> blk_queue_flag_set(QUEUE_FLAG_DISCARD, queue);
>> @@ -1407,10 +1417,6 @@ static void nvme_update_disk_info(struct gendisk *disk,
>> {
>> sector_t capacity = le64_to_cpup(&id->nsze) << (ns->lba_shift - 9);
>> unsigned short bs = 1 << ns->lba_shift;
>> - unsigned stream_alignment = 0;
>> -
>> - if (ns->ctrl->nr_streams && ns->sws && ns->sgs)
>> - stream_alignment = ns->sws * ns->sgs;
>>
>> blk_mq_freeze_queue(disk->queue);
>> blk_integrity_unregister(disk);
>> @@ -1427,7 +1433,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
>> set_capacity(disk, capacity);
>>
>> if (ns->ctrl->oncs & NVME_CTRL_ONCS_DSM)
>> - nvme_config_discard(ns->ctrl, stream_alignment, disk->queue);
>> + nvme_config_discard(ns);
>
> Since nvme_config_discard now handles disabling the queue limit, we need
> to call this unconditionally regardless of the ONCS_DSM
Oh yeah, I forgot making that change. I'll send out a proper version.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-05-02 17:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02 15:52 [PATCH] nvme: configure discard at init time Jens Axboe
2018-05-02 16:32 ` Keith Busch
2018-05-02 16:41 ` Jens Axboe
2018-05-02 16:54 ` Keith Busch
2018-05-02 17:05 ` Jens Axboe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.