* [PATCH 1/4] block: break discard submissions into the user defined size
2018-05-07 16:13 [PATCHSET v2 0/4] Add throttling for discards Jens Axboe
@ 2018-05-07 16:13 ` Jens Axboe
2018-05-07 23:56 ` Darrick J. Wong
2018-05-08 20:43 ` Omar Sandoval
2018-05-07 16:13 ` [PATCH 2/4] blk-wbt: account any writing command as a write Jens Axboe
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Jens Axboe @ 2018-05-07 16:13 UTC (permalink / raw)
To: linux-block, linux-xfs; +Cc: dchinner, hch, Jens Axboe
Don't build discards bigger than what the user asked for, if the
user decided to limit the size by writing to 'discard_max_bytes'.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-lib.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/block/blk-lib.c b/block/blk-lib.c
index a676084d4740..7417d617091b 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -62,10 +62,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
unsigned int req_sects;
sector_t end_sect, tmp;
- /* Make sure bi_size doesn't overflow */
- req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
+ /* Issue in chunks of the user defined max discard setting */
+ req_sects = min_t(sector_t, nr_sects,
+ q->limits.max_discard_sectors);
- /**
+ /*
* If splitting a request, and the next starting sector would be
* misaligned, stop the discard at the previous aligned sector.
*/
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] block: break discard submissions into the user defined size
2018-05-07 16:13 ` [PATCH 1/4] block: break discard submissions into the user defined size Jens Axboe
@ 2018-05-07 23:56 ` Darrick J. Wong
2018-05-08 20:43 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-05-07 23:56 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:32AM -0600, Jens Axboe wrote:
> Don't build discards bigger than what the user asked for, if the
> user decided to limit the size by writing to 'discard_max_bytes'.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Seems fine to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> ---
> block/blk-lib.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index a676084d4740..7417d617091b 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -62,10 +62,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> unsigned int req_sects;
> sector_t end_sect, tmp;
>
> - /* Make sure bi_size doesn't overflow */
> - req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
> + /* Issue in chunks of the user defined max discard setting */
> + req_sects = min_t(sector_t, nr_sects,
> + q->limits.max_discard_sectors);
>
> - /**
> + /*
> * If splitting a request, and the next starting sector would be
> * misaligned, stop the discard at the previous aligned sector.
> */
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] block: break discard submissions into the user defined size
2018-05-07 16:13 ` [PATCH 1/4] block: break discard submissions into the user defined size Jens Axboe
2018-05-07 23:56 ` Darrick J. Wong
@ 2018-05-08 20:43 ` Omar Sandoval
2018-05-08 20:57 ` Jens Axboe
1 sibling, 1 reply; 14+ messages in thread
From: Omar Sandoval @ 2018-05-08 20:43 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:32AM -0600, Jens Axboe wrote:
> Don't build discards bigger than what the user asked for, if the
> user decided to limit the size by writing to 'discard_max_bytes'.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> block/blk-lib.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index a676084d4740..7417d617091b 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -62,10 +62,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> unsigned int req_sects;
> sector_t end_sect, tmp;
>
> - /* Make sure bi_size doesn't overflow */
> - req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
> + /* Issue in chunks of the user defined max discard setting */
> + req_sects = min_t(sector_t, nr_sects,
> + q->limits.max_discard_sectors);
>
Some drivers, including nvme, do
blk_queue_max_discard_sectors(q, UINT_MAX)
That means q->limits.max_discard_sectors can be UINT_MAX, and
bio->bi_iter.bi_size = req_sects << 9;
will overflow. We should probably cap max_discard_sectors at
UINT_MAX >> 9 in a prep patch.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] block: break discard submissions into the user defined size
2018-05-08 20:43 ` Omar Sandoval
@ 2018-05-08 20:57 ` Jens Axboe
0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2018-05-08 20:57 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-block, linux-xfs, dchinner, hch
On 5/8/18 2:43 PM, Omar Sandoval wrote:
> On Mon, May 07, 2018 at 10:13:32AM -0600, Jens Axboe wrote:
>> Don't build discards bigger than what the user asked for, if the
>> user decided to limit the size by writing to 'discard_max_bytes'.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>> block/blk-lib.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index a676084d4740..7417d617091b 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -62,10 +62,11 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>> unsigned int req_sects;
>> sector_t end_sect, tmp;
>>
>> - /* Make sure bi_size doesn't overflow */
>> - req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
>> + /* Issue in chunks of the user defined max discard setting */
>> + req_sects = min_t(sector_t, nr_sects,
>> + q->limits.max_discard_sectors);
>>
>
> Some drivers, including nvme, do
>
> blk_queue_max_discard_sectors(q, UINT_MAX)
>
> That means q->limits.max_discard_sectors can be UINT_MAX, and
>
> bio->bi_iter.bi_size = req_sects << 9;
>
> will overflow. We should probably cap max_discard_sectors at
> UINT_MAX >> 9 in a prep patch.
Probably better to just keep the local overflow check, I'll do that.
Thanks for spotting it!
--
Jens Axboe
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/4] blk-wbt: account any writing command as a write
2018-05-07 16:13 [PATCHSET v2 0/4] Add throttling for discards Jens Axboe
2018-05-07 16:13 ` [PATCH 1/4] block: break discard submissions into the user defined size Jens Axboe
@ 2018-05-07 16:13 ` Jens Axboe
2018-05-07 23:58 ` Darrick J. Wong
2018-05-08 20:44 ` Omar Sandoval
2018-05-07 16:13 ` [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait() Jens Axboe
2018-05-07 16:13 ` [PATCH 4/4] blk-wbt: throttle discards like background writes Jens Axboe
3 siblings, 2 replies; 14+ messages in thread
From: Jens Axboe @ 2018-05-07 16:13 UTC (permalink / raw)
To: linux-block, linux-xfs; +Cc: dchinner, hch, Jens Axboe
We currently special case WRITE and FLUSH, but we should really
just include any command with the write bit set. This ensures
that we account DISCARD.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-wbt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index f92fc84b5e2c..3e34b41bcefc 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -701,7 +701,7 @@ static int wbt_data_dir(const struct request *rq)
if (op == REQ_OP_READ)
return READ;
- else if (op == REQ_OP_WRITE || op == REQ_OP_FLUSH)
+ else if (op_is_write(op))
return WRITE;
/* don't account */
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] blk-wbt: account any writing command as a write
2018-05-07 16:13 ` [PATCH 2/4] blk-wbt: account any writing command as a write Jens Axboe
@ 2018-05-07 23:58 ` Darrick J. Wong
2018-05-08 20:44 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-05-07 23:58 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:33AM -0600, Jens Axboe wrote:
> We currently special case WRITE and FLUSH, but we should really
> just include any command with the write bit set. This ensures
> that we account DISCARD.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> ---
> block/blk-wbt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index f92fc84b5e2c..3e34b41bcefc 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -701,7 +701,7 @@ static int wbt_data_dir(const struct request *rq)
>
> if (op == REQ_OP_READ)
> return READ;
> - else if (op == REQ_OP_WRITE || op == REQ_OP_FLUSH)
> + else if (op_is_write(op))
> return WRITE;
>
> /* don't account */
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] blk-wbt: account any writing command as a write
2018-05-07 16:13 ` [PATCH 2/4] blk-wbt: account any writing command as a write Jens Axboe
2018-05-07 23:58 ` Darrick J. Wong
@ 2018-05-08 20:44 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Omar Sandoval @ 2018-05-08 20:44 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:33AM -0600, Jens Axboe wrote:
> We currently special case WRITE and FLUSH, but we should really
> just include any command with the write bit set. This ensures
> that we account DISCARD.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Omar Sandoval <osandov@fb.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> block/blk-wbt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index f92fc84b5e2c..3e34b41bcefc 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -701,7 +701,7 @@ static int wbt_data_dir(const struct request *rq)
>
> if (op == REQ_OP_READ)
> return READ;
> - else if (op == REQ_OP_WRITE || op == REQ_OP_FLUSH)
> + else if (op_is_write(op))
> return WRITE;
>
> /* don't account */
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait()
2018-05-07 16:13 [PATCHSET v2 0/4] Add throttling for discards Jens Axboe
2018-05-07 16:13 ` [PATCH 1/4] block: break discard submissions into the user defined size Jens Axboe
2018-05-07 16:13 ` [PATCH 2/4] blk-wbt: account any writing command as a write Jens Axboe
@ 2018-05-07 16:13 ` Jens Axboe
2018-05-08 0:07 ` Darrick J. Wong
2018-05-08 20:46 ` Omar Sandoval
2018-05-07 16:13 ` [PATCH 4/4] blk-wbt: throttle discards like background writes Jens Axboe
3 siblings, 2 replies; 14+ messages in thread
From: Jens Axboe @ 2018-05-07 16:13 UTC (permalink / raw)
To: linux-block, linux-xfs; +Cc: dchinner, hch, Jens Axboe
This is in preparation for having more write queues, in which
case we would have needed to pass in more information than just
a simple 'is_kswapd' boolean.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-wbt.c | 25 +++++++++++++++----------
block/blk-wbt.h | 4 +++-
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 3e34b41bcefc..25d202345965 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -101,9 +101,13 @@ static bool wb_recent_wait(struct rq_wb *rwb)
return time_before(jiffies, wb->dirty_sleep + HZ);
}
-static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb, bool is_kswapd)
+static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
+ enum wbt_flags wb_acct)
{
- return &rwb->rq_wait[is_kswapd];
+ if (wb_acct & WBT_KSWAPD)
+ return &rwb->rq_wait[WBT_RWQ_KSWAPD];
+
+ return &rwb->rq_wait[WBT_RWQ_BG];
}
static void rwb_wake_all(struct rq_wb *rwb)
@@ -126,7 +130,7 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct)
if (!(wb_acct & WBT_TRACKED))
return;
- rqw = get_rq_wait(rwb, wb_acct & WBT_KSWAPD);
+ rqw = get_rq_wait(rwb, wb_acct);
inflight = atomic_dec_return(&rqw->inflight);
/*
@@ -529,11 +533,12 @@ static inline bool may_queue(struct rq_wb *rwb, struct rq_wait *rqw,
* Block if we will exceed our limit, or if we are currently waiting for
* the timer to kick off queuing again.
*/
-static void __wbt_wait(struct rq_wb *rwb, unsigned long rw, spinlock_t *lock)
+static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
+ unsigned long rw, spinlock_t *lock)
__releases(lock)
__acquires(lock)
{
- struct rq_wait *rqw = get_rq_wait(rwb, current_is_kswapd());
+ struct rq_wait *rqw = get_rq_wait(rwb, wb_acct);
DEFINE_WAIT(wait);
if (may_queue(rwb, rqw, &wait, rw))
@@ -584,7 +589,7 @@ static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
*/
enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
{
- unsigned int ret = 0;
+ enum wbt_flags ret = 0;
if (!rwb_enabled(rwb))
return 0;
@@ -598,14 +603,14 @@ enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
return ret;
}
- __wbt_wait(rwb, bio->bi_opf, lock);
+ if (current_is_kswapd())
+ ret |= WBT_KSWAPD;
+
+ __wbt_wait(rwb, ret, bio->bi_opf, lock);
if (!blk_stat_is_active(rwb->cb))
rwb_arm_timer(rwb);
- if (current_is_kswapd())
- ret |= WBT_KSWAPD;
-
return ret | WBT_TRACKED;
}
diff --git a/block/blk-wbt.h b/block/blk-wbt.h
index a232c98fbf4d..8038b4a0d4ef 100644
--- a/block/blk-wbt.h
+++ b/block/blk-wbt.h
@@ -19,7 +19,9 @@ enum wbt_flags {
};
enum {
- WBT_NUM_RWQ = 2,
+ WBT_RWQ_BG = 0,
+ WBT_RWQ_KSWAPD,
+ WBT_NUM_RWQ,
};
/*
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait()
2018-05-07 16:13 ` [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait() Jens Axboe
@ 2018-05-08 0:07 ` Darrick J. Wong
2018-05-08 20:46 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-05-08 0:07 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:34AM -0600, Jens Axboe wrote:
> This is in preparation for having more write queues, in which
> case we would have needed to pass in more information than just
> a simple 'is_kswapd' boolean.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks ok (having not run any kind of testing on this yet),
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> ---
> block/blk-wbt.c | 25 +++++++++++++++----------
> block/blk-wbt.h | 4 +++-
> 2 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index 3e34b41bcefc..25d202345965 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -101,9 +101,13 @@ static bool wb_recent_wait(struct rq_wb *rwb)
> return time_before(jiffies, wb->dirty_sleep + HZ);
> }
>
> -static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb, bool is_kswapd)
> +static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
> + enum wbt_flags wb_acct)
> {
> - return &rwb->rq_wait[is_kswapd];
> + if (wb_acct & WBT_KSWAPD)
> + return &rwb->rq_wait[WBT_RWQ_KSWAPD];
> +
> + return &rwb->rq_wait[WBT_RWQ_BG];
> }
>
> static void rwb_wake_all(struct rq_wb *rwb)
> @@ -126,7 +130,7 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct)
> if (!(wb_acct & WBT_TRACKED))
> return;
>
> - rqw = get_rq_wait(rwb, wb_acct & WBT_KSWAPD);
> + rqw = get_rq_wait(rwb, wb_acct);
> inflight = atomic_dec_return(&rqw->inflight);
>
> /*
> @@ -529,11 +533,12 @@ static inline bool may_queue(struct rq_wb *rwb, struct rq_wait *rqw,
> * Block if we will exceed our limit, or if we are currently waiting for
> * the timer to kick off queuing again.
> */
> -static void __wbt_wait(struct rq_wb *rwb, unsigned long rw, spinlock_t *lock)
> +static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
> + unsigned long rw, spinlock_t *lock)
> __releases(lock)
> __acquires(lock)
> {
> - struct rq_wait *rqw = get_rq_wait(rwb, current_is_kswapd());
> + struct rq_wait *rqw = get_rq_wait(rwb, wb_acct);
> DEFINE_WAIT(wait);
>
> if (may_queue(rwb, rqw, &wait, rw))
> @@ -584,7 +589,7 @@ static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
> */
> enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
> {
> - unsigned int ret = 0;
> + enum wbt_flags ret = 0;
>
> if (!rwb_enabled(rwb))
> return 0;
> @@ -598,14 +603,14 @@ enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
> return ret;
> }
>
> - __wbt_wait(rwb, bio->bi_opf, lock);
> + if (current_is_kswapd())
> + ret |= WBT_KSWAPD;
> +
> + __wbt_wait(rwb, ret, bio->bi_opf, lock);
>
> if (!blk_stat_is_active(rwb->cb))
> rwb_arm_timer(rwb);
>
> - if (current_is_kswapd())
> - ret |= WBT_KSWAPD;
> -
> return ret | WBT_TRACKED;
> }
>
> diff --git a/block/blk-wbt.h b/block/blk-wbt.h
> index a232c98fbf4d..8038b4a0d4ef 100644
> --- a/block/blk-wbt.h
> +++ b/block/blk-wbt.h
> @@ -19,7 +19,9 @@ enum wbt_flags {
> };
>
> enum {
> - WBT_NUM_RWQ = 2,
> + WBT_RWQ_BG = 0,
> + WBT_RWQ_KSWAPD,
> + WBT_NUM_RWQ,
> };
>
> /*
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait()
2018-05-07 16:13 ` [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait() Jens Axboe
2018-05-08 0:07 ` Darrick J. Wong
@ 2018-05-08 20:46 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Omar Sandoval @ 2018-05-08 20:46 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:34AM -0600, Jens Axboe wrote:
> This is in preparation for having more write queues, in which
> case we would have needed to pass in more information than just
> a simple 'is_kswapd' boolean.
Reviewed-by: Omar Sandoval <osandov@fb.com>
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> block/blk-wbt.c | 25 +++++++++++++++----------
> block/blk-wbt.h | 4 +++-
> 2 files changed, 18 insertions(+), 11 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 4/4] blk-wbt: throttle discards like background writes
2018-05-07 16:13 [PATCHSET v2 0/4] Add throttling for discards Jens Axboe
` (2 preceding siblings ...)
2018-05-07 16:13 ` [PATCH 3/4] blk-wbt: pass in enum wbt_flags to get_rq_wait() Jens Axboe
@ 2018-05-07 16:13 ` Jens Axboe
2018-05-08 0:13 ` Darrick J. Wong
2018-05-08 20:48 ` Omar Sandoval
3 siblings, 2 replies; 14+ messages in thread
From: Jens Axboe @ 2018-05-07 16:13 UTC (permalink / raw)
To: linux-block, linux-xfs; +Cc: dchinner, hch, Jens Axboe
Throttle discards like we would any background write. Discards should
be background activity, so if they are impacting foreground IO, then
we will throttle them down.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/blk-stat.h | 6 +++---
block/blk-wbt.c | 43 ++++++++++++++++++++++++++-----------------
block/blk-wbt.h | 4 +++-
3 files changed, 32 insertions(+), 21 deletions(-)
diff --git a/block/blk-stat.h b/block/blk-stat.h
index 2dd36347252a..c22049a8125e 100644
--- a/block/blk-stat.h
+++ b/block/blk-stat.h
@@ -10,11 +10,11 @@
/*
* from upper:
- * 3 bits: reserved for other usage
+ * 4 bits: reserved for other usage
* 12 bits: size
- * 49 bits: time
+ * 48 bits: time
*/
-#define BLK_STAT_RES_BITS 3
+#define BLK_STAT_RES_BITS 4
#define BLK_STAT_SIZE_BITS 12
#define BLK_STAT_RES_SHIFT (64 - BLK_STAT_RES_BITS)
#define BLK_STAT_SIZE_SHIFT (BLK_STAT_RES_SHIFT - BLK_STAT_SIZE_BITS)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 25d202345965..a7a724580033 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -106,6 +106,8 @@ static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
{
if (wb_acct & WBT_KSWAPD)
return &rwb->rq_wait[WBT_RWQ_KSWAPD];
+ else if (wb_acct & WBT_DISCARD)
+ return &rwb->rq_wait[WBT_RWQ_DISCARD];
return &rwb->rq_wait[WBT_RWQ_BG];
}
@@ -143,10 +145,13 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct)
}
/*
- * If the device does write back caching, drop further down
- * before we wake people up.
+ * For discards, our limit is always the background. For writes, if
+ * the device does write back caching, drop further down before we
+ * wake people up.
*/
- if (rwb->wc && !wb_recent_wait(rwb))
+ if (wb_acct & WBT_DISCARD)
+ limit = rwb->wb_background;
+ else if (rwb->wc && !wb_recent_wait(rwb))
limit = 0;
else
limit = rwb->wb_normal;
@@ -483,6 +488,9 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw)
{
unsigned int limit;
+ if ((rw & REQ_OP_MASK) == REQ_OP_DISCARD)
+ return rwb->wb_background;
+
/*
* At this point we know it's a buffered write. If this is
* kswapd trying to free memory, or REQ_SYNC is set, then
@@ -564,21 +572,20 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
{
- const int op = bio_op(bio);
-
- /*
- * If not a WRITE, do nothing
- */
- if (op != REQ_OP_WRITE)
- return false;
-
- /*
- * Don't throttle WRITE_ODIRECT
- */
- if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) == (REQ_SYNC | REQ_IDLE))
+ switch (bio_op(bio)) {
+ case REQ_OP_WRITE:
+ /*
+ * Don't throttle WRITE_ODIRECT
+ */
+ if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
+ (REQ_SYNC | REQ_IDLE))
+ return false;
+ /* fallthrough */
+ case REQ_OP_DISCARD:
+ return true;
+ default:
return false;
-
- return true;
+ }
}
/*
@@ -605,6 +612,8 @@ enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
if (current_is_kswapd())
ret |= WBT_KSWAPD;
+ if (bio_op(bio) == REQ_OP_DISCARD)
+ ret |= WBT_DISCARD;
__wbt_wait(rwb, ret, bio->bi_opf, lock);
diff --git a/block/blk-wbt.h b/block/blk-wbt.h
index 8038b4a0d4ef..d6a125e49db5 100644
--- a/block/blk-wbt.h
+++ b/block/blk-wbt.h
@@ -14,13 +14,15 @@ enum wbt_flags {
WBT_TRACKED = 1, /* write, tracked for throttling */
WBT_READ = 2, /* read */
WBT_KSWAPD = 4, /* write, from kswapd */
+ WBT_DISCARD = 8, /* discard */
- WBT_NR_BITS = 3, /* number of bits */
+ WBT_NR_BITS = 4, /* number of bits */
};
enum {
WBT_RWQ_BG = 0,
WBT_RWQ_KSWAPD,
+ WBT_RWQ_DISCARD,
WBT_NUM_RWQ,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] blk-wbt: throttle discards like background writes
2018-05-07 16:13 ` [PATCH 4/4] blk-wbt: throttle discards like background writes Jens Axboe
@ 2018-05-08 0:13 ` Darrick J. Wong
2018-05-08 20:48 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-05-08 0:13 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:35AM -0600, Jens Axboe wrote:
> Throttle discards like we would any background write. Discards should
> be background activity, so if they are impacting foreground IO, then
> we will throttle them down.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> ---
> block/blk-stat.h | 6 +++---
> block/blk-wbt.c | 43 ++++++++++++++++++++++++++-----------------
> block/blk-wbt.h | 4 +++-
> 3 files changed, 32 insertions(+), 21 deletions(-)
>
> diff --git a/block/blk-stat.h b/block/blk-stat.h
> index 2dd36347252a..c22049a8125e 100644
> --- a/block/blk-stat.h
> +++ b/block/blk-stat.h
> @@ -10,11 +10,11 @@
>
> /*
> * from upper:
> - * 3 bits: reserved for other usage
> + * 4 bits: reserved for other usage
> * 12 bits: size
> - * 49 bits: time
> + * 48 bits: time
> */
> -#define BLK_STAT_RES_BITS 3
> +#define BLK_STAT_RES_BITS 4
> #define BLK_STAT_SIZE_BITS 12
> #define BLK_STAT_RES_SHIFT (64 - BLK_STAT_RES_BITS)
> #define BLK_STAT_SIZE_SHIFT (BLK_STAT_RES_SHIFT - BLK_STAT_SIZE_BITS)
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index 25d202345965..a7a724580033 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -106,6 +106,8 @@ static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
> {
> if (wb_acct & WBT_KSWAPD)
> return &rwb->rq_wait[WBT_RWQ_KSWAPD];
> + else if (wb_acct & WBT_DISCARD)
> + return &rwb->rq_wait[WBT_RWQ_DISCARD];
>
> return &rwb->rq_wait[WBT_RWQ_BG];
> }
> @@ -143,10 +145,13 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct)
> }
>
> /*
> - * If the device does write back caching, drop further down
> - * before we wake people up.
> + * For discards, our limit is always the background. For writes, if
> + * the device does write back caching, drop further down before we
> + * wake people up.
> */
> - if (rwb->wc && !wb_recent_wait(rwb))
> + if (wb_acct & WBT_DISCARD)
> + limit = rwb->wb_background;
> + else if (rwb->wc && !wb_recent_wait(rwb))
> limit = 0;
> else
> limit = rwb->wb_normal;
> @@ -483,6 +488,9 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw)
> {
> unsigned int limit;
>
> + if ((rw & REQ_OP_MASK) == REQ_OP_DISCARD)
> + return rwb->wb_background;
> +
> /*
> * At this point we know it's a buffered write. If this is
> * kswapd trying to free memory, or REQ_SYNC is set, then
> @@ -564,21 +572,20 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
>
> static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
> {
> - const int op = bio_op(bio);
> -
> - /*
> - * If not a WRITE, do nothing
> - */
> - if (op != REQ_OP_WRITE)
> - return false;
> -
> - /*
> - * Don't throttle WRITE_ODIRECT
> - */
> - if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) == (REQ_SYNC | REQ_IDLE))
> + switch (bio_op(bio)) {
> + case REQ_OP_WRITE:
> + /*
> + * Don't throttle WRITE_ODIRECT
> + */
> + if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
> + (REQ_SYNC | REQ_IDLE))
> + return false;
> + /* fallthrough */
> + case REQ_OP_DISCARD:
> + return true;
> + default:
> return false;
> -
> - return true;
> + }
> }
>
> /*
> @@ -605,6 +612,8 @@ enum wbt_flags wbt_wait(struct rq_wb *rwb, struct bio *bio, spinlock_t *lock)
>
> if (current_is_kswapd())
> ret |= WBT_KSWAPD;
> + if (bio_op(bio) == REQ_OP_DISCARD)
> + ret |= WBT_DISCARD;
>
> __wbt_wait(rwb, ret, bio->bi_opf, lock);
>
> diff --git a/block/blk-wbt.h b/block/blk-wbt.h
> index 8038b4a0d4ef..d6a125e49db5 100644
> --- a/block/blk-wbt.h
> +++ b/block/blk-wbt.h
> @@ -14,13 +14,15 @@ enum wbt_flags {
> WBT_TRACKED = 1, /* write, tracked for throttling */
> WBT_READ = 2, /* read */
> WBT_KSWAPD = 4, /* write, from kswapd */
> + WBT_DISCARD = 8, /* discard */
>
> - WBT_NR_BITS = 3, /* number of bits */
> + WBT_NR_BITS = 4, /* number of bits */
> };
>
> enum {
> WBT_RWQ_BG = 0,
> WBT_RWQ_KSWAPD,
> + WBT_RWQ_DISCARD,
> WBT_NUM_RWQ,
> };
>
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] blk-wbt: throttle discards like background writes
2018-05-07 16:13 ` [PATCH 4/4] blk-wbt: throttle discards like background writes Jens Axboe
2018-05-08 0:13 ` Darrick J. Wong
@ 2018-05-08 20:48 ` Omar Sandoval
1 sibling, 0 replies; 14+ messages in thread
From: Omar Sandoval @ 2018-05-08 20:48 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-xfs, dchinner, hch
On Mon, May 07, 2018 at 10:13:35AM -0600, Jens Axboe wrote:
> Throttle discards like we would any background write. Discards should
> be background activity, so if they are impacting foreground IO, then
> we will throttle them down.
Seems reasonable.
Reviewed-by: Omar Sandoval <osandov@fb.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> block/blk-stat.h | 6 +++---
> block/blk-wbt.c | 43 ++++++++++++++++++++++++++-----------------
> block/blk-wbt.h | 4 +++-
> 3 files changed, 32 insertions(+), 21 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread