All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] blk: don't account discard request size
@ 2015-05-12 21:46 Shaohua Li
  2015-05-13 13:10 ` Jeff Moyer
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohua Li @ 2015-05-12 21:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: axboe

In a workload with discard request, the IO throughput is generally much
higher than expected. This is quite confusing checking iostat. Discard
request doesn't really write data to drive, so don't account it.

Signed-off-by: Shaohua Li <shli@fb.com>
---
 block/blk-core.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fd154b9..0128d18 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2138,7 +2138,11 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
 
 void blk_account_io_completion(struct request *req, unsigned int bytes)
 {
-	if (blk_do_io_stat(req)) {
+	/*
+	 * discard request doesn't really write @bytes to drive,
+	 * doesn't account it
+	 **/
+	if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_DISCARD)) {
 		const int rw = rq_data_dir(req);
 		struct hd_struct *part;
 		int cpu;
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-12 21:46 [PATCH] blk: don't account discard request size Shaohua Li
@ 2015-05-13 13:10 ` Jeff Moyer
  2015-05-13 14:20   ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2015-05-13 13:10 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, axboe

Shaohua Li <shli@fb.com> writes:

> In a workload with discard request, the IO throughput is generally much
> higher than expected. This is quite confusing checking iostat. Discard
> request doesn't really write data to drive, so don't account it.
>
> Signed-off-by: Shaohua Li <shli@fb.com>
> ---
>  block/blk-core.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index fd154b9..0128d18 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2138,7 +2138,11 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
>  
>  void blk_account_io_completion(struct request *req, unsigned int bytes)
>  {
> -	if (blk_do_io_stat(req)) {
> +	/*
> +	 * discard request doesn't really write @bytes to drive,
> +	 * doesn't account it
> +	 **/
> +	if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_DISCARD)) {
>  		const int rw = rq_data_dir(req);
>  		struct hd_struct *part;
>  		int cpu;

I think you want to modify __get_request to not set REQ_IO_STAT for
discard requests.  This patch will still account the start of I/O, which
means in_flight will be off.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 13:10 ` Jeff Moyer
@ 2015-05-13 14:20   ` Jens Axboe
  2015-05-13 15:00     ` Jeff Moyer
  2015-05-13 15:22     ` Shaohua Li
  0 siblings, 2 replies; 8+ messages in thread
From: Jens Axboe @ 2015-05-13 14:20 UTC (permalink / raw)
  To: Jeff Moyer, Shaohua Li; +Cc: linux-kernel

On 05/13/2015 09:10 AM, Jeff Moyer wrote:
> Shaohua Li <shli@fb.com> writes:
>
>> In a workload with discard request, the IO throughput is generally much
>> higher than expected. This is quite confusing checking iostat. Discard
>> request doesn't really write data to drive, so don't account it.
>>
>> Signed-off-by: Shaohua Li <shli@fb.com>
>> ---
>>   block/blk-core.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index fd154b9..0128d18 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -2138,7 +2138,11 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
>>
>>   void blk_account_io_completion(struct request *req, unsigned int bytes)
>>   {
>> -	if (blk_do_io_stat(req)) {
>> +	/*
>> +	 * discard request doesn't really write @bytes to drive,
>> +	 * doesn't account it
>> +	 **/
>> +	if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_DISCARD)) {
>>   		const int rw = rq_data_dir(req);
>>   		struct hd_struct *part;
>>   		int cpu;
>
> I think you want to modify __get_request to not set REQ_IO_STAT for
> discard requests.  This patch will still account the start of I/O, which
> means in_flight will be off.

That would be better. But I'm still not sure we want to turn off 
accounting for discards. For the mixed write/discard cases it's 
definitely confusing. The better option would be to account it as a 
discard and not a write. Preferably in a way that would not break 
existing tools, but so that they could get updated to support it.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 14:20   ` Jens Axboe
@ 2015-05-13 15:00     ` Jeff Moyer
  2015-05-13 15:22       ` Jens Axboe
  2015-05-13 15:22     ` Shaohua Li
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2015-05-13 15:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Shaohua Li, linux-kernel

Jens Axboe <axboe@fb.com> writes:

> That would be better. But I'm still not sure we want to turn off
> accounting for discards. For the mixed write/discard cases it's
> definitely confusing. The better option would be to account it as a
> discard and not a write. Preferably in a way that would not break
> existing tools, but so that they could get updated to support it.

Are you suggesting adding a few fields to the end of diskstats or adding
a new proc file altogether?  (or something else?)

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 14:20   ` Jens Axboe
  2015-05-13 15:00     ` Jeff Moyer
@ 2015-05-13 15:22     ` Shaohua Li
  2015-05-13 15:32       ` Jens Axboe
  1 sibling, 1 reply; 8+ messages in thread
From: Shaohua Li @ 2015-05-13 15:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Moyer, linux-kernel

On Wed, May 13, 2015 at 10:20:12AM -0400, Jens Axboe wrote:
> On 05/13/2015 09:10 AM, Jeff Moyer wrote:
> >Shaohua Li <shli@fb.com> writes:
> >
> >>In a workload with discard request, the IO throughput is generally much
> >>higher than expected. This is quite confusing checking iostat. Discard
> >>request doesn't really write data to drive, so don't account it.
> >>
> >>Signed-off-by: Shaohua Li <shli@fb.com>
> >>---
> >>  block/blk-core.c | 6 +++++-
> >>  1 file changed, 5 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/block/blk-core.c b/block/blk-core.c
> >>index fd154b9..0128d18 100644
> >>--- a/block/blk-core.c
> >>+++ b/block/blk-core.c
> >>@@ -2138,7 +2138,11 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
> >>
> >>  void blk_account_io_completion(struct request *req, unsigned int bytes)
> >>  {
> >>-	if (blk_do_io_stat(req)) {
> >>+	/*
> >>+	 * discard request doesn't really write @bytes to drive,
> >>+	 * doesn't account it
> >>+	 **/
> >>+	if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_DISCARD)) {
> >>  		const int rw = rq_data_dir(req);
> >>  		struct hd_struct *part;
> >>  		int cpu;
> >
> >I think you want to modify __get_request to not set REQ_IO_STAT for
> >discard requests.  This patch will still account the start of I/O, which
> >means in_flight will be off.
> 
> That would be better. But I'm still not sure we want to turn off
> accounting for discards. For the mixed write/discard cases it's
> definitely confusing. The better option would be to account it as a
> discard and not a write. Preferably in a way that would not break
> existing tools, but so that they could get updated to support it.

It's intentional discard IO start gets accounted, so tools will show
there is IO. I'm not sure if this is better though.

Adding separate columns for discard (maybe flush too) is definitely
preferred. Is breaking existing tools really ok?

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 15:00     ` Jeff Moyer
@ 2015-05-13 15:22       ` Jens Axboe
  2015-05-13 15:48         ` Jeff Moyer
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2015-05-13 15:22 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Shaohua Li, linux-kernel

On 05/13/2015 11:00 AM, Jeff Moyer wrote:
> Jens Axboe <axboe@fb.com> writes:
>
>> That would be better. But I'm still not sure we want to turn off
>> accounting for discards. For the mixed write/discard cases it's
>> definitely confusing. The better option would be to account it as a
>> discard and not a write. Preferably in a way that would not break
>> existing tools, but so that they could get updated to support it.
>
> Are you suggesting adding a few fields to the end of diskstats or adding
> a new proc file altogether?  (or something else?)

I didn't suggest any specific solution. Obviously it'd be nice if we 
could just extend diskstats, but that might break userland. Worth 
checking up on.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 15:22     ` Shaohua Li
@ 2015-05-13 15:32       ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2015-05-13 15:32 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Jeff Moyer, linux-kernel

On 05/13/2015 11:22 AM, Shaohua Li wrote:
> On Wed, May 13, 2015 at 10:20:12AM -0400, Jens Axboe wrote:
>> On 05/13/2015 09:10 AM, Jeff Moyer wrote:
>>> Shaohua Li <shli@fb.com> writes:
>>>
>>>> In a workload with discard request, the IO throughput is generally much
>>>> higher than expected. This is quite confusing checking iostat. Discard
>>>> request doesn't really write data to drive, so don't account it.
>>>>
>>>> Signed-off-by: Shaohua Li <shli@fb.com>
>>>> ---
>>>>   block/blk-core.c | 6 +++++-
>>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index fd154b9..0128d18 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -2138,7 +2138,11 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
>>>>
>>>>   void blk_account_io_completion(struct request *req, unsigned int bytes)
>>>>   {
>>>> -	if (blk_do_io_stat(req)) {
>>>> +	/*
>>>> +	 * discard request doesn't really write @bytes to drive,
>>>> +	 * doesn't account it
>>>> +	 **/
>>>> +	if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_DISCARD)) {
>>>>   		const int rw = rq_data_dir(req);
>>>>   		struct hd_struct *part;
>>>>   		int cpu;
>>>
>>> I think you want to modify __get_request to not set REQ_IO_STAT for
>>> discard requests.  This patch will still account the start of I/O, which
>>> means in_flight will be off.
>>
>> That would be better. But I'm still not sure we want to turn off
>> accounting for discards. For the mixed write/discard cases it's
>> definitely confusing. The better option would be to account it as a
>> discard and not a write. Preferably in a way that would not break
>> existing tools, but so that they could get updated to support it.
>
> It's intentional discard IO start gets accounted, so tools will show
> there is IO. I'm not sure if this is better though.
>
> Adding separate columns for discard (maybe flush too) is definitely
> preferred. Is breaking existing tools really ok?

We can't break then, I was just curious if adding a field to the end of 
the diskstats would potentially not break old applications. If not, they 
could just be updated to grab the new field too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk: don't account discard request size
  2015-05-13 15:22       ` Jens Axboe
@ 2015-05-13 15:48         ` Jeff Moyer
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Moyer @ 2015-05-13 15:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Shaohua Li, linux-kernel

Jens Axboe <axboe@fb.com> writes:

> On 05/13/2015 11:00 AM, Jeff Moyer wrote:
>> Jens Axboe <axboe@fb.com> writes:
>>
>>> That would be better. But I'm still not sure we want to turn off
>>> accounting for discards. For the mixed write/discard cases it's
>>> definitely confusing. The better option would be to account it as a
>>> discard and not a write. Preferably in a way that would not break
>>> existing tools, but so that they could get updated to support it.
>>
>> Are you suggesting adding a few fields to the end of diskstats or adding
>> a new proc file altogether?  (or something else?)
>
> I didn't suggest any specific solution. Obviously it'd be nice if we
> could just extend diskstats, but that might break userland. Worth
> checking up on.

OK, I didn't know if you had any other tricks up your sleeve.  We've had
luck adding fields to the end of some files in the past (I'd really have
to dig to remember what files), but I'm fairly certain this would break
*something*.  It's definitely safer (albeit uglier) to just add a file.

-Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-05-13 15:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-12 21:46 [PATCH] blk: don't account discard request size Shaohua Li
2015-05-13 13:10 ` Jeff Moyer
2015-05-13 14:20   ` Jens Axboe
2015-05-13 15:00     ` Jeff Moyer
2015-05-13 15:22       ` Jens Axboe
2015-05-13 15:48         ` Jeff Moyer
2015-05-13 15:22     ` Shaohua Li
2015-05-13 15:32       ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.