linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] block: fix iolat timestamp and restore accounting semantics
@ 2018-12-11 23:01 Dennis Zhou
  2018-12-13 19:48 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Dennis Zhou @ 2018-12-11 23:01 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo, Josef Bacik
  Cc: kernel-team, linux-block, cgroups, linux-kernel, Dennis Zhou

The blk-iolatency controller measures the time from rq_qos_throttle() to
rq_qos_done_bio() and attributes this time to the first bio that needs
to create the request. This means if a bio is plug-mergeable or
bio-mergeable, it gets to bypass the blk-iolatency controller.

The recent series, to tag all bios w/ blkgs in [1] changed the timing
incorrectly as well. First, the iolatency controller was tagging bios
and using that information if it should process it in rq_qos_done_bio().
However, now that all bios are tagged, this caused the atomic_t for the
struct rq_wait inflight count to underflow resulting in a stall. Second,
now the timing was using the duration a bio from generic_make_request()
rather than the timing mentioned above.

This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag to
determine if a bio has entered the request layer and is responsible for
starting a request. Stacked drivers don't recurse through
blk_mq_make_request(), so the overhead of using time between
generic_make_request() and the blk_mq_get_request() should be minimal.
blk-iolatency now checks if this flag is set to determine if it should
process the bio in rq_qos_done_bio().

[1] https://lore.kernel.org/lkml/20181205171039.73066-1-dennis@kernel.org/

Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
---
v2:
- Switched to reusing BIO_QUEUE_ENTERED rather than adding a second
  timestamp to the bio struct.

 block/blk-iolatency.c |  2 +-
 block/blk-mq.c        | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index bee092727cad..e408282bdc4c 100644
--- a/block/blk-iolatency.c
+++ b/block/blk-iolatency.c
@@ -593,7 +593,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio)
 	bool enabled = false;
 
 	blkg = bio->bi_blkg;
-	if (!blkg)
+	if (!blkg || !bio_flagged(bio, BIO_QUEUE_ENTERED))
 		return;
 
 	iolat = blkg_to_lat(bio->bi_blkg);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9690f4f8de7e..05ac940e6671 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1920,6 +1920,17 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	struct request *same_queue_rq = NULL;
 	blk_qc_t cookie;
 
+	/*
+	 * The flag BIO_QUEUE_ENTERED is used for two purposes.  First, it
+	 * determines if a bio is being split and has already entered the queue.
+	 * This happens in blk_queue_split() where we can recursively call
+	 * generic_make_request().  The second use is to mark bios that will
+	 * call rq_qos_throttle() and subseqently blk_mq_get_request().  These
+	 * are the bios that fail plug-merging and bio-merging with the primary
+	 * use case for this being the blk-iolatency controller.
+	 */
+	bio_clear_flag(bio, BIO_QUEUE_ENTERED);
+
 	blk_queue_bounce(q, &bio);
 
 	blk_queue_split(q, &bio);
@@ -1934,6 +1945,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	if (blk_mq_sched_bio_merge(q, bio))
 		return BLK_QC_T_NONE;
 
+	bio_set_flag(bio, BIO_QUEUE_ENTERED);
 	rq_qos_throttle(q, bio);
 
 	rq = blk_mq_get_request(q, bio, &data);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: fix iolat timestamp and restore accounting semantics
  2018-12-11 23:01 [PATCH v2] block: fix iolat timestamp and restore accounting semantics Dennis Zhou
@ 2018-12-13 19:48 ` Jens Axboe
  2018-12-13 19:52   ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2018-12-13 19:48 UTC (permalink / raw)
  To: Dennis Zhou, Tejun Heo, Josef Bacik
  Cc: kernel-team, linux-block, cgroups, linux-kernel

On 12/11/18 4:01 PM, Dennis Zhou wrote:
> The blk-iolatency controller measures the time from rq_qos_throttle() to
> rq_qos_done_bio() and attributes this time to the first bio that needs
> to create the request. This means if a bio is plug-mergeable or
> bio-mergeable, it gets to bypass the blk-iolatency controller.
> 
> The recent series, to tag all bios w/ blkgs in [1] changed the timing
> incorrectly as well. First, the iolatency controller was tagging bios
> and using that information if it should process it in rq_qos_done_bio().
> However, now that all bios are tagged, this caused the atomic_t for the
> struct rq_wait inflight count to underflow resulting in a stall. Second,
> now the timing was using the duration a bio from generic_make_request()
> rather than the timing mentioned above.
> 
> This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag to
> determine if a bio has entered the request layer and is responsible for
> starting a request. Stacked drivers don't recurse through
> blk_mq_make_request(), so the overhead of using time between
> generic_make_request() and the blk_mq_get_request() should be minimal.
> blk-iolatency now checks if this flag is set to determine if it should
> process the bio in rq_qos_done_bio().

I'm having a hard time convincing myself that this is correct... Maybe
we should just add a new flag for this specific use case? Or feel free
to convince me otherwise.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: fix iolat timestamp and restore accounting semantics
  2018-12-13 19:48 ` Jens Axboe
@ 2018-12-13 19:52   ` Josef Bacik
  2018-12-13 19:59     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2018-12-13 19:52 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Dennis Zhou, Tejun Heo, Josef Bacik, kernel-team, linux-block,
	cgroups, linux-kernel

On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
> On 12/11/18 4:01 PM, Dennis Zhou wrote:
> > The blk-iolatency controller measures the time from rq_qos_throttle() to
> > rq_qos_done_bio() and attributes this time to the first bio that needs
> > to create the request. This means if a bio is plug-mergeable or
> > bio-mergeable, it gets to bypass the blk-iolatency controller.
> > 
> > The recent series, to tag all bios w/ blkgs in [1] changed the timing
> > incorrectly as well. First, the iolatency controller was tagging bios
> > and using that information if it should process it in rq_qos_done_bio().
> > However, now that all bios are tagged, this caused the atomic_t for the
> > struct rq_wait inflight count to underflow resulting in a stall. Second,
> > now the timing was using the duration a bio from generic_make_request()
> > rather than the timing mentioned above.
> > 
> > This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag to
> > determine if a bio has entered the request layer and is responsible for
> > starting a request. Stacked drivers don't recurse through
> > blk_mq_make_request(), so the overhead of using time between
> > generic_make_request() and the blk_mq_get_request() should be minimal.
> > blk-iolatency now checks if this flag is set to determine if it should
> > process the bio in rq_qos_done_bio().
> 
> I'm having a hard time convincing myself that this is correct... Maybe
> we should just add a new flag for this specific use case? Or feel free
> to convince me otherwise.
>

I mean it'll work for now, but then when somebody else wants to do something
similar *cough*io.weight*cough* it'll need a new flag.  I kind of hate adding a
new flag for every controller, but then again it's not like there's thousands of
these things.  I'm having a hard time coming up with a solution other than a
per-tracker flag.  As for this specific version, I still think it needs to be in
iolatency itself, trying to make it generic just means it'll get fucked up again
later down the line.  Thanks,

Josef 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: fix iolat timestamp and restore accounting semantics
  2018-12-13 19:52   ` Josef Bacik
@ 2018-12-13 19:59     ` Jens Axboe
  2018-12-13 20:03       ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2018-12-13 19:59 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Dennis Zhou, Tejun Heo, kernel-team, linux-block, cgroups, linux-kernel

On 12/13/18 12:52 PM, Josef Bacik wrote:
> On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
>> On 12/11/18 4:01 PM, Dennis Zhou wrote:
>>> The blk-iolatency controller measures the time from
>>> rq_qos_throttle() to rq_qos_done_bio() and attributes this time to
>>> the first bio that needs to create the request. This means if a bio
>>> is plug-mergeable or bio-mergeable, it gets to bypass the
>>> blk-iolatency controller.
>>>
>>> The recent series, to tag all bios w/ blkgs in [1] changed the
>>> timing incorrectly as well. First, the iolatency controller was
>>> tagging bios and using that information if it should process it in
>>> rq_qos_done_bio().  However, now that all bios are tagged, this
>>> caused the atomic_t for the struct rq_wait inflight count to
>>> underflow resulting in a stall. Second, now the timing was using the
>>> duration a bio from generic_make_request() rather than the timing
>>> mentioned above.
>>>
>>> This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag
>>> to determine if a bio has entered the request layer and is
>>> responsible for starting a request. Stacked drivers don't recurse
>>> through blk_mq_make_request(), so the overhead of using time between
>>> generic_make_request() and the blk_mq_get_request() should be
>>> minimal.  blk-iolatency now checks if this flag is set to determine
>>> if it should process the bio in rq_qos_done_bio().
>>
>> I'm having a hard time convincing myself that this is correct...
>> Maybe we should just add a new flag for this specific use case? Or
>> feel free to convince me otherwise.
>>
> 
> I mean it'll work for now, but then when somebody else wants to do
> something similar *cough*io.weight*cough* it'll need a new flag.  I
> kind of hate adding a new flag for every controller, but then again
> it's not like there's thousands of these things.  I'm having a hard
> time coming up with a solution other than a per-tracker flag.  As for
> this specific version, I still think it needs to be in iolatency
> itself, trying to make it generic just means it'll get fucked up again
> later down the line.  Thanks,

We definitely don't have that many flags, and I'd hate to add a
per-whatever flag for this.

But do we need that? We really just need single flag for this, my main
worry was overloading ENTERED. Especially since we're adding different
clearing and setting for it. If we had a specific one, if it's set, we
would need to disallow merging for it, I guess.

And there's already BIO_THROTTLED...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: fix iolat timestamp and restore accounting semantics
  2018-12-13 19:59     ` Jens Axboe
@ 2018-12-13 20:03       ` Josef Bacik
  2018-12-13 20:10         ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2018-12-13 20:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Josef Bacik, Dennis Zhou, Tejun Heo, kernel-team, linux-block,
	cgroups, linux-kernel

On Thu, Dec 13, 2018 at 12:59:03PM -0700, Jens Axboe wrote:
> On 12/13/18 12:52 PM, Josef Bacik wrote:
> > On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
> >> On 12/11/18 4:01 PM, Dennis Zhou wrote:
> >>> The blk-iolatency controller measures the time from
> >>> rq_qos_throttle() to rq_qos_done_bio() and attributes this time to
> >>> the first bio that needs to create the request. This means if a bio
> >>> is plug-mergeable or bio-mergeable, it gets to bypass the
> >>> blk-iolatency controller.
> >>>
> >>> The recent series, to tag all bios w/ blkgs in [1] changed the
> >>> timing incorrectly as well. First, the iolatency controller was
> >>> tagging bios and using that information if it should process it in
> >>> rq_qos_done_bio().  However, now that all bios are tagged, this
> >>> caused the atomic_t for the struct rq_wait inflight count to
> >>> underflow resulting in a stall. Second, now the timing was using the
> >>> duration a bio from generic_make_request() rather than the timing
> >>> mentioned above.
> >>>
> >>> This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag
> >>> to determine if a bio has entered the request layer and is
> >>> responsible for starting a request. Stacked drivers don't recurse
> >>> through blk_mq_make_request(), so the overhead of using time between
> >>> generic_make_request() and the blk_mq_get_request() should be
> >>> minimal.  blk-iolatency now checks if this flag is set to determine
> >>> if it should process the bio in rq_qos_done_bio().
> >>
> >> I'm having a hard time convincing myself that this is correct...
> >> Maybe we should just add a new flag for this specific use case? Or
> >> feel free to convince me otherwise.
> >>
> > 
> > I mean it'll work for now, but then when somebody else wants to do
> > something similar *cough*io.weight*cough* it'll need a new flag.  I
> > kind of hate adding a new flag for every controller, but then again
> > it's not like there's thousands of these things.  I'm having a hard
> > time coming up with a solution other than a per-tracker flag.  As for
> > this specific version, I still think it needs to be in iolatency
> > itself, trying to make it generic just means it'll get fucked up again
> > later down the line.  Thanks,
> 
> We definitely don't have that many flags, and I'd hate to add a
> per-whatever flag for this.
> 
> But do we need that? We really just need single flag for this, my main
> worry was overloading ENTERED. Especially since we're adding different
> clearing and setting for it. If we had a specific one, if it's set, we
> would need to disallow merging for it, I guess.
> 
> And there's already BIO_THROTTLED...
> 

Oh well I guess we only really want to know if we saw the BIO, which should be
able to be shared by all the rq_qos implementations.  I think I'd rather just
have a BIO_TRACKED to indicate it went through the normal rq_qos path.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: fix iolat timestamp and restore accounting semantics
  2018-12-13 20:03       ` Josef Bacik
@ 2018-12-13 20:10         ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2018-12-13 20:10 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Dennis Zhou, Tejun Heo, kernel-team, linux-block, cgroups, linux-kernel

On 12/13/18 1:03 PM, Josef Bacik wrote:
> On Thu, Dec 13, 2018 at 12:59:03PM -0700, Jens Axboe wrote:
>> On 12/13/18 12:52 PM, Josef Bacik wrote:
>>> On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
>>>> On 12/11/18 4:01 PM, Dennis Zhou wrote:
>>>>> The blk-iolatency controller measures the time from
>>>>> rq_qos_throttle() to rq_qos_done_bio() and attributes this time to
>>>>> the first bio that needs to create the request. This means if a bio
>>>>> is plug-mergeable or bio-mergeable, it gets to bypass the
>>>>> blk-iolatency controller.
>>>>>
>>>>> The recent series, to tag all bios w/ blkgs in [1] changed the
>>>>> timing incorrectly as well. First, the iolatency controller was
>>>>> tagging bios and using that information if it should process it in
>>>>> rq_qos_done_bio().  However, now that all bios are tagged, this
>>>>> caused the atomic_t for the struct rq_wait inflight count to
>>>>> underflow resulting in a stall. Second, now the timing was using the
>>>>> duration a bio from generic_make_request() rather than the timing
>>>>> mentioned above.
>>>>>
>>>>> This patch fixes these issues by reusing the BLK_QUEUE_ENTERED flag
>>>>> to determine if a bio has entered the request layer and is
>>>>> responsible for starting a request. Stacked drivers don't recurse
>>>>> through blk_mq_make_request(), so the overhead of using time between
>>>>> generic_make_request() and the blk_mq_get_request() should be
>>>>> minimal.  blk-iolatency now checks if this flag is set to determine
>>>>> if it should process the bio in rq_qos_done_bio().
>>>>
>>>> I'm having a hard time convincing myself that this is correct...
>>>> Maybe we should just add a new flag for this specific use case? Or
>>>> feel free to convince me otherwise.
>>>>
>>>
>>> I mean it'll work for now, but then when somebody else wants to do
>>> something similar *cough*io.weight*cough* it'll need a new flag.  I
>>> kind of hate adding a new flag for every controller, but then again
>>> it's not like there's thousands of these things.  I'm having a hard
>>> time coming up with a solution other than a per-tracker flag.  As for
>>> this specific version, I still think it needs to be in iolatency
>>> itself, trying to make it generic just means it'll get fucked up again
>>> later down the line.  Thanks,
>>
>> We definitely don't have that many flags, and I'd hate to add a
>> per-whatever flag for this.
>>
>> But do we need that? We really just need single flag for this, my main
>> worry was overloading ENTERED. Especially since we're adding different
>> clearing and setting for it. If we had a specific one, if it's set, we
>> would need to disallow merging for it, I guess.
>>
>> And there's already BIO_THROTTLED...
>>
> 
> Oh well I guess we only really want to know if we saw the BIO, which
> should be able to be shared by all the rq_qos implementations.  I
> think I'd rather just have a BIO_TRACKED to indicate it went through
> the normal rq_qos path.  Thanks,

Agree, so how about renaming BIO_THROTTLED to BIO_TRACKED and using
that?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-12-13 20:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 23:01 [PATCH v2] block: fix iolat timestamp and restore accounting semantics Dennis Zhou
2018-12-13 19:48 ` Jens Axboe
2018-12-13 19:52   ` Josef Bacik
2018-12-13 19:59     ` Jens Axboe
2018-12-13 20:03       ` Josef Bacik
2018-12-13 20:10         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).