linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators.
       [not found] ` <20140118143145.GD3640@htj.dyndns.org>
@ 2014-01-21 15:58   ` Frank Mayhar
  2014-01-22 15:46     ` Frank Mayhar
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Mayhar @ 2014-01-21 15:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jens Axboe

On Sat, 2014-01-18 at 09:31 -0500, Tejun Heo wrote:
> Hello, Frank.
> 
> On Fri, Jan 17, 2014 at 01:59:36PM -0800, Frank Mayhar wrote:
> > After investigation, it's clear that the elevator switch code is trying
> > to quiesce the request queue and sets the bypass flag.  Unfortunately,
> > scsi_mode_sense() and friends call blk_execute_rq() directly, which pays
> > no attention to said flag.
> 
> Ouch.
> 
> > In fact, the bypass flag, as far as I can tell, is only checked in the
> > blk_cgroup.c, in blkg_lookup() and blkg_lookup_create().
> 
> Hmmm?  IIRC the main place is in __get_request().  If the queue is
> bypassing, the request doesn't get REQ_ELVPRIV which basically
> indicates that the request shouldn't go through elevator.  I don't
> think that part is broken even for scsi_execute_rq() - it uses
> blk_get_request() which should handle it properly.
> 
> > So my question is:  Is this a simple oversight?  Should blk_execute_rq()
> 
> Yeah, it's a bug.  It's not supposed to crash like that.

Clearly. :-)

> > care about the bypass flag?  Should it perhaps hold off the I/O until
> > the bypass flag is cleared?  Since at that level it has nothing to do
> > with cgroups I kinda don't think so but I'm still trying to get my head
> > around how all this stuff goes together.
> 
> The cgroup part isn't really relevant.  That's just because cgroup
> also uses bypassing to quiesce the queue when it's changing blkcg
> policies.  The thing relevant to the elveator is the check in
> __get_request().
> 
> Hmm.... the culprit is that we don't have any check in the dispatch
> path.  It doesn't really matter who's calling blk_peek_request(), that
> function is called as part of all request dispatches and should never
> crash.  I think the only reason we haven't noticed yet is because
> calling evelator dispatch_fn doesn't crash in most cases even if the
> elevator isn't in fully working condition.
> 
> Probably what we need is replacing blk_queue_dying() with
> blk_queue_bypass() test in __elv_next_request() before invoking the
> elevator dispatch function.

Replacing?  Or adding to?  Is BYPASS always set when DYING is set?  (My
guess is not but I haven't done an exhaustive analysis.)  So the
relevant code snippet in __elv_next_request() would be:
		if (unlikely(blk_queue_dying(q)) ||
		    unlikely(blk_queue_bypass(q)) ||
		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
			return NULL;

> Can you please reply w/ Jens and lkml cc'd with the original
> description and my response?  No reason to keep this private.

Sure, doing so.  I just wanted to make sure my understanding was correct
before bringing others into it.

> Thanks a lot for the report!

No problem.  Thanks for the quick response.  Sorry for my
three-day-weekend-delayed reply. :-)


Jens, others, my original email follows:

> Hey, Tejun.  I have a question about stuff going on in the block layer
> that's not quite as straightforward as one might wish.  We have a
> situation in which we need to use something other than the CFQ I/O
> scheduler so after we create a block device (iSCSI, as it happens) we
> switch it to use the deadline scheduler.  Unfortunately we've run into a
> crash when we do this in which a request completion (in some cases) or a
> sync I/O (in others) calls blk_peek_request() which in turn calls
> __elv_next_request() which in its turn calls
> q->elevator->type->ops.elevator_dispatch_fn(), which happens at the
> point of the call be deadline_dispatch_requests(), which promptly falls
> over because we're in the middle of the elevator switch and
> q->elevator->elevator_data is NULL.
> 
> After investigation, it's clear that the elevator switch code is trying
> to quiesce the request queue and sets the bypass flag.  Unfortunately,
> scsi_mode_sense() and friends call blk_execute_rq() directly, which pays
> no attention to said flag.
> 
> In fact, the bypass flag, as far as I can tell, is only checked in the
> blk_cgroup.c, in blkg_lookup() and blkg_lookup_create().
> 
> So my question is:  Is this a simple oversight?  Should blk_execute_rq()
> care about the bypass flag?  Should it perhaps hold off the I/O until
> the bypass flag is cleared?  Since at that level it has nothing to do
> with cgroups I kinda don't think so but I'm still trying to get my head
> around how all this stuff goes together.
> 
> Any hints would be _greatly_ appreciated.  Thanks!
-- 
Frank Mayhar
310-460-4042


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators.
  2014-01-21 15:58   ` Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators Frank Mayhar
@ 2014-01-22 15:46     ` Frank Mayhar
  2014-01-23 18:38       ` Frank Mayhar
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Mayhar @ 2014-01-22 15:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jens Axboe

On Tue, 2014-01-21 at 07:58 -0800, Frank Mayhar wrote:
> Replacing?  Or adding to?  Is BYPASS always set when DYING is set?  (My
> guess is not but I haven't done an exhaustive analysis.)  So the
> relevant code snippet in __elv_next_request() would be:
> 		if (unlikely(blk_queue_dying(q)) ||
> 		    unlikely(blk_queue_bypass(q)) ||
> 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
> 			return NULL;

FYI, I've made this change and tested it.  I can't say for certain that
it fixes the crash (since it's one of those races that's difficult to
reproduce), but it does seem to pass all the tests I've thrown at it so
far.
-- 
Frank Mayhar
310-460-4042


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators.
  2014-01-22 15:46     ` Frank Mayhar
@ 2014-01-23 18:38       ` Frank Mayhar
  2014-01-23 18:56         ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Mayhar @ 2014-01-23 18:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jens Axboe

On Wed, 2014-01-22 at 07:46 -0800, Frank Mayhar wrote:
> On Tue, 2014-01-21 at 07:58 -0800, Frank Mayhar wrote:
> > Replacing?  Or adding to?  Is BYPASS always set when DYING is set?  (My
> > guess is not but I haven't done an exhaustive analysis.)  So the
> > relevant code snippet in __elv_next_request() would be:
> > 		if (unlikely(blk_queue_dying(q)) ||
> > 		    unlikely(blk_queue_bypass(q)) ||
> > 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
> > 			return NULL;
> 
> FYI, I've made this change and tested it.  I can't say for certain that
> it fixes the crash (since it's one of those races that's difficult to
> reproduce), but it does seem to pass all the tests I've thrown at it so
> far.

Um, does anyone care about this?  Tejun?  Jens?  Anyone?

This is a real crash; it would be nice if someone would weigh in.
-- 
Frank Mayhar
310-460-4042


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators.
  2014-01-23 18:38       ` Frank Mayhar
@ 2014-01-23 18:56         ` Tejun Heo
  2014-01-23 21:14           ` Frank Mayhar
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2014-01-23 18:56 UTC (permalink / raw)
  To: Frank Mayhar; +Cc: linux-kernel, Jens Axboe

On Thu, Jan 23, 2014 at 10:38:33AM -0800, Frank Mayhar wrote:
> On Wed, 2014-01-22 at 07:46 -0800, Frank Mayhar wrote:
> > On Tue, 2014-01-21 at 07:58 -0800, Frank Mayhar wrote:
> > > Replacing?  Or adding to?  Is BYPASS always set when DYING is set?  (My
> > > guess is not but I haven't done an exhaustive analysis.)  So the
> > > relevant code snippet in __elv_next_request() would be:
> > > 		if (unlikely(blk_queue_dying(q)) ||
> > > 		    unlikely(blk_queue_bypass(q)) ||
> > > 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
> > > 			return NULL;
> > 
> > FYI, I've made this change and tested it.  I can't say for certain that
> > it fixes the crash (since it's one of those races that's difficult to
> > reproduce), but it does seem to pass all the tests I've thrown at it so
> > far.
> 
> Um, does anyone care about this?  Tejun?  Jens?  Anyone?
> 
> This is a real crash; it would be nice if someone would weigh in.

Yeah, we're gonna fix this and I *think* replacing dying with bypass
is the right thing to do as a queue is always bypassing when killed.
It's probably just that we're in the earlier part of the merge window
and I have some other things on my plate.  Will post a patch in a
couple days.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators.
  2014-01-23 18:56         ` Tejun Heo
@ 2014-01-23 21:14           ` Frank Mayhar
  0 siblings, 0 replies; 5+ messages in thread
From: Frank Mayhar @ 2014-01-23 21:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jens Axboe

On Thu, 2014-01-23 at 13:56 -0500, Tejun Heo wrote:
> On Thu, Jan 23, 2014 at 10:38:33AM -0800, Frank Mayhar wrote:
> > On Wed, 2014-01-22 at 07:46 -0800, Frank Mayhar wrote:
> > > On Tue, 2014-01-21 at 07:58 -0800, Frank Mayhar wrote:
> > > > Replacing?  Or adding to?  Is BYPASS always set when DYING is set?  (My
> > > > guess is not but I haven't done an exhaustive analysis.)  So the
> > > > relevant code snippet in __elv_next_request() would be:
> > > > 		if (unlikely(blk_queue_dying(q)) ||
> > > > 		    unlikely(blk_queue_bypass(q)) ||
> > > > 		    !q->elevator->type->ops.elevator_dispatch_fn(q, 0))
> > > > 			return NULL;
> > > 
> > > FYI, I've made this change and tested it.  I can't say for certain that
> > > it fixes the crash (since it's one of those races that's difficult to
> > > reproduce), but it does seem to pass all the tests I've thrown at it so
> > > far.
> > 
> > Um, does anyone care about this?  Tejun?  Jens?  Anyone?
> > 
> > This is a real crash; it would be nice if someone would weigh in.
> 
> Yeah, we're gonna fix this and I *think* replacing dying with bypass
> is the right thing to do as a queue is always bypassing when killed.
> It's probably just that we're in the earlier part of the merge window
> and I have some other things on my plate.  Will post a patch in a
> couple days.

Thanks!

For the record, I've seriously beaten on the change above (despite it
maybe being redundant, it seems to do the right thing) and have seen no
problems so far.  I've changed it to _just_ check bypass and will beat
on that now.  If you're correct that a dying queue always has bypass set
then I anticipate no problems.
-- 
Frank Mayhar
310-460-4042


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-01-23 21:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1389995976.20232.27.camel@bobble.lax.corp.google.com>
     [not found] ` <20140118143145.GD3640@htj.dyndns.org>
2014-01-21 15:58   ` Crash in elevator_dispatch_fn() (e.g. deadline_dispatch()) when changing elevators Frank Mayhar
2014-01-22 15:46     ` Frank Mayhar
2014-01-23 18:38       ` Frank Mayhar
2014-01-23 18:56         ` Tejun Heo
2014-01-23 21:14           ` Frank Mayhar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).