All of lore.kernel.org
 help / color / mirror / Atom feed
From: snitzer@redhat.com (Mike Snitzer)
Subject: [RFC PATCH] dm: fix excessive dm-mq context switching
Date: Fri, 5 Feb 2016 13:05:15 -0500	[thread overview]
Message-ID: <20160205180515.GA25808@redhat.com> (raw)
In-Reply-To: <20160205151334.GA82754@redhat.com>

On Fri, Feb 05 2016 at 10:13am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> Following is RFC because it really speaks to dm-mq _needing_ a variant
> of blk_mq_complete_request() that supports partial completions.  Not
> supporting partial completions really isn't an option for DM multipath.
> 
> From: Mike Snitzer <snitzer at redhat.com>
> Date: Fri, 5 Feb 2016 08:49:01 -0500
> Subject: [RFC PATCH] dm: fix excessive dm-mq context switching
> 
> Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
> than if an underlying null_blk device were used directly.  This biggest
> reason for this drop in performance is that blk_insert_clone_request()
> was calling blk_mq_insert_request() with @async=true.  This forced the
> use of kblockd_schedule_delayed_work_on() to run the queues which
> ushered in ping-ponging between process context (fio in this case) and
> kblockd's kworker to submit the cloned request.  The ftrace
> function_graph tracer showed:
> 
>   kworker-2013  =>   fio-12190
>   fio-12190    =>  kworker-2013
>   ...
>   kworker-2013  =>   fio-12190
>   fio-12190    =>  kworker-2013
>   ...
> 
> Fixing blk_mq_insert_request() to _not_ use kblockd to submit the cloned
> requests isn't enough to fix eliminated the oberved context switches.
> 
> In addition to this dm-mq specific blk-core fix, there were 2 DM core
> fixes to dm-mq that (when paired with the blk-core fix) completely
> eliminate the observed context switching:
> 
> 1)  don't blk_mq_run_hw_queues in blk-mq request completion
> 
>     Motivated by desire to reduce overhead of dm-mq, punting to kblockd
>     just increases context switches.
> 
>     In my testing against a really fast null_blk device there was no benefit
>     to running blk_mq_run_hw_queues() on completion (and no other blk-mq
>     driver does this).  So hopefully this change doesn't induce the need for
>     yet another revert like commit 621739b00e16ca2d !
> 
> 2)  use blk_mq_complete_request() in dm_complete_request()
> 
>     blk_complete_request() doesn't offer the traditional q->mq_ops vs
>     .request_fn branching pattern that other historic block interfaces
>     do (e.g. blk_get_request).  Using blk_mq_complete_request() for
>     blk-mq requests is important for performance but it doesn't handle
>     partial completions -- which is a pretty big problem given the
>     potential for partial completions with DM multipath due to path
>     failure(s).  As such this makes this entire patch only RFC-worthy.

> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index c683f6d..a618477 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1344,7 +1340,10 @@ static void dm_complete_request(struct request *rq, int error)
>  	struct dm_rq_target_io *tio = tio_from_request(rq);
>  
>  	tio->error = error;
> -	blk_complete_request(rq);
> +	if (!rq->q->mq_ops)
> +		blk_complete_request(rq);
> +	else
> +		blk_mq_complete_request(rq, rq->errors);
>  }
>  
>  /*

Looking closer, DM is very likely OK just using blk_mq_complete_request.

blk_complete_request() also doesn't provide native partial completion
support (it relies on the driver to do it, which DM core does):

/**
 * blk_complete_request - end I/O on a request
 * @req:      the request being processed
 *
 * Description:
 *     Ends all I/O on a request. It does not handle partial completions,
 *     unless the driver actually implements this in its completion callback
 *     through requeueing. The actual completion happens out-of-order,
 *     through a softirq handler. The user must have registered a completion
 *     callback through blk_queue_softirq_done().
 **/

blk_mq_complete_request() is effectively implemented in a comparable
fashion to blk_complete_request().  Given that DM core is providing
partial completion support by dm.c:end_clone_bio() triggering requeueing
of the request via dm-mpath.c:multipath_end_io()'s return of
DM_ENDIO_REQUEUE.

So I'm thinking I can drop the "RFC" for this patch and run with
it.. once I get Jens' feedback (hopefully) confirming my understanding.

Jens, please advise.  If you're comfortable providing your Acked-by I
can get this fix in for 4.5-rc4 or so...

Thanks!

Mike

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: axboe@kernel.dk, Hannes Reinecke <hare@suse.de>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	Christoph Hellwig <hch@infradead.org>
Cc: "keith.busch@intel.com" <keith.busch@intel.com>,
	linux-block@vger.kernel.org,
	device-mapper development <dm-devel@redhat.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Bart Van Assche <bart.vanassche@sandisk.com>
Subject: Re: [RFC PATCH] dm: fix excessive dm-mq context switching
Date: Fri, 5 Feb 2016 13:05:15 -0500	[thread overview]
Message-ID: <20160205180515.GA25808@redhat.com> (raw)
In-Reply-To: <20160205151334.GA82754@redhat.com>

On Fri, Feb 05 2016 at 10:13am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> Following is RFC because it really speaks to dm-mq _needing_ a variant
> of blk_mq_complete_request() that supports partial completions.  Not
> supporting partial completions really isn't an option for DM multipath.
> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Fri, 5 Feb 2016 08:49:01 -0500
> Subject: [RFC PATCH] dm: fix excessive dm-mq context switching
> 
> Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
> than if an underlying null_blk device were used directly.  This biggest
> reason for this drop in performance is that blk_insert_clone_request()
> was calling blk_mq_insert_request() with @async=true.  This forced the
> use of kblockd_schedule_delayed_work_on() to run the queues which
> ushered in ping-ponging between process context (fio in this case) and
> kblockd's kworker to submit the cloned request.  The ftrace
> function_graph tracer showed:
> 
>   kworker-2013  =>   fio-12190
>   fio-12190    =>  kworker-2013
>   ...
>   kworker-2013  =>   fio-12190
>   fio-12190    =>  kworker-2013
>   ...
> 
> Fixing blk_mq_insert_request() to _not_ use kblockd to submit the cloned
> requests isn't enough to fix eliminated the oberved context switches.
> 
> In addition to this dm-mq specific blk-core fix, there were 2 DM core
> fixes to dm-mq that (when paired with the blk-core fix) completely
> eliminate the observed context switching:
> 
> 1)  don't blk_mq_run_hw_queues in blk-mq request completion
> 
>     Motivated by desire to reduce overhead of dm-mq, punting to kblockd
>     just increases context switches.
> 
>     In my testing against a really fast null_blk device there was no benefit
>     to running blk_mq_run_hw_queues() on completion (and no other blk-mq
>     driver does this).  So hopefully this change doesn't induce the need for
>     yet another revert like commit 621739b00e16ca2d !
> 
> 2)  use blk_mq_complete_request() in dm_complete_request()
> 
>     blk_complete_request() doesn't offer the traditional q->mq_ops vs
>     .request_fn branching pattern that other historic block interfaces
>     do (e.g. blk_get_request).  Using blk_mq_complete_request() for
>     blk-mq requests is important for performance but it doesn't handle
>     partial completions -- which is a pretty big problem given the
>     potential for partial completions with DM multipath due to path
>     failure(s).  As such this makes this entire patch only RFC-worthy.

> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index c683f6d..a618477 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1344,7 +1340,10 @@ static void dm_complete_request(struct request *rq, int error)
>  	struct dm_rq_target_io *tio = tio_from_request(rq);
>  
>  	tio->error = error;
> -	blk_complete_request(rq);
> +	if (!rq->q->mq_ops)
> +		blk_complete_request(rq);
> +	else
> +		blk_mq_complete_request(rq, rq->errors);
>  }
>  
>  /*

Looking closer, DM is very likely OK just using blk_mq_complete_request.

blk_complete_request() also doesn't provide native partial completion
support (it relies on the driver to do it, which DM core does):

/**
 * blk_complete_request - end I/O on a request
 * @req:      the request being processed
 *
 * Description:
 *     Ends all I/O on a request. It does not handle partial completions,
 *     unless the driver actually implements this in its completion callback
 *     through requeueing. The actual completion happens out-of-order,
 *     through a softirq handler. The user must have registered a completion
 *     callback through blk_queue_softirq_done().
 **/

blk_mq_complete_request() is effectively implemented in a comparable
fashion to blk_complete_request().  Given that DM core is providing
partial completion support by dm.c:end_clone_bio() triggering requeueing
of the request via dm-mpath.c:multipath_end_io()'s return of
DM_ENDIO_REQUEUE.

So I'm thinking I can drop the "RFC" for this patch and run with
it.. once I get Jens' feedback (hopefully) confirming my understanding.

Jens, please advise.  If you're comfortable providing your Acked-by I
can get this fix in for 4.5-rc4 or so...

Thanks!

Mike

  reply	other threads:[~2016-02-05 18:05 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-18 12:04 dm-multipath low performance with blk-mq Sagi Grimberg
2016-01-19 10:37 ` Sagi Grimberg
2016-01-19 22:45   ` Mike Snitzer
2016-01-19 22:45     ` Mike Snitzer
2016-01-25 21:40     ` Mike Snitzer
2016-01-25 21:40       ` Mike Snitzer
2016-01-25 23:37       ` [dm-devel] " Benjamin Marzinski
2016-01-25 23:37         ` Benjamin Marzinski
2016-01-26 13:29         ` Mike Snitzer
2016-01-26 13:29           ` Mike Snitzer
2016-01-26 14:01           ` Hannes Reinecke
2016-01-26 14:47             ` Mike Snitzer
2016-01-26 14:47               ` Mike Snitzer
2016-01-26 14:56               ` Christoph Hellwig
2016-01-26 14:56                 ` Christoph Hellwig
2016-01-26 15:27                 ` Mike Snitzer
2016-01-26 15:27                   ` Mike Snitzer
2016-01-26 15:57             ` Benjamin Marzinski
2016-01-27 11:14           ` Sagi Grimberg
2016-01-27 11:14             ` Sagi Grimberg
2016-01-27 17:48             ` Mike Snitzer
2016-01-27 17:48               ` Mike Snitzer
2016-01-27 17:51               ` Jens Axboe
2016-01-27 17:51                 ` Jens Axboe
2016-01-27 18:16                 ` Mike Snitzer
2016-01-27 18:16                   ` Mike Snitzer
2016-01-27 18:26                   ` Jens Axboe
2016-01-27 18:26                     ` Jens Axboe
2016-01-27 19:14                     ` Mike Snitzer
2016-01-27 19:14                       ` Mike Snitzer
2016-01-27 19:50                       ` Jens Axboe
2016-01-27 19:50                         ` Jens Axboe
2016-01-27 17:56               ` Sagi Grimberg
2016-01-27 17:56                 ` Sagi Grimberg
2016-01-27 18:42                 ` Mike Snitzer
2016-01-27 18:42                   ` Mike Snitzer
2016-01-27 19:49                   ` Jens Axboe
2016-01-27 19:49                     ` Jens Axboe
2016-01-27 20:45                     ` Mike Snitzer
2016-01-27 20:45                       ` Mike Snitzer
2016-01-29 23:35                 ` Mike Snitzer
2016-01-29 23:35                   ` Mike Snitzer
2016-01-30  8:52                   ` Hannes Reinecke
2016-01-30  8:52                     ` Hannes Reinecke
2016-01-30 19:12                     ` Mike Snitzer
2016-01-30 19:12                       ` Mike Snitzer
2016-02-01  6:46                       ` Hannes Reinecke
2016-02-01  6:46                         ` Hannes Reinecke
2016-02-03 18:04                         ` Mike Snitzer
2016-02-03 18:04                           ` Mike Snitzer
2016-02-03 18:24                           ` Mike Snitzer
2016-02-03 18:24                             ` Mike Snitzer
2016-02-03 19:22                             ` Mike Snitzer
2016-02-03 19:22                               ` Mike Snitzer
2016-02-04  6:54                             ` Hannes Reinecke
2016-02-04  6:54                               ` Hannes Reinecke
2016-02-04 13:54                               ` Mike Snitzer
2016-02-04 13:54                                 ` Mike Snitzer
2016-02-04 13:58                                 ` Hannes Reinecke
2016-02-04 13:58                                   ` Hannes Reinecke
2016-02-04 14:09                                   ` Mike Snitzer
2016-02-04 14:09                                     ` Mike Snitzer
2016-02-04 14:32                                     ` Hannes Reinecke
2016-02-04 14:32                                       ` Hannes Reinecke
2016-02-04 14:44                                       ` Mike Snitzer
2016-02-04 14:44                                         ` Mike Snitzer
2016-02-05 15:13                                 ` [RFC PATCH] dm: fix excessive dm-mq context switching Mike Snitzer
2016-02-05 15:13                                   ` Mike Snitzer
2016-02-05 18:05                                   ` Mike Snitzer [this message]
2016-02-05 18:05                                     ` Mike Snitzer
2016-02-05 19:19                                     ` Mike Snitzer
2016-02-05 19:19                                       ` Mike Snitzer
2016-02-07 15:41                                       ` Sagi Grimberg
2016-02-07 15:41                                         ` Sagi Grimberg
2016-02-07 16:07                                         ` Mike Snitzer
2016-02-07 16:07                                           ` Mike Snitzer
2016-02-07 16:42                                           ` Sagi Grimberg
2016-02-07 16:42                                             ` Sagi Grimberg
2016-02-07 16:37                                         ` Bart Van Assche
2016-02-07 16:37                                           ` Bart Van Assche
2016-02-07 16:43                                           ` Sagi Grimberg
2016-02-07 16:43                                             ` Sagi Grimberg
2016-02-07 16:53                                             ` Mike Snitzer
2016-02-07 16:53                                               ` Mike Snitzer
2016-02-07 16:54                                             ` Sagi Grimberg
2016-02-07 16:54                                               ` Sagi Grimberg
2016-02-07 17:20                                               ` Mike Snitzer
2016-02-07 17:20                                                 ` Mike Snitzer
2016-02-08 12:21                                                 ` Sagi Grimberg
2016-02-08 12:21                                                   ` Sagi Grimberg
2016-02-08 14:34                                                   ` Mike Snitzer
2016-02-08 14:34                                                     ` Mike Snitzer
2016-02-09  7:50                                                 ` Hannes Reinecke
2016-02-09  7:50                                                   ` Hannes Reinecke
2016-02-09 14:55                                                   ` Mike Snitzer
2016-02-09 14:55                                                     ` Mike Snitzer
2016-02-09 15:32                                                     ` Hannes Reinecke
2016-02-09 15:32                                                       ` Hannes Reinecke
2016-02-10  0:45                                                       ` Mike Snitzer
2016-02-10  0:45                                                         ` Mike Snitzer
2016-02-11  1:50                                                         ` RCU-ified dm-mpath for testing/review Mike Snitzer
2016-02-11  3:35                                                           ` Mike Snitzer
2016-02-11  3:35                                                             ` Mike Snitzer
2016-02-11 15:34                                                           ` Mike Snitzer
2016-02-11 15:34                                                             ` Mike Snitzer
2016-02-12 15:18                                                             ` Hannes Reinecke
2016-02-12 15:18                                                               ` Hannes Reinecke
2016-02-12 15:26                                                               ` Mike Snitzer
2016-02-12 15:26                                                                 ` Mike Snitzer
2016-02-12 16:04                                                                 ` Hannes Reinecke
2016-02-12 16:04                                                                   ` Hannes Reinecke
2016-02-12 18:00                                                                   ` Mike Snitzer
2016-02-12 18:00                                                                     ` Mike Snitzer
2016-02-15  6:47                                                                     ` Hannes Reinecke
2016-02-15  6:47                                                                       ` Hannes Reinecke
2016-01-26  1:49       ` [dm-devel] dm-multipath low performance with blk-mq Benjamin Marzinski
2016-01-26  1:49         ` Benjamin Marzinski
2016-01-26 16:03       ` Mike Snitzer
2016-01-26 16:03         ` Mike Snitzer
2016-01-26 16:44         ` Christoph Hellwig
2016-01-26 16:44           ` Christoph Hellwig
2016-01-27  2:09           ` Mike Snitzer
2016-01-27  2:09             ` Mike Snitzer
2016-01-27 11:10             ` Sagi Grimberg
2016-01-27 11:10               ` Sagi Grimberg
2016-01-26 21:40         ` [dm-devel] " Benjamin Marzinski
2016-01-26 21:40           ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160205180515.GA25808@redhat.com \
    --to=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.