All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Merla, ShivaKrishna" <ShivaKrishna.Merla@netapp.com>
To: Junichi Nomura <j-nomura@ce.jp.nec.com>,
	device-mapper development <dm-devel@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"jmoyer@redhat.com" <jmoyer@redhat.com>
Subject: Re: [PATCH 6/8] dm: don't start current request if it would've merged with the previous
Date: Mon, 9 Mar 2015 03:30:03 +0000	[thread overview]
Message-ID: <230f5fbd7d6c403ab81327a69a52361f@hioexcmbx01-prd.hq.netapp.com> (raw)
In-Reply-To: <54FCF1B2.8030007@ce.jp.nec.com>



> -----Original Message-----
> From: Junichi Nomura [mailto:j-nomura@ce.jp.nec.com]
> Sent: Sunday, March 08, 2015 8:05 PM
> To: device-mapper development; Mike Snitzer
> Cc: axboe@kernel.dk; jmoyer@redhat.com; Hannes Reinecke; Merla,
> ShivaKrishna
> Subject: Re: [dm-devel] [PATCH 6/8] dm: don't start current request if it
> would've merged with the previous
> 
> On 03/04/15 09:47, Mike Snitzer wrote:
> > Request-based DM's dm_request_fn() is so fast to pull requests off the
> > queue that steps need to be taken to promote merging by avoiding
> request
> > processing if it makes sense.
> >
> > If the current request would've merged with previous request let the
> > current request stay on the queue longer.
> 
> Hi Mike,
> 
> Looking at this thread, I think there are 2 different problems mixed.
> 
> Firstly, "/dev/skd" is STEC S1120 block driver, which doesn't have
> lld_busy function. So back pressure doesn't propagate to request-based
> dm device and dm feeds as many request as possible to the lower driver.
> ("pulling too fast" situation)
> If you still have access to the device, can you try the patch like
> the attached one?
> 
> Secondly, for this comment from Merla ShivaKrishna:
> 
> > Yes, Indeed this the exact issue we saw at NetApp. While running
> sequential
> > 4K write I/O with large thread count, 2 paths yield better performance than
> > 4 paths and performance drastically drops with 4 paths. The device
> queue_depth
> > as 32 and with blktrace we could see better I/O merging happening and
> average
> > request size was > 8K through iostat. With 4 paths none of the I/O gets
> merged and
> > always average request size is 4K. Scheduler used was noop as we are using
> SSD
> > based storage. We could get I/O merging to happen even with 4 paths but
> with lower
> > device queue_depth of 16. Even then the performance was lacking
> compared to 2 paths.
> 
> Have you tried increasing nr_requests of the dm device?
> E.g. setting nr_requests to 256.
> 
> 4 paths with each queue depth 32 means that it can have 128 I/Os in flight.
> With the default value of nr_requests 128, the request queue is almost
> always empty and I/O merge could not happen.
> Increasing nr_requests of the dm device allows some more requests
> queued,
> thus the chance of merging may increase.
> Reducing the lower device queue depth could be another solution. But if
> the depth is too low, you might not be able to keep the optimal speed.
>
Yes, we have tried this as well but didn't help. Indeed, we have tested with queue_depth
of 16 on each path as well with 64 I/O's in flight and resulted in same issue. We did try
reducing the queue_depth with 4 paths, but couldn't achieve comparable performance
as of 2 paths. With Mike's patch, we see tremendous improvement with just a small delay 
of ~20us with 4 paths. This might vary with different configurations but sure have proved 
that a tunable to delay dispatches with sequential workloads has helped a lot.

 
> ----
> Jun'ichi Nomura, NEC Corporation
> 
> 
> [PATCH] skd: Add lld_busy function for request-based stacking driver
> 
> diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c
> index 1e46eb2..0e8f466 100644
> --- a/drivers/block/skd_main.c
> +++ b/drivers/block/skd_main.c
> @@ -565,6 +565,16 @@ skd_prep_discard_cdb(struct skd_scsi_request
> *scsi_req,
>  	blk_add_request_payload(req, page, len);
>  }
> 
> +static int skd_lld_busy(struct request_queue *q)
> +{
> +	struct skd_device *skdev = q->queuedata;
> +
> +	if (skdev->in_flight >= skdev->cur_max_queue_depth)
> +		return 1;
> +
> +	return 0;
> +}
> +
>  static void skd_request_fn_not_online(struct request_queue *q);
> 
>  static void skd_request_fn(struct request_queue *q)
> @@ -4419,6 +4429,8 @@ static int skd_cons_disk(struct skd_device *skdev)
>  	/* set sysfs ptimal_io_size to 8K */
>  	blk_queue_io_opt(q, 8192);
> 
> +	/* register feed back function for stacking driver */
> +	blk_queue_lld_busy(q, skd_lld_busy);
> +
>  	/* DISCARD Flag initialization. */
>  	q->limits.discard_granularity = 8192;
>  	q->limits.discard_alignment = 0;

  reply	other threads:[~2015-03-09  3:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-04  0:47 [PATCH 0/8] dm: optimize request-based queue processing Mike Snitzer
2015-03-04  0:47 ` [PATCH 1/8] dm: remove unnecessary wrapper around blk_lld_busy Mike Snitzer
2015-03-04  0:47 ` [PATCH 2/8] dm: remove request-based DM queue's lld_busy_fn hook Mike Snitzer
2015-03-04  0:47 ` [PATCH 3/8] dm: remove request-based logic from make_request_fn wrapper Mike Snitzer
2015-03-04  0:47 ` [PATCH 4/8] dm: only run the queue on completion if congested or no requests pending Mike Snitzer
2015-03-04  0:47 ` [PATCH 5/8] dm: don't schedule delayed run of the queue if nothing to do Mike Snitzer
2015-03-04  0:47 ` [PATCH 6/8] dm: don't start current request if it would've merged with the previous Mike Snitzer
2015-03-04  6:36   ` Hannes Reinecke
2015-03-04 17:26     ` Mike Snitzer
2015-03-09  1:04   ` Junichi Nomura
2015-03-09  3:30     ` Merla, ShivaKrishna [this message]
2015-03-09  6:09       ` Junichi Nomura
2015-03-09 16:10         ` Merla, ShivaKrishna
2015-03-10  1:05           ` Junichi Nomura
2015-03-10  1:59             ` Merla, ShivaKrishna
2015-03-10  5:43               ` Junichi Nomura
2015-03-04  0:47 ` [PATCH 7/8] dm sysfs: introduce ability to add writable attributes Mike Snitzer
2015-03-04  0:47 ` [PATCH 8/8] dm: impose configurable deadline for dm_request_fn's merge heuristic Mike Snitzer
2015-03-06 15:30   ` [PATCH 8/8 v2] " Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=230f5fbd7d6c403ab81327a69a52361f@hioexcmbx01-prd.hq.netapp.com \
    --to=shivakrishna.merla@netapp.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=jmoyer@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.