linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	"Ewan D . Milne" <emilne@redhat.com>,
	Omar Sandoval <osandov@fb.com>,
	James Bottomley <james.bottomley@hansenpartnership.com>,
	Christoph Hellwig <hch@lst.de>,
	Kashyap Desai <kashyap.desai@broadcom.com>,
	Hannes Reinecke <hare@suse.de>,
	Laurence Oberman <loberman@redhat.com>,
	Bart Van Assche <bart.vanassche@wdc.com>
Subject: [RFC PATCH V4 2/2] scsi: core: don't limit per-LUN queue depth for SSD
Date: Wed,  9 Oct 2019 17:32:41 +0800	[thread overview]
Message-ID: <20191009093241.21481-3-ming.lei@redhat.com> (raw)
In-Reply-To: <20191009093241.21481-1-ming.lei@redhat.com>

SCSI core uses the atomic variable of sdev->device_busy to track
in-flight IO requests dispatched to this scsi device. IO request may be
submitted from any CPU, so the cost for maintaining the shared atomic
counter can be very big on big NUMA machine with lots of CPU cores.

sdev->queue_depth is usually used for two purposes: 1) improve IO merge;
2) fair IO request scattered among all LUNs.

blk-mq already provides fair request allocation among all active shared
request queues(LUNs), see hctx_may_queue().

NVMe doesn't have such per-request-queue(namespace) queue depth, so it
is reasonable to ignore the limit for SCSI SSD too. Also IO merge won't
play big role for reaching top SSD performance.

With this patch, big cost for tracking in-flight per-LUN requests via
atomic variable can be saved.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b6f66dcb15a5..b8f0898a15e4 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -354,7 +354,8 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
 	if (starget->can_queue > 0)
 		atomic_dec(&starget->target_busy);
 
-	atomic_dec(&sdev->device_busy);
+	if (!blk_queue_nonrot(sdev->request_queue))
+		atomic_dec(&sdev->device_busy);
 }
 
 static void scsi_kick_queue(struct request_queue *q)
@@ -410,7 +411,8 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 
 static inline bool scsi_device_is_busy(struct scsi_device *sdev)
 {
-	if (atomic_read(&sdev->device_busy) >= sdev->queue_depth)
+	if (!blk_queue_nonrot(sdev->request_queue) &&
+			atomic_read(&sdev->device_busy) >= sdev->queue_depth)
 		return true;
 	if (atomic_read(&sdev->device_blocked) > 0)
 		return true;
@@ -1283,8 +1285,12 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 				  struct scsi_device *sdev)
 {
 	unsigned int busy;
+	bool bypass = blk_queue_nonrot(sdev->request_queue);
 
-	busy = atomic_inc_return(&sdev->device_busy) - 1;
+	if (!bypass)
+		busy = atomic_inc_return(&sdev->device_busy) - 1;
+	else
+		busy = 0;
 	if (atomic_read(&sdev->device_blocked)) {
 		if (busy)
 			goto out_dec;
@@ -1298,12 +1304,16 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 				   "unblocking device at zero depth\n"));
 	}
 
+	if (bypass)
+		return 1;
+
 	if (busy >= sdev->queue_depth)
 		goto out_dec;
 
 	return 1;
 out_dec:
-	atomic_dec(&sdev->device_busy);
+	if (!bypass)
+		atomic_dec(&sdev->device_busy);
 	return 0;
 }
 
@@ -1624,7 +1634,8 @@ static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx)
 	struct request_queue *q = hctx->queue;
 	struct scsi_device *sdev = q->queuedata;
 
-	atomic_dec(&sdev->device_busy);
+	if (!blk_queue_nonrot(sdev->request_queue))
+		atomic_dec(&sdev->device_busy);
 }
 
 static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
@@ -1706,7 +1717,8 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 	case BLK_STS_OK:
 		break;
 	case BLK_STS_RESOURCE:
-		if (atomic_read(&sdev->device_busy) ||
+		if ((!blk_queue_nonrot(sdev->request_queue) &&
+		     atomic_read(&sdev->device_busy)) ||
 		    scsi_device_blocked(sdev))
 			ret = BLK_STS_DEV_RESOURCE;
 		break;
-- 
2.20.1


  parent reply	other threads:[~2019-10-09  9:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09  9:32 [PATCH V4 0/2] scsi: avoid atomic operations in IO path Ming Lei
2019-10-09  9:32 ` [PATCH V4 1/2] scsi: core: avoid host-wide host_busy counter for scsi_mq Ming Lei
2019-10-09 16:14   ` Bart Van Assche
2019-10-23  8:52   ` John Garry
2019-10-24  0:58     ` Ming Lei
2019-10-24  9:19       ` John Garry
2019-10-24 21:24         ` Ming Lei
2019-10-25  8:58           ` John Garry
2019-10-25  9:43             ` Ming Lei
2019-10-25 10:13               ` John Garry
2019-10-25 21:53                 ` Ming Lei
2019-10-28  9:42                   ` John Garry
2019-10-09  9:32 ` Ming Lei [this message]
2019-10-09 16:05   ` [RFC PATCH V4 2/2] scsi: core: don't limit per-LUN queue depth for SSD Bart Van Assche
2019-10-10  0:43     ` Ming Lei
2019-10-17 18:30     ` Kashyap Desai
2019-10-23  1:28       ` Ming Lei
2019-10-23  7:46         ` Kashyap Desai
2019-10-24  1:09           ` Ming Lei
2019-10-25 10:04             ` Kashyap Desai
2019-10-25 21:58               ` Ming Lei
2019-11-04  9:30                 ` Kashyap Desai
2019-11-05  0:23                   ` Ming Lei
2019-10-23  0:30   ` [scsi] cc2f854c79: suspend_stress.fail kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191009093241.21481-3-ming.lei@redhat.com \
    --to=ming.lei@redhat.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@wdc.com \
    --cc=emilne@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=martin.petersen@oracle.com \
    --cc=osandov@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).