kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] scsi: add support for request batching
@ 2019-05-30 11:28 Paolo Bonzini
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-30 11:28 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jejb, martin.petersen, linux-scsi, stefanha

This allows a list of requests to be issued, with the LLD only writing
the hardware doorbell when necessary, after the last request was prepared.
This is more efficient if we have lists of requests to issue, particularly
on virtualized hardware, where writing the doorbell is more expensive than
on real hardware.

This applies to any HBA, either singlequeue or multiqueue; the second
patch implements it for virtio-scsi.

Paolo

Paolo Bonzini (2):
  scsi_host: add support for request batching
  virtio_scsi: implement request batching

 drivers/scsi/scsi_lib.c    | 37 ++++++++++++++++++++++---
 drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
 include/scsi/scsi_cmnd.h   |  1 +
 include/scsi/scsi_host.h   | 16 +++++++++--
 4 files changed, 89 insertions(+), 20 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 11:28 [PATCH 0/2] scsi: add support for request batching Paolo Bonzini
@ 2019-05-30 11:28 ` Paolo Bonzini
  2019-05-30 15:36   ` Bart Van Assche
                     ` (3 more replies)
  2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-30 11:28 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jejb, martin.petersen, linux-scsi, stefanha

This allows a list of requests to be issued, with the LLD only writing
the hardware doorbell when necessary, after the last request was prepared.
This is more efficient if we have lists of requests to issue, particularly
on virtualized hardware, where writing the doorbell is more expensive than
on real hardware.

The use case for this is plugged IO, where blk-mq flushes a batch of
requests all at once.

The API is the same as for blk-mq, just with blk-mq concepts tweaked to
fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
extracted from the hctx and passed as a parameter.

The only complication is that blk-mq uses different plugging heuristics
depending on whether commit_rqs is present or not.  So we have two
different sets of blk_mq_ops and pick one depending on whether the
scsi_host template uses commit_rqs or not.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/scsi/scsi_lib.c  | 37 ++++++++++++++++++++++++++++++++++---
 include/scsi/scsi_cmnd.h |  1 +
 include/scsi/scsi_host.h | 16 ++++++++++++++--
 3 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 601b9f1de267..eb4e67d02bfe 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1673,10 +1673,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 		blk_mq_start_request(req);
 	}
 
+	cmd->flags &= SCMD_PRESERVED_FLAGS;
 	if (sdev->simple_tags)
 		cmd->flags |= SCMD_TAGGED;
-	else
-		cmd->flags &= ~SCMD_TAGGED;
+	if (bd->last)
+		cmd->flags |= SCMD_LAST;
 
 	scsi_init_cmd_errh(cmd);
 	cmd->scsi_done = scsi_mq_done;
@@ -1807,10 +1808,37 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(__scsi_init_queue);
 
+static const struct blk_mq_ops scsi_mq_ops_no_commit = {
+	.get_budget	= scsi_mq_get_budget,
+	.put_budget	= scsi_mq_put_budget,
+	.queue_rq	= scsi_queue_rq,
+	.complete	= scsi_softirq_done,
+	.timeout	= scsi_timeout,
+#ifdef CONFIG_BLK_DEBUG_FS
+	.show_rq	= scsi_show_rq,
+#endif
+	.init_request	= scsi_mq_init_request,
+	.exit_request	= scsi_mq_exit_request,
+	.initialize_rq_fn = scsi_initialize_rq,
+	.busy		= scsi_mq_lld_busy,
+	.map_queues	= scsi_map_queues,
+};
+
+
+static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
+{
+	struct request_queue *q = hctx->queue;
+	struct scsi_device *sdev = q->queuedata;
+	struct Scsi_Host *shost = sdev->host;
+
+	shost->hostt->commit_rqs(shost, hctx->queue_num);
+}
+
 static const struct blk_mq_ops scsi_mq_ops = {
 	.get_budget	= scsi_mq_get_budget,
 	.put_budget	= scsi_mq_put_budget,
 	.queue_rq	= scsi_queue_rq,
+	.commit_rqs	= scsi_commit_rqs,
 	.complete	= scsi_softirq_done,
 	.timeout	= scsi_timeout,
 #ifdef CONFIG_BLK_DEBUG_FS
@@ -1845,7 +1873,10 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
 		cmd_size += sizeof(struct scsi_data_buffer) + sgl_size;
 
 	memset(&shost->tag_set, 0, sizeof(shost->tag_set));
-	shost->tag_set.ops = &scsi_mq_ops;
+	if (shost->hostt->commit_rqs)
+		shost->tag_set.ops = &scsi_mq_ops;
+	else
+		shost->tag_set.ops = &scsi_mq_ops_no_commit;
 	shost->tag_set.nr_hw_queues = shost->nr_hw_queues ? : 1;
 	shost->tag_set.queue_depth = shost->can_queue;
 	shost->tag_set.cmd_size = cmd_size;
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 76ed5e4acd38..91bd749a02f7 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -57,6 +57,7 @@ struct scsi_pointer {
 #define SCMD_TAGGED		(1 << 0)
 #define SCMD_UNCHECKED_ISA_DMA	(1 << 1)
 #define SCMD_INITIALIZED	(1 << 2)
+#define SCMD_LAST		(1 << 3)
 /* flags preserved across unprep / reprep */
 #define SCMD_PRESERVED_FLAGS	(SCMD_UNCHECKED_ISA_DMA | SCMD_INITIALIZED)
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 2b539a1b3f62..28f1c9177cd2 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -80,8 +80,10 @@ struct scsi_host_template {
 	 * command block to the LLDD.  When the driver finished
 	 * processing the command the done callback is invoked.
 	 *
-	 * If queuecommand returns 0, then the HBA has accepted the
-	 * command.  The done() function must be called on the command
+	 * If queuecommand returns 0, then the driver has accepted the
+	 * command.  It must also push it to the HBA if the scsi_cmnd
+	 * flag SCMD_LAST is set, or if the driver does not implement
+	 * commit_rqs.  The done() function must be called on the command
 	 * when the driver has finished with it. (you may call done on the
 	 * command before queuecommand returns, but in this case you
 	 * *must* return 0 from queuecommand).
@@ -109,6 +111,16 @@ struct scsi_host_template {
 	 */
 	int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);
 
+	/*
+	 * The commit_rqs function is used to trigger a hardware
+	 * doorbell after some requests have been queued with
+	 * queuecommand, when an error is encountered before sending
+	 * the request with SCMD_LAST set.
+	 *
+	 * STATUS: OPTIONAL
+	 */
+	void (*commit_rqs)(struct Scsi_Host *, u16);
+
 	/*
 	 * This is an error handling strategy routine.  You don't need to
 	 * define one of these if you don't want to - there is a default
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/2] virtio_scsi: implement request batching
  2019-05-30 11:28 [PATCH 0/2] scsi: add support for request batching Paolo Bonzini
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
@ 2019-05-30 11:28 ` Paolo Bonzini
  2019-05-30 17:28   ` Bart Van Assche
                     ` (2 more replies)
  2019-06-10 12:36 ` [PATCH 0/2] scsi: add support for " Stefan Hajnoczi
  2019-06-26 13:51 ` Paolo Bonzini
  3 siblings, 3 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-30 11:28 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jejb, martin.petersen, linux-scsi, stefanha

Adding the command and kicking the virtqueue so far was done one after
another.  Make the kick optional, so that we can take into account SCMD_LAST.
We also need a commit_rqs callback to kick the device if blk-mq aborts
the submission before the last request is reached.

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 8af01777d09c..918c811cea95 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -375,14 +375,7 @@ static void virtscsi_event_done(struct virtqueue *vq)
 	virtscsi_vq_done(vscsi, &vscsi->event_vq, virtscsi_complete_event);
 };
 
-/**
- * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue
- * @vq		: the struct virtqueue we're talking about
- * @cmd		: command structure
- * @req_size	: size of the request buffer
- * @resp_size	: size of the response buffer
- */
-static int virtscsi_add_cmd(struct virtqueue *vq,
+static int __virtscsi_add_cmd(struct virtqueue *vq,
 			    struct virtio_scsi_cmd *cmd,
 			    size_t req_size, size_t resp_size)
 {
@@ -427,17 +420,39 @@ static int virtscsi_add_cmd(struct virtqueue *vq,
 	return virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
 }
 
-static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq,
+static void virtscsi_kick_vq(struct virtio_scsi_vq *vq)
+{
+	bool needs_kick;
+	unsigned long flags;
+
+	spin_lock_irqsave(&vq->vq_lock, flags);
+	needs_kick = virtqueue_kick_prepare(vq->vq);
+	spin_unlock_irqrestore(&vq->vq_lock, flags);
+
+	if (needs_kick)
+		virtqueue_notify(vq->vq);
+}
+
+/**
+ * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue, optionally kick it
+ * @vq		: the struct virtqueue we're talking about
+ * @cmd		: command structure
+ * @req_size	: size of the request buffer
+ * @resp_size	: size of the response buffer
+ * @kick	: whether to kick the virtqueue immediately
+ */
+static int virtscsi_add_cmd(struct virtio_scsi_vq *vq,
 			     struct virtio_scsi_cmd *cmd,
-			     size_t req_size, size_t resp_size)
+			     size_t req_size, size_t resp_size,
+			     bool kick)
 {
 	unsigned long flags;
 	int err;
 	bool needs_kick = false;
 
 	spin_lock_irqsave(&vq->vq_lock, flags);
-	err = virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size);
-	if (!err)
+	err = __virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size);
+	if (!err && kick)
 		needs_kick = virtqueue_kick_prepare(vq->vq);
 
 	spin_unlock_irqrestore(&vq->vq_lock, flags);
@@ -502,6 +517,7 @@ static int virtscsi_queuecommand(struct Scsi_Host *shost,
 	struct virtio_scsi *vscsi = shost_priv(shost);
 	struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
 	struct virtio_scsi_cmd *cmd = scsi_cmd_priv(sc);
+	bool kick;
 	unsigned long flags;
 	int req_size;
 	int ret;
@@ -531,7 +547,8 @@ static int virtscsi_queuecommand(struct Scsi_Host *shost,
 		req_size = sizeof(cmd->req.cmd);
 	}
 
-	ret = virtscsi_kick_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd));
+	kick = (sc->flags & SCMD_LAST) != 0;
+	ret = virtscsi_add_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd), kick);
 	if (ret == -EIO) {
 		cmd->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
 		spin_lock_irqsave(&req_vq->vq_lock, flags);
@@ -549,8 +566,8 @@ static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
 	int ret = FAILED;
 
 	cmd->comp = &comp;
-	if (virtscsi_kick_cmd(&vscsi->ctrl_vq, cmd,
-			      sizeof cmd->req.tmf, sizeof cmd->resp.tmf) < 0)
+	if (virtscsi_add_cmd(&vscsi->ctrl_vq, cmd,
+			      sizeof cmd->req.tmf, sizeof cmd->resp.tmf, true) < 0)
 		goto out;
 
 	wait_for_completion(&comp);
@@ -664,6 +681,13 @@ static int virtscsi_map_queues(struct Scsi_Host *shost)
 	return blk_mq_virtio_map_queues(qmap, vscsi->vdev, 2);
 }
 
+static void virtscsi_commit_rqs(struct Scsi_Host *shost, u16 hwq)
+{
+	struct virtio_scsi *vscsi = shost_priv(shost);
+
+	virtscsi_kick_vq(&vscsi->req_vqs[hwq]);
+}
+
 /*
  * The host guarantees to respond to each command, although I/O
  * latencies might be higher than on bare metal.  Reset the timer
@@ -681,6 +705,7 @@ static struct scsi_host_template virtscsi_host_template = {
 	.this_id = -1,
 	.cmd_size = sizeof(struct virtio_scsi_cmd),
 	.queuecommand = virtscsi_queuecommand,
+	.commit_rqs = virtscsi_commit_rqs,
 	.change_queue_depth = virtscsi_change_queue_depth,
 	.eh_abort_handler = virtscsi_abort,
 	.eh_device_reset_handler = virtscsi_device_reset,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
@ 2019-05-30 15:36   ` Bart Van Assche
  2019-05-30 15:54     ` Paolo Bonzini
  2019-05-31  3:27   ` Ming Lei
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Bart Van Assche @ 2019-05-30 15:36 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 4:28 AM, Paolo Bonzini wrote:
> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
> +	.get_budget	= scsi_mq_get_budget,
> +	.put_budget	= scsi_mq_put_budget,
> +	.queue_rq	= scsi_queue_rq,
> +	.complete	= scsi_softirq_done,
> +	.timeout	= scsi_timeout,
> +#ifdef CONFIG_BLK_DEBUG_FS
> +	.show_rq	= scsi_show_rq,
> +#endif
> +	.init_request	= scsi_mq_init_request,
> +	.exit_request	= scsi_mq_exit_request,
> +	.initialize_rq_fn = scsi_initialize_rq,
> +	.busy		= scsi_mq_lld_busy,
> +	.map_queues	= scsi_map_queues,
> +};
> +
> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
> +{
> +	struct request_queue *q = hctx->queue;
> +	struct scsi_device *sdev = q->queuedata;
> +	struct Scsi_Host *shost = sdev->host;
> +
> +	shost->hostt->commit_rqs(shost, hctx->queue_num);
> +}
> +
>   static const struct blk_mq_ops scsi_mq_ops = {
>   	.get_budget	= scsi_mq_get_budget,
>   	.put_budget	= scsi_mq_put_budget,
>   	.queue_rq	= scsi_queue_rq,
> +	.commit_rqs	= scsi_commit_rqs,
>   	.complete	= scsi_softirq_done,
>   	.timeout	= scsi_timeout,
>   #ifdef CONFIG_BLK_DEBUG_FS

Hi Paolo,

Have you considered to modify the block layer such that a single 
scsi_mq_ops structure can be used for all SCSI LLD types?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 15:36   ` Bart Van Assche
@ 2019-05-30 15:54     ` Paolo Bonzini
  2019-05-30 17:54       ` Bart Van Assche
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-30 15:54 UTC (permalink / raw)
  To: Bart Van Assche, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 30/05/19 17:36, Bart Van Assche wrote:
> On 5/30/19 4:28 AM, Paolo Bonzini wrote:
>> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
>> +    .get_budget    = scsi_mq_get_budget,
>> +    .put_budget    = scsi_mq_put_budget,
>> +    .queue_rq    = scsi_queue_rq,
>> +    .complete    = scsi_softirq_done,
>> +    .timeout    = scsi_timeout,
>> +#ifdef CONFIG_BLK_DEBUG_FS
>> +    .show_rq    = scsi_show_rq,
>> +#endif
>> +    .init_request    = scsi_mq_init_request,
>> +    .exit_request    = scsi_mq_exit_request,
>> +    .initialize_rq_fn = scsi_initialize_rq,
>> +    .busy        = scsi_mq_lld_busy,
>> +    .map_queues    = scsi_map_queues,
>> +};
>> +
>> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
>> +{
>> +    struct request_queue *q = hctx->queue;
>> +    struct scsi_device *sdev = q->queuedata;
>> +    struct Scsi_Host *shost = sdev->host;
>> +
>> +    shost->hostt->commit_rqs(shost, hctx->queue_num);
>> +}
>> +
>>   static const struct blk_mq_ops scsi_mq_ops = {
>>       .get_budget    = scsi_mq_get_budget,
>>       .put_budget    = scsi_mq_put_budget,
>>       .queue_rq    = scsi_queue_rq,
>> +    .commit_rqs    = scsi_commit_rqs,
>>       .complete    = scsi_softirq_done,
>>       .timeout    = scsi_timeout,
>>   #ifdef CONFIG_BLK_DEBUG_FS
> 
> Hi Paolo,
> 
> Have you considered to modify the block layer such that a single
> scsi_mq_ops structure can be used for all SCSI LLD types?

Yes, but I don't think it's possible to do it in a nice way.
Any adjustment we make to the block layer to fit the SCSI subsystem's
desires would make all other block drivers uglier, so I chose to confine
the ugliness here.

The root issue is that the SCSI subsystem is unique in how it sits on
top of the block layer; this is the famous "adapter" (or "midlayer",
though that is confusing when talking about SCSI) design that Linux
usually tries to avoid.

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] virtio_scsi: implement request batching
  2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
@ 2019-05-30 17:28   ` Bart Van Assche
  2019-05-31  9:03     ` Paolo Bonzini
  2019-07-05  7:52   ` Hannes Reinecke
  2019-07-08  9:47   ` Ming Lei
  2 siblings, 1 reply; 30+ messages in thread
From: Bart Van Assche @ 2019-05-30 17:28 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 4:28 AM, Paolo Bonzini wrote:
> @@ -531,7 +547,8 @@ static int virtscsi_queuecommand(struct Scsi_Host *shost,
>   		req_size = sizeof(cmd->req.cmd);
>   	}
>   
> -	ret = virtscsi_kick_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd));
> +	kick = (sc->flags & SCMD_LAST) != 0;
> +	ret = virtscsi_add_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd), kick);

Have you considered to have the SCSI core call commit_rqs() if bd->last 
is true? I think that would avoid that we need to introduce the 
SCMD_LAST flag and that would also avoid that every SCSI LLD that 
supports a commit_rqs callback has to introduce code to test the 
SCMD_LAST flag.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 15:54     ` Paolo Bonzini
@ 2019-05-30 17:54       ` Bart Van Assche
  2019-05-31  9:12         ` Paolo Bonzini
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Van Assche @ 2019-05-30 17:54 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 8:54 AM, Paolo Bonzini wrote:
> On 30/05/19 17:36, Bart Van Assche wrote:
>> On 5/30/19 4:28 AM, Paolo Bonzini wrote:
>>> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
>>> +    .get_budget    = scsi_mq_get_budget,
>>> +    .put_budget    = scsi_mq_put_budget,
>>> +    .queue_rq    = scsi_queue_rq,
>>> +    .complete    = scsi_softirq_done,
>>> +    .timeout    = scsi_timeout,
>>> +#ifdef CONFIG_BLK_DEBUG_FS
>>> +    .show_rq    = scsi_show_rq,
>>> +#endif
>>> +    .init_request    = scsi_mq_init_request,
>>> +    .exit_request    = scsi_mq_exit_request,
>>> +    .initialize_rq_fn = scsi_initialize_rq,
>>> +    .busy        = scsi_mq_lld_busy,
>>> +    .map_queues    = scsi_map_queues,
>>> +};
>>> +
>>> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
>>> +{
>>> +    struct request_queue *q = hctx->queue;
>>> +    struct scsi_device *sdev = q->queuedata;
>>> +    struct Scsi_Host *shost = sdev->host;
>>> +
>>> +    shost->hostt->commit_rqs(shost, hctx->queue_num);
>>> +}
>>> +
>>>   static const struct blk_mq_ops scsi_mq_ops = {
>>>       .get_budget    = scsi_mq_get_budget,
>>>       .put_budget    = scsi_mq_put_budget,
>>>       .queue_rq    = scsi_queue_rq,
>>> +    .commit_rqs    = scsi_commit_rqs,
>>>       .complete    = scsi_softirq_done,
>>>       .timeout    = scsi_timeout,
>>>   #ifdef CONFIG_BLK_DEBUG_FS
>>
>> Hi Paolo,
>>
>> Have you considered to modify the block layer such that a single
>> scsi_mq_ops structure can be used for all SCSI LLD types?
> 
> Yes, but I don't think it's possible to do it in a nice way.
> Any adjustment we make to the block layer to fit the SCSI subsystem's
> desires would make all other block drivers uglier, so I chose to confine
> the ugliness here.
> 
> The root issue is that the SCSI subsystem is unique in how it sits on
> top of the block layer; this is the famous "adapter" (or "midlayer",
> though that is confusing when talking about SCSI) design that Linux
> usually tries to avoid.

As far as I can see the only impact of defining an empty commit_rqs
callback on the queueing behavior is that blk_mq_make_request() will
queue requests for multiple hwqs on the plug list instead of requests
for a single hwq. The plug list is sorted by hwq before it is submitted
to a block driver. If that helps NVMe performance it should also help
SCSI performance. How about always setting commit_rqs = scsi_commit_rqs
in scsi_mq_ops?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
  2019-05-30 15:36   ` Bart Van Assche
@ 2019-05-31  3:27   ` Ming Lei
  2019-06-03  8:16     ` Paolo Bonzini
  2019-06-04 22:40   ` Bart Van Assche
  2019-06-19  8:11   ` Hannes Reinecke
  3 siblings, 1 reply; 30+ messages in thread
From: Ming Lei @ 2019-05-31  3:27 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Linux Kernel Mailing List, KVM General, jejb, Martin K. Petersen,
	Linux SCSI List, Stefan Hajnoczi

On Thu, May 30, 2019 at 7:28 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
>
> The use case for this is plugged IO, where blk-mq flushes a batch of
> requests all at once.
>
> The API is the same as for blk-mq, just with blk-mq concepts tweaked to
> fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
> a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
> extracted from the hctx and passed as a parameter.
>
> The only complication is that blk-mq uses different plugging heuristics
> depending on whether commit_rqs is present or not.  So we have two
> different sets of blk_mq_ops and pick one depending on whether the
> scsi_host template uses commit_rqs or not.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  drivers/scsi/scsi_lib.c  | 37 ++++++++++++++++++++++++++++++++++---
>  include/scsi/scsi_cmnd.h |  1 +
>  include/scsi/scsi_host.h | 16 ++++++++++++++--
>  3 files changed, 49 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 601b9f1de267..eb4e67d02bfe 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1673,10 +1673,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>                 blk_mq_start_request(req);
>         }
>
> +       cmd->flags &= SCMD_PRESERVED_FLAGS;
>         if (sdev->simple_tags)
>                 cmd->flags |= SCMD_TAGGED;
> -       else
> -               cmd->flags &= ~SCMD_TAGGED;
> +       if (bd->last)
> +               cmd->flags |= SCMD_LAST;
>
>         scsi_init_cmd_errh(cmd);
>         cmd->scsi_done = scsi_mq_done;
> @@ -1807,10 +1808,37 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>  }
>  EXPORT_SYMBOL_GPL(__scsi_init_queue);
>
> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
> +       .get_budget     = scsi_mq_get_budget,
> +       .put_budget     = scsi_mq_put_budget,
> +       .queue_rq       = scsi_queue_rq,
> +       .complete       = scsi_softirq_done,
> +       .timeout        = scsi_timeout,
> +#ifdef CONFIG_BLK_DEBUG_FS
> +       .show_rq        = scsi_show_rq,
> +#endif
> +       .init_request   = scsi_mq_init_request,
> +       .exit_request   = scsi_mq_exit_request,
> +       .initialize_rq_fn = scsi_initialize_rq,
> +       .busy           = scsi_mq_lld_busy,
> +       .map_queues     = scsi_map_queues,
> +};
> +
> +
> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
> +{
> +       struct request_queue *q = hctx->queue;
> +       struct scsi_device *sdev = q->queuedata;
> +       struct Scsi_Host *shost = sdev->host;
> +
> +       shost->hostt->commit_rqs(shost, hctx->queue_num);
> +}

It should be fine to implement scsi_commit_rqs() as:

 if (shost->hostt->commit_rqs)
       shost->hostt->commit_rqs(shost, hctx->queue_num);

then scsi_mq_ops_no_commit can be saved.

Because .commit_rqs() is only called when BLK_STS_*_RESOURCE is
returned from scsi_queue_rq(), at that time shost->hostt->commit_rqs should
have been hit from cache given .queuecommand is called via
host->hostt->queuecommand.

Not mention BLK_STS_*_RESOURCE is just often returned for small queue depth
device.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] virtio_scsi: implement request batching
  2019-05-30 17:28   ` Bart Van Assche
@ 2019-05-31  9:03     ` Paolo Bonzini
  0 siblings, 0 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-31  9:03 UTC (permalink / raw)
  To: Bart Van Assche, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 30/05/19 19:28, Bart Van Assche wrote:
> On 5/30/19 4:28 AM, Paolo Bonzini wrote:
>> @@ -531,7 +547,8 @@ static int virtscsi_queuecommand(struct Scsi_Host
>> *shost,
>>           req_size = sizeof(cmd->req.cmd);
>>       }
>>   -    ret = virtscsi_kick_cmd(req_vq, cmd, req_size,
>> sizeof(cmd->resp.cmd));
>> +    kick = (sc->flags & SCMD_LAST) != 0;
>> +    ret = virtscsi_add_cmd(req_vq, cmd, req_size,
>> sizeof(cmd->resp.cmd), kick);
> 
> Have you considered to have the SCSI core call commit_rqs() if bd->last
> is true? I think that would avoid that we need to introduce the
> SCMD_LAST flag and that would also avoid that every SCSI LLD that
> supports a commit_rqs callback has to introduce code to test the
> SCMD_LAST flag.

That is slightly worse for performance, as it unlocks and re-locks the
spinlock.

Paolo


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 17:54       ` Bart Van Assche
@ 2019-05-31  9:12         ` Paolo Bonzini
  0 siblings, 0 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-05-31  9:12 UTC (permalink / raw)
  To: Bart Van Assche, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 30/05/19 19:54, Bart Van Assche wrote:
> As far as I can see the only impact of defining an empty commit_rqs
> callback on the queueing behavior is that blk_mq_make_request() will
> queue requests for multiple hwqs on the plug list instead of requests
> for a single hwq. The plug list is sorted by hwq before it is submitted
> to a block driver. If that helps NVMe performance it should also help
> SCSI performance.

See the comment in blk_mq_make_request(): sorting by hwq helps NVMe
because it uses bd->last, and blk_mq_make_request() uses the presence of
the ->commit_rqs() as a sign that the driver uses bd->last.  The
heuristic basically trades latency (with plugging, the driver rings the
doorbell a bit later) for throughput (ringing the doorbell is slow, so
we want to do it less).  If the driver doesn't use bd->last and the
doorbell is always rung, plugging adds to the latency without the
throughput benefit.

All that the duplicate blk_mq_ops do is letting blk_mq_make_request()
use the same heuristic for SCSI.  This should be beneficial exactly for
the reason that you mention: if the heuristic helps non-SCSI block
driver performance, it should also help performance of SCSI drivers that
have nr_hw_queues > 1 but no commit_rqs (lpfc, qedi, qla2xxx, smartpqi,
storvsc).

Paolo

> How about always setting commit_rqs = scsi_commit_rqs
> in scsi_mq_ops?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-31  3:27   ` Ming Lei
@ 2019-06-03  8:16     ` Paolo Bonzini
  2019-07-05  1:00       ` Ming Lei
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-06-03  8:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, KVM General, jejb, Martin K. Petersen,
	Linux SCSI List, Stefan Hajnoczi

On 31/05/19 05:27, Ming Lei wrote:
> It should be fine to implement scsi_commit_rqs() as:
> 
>  if (shost->hostt->commit_rqs)
>        shost->hostt->commit_rqs(shost, hctx->queue_num);
> 
> then scsi_mq_ops_no_commit can be saved.
> 
> Because .commit_rqs() is only called when BLK_STS_*_RESOURCE is
> returned from scsi_queue_rq(), at that time shost->hostt->commit_rqs should
> have been hit from cache given .queuecommand is called via
> host->hostt->queuecommand.

This is not about d-cache, it's about preserving the heuristics that
blk-mq applies depending on whether commit_rqs is there or not.

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
  2019-05-30 15:36   ` Bart Van Assche
  2019-05-31  3:27   ` Ming Lei
@ 2019-06-04 22:40   ` Bart Van Assche
  2019-06-19  8:11   ` Hannes Reinecke
  3 siblings, 0 replies; 30+ messages in thread
From: Bart Van Assche @ 2019-06-04 22:40 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 4:28 AM, Paolo Bonzini wrote:
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
> 
> The use case for this is plugged IO, where blk-mq flushes a batch of
> requests all at once.
> 
> The API is the same as for blk-mq, just with blk-mq concepts tweaked to
> fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
> a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
> extracted from the hctx and passed as a parameter.
> 
> The only complication is that blk-mq uses different plugging heuristics
> depending on whether commit_rqs is present or not.  So we have two
> different sets of blk_mq_ops and pick one depending on whether the
> scsi_host template uses commit_rqs or not.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-05-30 11:28 [PATCH 0/2] scsi: add support for request batching Paolo Bonzini
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
  2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
@ 2019-06-10 12:36 ` Stefan Hajnoczi
  2019-06-26 13:51 ` Paolo Bonzini
  3 siblings, 0 replies; 30+ messages in thread
From: Stefan Hajnoczi @ 2019-06-10 12:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jejb, martin.petersen, linux-scsi, stefanha

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

On Thu, May 30, 2019 at 01:28:09PM +0200, Paolo Bonzini wrote:
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
> 
> This applies to any HBA, either singlequeue or multiqueue; the second
> patch implements it for virtio-scsi.
> 
> Paolo
> 
> Paolo Bonzini (2):
>   scsi_host: add support for request batching
>   virtio_scsi: implement request batching
> 
>  drivers/scsi/scsi_lib.c    | 37 ++++++++++++++++++++++---
>  drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
>  include/scsi/scsi_cmnd.h   |  1 +
>  include/scsi/scsi_host.h   | 16 +++++++++--
>  4 files changed, 89 insertions(+), 20 deletions(-)
> 
> -- 
> 2.21.0
> 

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
                     ` (2 preceding siblings ...)
  2019-06-04 22:40   ` Bart Van Assche
@ 2019-06-19  8:11   ` Hannes Reinecke
  2019-06-19 10:31     ` Paolo Bonzini
  3 siblings, 1 reply; 30+ messages in thread
From: Hannes Reinecke @ 2019-06-19  8:11 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 1:28 PM, Paolo Bonzini wrote:
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
> 
> The use case for this is plugged IO, where blk-mq flushes a batch of
> requests all at once.
> 
> The API is the same as for blk-mq, just with blk-mq concepts tweaked to
> fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
> a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
> extracted from the hctx and passed as a parameter.
> 
> The only complication is that blk-mq uses different plugging heuristics
> depending on whether commit_rqs is present or not.  So we have two
> different sets of blk_mq_ops and pick one depending on whether the
> scsi_host template uses commit_rqs or not.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  drivers/scsi/scsi_lib.c  | 37 ++++++++++++++++++++++++++++++++++---
>  include/scsi/scsi_cmnd.h |  1 +
>  include/scsi/scsi_host.h | 16 ++++++++++++++--
>  3 files changed, 49 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 601b9f1de267..eb4e67d02bfe 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1673,10 +1673,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>  		blk_mq_start_request(req);
>  	}
>  
> +	cmd->flags &= SCMD_PRESERVED_FLAGS;
>  	if (sdev->simple_tags)
>  		cmd->flags |= SCMD_TAGGED;
> -	else
> -		cmd->flags &= ~SCMD_TAGGED;
> +	if (bd->last)
> +		cmd->flags |= SCMD_LAST;
>  
>  	scsi_init_cmd_errh(cmd);
>  	cmd->scsi_done = scsi_mq_done;
> @@ -1807,10 +1808,37 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>  }
>  EXPORT_SYMBOL_GPL(__scsi_init_queue);
>  
> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
> +	.get_budget	= scsi_mq_get_budget,
> +	.put_budget	= scsi_mq_put_budget,
> +	.queue_rq	= scsi_queue_rq,
> +	.complete	= scsi_softirq_done,
> +	.timeout	= scsi_timeout,
> +#ifdef CONFIG_BLK_DEBUG_FS
> +	.show_rq	= scsi_show_rq,
> +#endif
> +	.init_request	= scsi_mq_init_request,
> +	.exit_request	= scsi_mq_exit_request,
> +	.initialize_rq_fn = scsi_initialize_rq,
> +	.busy		= scsi_mq_lld_busy,
> +	.map_queues	= scsi_map_queues,
> +};
> +
> +
> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
> +{
> +	struct request_queue *q = hctx->queue;
> +	struct scsi_device *sdev = q->queuedata;
> +	struct Scsi_Host *shost = sdev->host;
> +
> +	shost->hostt->commit_rqs(shost, hctx->queue_num);
> +}
> +
>  static const struct blk_mq_ops scsi_mq_ops = {
>  	.get_budget	= scsi_mq_get_budget,
>  	.put_budget	= scsi_mq_put_budget,
>  	.queue_rq	= scsi_queue_rq,
> +	.commit_rqs	= scsi_commit_rqs,
>  	.complete	= scsi_softirq_done,
>  	.timeout	= scsi_timeout,
>  #ifdef CONFIG_BLK_DEBUG_FS
> @@ -1845,7 +1873,10 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
>  		cmd_size += sizeof(struct scsi_data_buffer) + sgl_size;
>  
>  	memset(&shost->tag_set, 0, sizeof(shost->tag_set));
> -	shost->tag_set.ops = &scsi_mq_ops;
> +	if (shost->hostt->commit_rqs)
> +		shost->tag_set.ops = &scsi_mq_ops;
> +	else
> +		shost->tag_set.ops = &scsi_mq_ops_no_commit;
>  	shost->tag_set.nr_hw_queues = shost->nr_hw_queues ? : 1;
>  	shost->tag_set.queue_depth = shost->can_queue;
>  	shost->tag_set.cmd_size = cmd_size;
> diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
> index 76ed5e4acd38..91bd749a02f7 100644
> --- a/include/scsi/scsi_cmnd.h
> +++ b/include/scsi/scsi_cmnd.h
> @@ -57,6 +57,7 @@ struct scsi_pointer {
>  #define SCMD_TAGGED		(1 << 0)
>  #define SCMD_UNCHECKED_ISA_DMA	(1 << 1)
>  #define SCMD_INITIALIZED	(1 << 2)
> +#define SCMD_LAST		(1 << 3)
>  /* flags preserved across unprep / reprep */
>  #define SCMD_PRESERVED_FLAGS	(SCMD_UNCHECKED_ISA_DMA | SCMD_INITIALIZED)
>  
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 2b539a1b3f62..28f1c9177cd2 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -80,8 +80,10 @@ struct scsi_host_template {
>  	 * command block to the LLDD.  When the driver finished
>  	 * processing the command the done callback is invoked.
>  	 *
> -	 * If queuecommand returns 0, then the HBA has accepted the
> -	 * command.  The done() function must be called on the command
> +	 * If queuecommand returns 0, then the driver has accepted the
> +	 * command.  It must also push it to the HBA if the scsi_cmnd
> +	 * flag SCMD_LAST is set, or if the driver does not implement
> +	 * commit_rqs.  The done() function must be called on the command
>  	 * when the driver has finished with it. (you may call done on the
>  	 * command before queuecommand returns, but in this case you
>  	 * *must* return 0 from queuecommand).
> @@ -109,6 +111,16 @@ struct scsi_host_template {
>  	 */
>  	int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);
>  
> +	/*
> +	 * The commit_rqs function is used to trigger a hardware
> +	 * doorbell after some requests have been queued with
> +	 * queuecommand, when an error is encountered before sending
> +	 * the request with SCMD_LAST set.
> +	 *
> +	 * STATUS: OPTIONAL
> +	 */
> +	void (*commit_rqs)(struct Scsi_Host *, u16);
> +
>  	/*
>  	 * This is an error handling strategy routine.  You don't need to
>  	 * define one of these if you don't want to - there is a default
> 
I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
it's present if set, but what about requests with 'bd->last == false' ?
Is there a guarantee that they will _always_ be followed with a request
with bd->last == true?
And if so, is there a guarantee that this request is part of the same batch?

Aside from it: I think it's a good idea to match the '->last' setting
onto the SCMD_LAST flag; I would even go so far and make this an
independent patch.

Once to above points are cleared, that is.

But if that one is in, why do we need to have the separate 'commit_rqs'
callback?
Can't we let the driver decide to issue a doorbell kick (or whatever the
driver decides to do there)?
If we ensure that the SCMD_LAST flag is always set for the end of a
batch (even if this batch consists only of one request), the driver
simply can evaluate the flag and do its actions.
Why do we need a new callback here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-06-19  8:11   ` Hannes Reinecke
@ 2019-06-19 10:31     ` Paolo Bonzini
  2019-07-04 13:19       ` Paolo Bonzini
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-06-19 10:31 UTC (permalink / raw)
  To: Hannes Reinecke, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 19/06/19 10:11, Hannes Reinecke wrote:
> On 5/30/19 1:28 PM, Paolo Bonzini wrote:
>> This allows a list of requests to be issued, with the LLD only writing
>> the hardware doorbell when necessary, after the last request was prepared.
>> This is more efficient if we have lists of requests to issue, particularly
>> on virtualized hardware, where writing the doorbell is more expensive than
>> on real hardware.
>>
>> The use case for this is plugged IO, where blk-mq flushes a batch of
>> requests all at once.
>>
>> The API is the same as for blk-mq, just with blk-mq concepts tweaked to
>> fit the SCSI subsystem API: the "last" flag in blk_mq_queue_data becomes
>> a flag in scsi_cmnd, while the queue_num in the commit_rqs callback is
>> extracted from the hctx and passed as a parameter.
>>
>> The only complication is that blk-mq uses different plugging heuristics
>> depending on whether commit_rqs is present or not.  So we have two
>> different sets of blk_mq_ops and pick one depending on whether the
>> scsi_host template uses commit_rqs or not.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  drivers/scsi/scsi_lib.c  | 37 ++++++++++++++++++++++++++++++++++---
>>  include/scsi/scsi_cmnd.h |  1 +
>>  include/scsi/scsi_host.h | 16 ++++++++++++++--
>>  3 files changed, 49 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 601b9f1de267..eb4e67d02bfe 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1673,10 +1673,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>>  		blk_mq_start_request(req);
>>  	}
>>  
>> +	cmd->flags &= SCMD_PRESERVED_FLAGS;
>>  	if (sdev->simple_tags)
>>  		cmd->flags |= SCMD_TAGGED;
>> -	else
>> -		cmd->flags &= ~SCMD_TAGGED;
>> +	if (bd->last)
>> +		cmd->flags |= SCMD_LAST;
>>  
>>  	scsi_init_cmd_errh(cmd);
>>  	cmd->scsi_done = scsi_mq_done;
>> @@ -1807,10 +1808,37 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>>  }
>>  EXPORT_SYMBOL_GPL(__scsi_init_queue);
>>  
>> +static const struct blk_mq_ops scsi_mq_ops_no_commit = {
>> +	.get_budget	= scsi_mq_get_budget,
>> +	.put_budget	= scsi_mq_put_budget,
>> +	.queue_rq	= scsi_queue_rq,
>> +	.complete	= scsi_softirq_done,
>> +	.timeout	= scsi_timeout,
>> +#ifdef CONFIG_BLK_DEBUG_FS
>> +	.show_rq	= scsi_show_rq,
>> +#endif
>> +	.init_request	= scsi_mq_init_request,
>> +	.exit_request	= scsi_mq_exit_request,
>> +	.initialize_rq_fn = scsi_initialize_rq,
>> +	.busy		= scsi_mq_lld_busy,
>> +	.map_queues	= scsi_map_queues,
>> +};
>> +
>> +
>> +static void scsi_commit_rqs(struct blk_mq_hw_ctx *hctx)
>> +{
>> +	struct request_queue *q = hctx->queue;
>> +	struct scsi_device *sdev = q->queuedata;
>> +	struct Scsi_Host *shost = sdev->host;
>> +
>> +	shost->hostt->commit_rqs(shost, hctx->queue_num);
>> +}
>> +
>>  static const struct blk_mq_ops scsi_mq_ops = {
>>  	.get_budget	= scsi_mq_get_budget,
>>  	.put_budget	= scsi_mq_put_budget,
>>  	.queue_rq	= scsi_queue_rq,
>> +	.commit_rqs	= scsi_commit_rqs,
>>  	.complete	= scsi_softirq_done,
>>  	.timeout	= scsi_timeout,
>>  #ifdef CONFIG_BLK_DEBUG_FS
>> @@ -1845,7 +1873,10 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
>>  		cmd_size += sizeof(struct scsi_data_buffer) + sgl_size;
>>  
>>  	memset(&shost->tag_set, 0, sizeof(shost->tag_set));
>> -	shost->tag_set.ops = &scsi_mq_ops;
>> +	if (shost->hostt->commit_rqs)
>> +		shost->tag_set.ops = &scsi_mq_ops;
>> +	else
>> +		shost->tag_set.ops = &scsi_mq_ops_no_commit;
>>  	shost->tag_set.nr_hw_queues = shost->nr_hw_queues ? : 1;
>>  	shost->tag_set.queue_depth = shost->can_queue;
>>  	shost->tag_set.cmd_size = cmd_size;
>> diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
>> index 76ed5e4acd38..91bd749a02f7 100644
>> --- a/include/scsi/scsi_cmnd.h
>> +++ b/include/scsi/scsi_cmnd.h
>> @@ -57,6 +57,7 @@ struct scsi_pointer {
>>  #define SCMD_TAGGED		(1 << 0)
>>  #define SCMD_UNCHECKED_ISA_DMA	(1 << 1)
>>  #define SCMD_INITIALIZED	(1 << 2)
>> +#define SCMD_LAST		(1 << 3)
>>  /* flags preserved across unprep / reprep */
>>  #define SCMD_PRESERVED_FLAGS	(SCMD_UNCHECKED_ISA_DMA | SCMD_INITIALIZED)
>>  
>> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
>> index 2b539a1b3f62..28f1c9177cd2 100644
>> --- a/include/scsi/scsi_host.h
>> +++ b/include/scsi/scsi_host.h
>> @@ -80,8 +80,10 @@ struct scsi_host_template {
>>  	 * command block to the LLDD.  When the driver finished
>>  	 * processing the command the done callback is invoked.
>>  	 *
>> -	 * If queuecommand returns 0, then the HBA has accepted the
>> -	 * command.  The done() function must be called on the command
>> +	 * If queuecommand returns 0, then the driver has accepted the
>> +	 * command.  It must also push it to the HBA if the scsi_cmnd
>> +	 * flag SCMD_LAST is set, or if the driver does not implement
>> +	 * commit_rqs.  The done() function must be called on the command
>>  	 * when the driver has finished with it. (you may call done on the
>>  	 * command before queuecommand returns, but in this case you
>>  	 * *must* return 0 from queuecommand).
>> @@ -109,6 +111,16 @@ struct scsi_host_template {
>>  	 */
>>  	int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);
>>  
>> +	/*
>> +	 * The commit_rqs function is used to trigger a hardware
>> +	 * doorbell after some requests have been queued with
>> +	 * queuecommand, when an error is encountered before sending
>> +	 * the request with SCMD_LAST set.
>> +	 *
>> +	 * STATUS: OPTIONAL
>> +	 */
>> +	void (*commit_rqs)(struct Scsi_Host *, u16);
>> +
>>  	/*
>>  	 * This is an error handling strategy routine.  You don't need to
>>  	 * define one of these if you don't want to - there is a default
>>
> I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
> it's present if set, but what about requests with 'bd->last == false' ?
> Is there a guarantee that they will _always_ be followed with a request
> with bd->last == true?
> And if so, is there a guarantee that this request is part of the same batch?

It's complicated.  A request with bd->last == false _will_ always be
followed by a request with bd->last == true in the same batch.  However,
due to e.g. errors it may be possible that the last request is not sent.
 In that case, the block layer sends commit_rqs, as documented in the
comment above, to flush the requests that have been sent already.

So, a driver that obeys bd->last (or SCMD_LAST) but does not implement
commit_rqs is bound to have bugs, which is why this patch was not split
further.

Makes sense?

Paolo

> Aside from it: I think it's a good idea to match the '->last' setting
> onto the SCMD_LAST flag; I would even go so far and make this an
> independent patch.

> Once to above points are cleared, that is.
> 
> But if that one is in, why do we need to have the separate 'commit_rqs'
> callback?
> Can't we let the driver decide to issue a doorbell kick (or whatever the
> driver decides to do there)?
> If we ensure that the SCMD_LAST flag is always set for the end of a
> batch (even if this batch consists only of one request), the driver
> simply can evaluate the flag and do its actions.
> Why do we need a new callback here?
> 
> Cheers,
> 
> Hannes
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-05-30 11:28 [PATCH 0/2] scsi: add support for request batching Paolo Bonzini
                   ` (2 preceding siblings ...)
  2019-06-10 12:36 ` [PATCH 0/2] scsi: add support for " Stefan Hajnoczi
@ 2019-06-26 13:51 ` Paolo Bonzini
  2019-06-26 14:14   ` Douglas Gilbert
  2019-06-27  3:37   ` Martin K. Petersen
  3 siblings, 2 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-06-26 13:51 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jejb, martin.petersen, linux-scsi, stefanha

On 30/05/19 13:28, Paolo Bonzini wrote:
> This allows a list of requests to be issued, with the LLD only writing
> the hardware doorbell when necessary, after the last request was prepared.
> This is more efficient if we have lists of requests to issue, particularly
> on virtualized hardware, where writing the doorbell is more expensive than
> on real hardware.
> 
> This applies to any HBA, either singlequeue or multiqueue; the second
> patch implements it for virtio-scsi.
> 
> Paolo
> 
> Paolo Bonzini (2):
>   scsi_host: add support for request batching
>   virtio_scsi: implement request batching
> 
>  drivers/scsi/scsi_lib.c    | 37 ++++++++++++++++++++++---
>  drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
>  include/scsi/scsi_cmnd.h   |  1 +
>  include/scsi/scsi_host.h   | 16 +++++++++--
>  4 files changed, 89 insertions(+), 20 deletions(-)
> 


Ping?  Are there any more objections?

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-06-26 13:51 ` Paolo Bonzini
@ 2019-06-26 14:14   ` Douglas Gilbert
  2019-06-26 14:50     ` Paolo Bonzini
  2019-06-27  3:37   ` Martin K. Petersen
  1 sibling, 1 reply; 30+ messages in thread
From: Douglas Gilbert @ 2019-06-26 14:14 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 2019-06-26 9:51 a.m., Paolo Bonzini wrote:
> On 30/05/19 13:28, Paolo Bonzini wrote:
>> This allows a list of requests to be issued, with the LLD only writing
>> the hardware doorbell when necessary, after the last request was prepared.
>> This is more efficient if we have lists of requests to issue, particularly
>> on virtualized hardware, where writing the doorbell is more expensive than
>> on real hardware.
>>
>> This applies to any HBA, either singlequeue or multiqueue; the second
>> patch implements it for virtio-scsi.
>>
>> Paolo
>>
>> Paolo Bonzini (2):
>>    scsi_host: add support for request batching
>>    virtio_scsi: implement request batching
>>
>>   drivers/scsi/scsi_lib.c    | 37 ++++++++++++++++++++++---
>>   drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
>>   include/scsi/scsi_cmnd.h   |  1 +
>>   include/scsi/scsi_host.h   | 16 +++++++++--
>>   4 files changed, 89 insertions(+), 20 deletions(-)
>>
> 
> 
> Ping?  Are there any more objections?

I have no objections, just a few questions.

To implement this is the scsi_debug driver, a per device queue would
need to be added, correct? Then a 'commit_rqs' call would be expected
at some later point and it would drain that queue and submit each
command. Or is the queue draining ongoing in the LLD and 'commit_rqs'
means: don't return until that queue is empty?

So does that mean in the normal (i.e. non request batching) case
there are two calls to the LLD for each submitted command? Or is
'commit_rqs' optional, a sync-ing type command?

Doug Gilbert



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-06-26 14:14   ` Douglas Gilbert
@ 2019-06-26 14:50     ` Paolo Bonzini
  0 siblings, 0 replies; 30+ messages in thread
From: Paolo Bonzini @ 2019-06-26 14:50 UTC (permalink / raw)
  To: dgilbert, linux-kernel, kvm; +Cc: jejb, martin.petersen, linux-scsi, stefanha

On 26/06/19 16:14, Douglas Gilbert wrote:
> 
> I have no objections, just a few questions.
> 
> To implement this is the scsi_debug driver, a per device queue would
> need to be added, correct?

Yes, queuecommand would then return before calling schedule_resp (for
all requests except the one with SCMD_LAST, see later).  schedule_resp
would then be called for all requests in a batch.

> Then a 'commit_rqs' call would be expected
> at some later point and it would drain that queue and submit each
> command. Or is the queue draining ongoing in the LLD and 'commit_rqs'
> means: don't return until that queue is empty?

commit_rqs means the former; it is asynchronous.

However, commit_rqs is only called if a request batch fails submission
in the middle of the batch, so the request batch must be sent to the
HBA.  If the whole request batch is sent successfully, then the LLD
takes care of sending the batch to the HBA when it sees SCMD_LAST in the
request.

So, in the scsi_debug case schedule_resp would be called for the whole
batch from commit_rqs *and* when queuecommand sees a command with the
SCMD_LAST flag set.  This is exactly to avoid having two calls to the
LLD in the case of no request batching.

> So does that mean in the normal (i.e. non request batching) case
> there are two calls to the LLD for each submitted command? Or is
> 'commit_rqs' optional, a sync-ing type command?

It's not syncing.  It's mandatory if the queuecommand function observes
SCMD_LAST, not needed at all if queuecommand ignores it.  So it's not
needed at all until your driver adds support for batched submission of
requests to the HBA.

(All this is documented by the patches in the comments for struct
scsi_host_template, if those are not clear please reply to patch 1 with
your doubts).

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-06-26 13:51 ` Paolo Bonzini
  2019-06-26 14:14   ` Douglas Gilbert
@ 2019-06-27  3:37   ` Martin K. Petersen
  2019-06-27  8:17     ` Paolo Bonzini
  1 sibling, 1 reply; 30+ messages in thread
From: Martin K. Petersen @ 2019-06-27  3:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jejb, martin.petersen, linux-scsi, stefanha


Paolo,

> Ping?  Are there any more objections?

It's a core change so we'll need some more reviews. I suggest you
resubmit.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-06-27  3:37   ` Martin K. Petersen
@ 2019-06-27  8:17     ` Paolo Bonzini
  2019-07-05  7:13       ` Hannes Reinecke
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-06-27  8:17 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: linux-kernel, kvm, jejb, linux-scsi, stefanha

On 27/06/19 05:37, Martin K. Petersen wrote:
>> Ping?  Are there any more objections?
> It's a core change so we'll need some more reviews. I suggest you
> resubmit.

Resubmit exactly the same patches?

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-06-19 10:31     ` Paolo Bonzini
@ 2019-07-04 13:19       ` Paolo Bonzini
  2019-07-05  7:12         ` Hannes Reinecke
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-07-04 13:19 UTC (permalink / raw)
  To: Hannes Reinecke, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 19/06/19 12:31, Paolo Bonzini wrote:
>> I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
>> it's present if set, but what about requests with 'bd->last == false' ?
>> Is there a guarantee that they will _always_ be followed with a request
>> with bd->last == true?
>> And if so, is there a guarantee that this request is part of the same batch?
> It's complicated.  A request with bd->last == false _will_ always be
> followed by a request with bd->last == true in the same batch.  However,
> due to e.g. errors it may be possible that the last request is not sent.
>  In that case, the block layer sends commit_rqs, as documented in the
> comment above, to flush the requests that have been sent already.
> 
> So, a driver that obeys bd->last (or SCMD_LAST) but does not implement
> commit_rqs is bound to have bugs, which is why this patch was not split
> further.
> 
> Makes sense?

Hannes, can you provide your Reviewed-by?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-06-03  8:16     ` Paolo Bonzini
@ 2019-07-05  1:00       ` Ming Lei
  0 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-07-05  1:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Linux Kernel Mailing List, KVM General, jejb, Martin K. Petersen,
	Linux SCSI List, Stefan Hajnoczi

On Mon, Jun 3, 2019 at 4:16 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 31/05/19 05:27, Ming Lei wrote:
> > It should be fine to implement scsi_commit_rqs() as:
> >
> >  if (shost->hostt->commit_rqs)
> >        shost->hostt->commit_rqs(shost, hctx->queue_num);
> >
> > then scsi_mq_ops_no_commit can be saved.
> >
> > Because .commit_rqs() is only called when BLK_STS_*_RESOURCE is
> > returned from scsi_queue_rq(), at that time shost->hostt->commit_rqs should
> > have been hit from cache given .queuecommand is called via
> > host->hostt->queuecommand.
>
> This is not about d-cache, it's about preserving the heuristics that
> blk-mq applies depending on whether commit_rqs is there or not.

Fair enough, at least difference would be made by the check in
blk_mq_make_request() if scsi_commit_rqs is provided unconditionally,
so looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-07-04 13:19       ` Paolo Bonzini
@ 2019-07-05  7:12         ` Hannes Reinecke
  2019-07-05  7:44           ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Hannes Reinecke @ 2019-07-05  7:12 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 7/4/19 3:19 PM, Paolo Bonzini wrote:
> On 19/06/19 12:31, Paolo Bonzini wrote:
>>> I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
>>> it's present if set, but what about requests with 'bd->last == false' ?
>>> Is there a guarantee that they will _always_ be followed with a request
>>> with bd->last == true?
>>> And if so, is there a guarantee that this request is part of the same batch?
>> It's complicated.  A request with bd->last == false _will_ always be
>> followed by a request with bd->last == true in the same batch.  However,
>> due to e.g. errors it may be possible that the last request is not sent.
>>  In that case, the block layer sends commit_rqs, as documented in the
>> comment above, to flush the requests that have been sent already.
>>
>> So, a driver that obeys bd->last (or SCMD_LAST) but does not implement
>> commit_rqs is bound to have bugs, which is why this patch was not split
>> further.
>>
>> Makes sense?
> 
> Hannes, can you provide your Reviewed-by?
> 
Well ... since you asked for it:

Where is the 'commit_rqs' callback actually used?
I seem to be going blind, but I can't find it; should be somewhere in
the first patch, no?
As per description:

 * The commit_rqs function is used to trigger a hardware
 * doorbell after some requests have been queued with
 * queuecommand, when an error is encountered before sending
 * the request with SCMD_LAST set.

So it should be somewhere in the error path, probably scsi_error or
something. But I don't seem to be able to find it ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-06-27  8:17     ` Paolo Bonzini
@ 2019-07-05  7:13       ` Hannes Reinecke
  2019-07-05 11:58         ` Paolo Bonzini
  0 siblings, 1 reply; 30+ messages in thread
From: Hannes Reinecke @ 2019-07-05  7:13 UTC (permalink / raw)
  To: Paolo Bonzini, Martin K. Petersen
  Cc: linux-kernel, kvm, jejb, linux-scsi, stefanha

On 6/27/19 10:17 AM, Paolo Bonzini wrote:
> On 27/06/19 05:37, Martin K. Petersen wrote:
>>> Ping?  Are there any more objections?
>> It's a core change so we'll need some more reviews. I suggest you
>> resubmit.
> 
> Resubmit exactly the same patches?
> Where is the ->commit_rqs() callback invoked?
I don't seem to be able to find it...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-07-05  7:12         ` Hannes Reinecke
@ 2019-07-05  7:44           ` Stefan Hajnoczi
  2019-07-05  7:51             ` Hannes Reinecke
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2019-07-05  7:44 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Paolo Bonzini, linux-kernel, kvm, jejb, martin.petersen, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1843 bytes --]

On Fri, Jul 05, 2019 at 09:12:37AM +0200, Hannes Reinecke wrote:
> On 7/4/19 3:19 PM, Paolo Bonzini wrote:
> > On 19/06/19 12:31, Paolo Bonzini wrote:
> >>> I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
> >>> it's present if set, but what about requests with 'bd->last == false' ?
> >>> Is there a guarantee that they will _always_ be followed with a request
> >>> with bd->last == true?
> >>> And if so, is there a guarantee that this request is part of the same batch?
> >> It's complicated.  A request with bd->last == false _will_ always be
> >> followed by a request with bd->last == true in the same batch.  However,
> >> due to e.g. errors it may be possible that the last request is not sent.
> >>  In that case, the block layer sends commit_rqs, as documented in the
> >> comment above, to flush the requests that have been sent already.
> >>
> >> So, a driver that obeys bd->last (or SCMD_LAST) but does not implement
> >> commit_rqs is bound to have bugs, which is why this patch was not split
> >> further.
> >>
> >> Makes sense?
> > 
> > Hannes, can you provide your Reviewed-by?
> > 
> Well ... since you asked for it:
> 
> Where is the 'commit_rqs' callback actually used?
> I seem to be going blind, but I can't find it; should be somewhere in
> the first patch, no?
> As per description:
> 
>  * The commit_rqs function is used to trigger a hardware
>  * doorbell after some requests have been queued with
>  * queuecommand, when an error is encountered before sending
>  * the request with SCMD_LAST set.
> 
> So it should be somewhere in the error path, probably scsi_error or
> something. But I don't seem to be able to find it ...

The block layer calls scsi_mq_ops->commit_rqs() from
blk_mq_dispatch_rq_list() and blk_mq_try_issue_list_directly().

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/2] scsi_host: add support for request batching
  2019-07-05  7:44           ` Stefan Hajnoczi
@ 2019-07-05  7:51             ` Hannes Reinecke
  0 siblings, 0 replies; 30+ messages in thread
From: Hannes Reinecke @ 2019-07-05  7:51 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Paolo Bonzini, linux-kernel, kvm, jejb, martin.petersen, linux-scsi

On 7/5/19 9:44 AM, Stefan Hajnoczi wrote:
> On Fri, Jul 05, 2019 at 09:12:37AM +0200, Hannes Reinecke wrote:
>> On 7/4/19 3:19 PM, Paolo Bonzini wrote:
>>> On 19/06/19 12:31, Paolo Bonzini wrote:
>>>>> I'm a bit unsure if 'bd->last' is always set; it's quite obvious that
>>>>> it's present if set, but what about requests with 'bd->last == false' ?
>>>>> Is there a guarantee that they will _always_ be followed with a request
>>>>> with bd->last == true?
>>>>> And if so, is there a guarantee that this request is part of the same batch?
>>>> It's complicated.  A request with bd->last == false _will_ always be
>>>> followed by a request with bd->last == true in the same batch.  However,
>>>> due to e.g. errors it may be possible that the last request is not sent.
>>>>  In that case, the block layer sends commit_rqs, as documented in the
>>>> comment above, to flush the requests that have been sent already.
>>>>
>>>> So, a driver that obeys bd->last (or SCMD_LAST) but does not implement
>>>> commit_rqs is bound to have bugs, which is why this patch was not split
>>>> further.
>>>>
>>>> Makes sense?
>>>
>>> Hannes, can you provide your Reviewed-by?
>>>
>> Well ... since you asked for it:
>>
>> Where is the 'commit_rqs' callback actually used?
>> I seem to be going blind, but I can't find it; should be somewhere in
>> the first patch, no?
>> As per description:
>>
>>  * The commit_rqs function is used to trigger a hardware
>>  * doorbell after some requests have been queued with
>>  * queuecommand, when an error is encountered before sending
>>  * the request with SCMD_LAST set.
>>
>> So it should be somewhere in the error path, probably scsi_error or
>> something. But I don't seem to be able to find it ...
> 
> The block layer calls scsi_mq_ops->commit_rqs() from
> blk_mq_dispatch_rq_list() and blk_mq_try_issue_list_directly().
> 
Ah, right.

Now, then:

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] virtio_scsi: implement request batching
  2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
  2019-05-30 17:28   ` Bart Van Assche
@ 2019-07-05  7:52   ` Hannes Reinecke
  2019-07-08  9:47   ` Ming Lei
  2 siblings, 0 replies; 30+ messages in thread
From: Hannes Reinecke @ 2019-07-05  7:52 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jejb, martin.petersen, linux-scsi, stefanha

On 5/30/19 1:28 PM, Paolo Bonzini wrote:
> Adding the command and kicking the virtqueue so far was done one after
> another.  Make the kick optional, so that we can take into account SCMD_LAST.
> We also need a commit_rqs callback to kick the device if blk-mq aborts
> the submission before the last request is reached.
> 
> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
>  1 file changed, 40 insertions(+), 15 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-07-05  7:13       ` Hannes Reinecke
@ 2019-07-05 11:58         ` Paolo Bonzini
  2019-07-12  1:37           ` Martin K. Petersen
  0 siblings, 1 reply; 30+ messages in thread
From: Paolo Bonzini @ 2019-07-05 11:58 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: linux-kernel, kvm, jejb, linux-scsi, stefanha

On 05/07/19 09:13, Hannes Reinecke wrote:
> On 6/27/19 10:17 AM, Paolo Bonzini wrote:
>> On 27/06/19 05:37, Martin K. Petersen wrote:
>>>> Ping?  Are there any more objections?
>>> It's a core change so we'll need some more reviews. I suggest you
>>> resubmit.
>> Resubmit exactly the same patches?
>> Where is the ->commit_rqs() callback invoked?
> I don't seem to be able to find it...

Stefan answered, and the series now has three reviews!  It may be late
for 5.3 but I hope this can go in soon.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/2] virtio_scsi: implement request batching
  2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
  2019-05-30 17:28   ` Bart Van Assche
  2019-07-05  7:52   ` Hannes Reinecke
@ 2019-07-08  9:47   ` Ming Lei
  2 siblings, 0 replies; 30+ messages in thread
From: Ming Lei @ 2019-07-08  9:47 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Linux Kernel Mailing List, KVM General, jejb, Martin K. Petersen,
	Linux SCSI List, Stefan Hajnoczi

On Thu, May 30, 2019 at 7:28 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Adding the command and kicking the virtqueue so far was done one after
> another.  Make the kick optional, so that we can take into account SCMD_LAST.
> We also need a commit_rqs callback to kick the device if blk-mq aborts
> the submission before the last request is reached.
>
> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  drivers/scsi/virtio_scsi.c | 55 +++++++++++++++++++++++++++-----------
>  1 file changed, 40 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 8af01777d09c..918c811cea95 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -375,14 +375,7 @@ static void virtscsi_event_done(struct virtqueue *vq)
>         virtscsi_vq_done(vscsi, &vscsi->event_vq, virtscsi_complete_event);
>  };
>
> -/**
> - * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue
> - * @vq         : the struct virtqueue we're talking about
> - * @cmd                : command structure
> - * @req_size   : size of the request buffer
> - * @resp_size  : size of the response buffer
> - */
> -static int virtscsi_add_cmd(struct virtqueue *vq,
> +static int __virtscsi_add_cmd(struct virtqueue *vq,
>                             struct virtio_scsi_cmd *cmd,
>                             size_t req_size, size_t resp_size)
>  {
> @@ -427,17 +420,39 @@ static int virtscsi_add_cmd(struct virtqueue *vq,
>         return virtqueue_add_sgs(vq, sgs, out_num, in_num, cmd, GFP_ATOMIC);
>  }
>
> -static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq,
> +static void virtscsi_kick_vq(struct virtio_scsi_vq *vq)
> +{
> +       bool needs_kick;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&vq->vq_lock, flags);
> +       needs_kick = virtqueue_kick_prepare(vq->vq);
> +       spin_unlock_irqrestore(&vq->vq_lock, flags);
> +
> +       if (needs_kick)
> +               virtqueue_notify(vq->vq);
> +}
> +
> +/**
> + * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue, optionally kick it
> + * @vq         : the struct virtqueue we're talking about
> + * @cmd                : command structure
> + * @req_size   : size of the request buffer
> + * @resp_size  : size of the response buffer
> + * @kick       : whether to kick the virtqueue immediately
> + */
> +static int virtscsi_add_cmd(struct virtio_scsi_vq *vq,
>                              struct virtio_scsi_cmd *cmd,
> -                            size_t req_size, size_t resp_size)
> +                            size_t req_size, size_t resp_size,
> +                            bool kick)
>  {
>         unsigned long flags;
>         int err;
>         bool needs_kick = false;
>
>         spin_lock_irqsave(&vq->vq_lock, flags);
> -       err = virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size);
> -       if (!err)
> +       err = __virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size);
> +       if (!err && kick)
>                 needs_kick = virtqueue_kick_prepare(vq->vq);
>
>         spin_unlock_irqrestore(&vq->vq_lock, flags);
> @@ -502,6 +517,7 @@ static int virtscsi_queuecommand(struct Scsi_Host *shost,
>         struct virtio_scsi *vscsi = shost_priv(shost);
>         struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
>         struct virtio_scsi_cmd *cmd = scsi_cmd_priv(sc);
> +       bool kick;
>         unsigned long flags;
>         int req_size;
>         int ret;
> @@ -531,7 +547,8 @@ static int virtscsi_queuecommand(struct Scsi_Host *shost,
>                 req_size = sizeof(cmd->req.cmd);
>         }
>
> -       ret = virtscsi_kick_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd));
> +       kick = (sc->flags & SCMD_LAST) != 0;
> +       ret = virtscsi_add_cmd(req_vq, cmd, req_size, sizeof(cmd->resp.cmd), kick);
>         if (ret == -EIO) {
>                 cmd->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
>                 spin_lock_irqsave(&req_vq->vq_lock, flags);
> @@ -549,8 +566,8 @@ static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
>         int ret = FAILED;
>
>         cmd->comp = &comp;
> -       if (virtscsi_kick_cmd(&vscsi->ctrl_vq, cmd,
> -                             sizeof cmd->req.tmf, sizeof cmd->resp.tmf) < 0)
> +       if (virtscsi_add_cmd(&vscsi->ctrl_vq, cmd,
> +                             sizeof cmd->req.tmf, sizeof cmd->resp.tmf, true) < 0)
>                 goto out;
>
>         wait_for_completion(&comp);
> @@ -664,6 +681,13 @@ static int virtscsi_map_queues(struct Scsi_Host *shost)
>         return blk_mq_virtio_map_queues(qmap, vscsi->vdev, 2);
>  }
>
> +static void virtscsi_commit_rqs(struct Scsi_Host *shost, u16 hwq)
> +{
> +       struct virtio_scsi *vscsi = shost_priv(shost);
> +
> +       virtscsi_kick_vq(&vscsi->req_vqs[hwq]);
> +}
> +
>  /*
>   * The host guarantees to respond to each command, although I/O
>   * latencies might be higher than on bare metal.  Reset the timer
> @@ -681,6 +705,7 @@ static struct scsi_host_template virtscsi_host_template = {
>         .this_id = -1,
>         .cmd_size = sizeof(struct virtio_scsi_cmd),
>         .queuecommand = virtscsi_queuecommand,
> +       .commit_rqs = virtscsi_commit_rqs,
>         .change_queue_depth = virtscsi_change_queue_depth,
>         .eh_abort_handler = virtscsi_abort,
>         .eh_device_reset_handler = virtscsi_device_reset,
> --
> 2.21.0
>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/2] scsi: add support for request batching
  2019-07-05 11:58         ` Paolo Bonzini
@ 2019-07-12  1:37           ` Martin K. Petersen
  0 siblings, 0 replies; 30+ messages in thread
From: Martin K. Petersen @ 2019-07-12  1:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Hannes Reinecke, Martin K. Petersen, linux-kernel, kvm, jejb,
	linux-scsi, stefanha


Paolo,

> Stefan answered, and the series now has three reviews!  It may be late
> for 5.3 but I hope this can go in soon.

I queued these up for 5.4. Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-07-12  1:38 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-30 11:28 [PATCH 0/2] scsi: add support for request batching Paolo Bonzini
2019-05-30 11:28 ` [PATCH 1/2] scsi_host: " Paolo Bonzini
2019-05-30 15:36   ` Bart Van Assche
2019-05-30 15:54     ` Paolo Bonzini
2019-05-30 17:54       ` Bart Van Assche
2019-05-31  9:12         ` Paolo Bonzini
2019-05-31  3:27   ` Ming Lei
2019-06-03  8:16     ` Paolo Bonzini
2019-07-05  1:00       ` Ming Lei
2019-06-04 22:40   ` Bart Van Assche
2019-06-19  8:11   ` Hannes Reinecke
2019-06-19 10:31     ` Paolo Bonzini
2019-07-04 13:19       ` Paolo Bonzini
2019-07-05  7:12         ` Hannes Reinecke
2019-07-05  7:44           ` Stefan Hajnoczi
2019-07-05  7:51             ` Hannes Reinecke
2019-05-30 11:28 ` [PATCH 2/2] virtio_scsi: implement " Paolo Bonzini
2019-05-30 17:28   ` Bart Van Assche
2019-05-31  9:03     ` Paolo Bonzini
2019-07-05  7:52   ` Hannes Reinecke
2019-07-08  9:47   ` Ming Lei
2019-06-10 12:36 ` [PATCH 0/2] scsi: add support for " Stefan Hajnoczi
2019-06-26 13:51 ` Paolo Bonzini
2019-06-26 14:14   ` Douglas Gilbert
2019-06-26 14:50     ` Paolo Bonzini
2019-06-27  3:37   ` Martin K. Petersen
2019-06-27  8:17     ` Paolo Bonzini
2019-07-05  7:13       ` Hannes Reinecke
2019-07-05 11:58         ` Paolo Bonzini
2019-07-12  1:37           ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).