All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
@ 2022-03-22 10:39 ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Currently SCSI low-level drivers are required to manage tags for commands
which do not come via the block layer - libata internal commands would be
an example of one of these.

There was some work to provide "reserved commands" support in such series
as https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/

This was based on allocating a request for the lifetime of the "internal"
command.

This series tries to solve that problem by not just allocating the request
but also sending it through the block layer, that being the normal flow
for a request. We need to do this as we may only poll completion of
requests through the block layer, so would need to do this for poll queue
support.

There is still scope to allocate commands just to get a tag as token as
that may suit some other scenarios, but it's not what we do here.

This series extends blk-mq to support a request queue having a custom set
of ops. In addition SCSI core code adds support for these type of requests.

This series does not include SCSI core handling for enabling reserved
tags per tagset, but that would be easy to add.

Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993 

Please consider as an RFC for now. I think that the libata change has the
largest scope for improvement...

John Garry (11):
  blk-mq: Add blk_mq_init_queue_ops()
  scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
  libata: Send internal commands through the block layer
  scsi: libsas: Send SMP commands through the block layer
  scsi: libsas: Send TMF commands through the block layer
  scsi: core: Add scsi_alloc_request_hwq()
  scsi: libsas: Send internal abort commands through the block layer
  scsi: libsas: Change ATA support to deal with each qc having a SCSI
    command
  scsi: libsas: Add sas_task_to_unique_tag()
  scsi: libsas: Add sas_task_to_hwq()
  scsi: hisi_sas: Remove private tag management

 block/blk-mq.c                         |  23 +++-
 drivers/ata/libata-core.c              | 121 +++++++++++++------
 drivers/md/dm-rq.c                     |   2 +-
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  66 +----------
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |   3 +-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |   3 +-
 drivers/scsi/libsas/sas_ata.c          |  11 +-
 drivers/scsi/libsas/sas_expander.c     |  38 ++++--
 drivers/scsi/libsas/sas_internal.h     |   1 +
 drivers/scsi/libsas/sas_scsi_host.c    | 153 ++++++++++++++++++++-----
 drivers/scsi/scsi_lib.c                |  14 +++
 include/linux/blk-mq.h                 |   5 +-
 include/scsi/libsas.h                  |   4 +-
 include/scsi/scsi_cmnd.h               |   4 +
 14 files changed, 298 insertions(+), 150 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
@ 2022-03-22 10:39 ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Currently SCSI low-level drivers are required to manage tags for commands
which do not come via the block layer - libata internal commands would be
an example of one of these.

There was some work to provide "reserved commands" support in such series
as https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/

This was based on allocating a request for the lifetime of the "internal"
command.

This series tries to solve that problem by not just allocating the request
but also sending it through the block layer, that being the normal flow
for a request. We need to do this as we may only poll completion of
requests through the block layer, so would need to do this for poll queue
support.

There is still scope to allocate commands just to get a tag as token as
that may suit some other scenarios, but it's not what we do here.

This series extends blk-mq to support a request queue having a custom set
of ops. In addition SCSI core code adds support for these type of requests.

This series does not include SCSI core handling for enabling reserved
tags per tagset, but that would be easy to add.

Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993 

Please consider as an RFC for now. I think that the libata change has the
largest scope for improvement...

John Garry (11):
  blk-mq: Add blk_mq_init_queue_ops()
  scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
  libata: Send internal commands through the block layer
  scsi: libsas: Send SMP commands through the block layer
  scsi: libsas: Send TMF commands through the block layer
  scsi: core: Add scsi_alloc_request_hwq()
  scsi: libsas: Send internal abort commands through the block layer
  scsi: libsas: Change ATA support to deal with each qc having a SCSI
    command
  scsi: libsas: Add sas_task_to_unique_tag()
  scsi: libsas: Add sas_task_to_hwq()
  scsi: hisi_sas: Remove private tag management

 block/blk-mq.c                         |  23 +++-
 drivers/ata/libata-core.c              | 121 +++++++++++++------
 drivers/md/dm-rq.c                     |   2 +-
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  66 +----------
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |   3 +-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |   3 +-
 drivers/scsi/libsas/sas_ata.c          |  11 +-
 drivers/scsi/libsas/sas_expander.c     |  38 ++++--
 drivers/scsi/libsas/sas_internal.h     |   1 +
 drivers/scsi/libsas/sas_scsi_host.c    | 153 ++++++++++++++++++++-----
 drivers/scsi/scsi_lib.c                |  14 +++
 include/linux/blk-mq.h                 |   5 +-
 include/scsi/libsas.h                  |   4 +-
 include/scsi/scsi_cmnd.h               |   4 +
 14 files changed, 298 insertions(+), 150 deletions(-)

-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Add an API to allocate a request queue which accepts a custom set of
blk_mq_ops for that request queue.

The reason which we may want custom ops is for queuing requests which we
don't want to go through the normal queuing path.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c         | 23 +++++++++++++++++------
 drivers/md/dm-rq.c     |  2 +-
 include/linux/blk-mq.h |  5 ++++-
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f3bf3358a3bb..8ea3447339ca 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3858,7 +3858,7 @@ void blk_mq_release(struct request_queue *q)
 }
 
 static struct request_queue *blk_mq_init_queue_data(struct blk_mq_tag_set *set,
-		void *queuedata)
+		void *queuedata, const struct blk_mq_ops *ops)
 {
 	struct request_queue *q;
 	int ret;
@@ -3867,27 +3867,35 @@ static struct request_queue *blk_mq_init_queue_data(struct blk_mq_tag_set *set,
 	if (!q)
 		return ERR_PTR(-ENOMEM);
 	q->queuedata = queuedata;
-	ret = blk_mq_init_allocated_queue(set, q);
+	ret = blk_mq_init_allocated_queue(set, q, ops);
 	if (ret) {
 		blk_cleanup_queue(q);
 		return ERR_PTR(ret);
 	}
+
 	return q;
 }
 
 struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *set)
 {
-	return blk_mq_init_queue_data(set, NULL);
+	return blk_mq_init_queue_data(set, NULL, NULL);
 }
 EXPORT_SYMBOL(blk_mq_init_queue);
 
+struct request_queue *blk_mq_init_queue_ops(struct blk_mq_tag_set *set,
+					    const struct blk_mq_ops *custom_ops)
+{
+	return blk_mq_init_queue_data(set, NULL, custom_ops);
+}
+EXPORT_SYMBOL(blk_mq_init_queue_ops);
+
 struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
 		struct lock_class_key *lkclass)
 {
 	struct request_queue *q;
 	struct gendisk *disk;
 
-	q = blk_mq_init_queue_data(set, queuedata);
+	q = blk_mq_init_queue_data(set, queuedata, NULL);
 	if (IS_ERR(q))
 		return ERR_CAST(q);
 
@@ -4010,13 +4018,16 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
 }
 
 int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
-		struct request_queue *q)
+		struct request_queue *q, const struct blk_mq_ops *custom_ops)
 {
 	WARN_ON_ONCE(blk_queue_has_srcu(q) !=
 			!!(set->flags & BLK_MQ_F_BLOCKING));
 
 	/* mark the queue as mq asap */
-	q->mq_ops = set->ops;
+	if (custom_ops)
+		q->mq_ops = custom_ops;
+	else
+		q->mq_ops = set->ops;
 
 	q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn,
 					     blk_mq_poll_stats_bkt,
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 3907950a0ddc..9d93f72a3eec 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -560,7 +560,7 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	if (err)
 		goto out_kfree_tag_set;
 
-	err = blk_mq_init_allocated_queue(md->tag_set, md->queue);
+	err = blk_mq_init_allocated_queue(md->tag_set, md->queue, NULL);
 	if (err)
 		goto out_tag_set;
 	return 0;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index d319ffa59354..e12d17c86c52 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -688,8 +688,11 @@ struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
 	__blk_mq_alloc_disk(set, queuedata, &__key);			\
 })
 struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *);
+struct request_queue *blk_mq_init_queue_ops(struct blk_mq_tag_set *,
+		const struct blk_mq_ops *custom_ops);
+
 int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
-		struct request_queue *q);
+		struct request_queue *q, const struct blk_mq_ops *custom_ops);
 void blk_mq_unregister_dev(struct device *, struct request_queue *);
 
 int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Add an API to allocate a request queue which accepts a custom set of
blk_mq_ops for that request queue.

The reason which we may want custom ops is for queuing requests which we
don't want to go through the normal queuing path.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c         | 23 +++++++++++++++++------
 drivers/md/dm-rq.c     |  2 +-
 include/linux/blk-mq.h |  5 ++++-
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f3bf3358a3bb..8ea3447339ca 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3858,7 +3858,7 @@ void blk_mq_release(struct request_queue *q)
 }
 
 static struct request_queue *blk_mq_init_queue_data(struct blk_mq_tag_set *set,
-		void *queuedata)
+		void *queuedata, const struct blk_mq_ops *ops)
 {
 	struct request_queue *q;
 	int ret;
@@ -3867,27 +3867,35 @@ static struct request_queue *blk_mq_init_queue_data(struct blk_mq_tag_set *set,
 	if (!q)
 		return ERR_PTR(-ENOMEM);
 	q->queuedata = queuedata;
-	ret = blk_mq_init_allocated_queue(set, q);
+	ret = blk_mq_init_allocated_queue(set, q, ops);
 	if (ret) {
 		blk_cleanup_queue(q);
 		return ERR_PTR(ret);
 	}
+
 	return q;
 }
 
 struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *set)
 {
-	return blk_mq_init_queue_data(set, NULL);
+	return blk_mq_init_queue_data(set, NULL, NULL);
 }
 EXPORT_SYMBOL(blk_mq_init_queue);
 
+struct request_queue *blk_mq_init_queue_ops(struct blk_mq_tag_set *set,
+					    const struct blk_mq_ops *custom_ops)
+{
+	return blk_mq_init_queue_data(set, NULL, custom_ops);
+}
+EXPORT_SYMBOL(blk_mq_init_queue_ops);
+
 struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
 		struct lock_class_key *lkclass)
 {
 	struct request_queue *q;
 	struct gendisk *disk;
 
-	q = blk_mq_init_queue_data(set, queuedata);
+	q = blk_mq_init_queue_data(set, queuedata, NULL);
 	if (IS_ERR(q))
 		return ERR_CAST(q);
 
@@ -4010,13 +4018,16 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
 }
 
 int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
-		struct request_queue *q)
+		struct request_queue *q, const struct blk_mq_ops *custom_ops)
 {
 	WARN_ON_ONCE(blk_queue_has_srcu(q) !=
 			!!(set->flags & BLK_MQ_F_BLOCKING));
 
 	/* mark the queue as mq asap */
-	q->mq_ops = set->ops;
+	if (custom_ops)
+		q->mq_ops = custom_ops;
+	else
+		q->mq_ops = set->ops;
 
 	q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn,
 					     blk_mq_poll_stats_bkt,
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 3907950a0ddc..9d93f72a3eec 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -560,7 +560,7 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	if (err)
 		goto out_kfree_tag_set;
 
-	err = blk_mq_init_allocated_queue(md->tag_set, md->queue);
+	err = blk_mq_init_allocated_queue(md->tag_set, md->queue, NULL);
 	if (err)
 		goto out_tag_set;
 	return 0;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index d319ffa59354..e12d17c86c52 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -688,8 +688,11 @@ struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
 	__blk_mq_alloc_disk(set, queuedata, &__key);			\
 })
 struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *);
+struct request_queue *blk_mq_init_queue_ops(struct blk_mq_tag_set *,
+		const struct blk_mq_ops *custom_ops);
+
 int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
-		struct request_queue *q);
+		struct request_queue *q, const struct blk_mq_ops *custom_ops);
 void blk_mq_unregister_dev(struct device *, struct request_queue *);
 
 int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set);
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 02/11] scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Add a new type of submitter, SUBMITTED_BY_SCSI_CUSTOM_OPS, for when a
SCSI cmnd is submitted via the block layer but not by scsi_queue_rq().

Since this is not a true SCSI cmnd we should do nothing for it in
scsi_done_internal().

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_lib.c  | 2 ++
 include/scsi/scsi_cmnd.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a7788184908e..d230392f2b4a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1615,6 +1615,8 @@ static void scsi_done_internal(struct scsi_cmnd *cmd, bool complete_directly)
 		return scsi_eh_done(cmd);
 	case SUBMITTED_BY_SCSI_RESET_IOCTL:
 		return;
+	case SUBMITTED_BY_SCSI_CUSTOM_OPS:
+		return;
 	}
 
 	if (unlikely(blk_should_fake_timeout(scsi_cmd_to_rq(cmd)->q)))
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 76c5eaeeb3b5..ad4bcace1390 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -64,6 +64,7 @@ enum scsi_cmnd_submitter {
 	SUBMITTED_BY_BLOCK_LAYER = 0,
 	SUBMITTED_BY_SCSI_ERROR_HANDLER = 1,
 	SUBMITTED_BY_SCSI_RESET_IOCTL = 2,
+	SUBMITTED_BY_SCSI_CUSTOM_OPS = 3,
 } __packed;
 
 struct scsi_cmnd {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 02/11] scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Add a new type of submitter, SUBMITTED_BY_SCSI_CUSTOM_OPS, for when a
SCSI cmnd is submitted via the block layer but not by scsi_queue_rq().

Since this is not a true SCSI cmnd we should do nothing for it in
scsi_done_internal().

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_lib.c  | 2 ++
 include/scsi/scsi_cmnd.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a7788184908e..d230392f2b4a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1615,6 +1615,8 @@ static void scsi_done_internal(struct scsi_cmnd *cmd, bool complete_directly)
 		return scsi_eh_done(cmd);
 	case SUBMITTED_BY_SCSI_RESET_IOCTL:
 		return;
+	case SUBMITTED_BY_SCSI_CUSTOM_OPS:
+		return;
 	}
 
 	if (unlikely(blk_should_fake_timeout(scsi_cmd_to_rq(cmd)->q)))
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 76c5eaeeb3b5..ad4bcace1390 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -64,6 +64,7 @@ enum scsi_cmnd_submitter {
 	SUBMITTED_BY_BLOCK_LAYER = 0,
 	SUBMITTED_BY_SCSI_ERROR_HANDLER = 1,
 	SUBMITTED_BY_SCSI_RESET_IOCTL = 2,
+	SUBMITTED_BY_SCSI_CUSTOM_OPS = 3,
 } __packed;
 
 struct scsi_cmnd {
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 03/11] libata: Send internal commands through the block layer
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

When SCSI HBA device drivers are required to process an ATA internal
command they still need a tag for the IO. This often requires the driver
to set aside a set of tags for these sorts of IOs and manage the tags
themselves.

If we associate a SCSI command (and request) with an ATA internal command
then the tag is already provided, so introduce the change to send ATA
internal commands through the block layer with a set of custom blk-mq ops.

note: I think that the timeout handling needs to be fixed up.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/ata/libata-core.c | 121 ++++++++++++++++++++++++++++----------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 67f88027680a..9db0428d0511 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1438,6 +1438,59 @@ static void ata_qc_complete_internal(struct ata_queued_cmd *qc)
 	complete(waiting);
 }
 
+struct ata_internal_sg_data {
+	struct completion wait;
+
+	unsigned int preempted_tag;
+	u32 preempted_sactive;
+	u64 preempted_qc_active;
+	int preempted_nr_active_links;
+};
+
+static blk_status_t ata_exec_internal_sg_queue_rq(struct blk_mq_hw_ctx *hctx,
+		const struct blk_mq_queue_data *bd)
+{
+	struct request *rq = bd->rq;
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
+	struct ata_internal_sg_data *data;
+	struct ata_device *dev = qc->dev;
+	struct ata_port *ap = qc->ap;
+	struct ata_link *link = dev->link;
+	unsigned long flags;
+
+	data = container_of(qc->private_data, struct ata_internal_sg_data, wait);
+
+	blk_mq_start_request(bd->rq);
+
+	spin_lock_irqsave(ap->lock, flags);
+
+	/* no internal command while frozen */
+	if (ap->pflags & ATA_PFLAG_FROZEN) {
+		spin_unlock_irqrestore(ap->lock, flags);
+		return BLK_STS_TARGET;
+	}
+
+	data->preempted_tag = link->active_tag;
+	data->preempted_sactive = link->sactive;
+	data->preempted_qc_active = ap->qc_active;
+	data->preempted_nr_active_links = ap->nr_active_links;
+	link->active_tag = ATA_TAG_POISON;
+	link->sactive = 0;
+	ap->qc_active = 0;
+	ap->nr_active_links = 0;
+
+	ata_qc_issue(qc);
+
+	spin_unlock_irqrestore(ap->lock, flags);
+
+	return BLK_STS_OK;
+}
+
+static const struct blk_mq_ops ata_exec_internal_sg_mq_ops = {
+	.queue_rq	= ata_exec_internal_sg_queue_rq,
+};
+
 /**
  *	ata_exec_internal_sg - execute libata internal command
  *	@dev: Device to which the command is sent
@@ -1467,45 +1520,46 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 {
 	struct ata_link *link = dev->link;
 	struct ata_port *ap = link->ap;
+	struct Scsi_Host *scsi_host = ap->scsi_host;
+	struct request_queue *request_queue;
 	u8 command = tf->command;
-	int auto_timeout = 0;
 	struct ata_queued_cmd *qc;
-	unsigned int preempted_tag;
-	u32 preempted_sactive;
-	u64 preempted_qc_active;
-	int preempted_nr_active_links;
-	DECLARE_COMPLETION_ONSTACK(wait);
-	unsigned long flags;
+	struct scsi_cmnd *scmd;
 	unsigned int err_mask;
-	int rc;
+	unsigned long flags;
+	struct request *rq;
+	int rc, auto_timeout = 0;
+	struct ata_internal_sg_data data = {
+		.wait = COMPLETION_INITIALIZER_ONSTACK(data.wait),
+	};
+	unsigned int op;
 
-	spin_lock_irqsave(ap->lock, flags);
+	op = (dma_dir == DMA_TO_DEVICE) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN;
 
-	/* no internal command while frozen */
-	if (ap->pflags & ATA_PFLAG_FROZEN) {
-		spin_unlock_irqrestore(ap->lock, flags);
-		return AC_ERR_SYSTEM;
+	request_queue = blk_mq_init_queue_ops(&scsi_host->tag_set,
+					      &ata_exec_internal_sg_mq_ops);
+	if (!request_queue)
+		return AC_ERR_OTHER;
+
+	rq = scsi_alloc_request(request_queue, op, 0);
+	if (IS_ERR(rq)) {
+		err_mask = AC_ERR_OTHER;
+		goto out;
 	}
 
+	scmd = blk_mq_rq_to_pdu(rq);
+	scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+
 	/* initialize internal qc */
 	qc = __ata_qc_from_tag(ap, ATA_TAG_INTERNAL);
 
 	qc->tag = ATA_TAG_INTERNAL;
 	qc->hw_tag = 0;
-	qc->scsicmd = NULL;
+	qc->scsicmd = scmd;
 	qc->ap = ap;
 	qc->dev = dev;
 	ata_qc_reinit(qc);
 
-	preempted_tag = link->active_tag;
-	preempted_sactive = link->sactive;
-	preempted_qc_active = ap->qc_active;
-	preempted_nr_active_links = ap->nr_active_links;
-	link->active_tag = ATA_TAG_POISON;
-	link->sactive = 0;
-	ap->qc_active = 0;
-	ap->nr_active_links = 0;
-
 	/* prepare & issue qc */
 	qc->tf = *tf;
 	if (cdb)
@@ -1529,12 +1583,11 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 		qc->nbytes = buflen;
 	}
 
-	qc->private_data = &wait;
+	qc->private_data = &data.wait;
 	qc->complete_fn = ata_qc_complete_internal;
 
-	ata_qc_issue(qc);
-
-	spin_unlock_irqrestore(ap->lock, flags);
+	scmd->host_scribble = (unsigned char *)qc;
+	blk_execute_rq_nowait(rq, true, NULL);
 
 	if (!timeout) {
 		if (ata_probe_timeout)
@@ -1548,7 +1601,7 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 	if (ap->ops->error_handler)
 		ata_eh_release(ap);
 
-	rc = wait_for_completion_timeout(&wait, msecs_to_jiffies(timeout));
+	rc = wait_for_completion_timeout(&data.wait, msecs_to_jiffies(timeout));
 
 	if (ap->ops->error_handler)
 		ata_eh_acquire(ap);
@@ -1603,16 +1656,20 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 	err_mask = qc->err_mask;
 
 	ata_qc_free(qc);
-	link->active_tag = preempted_tag;
-	link->sactive = preempted_sactive;
-	ap->qc_active = preempted_qc_active;
-	ap->nr_active_links = preempted_nr_active_links;
+	link->active_tag = data.preempted_tag;
+	link->sactive = data.preempted_sactive;
+	ap->qc_active = data.preempted_qc_active;
+	ap->nr_active_links = data.preempted_nr_active_links;
 
 	spin_unlock_irqrestore(ap->lock, flags);
 
 	if ((err_mask & AC_ERR_TIMEOUT) && auto_timeout)
 		ata_internal_cmd_timed_out(dev, command);
 
+	__blk_mq_end_request(rq, BLK_STS_OK);
+
+out:
+	blk_cleanup_queue(request_queue);
 	return err_mask;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 03/11] libata: Send internal commands through the block layer
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

When SCSI HBA device drivers are required to process an ATA internal
command they still need a tag for the IO. This often requires the driver
to set aside a set of tags for these sorts of IOs and manage the tags
themselves.

If we associate a SCSI command (and request) with an ATA internal command
then the tag is already provided, so introduce the change to send ATA
internal commands through the block layer with a set of custom blk-mq ops.

note: I think that the timeout handling needs to be fixed up.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/ata/libata-core.c | 121 ++++++++++++++++++++++++++++----------
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 67f88027680a..9db0428d0511 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1438,6 +1438,59 @@ static void ata_qc_complete_internal(struct ata_queued_cmd *qc)
 	complete(waiting);
 }
 
+struct ata_internal_sg_data {
+	struct completion wait;
+
+	unsigned int preempted_tag;
+	u32 preempted_sactive;
+	u64 preempted_qc_active;
+	int preempted_nr_active_links;
+};
+
+static blk_status_t ata_exec_internal_sg_queue_rq(struct blk_mq_hw_ctx *hctx,
+		const struct blk_mq_queue_data *bd)
+{
+	struct request *rq = bd->rq;
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
+	struct ata_internal_sg_data *data;
+	struct ata_device *dev = qc->dev;
+	struct ata_port *ap = qc->ap;
+	struct ata_link *link = dev->link;
+	unsigned long flags;
+
+	data = container_of(qc->private_data, struct ata_internal_sg_data, wait);
+
+	blk_mq_start_request(bd->rq);
+
+	spin_lock_irqsave(ap->lock, flags);
+
+	/* no internal command while frozen */
+	if (ap->pflags & ATA_PFLAG_FROZEN) {
+		spin_unlock_irqrestore(ap->lock, flags);
+		return BLK_STS_TARGET;
+	}
+
+	data->preempted_tag = link->active_tag;
+	data->preempted_sactive = link->sactive;
+	data->preempted_qc_active = ap->qc_active;
+	data->preempted_nr_active_links = ap->nr_active_links;
+	link->active_tag = ATA_TAG_POISON;
+	link->sactive = 0;
+	ap->qc_active = 0;
+	ap->nr_active_links = 0;
+
+	ata_qc_issue(qc);
+
+	spin_unlock_irqrestore(ap->lock, flags);
+
+	return BLK_STS_OK;
+}
+
+static const struct blk_mq_ops ata_exec_internal_sg_mq_ops = {
+	.queue_rq	= ata_exec_internal_sg_queue_rq,
+};
+
 /**
  *	ata_exec_internal_sg - execute libata internal command
  *	@dev: Device to which the command is sent
@@ -1467,45 +1520,46 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 {
 	struct ata_link *link = dev->link;
 	struct ata_port *ap = link->ap;
+	struct Scsi_Host *scsi_host = ap->scsi_host;
+	struct request_queue *request_queue;
 	u8 command = tf->command;
-	int auto_timeout = 0;
 	struct ata_queued_cmd *qc;
-	unsigned int preempted_tag;
-	u32 preempted_sactive;
-	u64 preempted_qc_active;
-	int preempted_nr_active_links;
-	DECLARE_COMPLETION_ONSTACK(wait);
-	unsigned long flags;
+	struct scsi_cmnd *scmd;
 	unsigned int err_mask;
-	int rc;
+	unsigned long flags;
+	struct request *rq;
+	int rc, auto_timeout = 0;
+	struct ata_internal_sg_data data = {
+		.wait = COMPLETION_INITIALIZER_ONSTACK(data.wait),
+	};
+	unsigned int op;
 
-	spin_lock_irqsave(ap->lock, flags);
+	op = (dma_dir == DMA_TO_DEVICE) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN;
 
-	/* no internal command while frozen */
-	if (ap->pflags & ATA_PFLAG_FROZEN) {
-		spin_unlock_irqrestore(ap->lock, flags);
-		return AC_ERR_SYSTEM;
+	request_queue = blk_mq_init_queue_ops(&scsi_host->tag_set,
+					      &ata_exec_internal_sg_mq_ops);
+	if (!request_queue)
+		return AC_ERR_OTHER;
+
+	rq = scsi_alloc_request(request_queue, op, 0);
+	if (IS_ERR(rq)) {
+		err_mask = AC_ERR_OTHER;
+		goto out;
 	}
 
+	scmd = blk_mq_rq_to_pdu(rq);
+	scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+
 	/* initialize internal qc */
 	qc = __ata_qc_from_tag(ap, ATA_TAG_INTERNAL);
 
 	qc->tag = ATA_TAG_INTERNAL;
 	qc->hw_tag = 0;
-	qc->scsicmd = NULL;
+	qc->scsicmd = scmd;
 	qc->ap = ap;
 	qc->dev = dev;
 	ata_qc_reinit(qc);
 
-	preempted_tag = link->active_tag;
-	preempted_sactive = link->sactive;
-	preempted_qc_active = ap->qc_active;
-	preempted_nr_active_links = ap->nr_active_links;
-	link->active_tag = ATA_TAG_POISON;
-	link->sactive = 0;
-	ap->qc_active = 0;
-	ap->nr_active_links = 0;
-
 	/* prepare & issue qc */
 	qc->tf = *tf;
 	if (cdb)
@@ -1529,12 +1583,11 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 		qc->nbytes = buflen;
 	}
 
-	qc->private_data = &wait;
+	qc->private_data = &data.wait;
 	qc->complete_fn = ata_qc_complete_internal;
 
-	ata_qc_issue(qc);
-
-	spin_unlock_irqrestore(ap->lock, flags);
+	scmd->host_scribble = (unsigned char *)qc;
+	blk_execute_rq_nowait(rq, true, NULL);
 
 	if (!timeout) {
 		if (ata_probe_timeout)
@@ -1548,7 +1601,7 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 	if (ap->ops->error_handler)
 		ata_eh_release(ap);
 
-	rc = wait_for_completion_timeout(&wait, msecs_to_jiffies(timeout));
+	rc = wait_for_completion_timeout(&data.wait, msecs_to_jiffies(timeout));
 
 	if (ap->ops->error_handler)
 		ata_eh_acquire(ap);
@@ -1603,16 +1656,20 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 	err_mask = qc->err_mask;
 
 	ata_qc_free(qc);
-	link->active_tag = preempted_tag;
-	link->sactive = preempted_sactive;
-	ap->qc_active = preempted_qc_active;
-	ap->nr_active_links = preempted_nr_active_links;
+	link->active_tag = data.preempted_tag;
+	link->sactive = data.preempted_sactive;
+	ap->qc_active = data.preempted_qc_active;
+	ap->nr_active_links = data.preempted_nr_active_links;
 
 	spin_unlock_irqrestore(ap->lock, flags);
 
 	if ((err_mask & AC_ERR_TIMEOUT) && auto_timeout)
 		ata_internal_cmd_timed_out(dev, command);
 
+	__blk_mq_end_request(rq, BLK_STS_OK);
+
+out:
+	blk_cleanup_queue(request_queue);
 	return err_mask;
 }
 
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Send SMP commands through the block layer so that each command gets a
unique tag associated.

We can now also take advantage of the block layer request timeout handling.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_expander.c  | 38 +++++++++++++++------
 drivers/scsi/libsas/sas_internal.h  |  1 +
 drivers/scsi/libsas/sas_scsi_host.c | 53 +++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 260e735d06fa..bd25b00912ac 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -39,10 +39,19 @@ static int smp_execute_task_sg(struct domain_device *dev,
 	struct sas_internal *i =
 		to_sas_internal(dev->port->ha->core.shost->transportt);
 	struct sas_ha_struct *ha = dev->port->ha;
+	struct Scsi_Host *shost = ha->core.shost;
+	struct request_queue *request_queue;
+	struct request *rq;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	pm_runtime_get_sync(ha->dev);
 	mutex_lock(&dev->ex_dev.cmd_mutex);
 	for (retry = 0; retry < 3; retry++) {
+		struct scsi_cmnd *scmd;
+
 		if (test_bit(SAS_DEV_GONE, &dev->state)) {
 			res = -ECOMM;
 			break;
@@ -53,26 +62,30 @@ static int smp_execute_task_sg(struct domain_device *dev,
 			res = -ENOMEM;
 			break;
 		}
+
+		rq = scsi_alloc_request(request_queue, REQ_OP_DRV_IN, 0);
+		if (IS_ERR(rq)) {
+			res = PTR_ERR(rq);
+			break;
+		}
+
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
 		task->dev = dev;
+		task->uldd_task = scmd;
 		task->task_proto = dev->tproto;
 		task->smp_task.smp_req = *req;
 		task->smp_task.smp_resp = *resp;
-
 		task->task_done = sas_task_internal_done;
 
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
-		add_timer(&task->slow_task->timer);
+		rq->timeout = SMP_TIMEOUT*HZ;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-
-		if (res) {
-			del_timer(&task->slow_task->timer);
-			pr_notice("executing SMP task failed:%d\n", res);
-			break;
-		}
+		blk_execute_rq_nowait(rq, true, NULL);
 
 		wait_for_completion(&task->slow_task->completion);
+		__blk_mq_end_request(rq, BLK_STS_OK);
 		res = -ECOMM;
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
 			pr_notice("smp task timed out or aborted\n");
@@ -117,6 +130,9 @@ static int smp_execute_task_sg(struct domain_device *dev,
 
 	BUG_ON(retry == 3 && task != NULL);
 	sas_free_task(task);
+
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index 13d0ffaada93..ca2bace16953 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -95,6 +95,7 @@ extern const work_func_t sas_port_event_fns[PORT_NUM_EVENTS];
 
 void sas_task_internal_done(struct sas_task *task);
 void sas_task_internal_timedout(struct timer_list *t);
+struct request_queue *sas_alloc_request_queue(struct Scsi_Host *shost);
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf);
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 9c82e5dc4fcc..91133ad37ae8 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -920,6 +920,59 @@ void sas_task_internal_timedout(struct timer_list *t)
 #define TASK_TIMEOUT			(20 * HZ)
 #define TASK_RETRY			3
 
+static enum blk_eh_timer_return sas_task_timedout(struct request *rq, bool resv)
+{
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct sas_task *task = TO_SAS_TASK(scmd);
+	bool is_completed = true;
+	unsigned long flags;
+
+	spin_lock_irqsave(&task->task_state_lock, flags);
+	if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
+		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
+		is_completed = false;
+	}
+	spin_unlock_irqrestore(&task->task_state_lock, flags);
+
+	if (!is_completed)
+		complete(&task->slow_task->completion);
+	return BLK_EH_DONE;
+}
+
+static blk_status_t sas_exec_rq(struct blk_mq_hw_ctx *hctx,
+				const struct blk_mq_queue_data *bd)
+{
+	struct request *rq = bd->rq;
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct sas_task *task = (struct sas_task *)scmd->host_scribble;
+	struct domain_device *device = task->dev;
+	struct sas_ha_struct *ha = device->port->ha;
+	struct sas_internal *i = to_sas_internal(ha->core.shost->transportt);
+	int res;
+
+	blk_mq_start_request(bd->rq);
+
+	res = i->dft->lldd_execute_task(task, GFP_KERNEL);
+	if (res) {
+		pr_notice("executing task proto 0x%x failed:%d\n", task->task_proto, res);
+		task->task_status.resp = SAS_TASK_UNDELIVERED;
+		task->task_status.stat = SAS_DEVICE_UNKNOWN;
+		complete(&task->slow_task->completion);
+	}
+
+	return BLK_STS_OK;
+}
+
+static const struct blk_mq_ops sas_blk_mq_ops = {
+	.queue_rq	= sas_exec_rq,
+	.timeout	= sas_task_timedout,
+};
+
+struct request_queue *sas_alloc_request_queue(struct Scsi_Host *shost)
+{
+	return blk_mq_init_queue_ops(&shost->tag_set, &sas_blk_mq_ops);
+}
+
 static int sas_execute_internal_abort(struct domain_device *device,
 				      enum sas_internal_abort type, u16 tag,
 				      unsigned int qid, void *data)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Send SMP commands through the block layer so that each command gets a
unique tag associated.

We can now also take advantage of the block layer request timeout handling.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_expander.c  | 38 +++++++++++++++------
 drivers/scsi/libsas/sas_internal.h  |  1 +
 drivers/scsi/libsas/sas_scsi_host.c | 53 +++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 260e735d06fa..bd25b00912ac 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -39,10 +39,19 @@ static int smp_execute_task_sg(struct domain_device *dev,
 	struct sas_internal *i =
 		to_sas_internal(dev->port->ha->core.shost->transportt);
 	struct sas_ha_struct *ha = dev->port->ha;
+	struct Scsi_Host *shost = ha->core.shost;
+	struct request_queue *request_queue;
+	struct request *rq;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	pm_runtime_get_sync(ha->dev);
 	mutex_lock(&dev->ex_dev.cmd_mutex);
 	for (retry = 0; retry < 3; retry++) {
+		struct scsi_cmnd *scmd;
+
 		if (test_bit(SAS_DEV_GONE, &dev->state)) {
 			res = -ECOMM;
 			break;
@@ -53,26 +62,30 @@ static int smp_execute_task_sg(struct domain_device *dev,
 			res = -ENOMEM;
 			break;
 		}
+
+		rq = scsi_alloc_request(request_queue, REQ_OP_DRV_IN, 0);
+		if (IS_ERR(rq)) {
+			res = PTR_ERR(rq);
+			break;
+		}
+
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
 		task->dev = dev;
+		task->uldd_task = scmd;
 		task->task_proto = dev->tproto;
 		task->smp_task.smp_req = *req;
 		task->smp_task.smp_resp = *resp;
-
 		task->task_done = sas_task_internal_done;
 
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
-		add_timer(&task->slow_task->timer);
+		rq->timeout = SMP_TIMEOUT*HZ;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-
-		if (res) {
-			del_timer(&task->slow_task->timer);
-			pr_notice("executing SMP task failed:%d\n", res);
-			break;
-		}
+		blk_execute_rq_nowait(rq, true, NULL);
 
 		wait_for_completion(&task->slow_task->completion);
+		__blk_mq_end_request(rq, BLK_STS_OK);
 		res = -ECOMM;
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
 			pr_notice("smp task timed out or aborted\n");
@@ -117,6 +130,9 @@ static int smp_execute_task_sg(struct domain_device *dev,
 
 	BUG_ON(retry == 3 && task != NULL);
 	sas_free_task(task);
+
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index 13d0ffaada93..ca2bace16953 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -95,6 +95,7 @@ extern const work_func_t sas_port_event_fns[PORT_NUM_EVENTS];
 
 void sas_task_internal_done(struct sas_task *task);
 void sas_task_internal_timedout(struct timer_list *t);
+struct request_queue *sas_alloc_request_queue(struct Scsi_Host *shost);
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf);
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 9c82e5dc4fcc..91133ad37ae8 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -920,6 +920,59 @@ void sas_task_internal_timedout(struct timer_list *t)
 #define TASK_TIMEOUT			(20 * HZ)
 #define TASK_RETRY			3
 
+static enum blk_eh_timer_return sas_task_timedout(struct request *rq, bool resv)
+{
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct sas_task *task = TO_SAS_TASK(scmd);
+	bool is_completed = true;
+	unsigned long flags;
+
+	spin_lock_irqsave(&task->task_state_lock, flags);
+	if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
+		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
+		is_completed = false;
+	}
+	spin_unlock_irqrestore(&task->task_state_lock, flags);
+
+	if (!is_completed)
+		complete(&task->slow_task->completion);
+	return BLK_EH_DONE;
+}
+
+static blk_status_t sas_exec_rq(struct blk_mq_hw_ctx *hctx,
+				const struct blk_mq_queue_data *bd)
+{
+	struct request *rq = bd->rq;
+	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq);
+	struct sas_task *task = (struct sas_task *)scmd->host_scribble;
+	struct domain_device *device = task->dev;
+	struct sas_ha_struct *ha = device->port->ha;
+	struct sas_internal *i = to_sas_internal(ha->core.shost->transportt);
+	int res;
+
+	blk_mq_start_request(bd->rq);
+
+	res = i->dft->lldd_execute_task(task, GFP_KERNEL);
+	if (res) {
+		pr_notice("executing task proto 0x%x failed:%d\n", task->task_proto, res);
+		task->task_status.resp = SAS_TASK_UNDELIVERED;
+		task->task_status.stat = SAS_DEVICE_UNKNOWN;
+		complete(&task->slow_task->completion);
+	}
+
+	return BLK_STS_OK;
+}
+
+static const struct blk_mq_ops sas_blk_mq_ops = {
+	.queue_rq	= sas_exec_rq,
+	.timeout	= sas_task_timedout,
+};
+
+struct request_queue *sas_alloc_request_queue(struct Scsi_Host *shost)
+{
+	return blk_mq_init_queue_ops(&shost->tag_set, &sas_blk_mq_ops);
+}
+
 static int sas_execute_internal_abort(struct domain_device *device,
 				      enum sas_internal_abort type, u16 tag,
 				      unsigned int qid, void *data)
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 05/11] scsi: libsas: Send TMF commands through the block layer
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Like what we did with SMP commands, send TMF commands through the block
layer.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_scsi_host.c | 44 +++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 91133ad37ae8..04c6298d1340 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1070,15 +1070,39 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 	struct sas_internal *i =
 		to_sas_internal(device->port->ha->core.shost->transportt);
 	int res, retry;
+	struct request_queue *request_queue;
+	struct request *rq;
+	struct sas_ha_struct *ha = device->port->ha;
+	struct Scsi_Host *shost = ha->core.shost;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	for (retry = 0; retry < TASK_RETRY; retry++) {
+		struct scsi_cmnd *scmd;
+
 		task = sas_alloc_slow_task(GFP_KERNEL);
-		if (!task)
-			return -ENOMEM;
+		if (!task) {
+			res = -ENOMEM;
+			break;
+		}
 
 		task->dev = device;
 		task->task_proto = device->tproto;
 
+		rq = scsi_alloc_request(request_queue, REQ_OP_DRV_IN, 0);
+		if (IS_ERR(rq)) {
+			res = PTR_ERR(rq);
+			break;
+		}
+
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
+		task->uldd_task = scmd;
+
 		if (dev_is_sata(device)) {
 			task->ata_task.device_control_reg_update = 1;
 			if (force_phy_id >= 0) {
@@ -1093,20 +1117,14 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 		task->task_done = sas_task_internal_done;
 		task->tmf = tmf;
 
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + TASK_TIMEOUT;
-		add_timer(&task->slow_task->timer);
+		rq->timeout = TASK_TIMEOUT;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-		if (res) {
-			del_timer_sync(&task->slow_task->timer);
-			pr_err("executing TMF task failed %016llx (%d)\n",
-			       SAS_ADDR(device->sas_addr), res);
-			break;
-		}
+		blk_execute_rq_nowait(rq, true, NULL);
 
 		wait_for_completion(&task->slow_task->completion);
 
+		__blk_mq_end_request(rq, BLK_STS_OK);
+
 		if (i->dft->lldd_tmf_exec_complete)
 			i->dft->lldd_tmf_exec_complete(device);
 
@@ -1177,6 +1195,8 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 			SAS_ADDR(device->sas_addr), TASK_RETRY);
 	sas_free_task(task);
 
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 05/11] scsi: libsas: Send TMF commands through the block layer
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Like what we did with SMP commands, send TMF commands through the block
layer.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_scsi_host.c | 44 +++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 91133ad37ae8..04c6298d1340 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1070,15 +1070,39 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 	struct sas_internal *i =
 		to_sas_internal(device->port->ha->core.shost->transportt);
 	int res, retry;
+	struct request_queue *request_queue;
+	struct request *rq;
+	struct sas_ha_struct *ha = device->port->ha;
+	struct Scsi_Host *shost = ha->core.shost;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	for (retry = 0; retry < TASK_RETRY; retry++) {
+		struct scsi_cmnd *scmd;
+
 		task = sas_alloc_slow_task(GFP_KERNEL);
-		if (!task)
-			return -ENOMEM;
+		if (!task) {
+			res = -ENOMEM;
+			break;
+		}
 
 		task->dev = device;
 		task->task_proto = device->tproto;
 
+		rq = scsi_alloc_request(request_queue, REQ_OP_DRV_IN, 0);
+		if (IS_ERR(rq)) {
+			res = PTR_ERR(rq);
+			break;
+		}
+
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
+		task->uldd_task = scmd;
+
 		if (dev_is_sata(device)) {
 			task->ata_task.device_control_reg_update = 1;
 			if (force_phy_id >= 0) {
@@ -1093,20 +1117,14 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 		task->task_done = sas_task_internal_done;
 		task->tmf = tmf;
 
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + TASK_TIMEOUT;
-		add_timer(&task->slow_task->timer);
+		rq->timeout = TASK_TIMEOUT;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-		if (res) {
-			del_timer_sync(&task->slow_task->timer);
-			pr_err("executing TMF task failed %016llx (%d)\n",
-			       SAS_ADDR(device->sas_addr), res);
-			break;
-		}
+		blk_execute_rq_nowait(rq, true, NULL);
 
 		wait_for_completion(&task->slow_task->completion);
 
+		__blk_mq_end_request(rq, BLK_STS_OK);
+
 		if (i->dft->lldd_tmf_exec_complete)
 			i->dft->lldd_tmf_exec_complete(device);
 
@@ -1177,6 +1195,8 @@ int sas_execute_tmf(struct domain_device *device, void *parameter,
 			SAS_ADDR(device->sas_addr), TASK_RETRY);
 	sas_free_task(task);
 
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 06/11] scsi: core: Add scsi_alloc_request_hwq()
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Add a variant of scsi_alloc_request() which allocates a request for a
specific hw queue.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_lib.c  | 12 ++++++++++++
 include/scsi/scsi_cmnd.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d230392f2b4a..543cc01b2816 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1137,6 +1137,18 @@ struct request *scsi_alloc_request(struct request_queue *q,
 }
 EXPORT_SYMBOL_GPL(scsi_alloc_request);
 
+struct request *scsi_alloc_request_hwq(struct request_queue *q,
+		unsigned int op, blk_mq_req_flags_t flags, unsigned int hwq)
+{
+	struct request *rq;
+
+	rq = blk_mq_alloc_request_hctx(q, op, flags, hwq);
+	if (!IS_ERR(rq))
+		scsi_initialize_rq(rq);
+	return rq;
+}
+EXPORT_SYMBOL_GPL(scsi_alloc_request_hwq);
+
 /*
  * Only called when the request isn't completed by SCSI, and not freed by
  * SCSI
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index ad4bcace1390..94e65f4c81b5 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -399,4 +399,7 @@ extern void scsi_build_sense(struct scsi_cmnd *scmd, int desc,
 struct request *scsi_alloc_request(struct request_queue *q,
 		unsigned int op, blk_mq_req_flags_t flags);
 
+struct request *scsi_alloc_request_hwq(struct request_queue *q,
+		unsigned int op, blk_mq_req_flags_t flags, unsigned int hwq);
+
 #endif /* _SCSI_SCSI_CMND_H */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 06/11] scsi: core: Add scsi_alloc_request_hwq()
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Add a variant of scsi_alloc_request() which allocates a request for a
specific hw queue.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_lib.c  | 12 ++++++++++++
 include/scsi/scsi_cmnd.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d230392f2b4a..543cc01b2816 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1137,6 +1137,18 @@ struct request *scsi_alloc_request(struct request_queue *q,
 }
 EXPORT_SYMBOL_GPL(scsi_alloc_request);
 
+struct request *scsi_alloc_request_hwq(struct request_queue *q,
+		unsigned int op, blk_mq_req_flags_t flags, unsigned int hwq)
+{
+	struct request *rq;
+
+	rq = blk_mq_alloc_request_hctx(q, op, flags, hwq);
+	if (!IS_ERR(rq))
+		scsi_initialize_rq(rq);
+	return rq;
+}
+EXPORT_SYMBOL_GPL(scsi_alloc_request_hwq);
+
 /*
  * Only called when the request isn't completed by SCSI, and not freed by
  * SCSI
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index ad4bcace1390..94e65f4c81b5 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -399,4 +399,7 @@ extern void scsi_build_sense(struct scsi_cmnd *scmd, int desc,
 struct request *scsi_alloc_request(struct request_queue *q,
 		unsigned int op, blk_mq_req_flags_t flags);
 
+struct request *scsi_alloc_request_hwq(struct request_queue *q,
+		unsigned int op, blk_mq_req_flags_t flags, unsigned int hwq);
+
 #endif /* _SCSI_SCSI_CMND_H */
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 07/11] scsi: libsas: Send internal abort commands through the block layer
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Like what we did for SMP commands, send internal abort commands through
the block layer.

At this point no code in libsas relies on the sas_slow_task.timer;
unfortunately we can't delete it as some LLDDs (pm8001) still rely on it -
let's try to remove them in future.

But we can delete the legacy timeout handler, sas_task_internal_timedout().

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_scsi_host.c | 59 +++++++++++++++--------------
 1 file changed, 31 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 04c6298d1340..f139977098c2 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -899,24 +899,6 @@ void sas_task_internal_done(struct sas_task *task)
 	complete(&task->slow_task->completion);
 }
 
-void sas_task_internal_timedout(struct timer_list *t)
-{
-	struct sas_task_slow *slow = from_timer(slow, t, timer);
-	struct sas_task *task = slow->task;
-	bool is_completed = true;
-	unsigned long flags;
-
-	spin_lock_irqsave(&task->task_state_lock, flags);
-	if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
-		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
-		is_completed = false;
-	}
-	spin_unlock_irqrestore(&task->task_state_lock, flags);
-
-	if (!is_completed)
-		complete(&task->slow_task->completion);
-}
-
 #define TASK_TIMEOUT			(20 * HZ)
 #define TASK_RETRY			3
 
@@ -979,35 +961,52 @@ static int sas_execute_internal_abort(struct domain_device *device,
 {
 	struct sas_ha_struct *ha = device->port->ha;
 	struct sas_internal *i = to_sas_internal(ha->core.shost->transportt);
+	struct Scsi_Host *shost = ha->core.shost;
 	struct sas_task *task = NULL;
 	int res, retry;
+	struct request_queue *request_queue;
+	struct request *rq;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	for (retry = 0; retry < TASK_RETRY; retry++) {
+		struct scsi_cmnd *scmd;
+
 		task = sas_alloc_slow_task(GFP_KERNEL);
-		if (!task)
-			return -ENOMEM;
+		if (!task) {
+			res = -ENOMEM;
+			break;
+		}
 
 		task->dev = device;
 		task->task_proto = SAS_PROTOCOL_INTERNAL_ABORT;
 		task->task_done = sas_task_internal_done;
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + TASK_TIMEOUT;
-		add_timer(&task->slow_task->timer);
 
 		task->abort_task.tag = tag;
 		task->abort_task.type = type;
 		task->abort_task.qid = qid;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-		if (res) {
-			del_timer_sync(&task->slow_task->timer);
-			pr_err("Executing internal abort failed %016llx (%d)\n",
-			       SAS_ADDR(device->sas_addr), res);
+		rq = scsi_alloc_request_hwq(request_queue, REQ_OP_DRV_IN, BLK_MQ_REQ_NOWAIT, qid);
+		if (!rq) {
+			res = PTR_ERR(rq);
 			break;
 		}
 
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
+		task->uldd_task = scmd;
+
+		rq->timeout = TASK_TIMEOUT;
+
+		blk_execute_rq_nowait(rq, true, NULL);
+
 		wait_for_completion(&task->slow_task->completion);
 		res = TMF_RESP_FUNC_FAILED;
+		__blk_mq_end_request(rq, BLK_STS_OK);
 
 		/* Even if the internal abort timed out, return direct. */
 		if (task->task_state_flags & SAS_TASK_STATE_ABORTED) {
@@ -1042,7 +1041,11 @@ static int sas_execute_internal_abort(struct domain_device *device,
 		task = NULL;
 	}
 	BUG_ON(retry == TASK_RETRY && task != NULL);
+
 	sas_free_task(task);
+
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 07/11] scsi: libsas: Send internal abort commands through the block layer
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Like what we did for SMP commands, send internal abort commands through
the block layer.

At this point no code in libsas relies on the sas_slow_task.timer;
unfortunately we can't delete it as some LLDDs (pm8001) still rely on it -
let's try to remove them in future.

But we can delete the legacy timeout handler, sas_task_internal_timedout().

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/libsas/sas_scsi_host.c | 59 +++++++++++++++--------------
 1 file changed, 31 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 04c6298d1340..f139977098c2 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -899,24 +899,6 @@ void sas_task_internal_done(struct sas_task *task)
 	complete(&task->slow_task->completion);
 }
 
-void sas_task_internal_timedout(struct timer_list *t)
-{
-	struct sas_task_slow *slow = from_timer(slow, t, timer);
-	struct sas_task *task = slow->task;
-	bool is_completed = true;
-	unsigned long flags;
-
-	spin_lock_irqsave(&task->task_state_lock, flags);
-	if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
-		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
-		is_completed = false;
-	}
-	spin_unlock_irqrestore(&task->task_state_lock, flags);
-
-	if (!is_completed)
-		complete(&task->slow_task->completion);
-}
-
 #define TASK_TIMEOUT			(20 * HZ)
 #define TASK_RETRY			3
 
@@ -979,35 +961,52 @@ static int sas_execute_internal_abort(struct domain_device *device,
 {
 	struct sas_ha_struct *ha = device->port->ha;
 	struct sas_internal *i = to_sas_internal(ha->core.shost->transportt);
+	struct Scsi_Host *shost = ha->core.shost;
 	struct sas_task *task = NULL;
 	int res, retry;
+	struct request_queue *request_queue;
+	struct request *rq;
+
+	request_queue = sas_alloc_request_queue(shost);
+	if (IS_ERR(request_queue))
+		return PTR_ERR(request_queue);
 
 	for (retry = 0; retry < TASK_RETRY; retry++) {
+		struct scsi_cmnd *scmd;
+
 		task = sas_alloc_slow_task(GFP_KERNEL);
-		if (!task)
-			return -ENOMEM;
+		if (!task) {
+			res = -ENOMEM;
+			break;
+		}
 
 		task->dev = device;
 		task->task_proto = SAS_PROTOCOL_INTERNAL_ABORT;
 		task->task_done = sas_task_internal_done;
-		task->slow_task->timer.function = sas_task_internal_timedout;
-		task->slow_task->timer.expires = jiffies + TASK_TIMEOUT;
-		add_timer(&task->slow_task->timer);
 
 		task->abort_task.tag = tag;
 		task->abort_task.type = type;
 		task->abort_task.qid = qid;
 
-		res = i->dft->lldd_execute_task(task, GFP_KERNEL);
-		if (res) {
-			del_timer_sync(&task->slow_task->timer);
-			pr_err("Executing internal abort failed %016llx (%d)\n",
-			       SAS_ADDR(device->sas_addr), res);
+		rq = scsi_alloc_request_hwq(request_queue, REQ_OP_DRV_IN, BLK_MQ_REQ_NOWAIT, qid);
+		if (!rq) {
+			res = PTR_ERR(rq);
 			break;
 		}
 
+		scmd = blk_mq_rq_to_pdu(rq);
+		scmd->submitter = SUBMITTED_BY_SCSI_CUSTOM_OPS;
+		ASSIGN_SAS_TASK(scmd, task);
+
+		task->uldd_task = scmd;
+
+		rq->timeout = TASK_TIMEOUT;
+
+		blk_execute_rq_nowait(rq, true, NULL);
+
 		wait_for_completion(&task->slow_task->completion);
 		res = TMF_RESP_FUNC_FAILED;
+		__blk_mq_end_request(rq, BLK_STS_OK);
 
 		/* Even if the internal abort timed out, return direct. */
 		if (task->task_state_flags & SAS_TASK_STATE_ABORTED) {
@@ -1042,7 +1041,11 @@ static int sas_execute_internal_abort(struct domain_device *device,
 		task = NULL;
 	}
 	BUG_ON(retry == TASK_RETRY && task != NULL);
+
 	sas_free_task(task);
+
+	blk_cleanup_queue(request_queue);
+
 	return res;
 }
 
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 08/11] scsi: libsas: Change ATA support to deal with each qc having a SCSI command
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Now that each ATA qc has a SCSI command associated, change the qc <->
scsi_cmnd <-> sas_task referencing to deal with this. Essentially
task->uldd_task = scmd always. We need to do this for sas_execute_tmf(),
as those ATA commands have no qc.

TODO: rename ASSIGN_SAS_TASK and TO_SAS_TASK to be more appropriate.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 11 +----------
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  3 ++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  3 ++-
 drivers/scsi/libsas/sas_ata.c          | 11 ++++++-----
 4 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 461ef8a76c4c..e438c68e249a 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -520,16 +520,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		if (task->uldd_task) {
-			struct ata_queued_cmd *qc;
-
-			if (dev_is_sata(device)) {
-				qc = task->uldd_task;
-				scmd = qc->scsicmd;
-			} else {
-				scmd = task->uldd_task;
-			}
-		}
+		scmd = task->uldd_task;
 
 		if (scmd) {
 			unsigned int dq_index;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 455d49299ddf..d1f2c8fbdc79 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2538,7 +2538,8 @@ static void prep_ata_v2_hw(struct hisi_hba *hisi_hba,
 
 	/* dw2 */
 	if (task->ata_task.use_ncq) {
-		struct ata_queued_cmd *qc = task->uldd_task;
+		struct scsi_cmnd *scmd = task->uldd_task;
+		struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
 
 		hdr_tag = qc->tag;
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 79f87d7c3e68..e3e02a24ee76 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -1421,7 +1421,8 @@ static void prep_ata_v3_hw(struct hisi_hba *hisi_hba,
 
 	/* dw2 */
 	if (task->ata_task.use_ncq) {
-		struct ata_queued_cmd *qc = task->uldd_task;
+		struct scsi_cmnd *scmd = task->uldd_task;
+		struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
 
 		hdr_tag = qc->tag;
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index d34c82e24d9a..4d2b8fb31f30 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -72,7 +72,8 @@ static enum ata_completion_errors sas_to_ata_err(struct task_status_struct *ts)
 
 static void sas_ata_task_done(struct sas_task *task)
 {
-	struct ata_queued_cmd *qc = task->uldd_task;
+	struct scsi_cmnd *scmd = task->uldd_task;
+	struct ata_queued_cmd *qc = TO_SAS_TASK(scmd);
 	struct domain_device *dev = task->dev;
 	struct task_status_struct *stat = &task->task_status;
 	struct ata_task_resp *resp = (struct ata_task_resp *)stat->buf;
@@ -184,7 +185,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 		qc->tf.nsect = 0;
 
 	ata_tf_to_fis(&qc->tf, qc->dev->link->pmp, 1, (u8 *)&task->ata_task.fis);
-	task->uldd_task = qc;
+	task->uldd_task = qc->scsicmd;
 	if (ata_is_atapi(qc->tf.protocol)) {
 		memcpy(task->ata_task.atapi_packet, qc->cdb, qc->dev->cdb_len);
 		task->total_xfer_len = qc->nbytes;
@@ -207,8 +208,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 	task->ata_task.use_ncq = ata_is_ncq(qc->tf.protocol);
 	task->ata_task.dma_xfer = ata_is_dma(qc->tf.protocol);
 
-	if (qc->scsicmd)
-		ASSIGN_SAS_TASK(qc->scsicmd, task);
+	ASSIGN_SAS_TASK(qc->scsicmd, qc);
 
 	ret = i->dft->lldd_execute_task(task, GFP_ATOMIC);
 	if (ret) {
@@ -583,7 +583,8 @@ int sas_ata_init(struct domain_device *found_dev)
 
 void sas_ata_task_abort(struct sas_task *task)
 {
-	struct ata_queued_cmd *qc = task->uldd_task;
+	struct scsi_cmnd *scmd = task->uldd_task;
+	struct ata_queued_cmd *qc = TO_SAS_TASK(scmd);
 	struct completion *waiting;
 
 	/* Bounce SCSI-initiated commands to the SCSI EH */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 08/11] scsi: libsas: Change ATA support to deal with each qc having a SCSI command
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Now that each ATA qc has a SCSI command associated, change the qc <->
scsi_cmnd <-> sas_task referencing to deal with this. Essentially
task->uldd_task = scmd always. We need to do this for sas_execute_tmf(),
as those ATA commands have no qc.

TODO: rename ASSIGN_SAS_TASK and TO_SAS_TASK to be more appropriate.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 11 +----------
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  3 ++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  3 ++-
 drivers/scsi/libsas/sas_ata.c          | 11 ++++++-----
 4 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 461ef8a76c4c..e438c68e249a 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -520,16 +520,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		if (task->uldd_task) {
-			struct ata_queued_cmd *qc;
-
-			if (dev_is_sata(device)) {
-				qc = task->uldd_task;
-				scmd = qc->scsicmd;
-			} else {
-				scmd = task->uldd_task;
-			}
-		}
+		scmd = task->uldd_task;
 
 		if (scmd) {
 			unsigned int dq_index;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 455d49299ddf..d1f2c8fbdc79 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2538,7 +2538,8 @@ static void prep_ata_v2_hw(struct hisi_hba *hisi_hba,
 
 	/* dw2 */
 	if (task->ata_task.use_ncq) {
-		struct ata_queued_cmd *qc = task->uldd_task;
+		struct scsi_cmnd *scmd = task->uldd_task;
+		struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
 
 		hdr_tag = qc->tag;
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 79f87d7c3e68..e3e02a24ee76 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -1421,7 +1421,8 @@ static void prep_ata_v3_hw(struct hisi_hba *hisi_hba,
 
 	/* dw2 */
 	if (task->ata_task.use_ncq) {
-		struct ata_queued_cmd *qc = task->uldd_task;
+		struct scsi_cmnd *scmd = task->uldd_task;
+		struct ata_queued_cmd *qc = (struct ata_queued_cmd *)scmd->host_scribble;
 
 		hdr_tag = qc->tag;
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index d34c82e24d9a..4d2b8fb31f30 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -72,7 +72,8 @@ static enum ata_completion_errors sas_to_ata_err(struct task_status_struct *ts)
 
 static void sas_ata_task_done(struct sas_task *task)
 {
-	struct ata_queued_cmd *qc = task->uldd_task;
+	struct scsi_cmnd *scmd = task->uldd_task;
+	struct ata_queued_cmd *qc = TO_SAS_TASK(scmd);
 	struct domain_device *dev = task->dev;
 	struct task_status_struct *stat = &task->task_status;
 	struct ata_task_resp *resp = (struct ata_task_resp *)stat->buf;
@@ -184,7 +185,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 		qc->tf.nsect = 0;
 
 	ata_tf_to_fis(&qc->tf, qc->dev->link->pmp, 1, (u8 *)&task->ata_task.fis);
-	task->uldd_task = qc;
+	task->uldd_task = qc->scsicmd;
 	if (ata_is_atapi(qc->tf.protocol)) {
 		memcpy(task->ata_task.atapi_packet, qc->cdb, qc->dev->cdb_len);
 		task->total_xfer_len = qc->nbytes;
@@ -207,8 +208,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 	task->ata_task.use_ncq = ata_is_ncq(qc->tf.protocol);
 	task->ata_task.dma_xfer = ata_is_dma(qc->tf.protocol);
 
-	if (qc->scsicmd)
-		ASSIGN_SAS_TASK(qc->scsicmd, task);
+	ASSIGN_SAS_TASK(qc->scsicmd, qc);
 
 	ret = i->dft->lldd_execute_task(task, GFP_ATOMIC);
 	if (ret) {
@@ -583,7 +583,8 @@ int sas_ata_init(struct domain_device *found_dev)
 
 void sas_ata_task_abort(struct sas_task *task)
 {
-	struct ata_queued_cmd *qc = task->uldd_task;
+	struct scsi_cmnd *scmd = task->uldd_task;
+	struct ata_queued_cmd *qc = TO_SAS_TASK(scmd);
 	struct completion *waiting;
 
 	/* Bounce SCSI-initiated commands to the SCSI EH */
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 09/11] scsi: libsas: Add sas_task_to_unique_tag()
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Add a function to get the unique request tag from a sas_task.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++----
 drivers/scsi/libsas/sas_scsi_host.c   | 14 ++++++++++++++
 include/scsi/libsas.h                 |  2 ++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index e438c68e249a..f6b64c789335 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -177,13 +177,13 @@ static void hisi_sas_slot_index_set(struct hisi_hba *hisi_hba, int slot_idx)
 }
 
 static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
-				     struct scsi_cmnd *scsi_cmnd)
+				     struct sas_task *task)
 {
 	int index;
 	void *bitmap = hisi_hba->slot_index_tags;
 
-	if (scsi_cmnd)
-		return scsi_cmd_to_rq(scsi_cmnd)->tag;
+	if (task)
+		return sas_task_to_unique_tag(task);
 
 	spin_lock(&hisi_hba->lock);
 	index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
@@ -572,7 +572,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	if (!internal_abort && hisi_hba->hw->slot_index_alloc)
 		rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
 	else
-		rc = hisi_sas_slot_index_alloc(hisi_hba, scmd);
+		rc = hisi_sas_slot_index_alloc(hisi_hba, task);
 
 	if (rc < 0)
 		goto err_out_dif_dma_unmap;
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index f139977098c2..425904fa4cc7 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1065,6 +1065,20 @@ int sas_execute_internal_abort_dev(struct domain_device *device,
 }
 EXPORT_SYMBOL_GPL(sas_execute_internal_abort_dev);
 
+static u32 sas_task_to_rq_unique_tag(struct sas_task *task)
+{
+	return blk_mq_unique_tag(scsi_cmd_to_rq(task->uldd_task));
+}
+
+unsigned int sas_task_to_unique_tag(struct sas_task *task)
+{
+	u32 unique = sas_task_to_rq_unique_tag(task);
+
+	return blk_mq_unique_tag_to_tag(unique);
+
+}
+EXPORT_SYMBOL_GPL(sas_task_to_unique_tag);
+
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index ff04eb6d250b..3d9ef4c2c889 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -625,6 +625,8 @@ struct sas_task {
 	struct sas_tmf_task *tmf;
 };
 
+unsigned int sas_task_to_unique_tag(struct sas_task *task);
+
 struct sas_task_slow {
 	/* standard/extra infrastructure for slow path commands (SMP and
 	 * internal lldd commands
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 09/11] scsi: libsas: Add sas_task_to_unique_tag()
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Add a function to get the unique request tag from a sas_task.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++----
 drivers/scsi/libsas/sas_scsi_host.c   | 14 ++++++++++++++
 include/scsi/libsas.h                 |  2 ++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index e438c68e249a..f6b64c789335 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -177,13 +177,13 @@ static void hisi_sas_slot_index_set(struct hisi_hba *hisi_hba, int slot_idx)
 }
 
 static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
-				     struct scsi_cmnd *scsi_cmnd)
+				     struct sas_task *task)
 {
 	int index;
 	void *bitmap = hisi_hba->slot_index_tags;
 
-	if (scsi_cmnd)
-		return scsi_cmd_to_rq(scsi_cmnd)->tag;
+	if (task)
+		return sas_task_to_unique_tag(task);
 
 	spin_lock(&hisi_hba->lock);
 	index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
@@ -572,7 +572,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	if (!internal_abort && hisi_hba->hw->slot_index_alloc)
 		rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
 	else
-		rc = hisi_sas_slot_index_alloc(hisi_hba, scmd);
+		rc = hisi_sas_slot_index_alloc(hisi_hba, task);
 
 	if (rc < 0)
 		goto err_out_dif_dma_unmap;
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index f139977098c2..425904fa4cc7 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1065,6 +1065,20 @@ int sas_execute_internal_abort_dev(struct domain_device *device,
 }
 EXPORT_SYMBOL_GPL(sas_execute_internal_abort_dev);
 
+static u32 sas_task_to_rq_unique_tag(struct sas_task *task)
+{
+	return blk_mq_unique_tag(scsi_cmd_to_rq(task->uldd_task));
+}
+
+unsigned int sas_task_to_unique_tag(struct sas_task *task)
+{
+	u32 unique = sas_task_to_rq_unique_tag(task);
+
+	return blk_mq_unique_tag_to_tag(unique);
+
+}
+EXPORT_SYMBOL_GPL(sas_task_to_unique_tag);
+
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index ff04eb6d250b..3d9ef4c2c889 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -625,6 +625,8 @@ struct sas_task {
 	struct sas_tmf_task *tmf;
 };
 
+unsigned int sas_task_to_unique_tag(struct sas_task *task);
+
 struct sas_task_slow {
 	/* standard/extra infrastructure for slow path commands (SMP and
 	 * internal lldd commands
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 10/11] scsi: libsas: Add sas_task_to_hwq()
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Add a function to get the hw queue index from a sas_task.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 20 ++------------------
 drivers/scsi/libsas/sas_scsi_host.c   |  9 ++++++++-
 include/scsi/libsas.h                 |  2 +-
 3 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index f6b64c789335..b3e03c229cb5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -461,7 +461,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	struct asd_sas_port *sas_port = device->port;
 	struct hisi_sas_device *sas_dev = device->lldd_dev;
 	bool internal_abort = sas_is_internal_abort(task);
-	struct scsi_cmnd *scmd = NULL;
 	struct hisi_sas_dq *dq = NULL;
 	struct hisi_sas_port *port;
 	struct hisi_hba *hisi_hba;
@@ -486,6 +485,8 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	hisi_hba = dev_to_hisi_hba(device);
 	dev = hisi_hba->dev;
 
+	dq = &hisi_hba->dq[sas_task_to_hwq(task)];
+
 	switch (task->task_proto) {
 	case SAS_PROTOCOL_SSP:
 	case SAS_PROTOCOL_SMP:
@@ -520,22 +521,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		scmd = task->uldd_task;
-
-		if (scmd) {
-			unsigned int dq_index;
-			u32 blk_tag;
-
-			blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
-			dq_index = blk_mq_unique_tag_to_hwq(blk_tag);
-			dq = &hisi_hba->dq[dq_index];
-		} else {
-			struct Scsi_Host *shost = hisi_hba->shost;
-			struct blk_mq_queue_map *qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT];
-			int queue = qmap->mq_map[raw_smp_processor_id()];
-
-			dq = &hisi_hba->dq[queue];
-		}
 		break;
 	case SAS_PROTOCOL_INTERNAL_ABORT:
 		if (!hisi_hba->hw->prep_abort)
@@ -550,7 +535,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 			return -EINVAL;
 
 		port = to_hisi_sas_port(sas_port);
-		dq = &hisi_hba->dq[task->abort_task.qid];
 		break;
 	default:
 		dev_err(hisi_hba->dev, "task prep: unknown/unsupported proto (0x%x)\n",
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 425904fa4cc7..b808b23e265e 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -986,7 +986,6 @@ static int sas_execute_internal_abort(struct domain_device *device,
 
 		task->abort_task.tag = tag;
 		task->abort_task.type = type;
-		task->abort_task.qid = qid;
 
 		rq = scsi_alloc_request_hwq(request_queue, REQ_OP_DRV_IN, BLK_MQ_REQ_NOWAIT, qid);
 		if (!rq) {
@@ -1079,6 +1078,14 @@ unsigned int sas_task_to_unique_tag(struct sas_task *task)
 }
 EXPORT_SYMBOL_GPL(sas_task_to_unique_tag);
 
+unsigned int sas_task_to_hwq(struct sas_task *task)
+{
+	u32 unique = sas_task_to_rq_unique_tag(task);
+
+	return blk_mq_unique_tag_to_hwq(unique);
+}
+EXPORT_SYMBOL_GPL(sas_task_to_hwq);
+
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 3d9ef4c2c889..368c45016af0 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -565,7 +565,6 @@ enum sas_internal_abort {
 
 struct sas_internal_abort_task {
 	enum sas_internal_abort type;
-	unsigned int qid;
 	u16 tag;
 };
 
@@ -626,6 +625,7 @@ struct sas_task {
 };
 
 unsigned int sas_task_to_unique_tag(struct sas_task *task);
+unsigned int sas_task_to_hwq(struct sas_task *task);
 
 struct sas_task_slow {
 	/* standard/extra infrastructure for slow path commands (SMP and
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 10/11] scsi: libsas: Add sas_task_to_hwq()
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Add a function to get the hw queue index from a sas_task.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 20 ++------------------
 drivers/scsi/libsas/sas_scsi_host.c   |  9 ++++++++-
 include/scsi/libsas.h                 |  2 +-
 3 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index f6b64c789335..b3e03c229cb5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -461,7 +461,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	struct asd_sas_port *sas_port = device->port;
 	struct hisi_sas_device *sas_dev = device->lldd_dev;
 	bool internal_abort = sas_is_internal_abort(task);
-	struct scsi_cmnd *scmd = NULL;
 	struct hisi_sas_dq *dq = NULL;
 	struct hisi_sas_port *port;
 	struct hisi_hba *hisi_hba;
@@ -486,6 +485,8 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	hisi_hba = dev_to_hisi_hba(device);
 	dev = hisi_hba->dev;
 
+	dq = &hisi_hba->dq[sas_task_to_hwq(task)];
+
 	switch (task->task_proto) {
 	case SAS_PROTOCOL_SSP:
 	case SAS_PROTOCOL_SMP:
@@ -520,22 +521,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		scmd = task->uldd_task;
-
-		if (scmd) {
-			unsigned int dq_index;
-			u32 blk_tag;
-
-			blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
-			dq_index = blk_mq_unique_tag_to_hwq(blk_tag);
-			dq = &hisi_hba->dq[dq_index];
-		} else {
-			struct Scsi_Host *shost = hisi_hba->shost;
-			struct blk_mq_queue_map *qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT];
-			int queue = qmap->mq_map[raw_smp_processor_id()];
-
-			dq = &hisi_hba->dq[queue];
-		}
 		break;
 	case SAS_PROTOCOL_INTERNAL_ABORT:
 		if (!hisi_hba->hw->prep_abort)
@@ -550,7 +535,6 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 			return -EINVAL;
 
 		port = to_hisi_sas_port(sas_port);
-		dq = &hisi_hba->dq[task->abort_task.qid];
 		break;
 	default:
 		dev_err(hisi_hba->dev, "task prep: unknown/unsupported proto (0x%x)\n",
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 425904fa4cc7..b808b23e265e 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -986,7 +986,6 @@ static int sas_execute_internal_abort(struct domain_device *device,
 
 		task->abort_task.tag = tag;
 		task->abort_task.type = type;
-		task->abort_task.qid = qid;
 
 		rq = scsi_alloc_request_hwq(request_queue, REQ_OP_DRV_IN, BLK_MQ_REQ_NOWAIT, qid);
 		if (!rq) {
@@ -1079,6 +1078,14 @@ unsigned int sas_task_to_unique_tag(struct sas_task *task)
 }
 EXPORT_SYMBOL_GPL(sas_task_to_unique_tag);
 
+unsigned int sas_task_to_hwq(struct sas_task *task)
+{
+	u32 unique = sas_task_to_rq_unique_tag(task);
+
+	return blk_mq_unique_tag_to_hwq(unique);
+}
+EXPORT_SYMBOL_GPL(sas_task_to_hwq);
+
 int sas_execute_tmf(struct domain_device *device, void *parameter,
 		    int para_len, int force_phy_id,
 		    struct sas_tmf_task *tmf)
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 3d9ef4c2c889..368c45016af0 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -565,7 +565,6 @@ enum sas_internal_abort {
 
 struct sas_internal_abort_task {
 	enum sas_internal_abort type;
-	unsigned int qid;
 	u16 tag;
 };
 
@@ -626,6 +625,7 @@ struct sas_task {
 };
 
 unsigned int sas_task_to_unique_tag(struct sas_task *task);
+unsigned int sas_task_to_hwq(struct sas_task *task);
 
 struct sas_task_slow {
 	/* standard/extra infrastructure for slow path commands (SMP and
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 11/11] scsi: hisi_sas: Remove private tag management
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 10:39   ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo, John Garry

Now every sas_task which the driver sees has a SCSI command and also
request associated, so drop the internal tag management.

No reserved tags have been set aside in the tagset yet, but this is simple
to do.

For v2 HW we need to continue to allocate the tag internally as the HW is
so broken and we need to use special rules for tag allocation, see
slot_index_alloc_quirk_v2_hw()

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 37 +--------------------------
 1 file changed, 1 insertion(+), 36 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index b3e03c229cb5..19c9ed169c91 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -169,41 +169,6 @@ static void hisi_sas_slot_index_free(struct hisi_hba *hisi_hba, int slot_idx)
 	}
 }
 
-static void hisi_sas_slot_index_set(struct hisi_hba *hisi_hba, int slot_idx)
-{
-	void *bitmap = hisi_hba->slot_index_tags;
-
-	__set_bit(slot_idx, bitmap);
-}
-
-static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
-				     struct sas_task *task)
-{
-	int index;
-	void *bitmap = hisi_hba->slot_index_tags;
-
-	if (task)
-		return sas_task_to_unique_tag(task);
-
-	spin_lock(&hisi_hba->lock);
-	index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
-				   hisi_hba->last_slot_index + 1);
-	if (index >= hisi_hba->slot_index_count) {
-		index = find_next_zero_bit(bitmap,
-				hisi_hba->slot_index_count,
-				HISI_SAS_UNRESERVED_IPTT);
-		if (index >= hisi_hba->slot_index_count) {
-			spin_unlock(&hisi_hba->lock);
-			return -SAS_QUEUE_FULL;
-		}
-	}
-	hisi_sas_slot_index_set(hisi_hba, index);
-	hisi_hba->last_slot_index = index;
-	spin_unlock(&hisi_hba->lock);
-
-	return index;
-}
-
 void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task,
 			     struct hisi_sas_slot *slot)
 {
@@ -556,7 +521,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	if (!internal_abort && hisi_hba->hw->slot_index_alloc)
 		rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
 	else
-		rc = hisi_sas_slot_index_alloc(hisi_hba, task);
+		rc = sas_task_to_unique_tag(task);
 
 	if (rc < 0)
 		goto err_out_dif_dma_unmap;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [dm-devel] [PATCH 11/11] scsi: hisi_sas: Remove private tag management
@ 2022-03-22 10:39   ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 10:39 UTC (permalink / raw)
  To: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, John Garry, linux-kernel, linux-block,
	linux-ide, dm-devel, beanhuo

Now every sas_task which the driver sees has a SCSI command and also
request associated, so drop the internal tag management.

No reserved tags have been set aside in the tagset yet, but this is simple
to do.

For v2 HW we need to continue to allocate the tag internally as the HW is
so broken and we need to use special rules for tag allocation, see
slot_index_alloc_quirk_v2_hw()

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 37 +--------------------------
 1 file changed, 1 insertion(+), 36 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index b3e03c229cb5..19c9ed169c91 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -169,41 +169,6 @@ static void hisi_sas_slot_index_free(struct hisi_hba *hisi_hba, int slot_idx)
 	}
 }
 
-static void hisi_sas_slot_index_set(struct hisi_hba *hisi_hba, int slot_idx)
-{
-	void *bitmap = hisi_hba->slot_index_tags;
-
-	__set_bit(slot_idx, bitmap);
-}
-
-static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba,
-				     struct sas_task *task)
-{
-	int index;
-	void *bitmap = hisi_hba->slot_index_tags;
-
-	if (task)
-		return sas_task_to_unique_tag(task);
-
-	spin_lock(&hisi_hba->lock);
-	index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count,
-				   hisi_hba->last_slot_index + 1);
-	if (index >= hisi_hba->slot_index_count) {
-		index = find_next_zero_bit(bitmap,
-				hisi_hba->slot_index_count,
-				HISI_SAS_UNRESERVED_IPTT);
-		if (index >= hisi_hba->slot_index_count) {
-			spin_unlock(&hisi_hba->lock);
-			return -SAS_QUEUE_FULL;
-		}
-	}
-	hisi_sas_slot_index_set(hisi_hba, index);
-	hisi_hba->last_slot_index = index;
-	spin_unlock(&hisi_hba->lock);
-
-	return index;
-}
-
 void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task,
 			     struct hisi_sas_slot *slot)
 {
@@ -556,7 +521,7 @@ static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	if (!internal_abort && hisi_hba->hw->slot_index_alloc)
 		rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
 	else
-		rc = hisi_sas_slot_index_alloc(hisi_hba, task);
+		rc = sas_task_to_unique_tag(task);
 
 	if (rc < 0)
 		goto err_out_dif_dma_unmap;
-- 
2.26.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 10:39   ` [dm-devel] " John Garry
@ 2022-03-22 11:18     ` Christoph Hellwig
  -1 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:18 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
> Add an API to allocate a request queue which accepts a custom set of
> blk_mq_ops for that request queue.
> 
> The reason which we may want custom ops is for queuing requests which we
> don't want to go through the normal queuing path.

Eww.  I really do not think we should do separate ops per queue, as that
is going to get us into a deep mess eventually.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 11:18     ` Christoph Hellwig
  0 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:18 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, hch, linux-block, linux-ide, dm-devel, linux-scsi,
	ming.lei, jejb, beanhuo

On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
> Add an API to allocate a request queue which accepts a custom set of
> blk_mq_ops for that request queue.
> 
> The reason which we may want custom ops is for queuing requests which we
> don't want to go through the normal queuing path.

Eww.  I really do not think we should do separate ops per queue, as that
is going to get us into a deep mess eventually.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 02/11] scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
  2022-03-22 10:39   ` [dm-devel] " John Garry
@ 2022-03-22 11:20     ` Christoph Hellwig
  -1 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:20 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On Tue, Mar 22, 2022 at 06:39:36PM +0800, John Garry wrote:
> Add a new type of submitter, SUBMITTED_BY_SCSI_CUSTOM_OPS, for when a
> SCSI cmnd is submitted via the block layer but not by scsi_queue_rq().
> 
> Since this is not a true SCSI cmnd we should do nothing for it in
> scsi_done_internal().

CUSTOM_OPS sounds weird.  I think the naming should match the naming
of whatever is used to submit it (haven't read the remaining patches
yet).  And this should probably be folded into the patch that actually
uses it.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 02/11] scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS
@ 2022-03-22 11:20     ` Christoph Hellwig
  0 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:20 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, hch, linux-block, linux-ide, dm-devel, linux-scsi,
	ming.lei, jejb, beanhuo

On Tue, Mar 22, 2022 at 06:39:36PM +0800, John Garry wrote:
> Add a new type of submitter, SUBMITTED_BY_SCSI_CUSTOM_OPS, for when a
> SCSI cmnd is submitted via the block layer but not by scsi_queue_rq().
> 
> Since this is not a true SCSI cmnd we should do nothing for it in
> scsi_done_internal().

CUSTOM_OPS sounds weird.  I think the naming should match the naming
of whatever is used to submit it (haven't read the remaining patches
yet).  And this should probably be folded into the patch that actually
uses it.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/11] libata: Send internal commands through the block layer
  2022-03-22 10:39   ` [dm-devel] " John Garry
@ 2022-03-22 11:20     ` Christoph Hellwig
  -1 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:20 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
> When SCSI HBA device drivers are required to process an ATA internal
> command they still need a tag for the IO. This often requires the driver
> to set aside a set of tags for these sorts of IOs and manage the tags
> themselves.
> 
> If we associate a SCSI command (and request) with an ATA internal command
> then the tag is already provided, so introduce the change to send ATA
> internal commands through the block layer with a set of custom blk-mq ops.
> 
> note: I think that the timeout handling needs to be fixed up.

Any reason to not just send them through an ATA_16 passthrough CDB and
just use all the normal SCSI command handling?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 03/11] libata: Send internal commands through the block layer
@ 2022-03-22 11:20     ` Christoph Hellwig
  0 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:20 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, hch, linux-block, linux-ide, dm-devel, linux-scsi,
	ming.lei, jejb, beanhuo

On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
> When SCSI HBA device drivers are required to process an ATA internal
> command they still need a tag for the IO. This often requires the driver
> to set aside a set of tags for these sorts of IOs and manage the tags
> themselves.
> 
> If we associate a SCSI command (and request) with an ATA internal command
> then the tag is already provided, so introduce the change to send ATA
> internal commands through the block layer with a set of custom blk-mq ops.
> 
> note: I think that the timeout handling needs to be fixed up.

Any reason to not just send them through an ATA_16 passthrough CDB and
just use all the normal SCSI command handling?

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
  2022-03-22 10:39   ` [dm-devel] " John Garry
@ 2022-03-22 11:23     ` Christoph Hellwig
  -1 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:23 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen, hch,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

Wouldn't it make more sense to have a new member in th Scsi_Host_template
that gets called for these request early from ->queue_rq instead of the
new request_queue and custome ops?

FYI, I think reusing the block layer for all these command is a good
idea but the way it is done here seems a bit ugly and not very
maintainable.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
@ 2022-03-22 11:23     ` Christoph Hellwig
  0 siblings, 0 replies; 58+ messages in thread
From: Christoph Hellwig @ 2022-03-22 11:23 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, hch, linux-block, linux-ide, dm-devel, linux-scsi,
	ming.lei, jejb, beanhuo

Wouldn't it make more sense to have a new member in th Scsi_Host_template
that gets called for these request early from ->queue_rq instead of the
new request_queue and custome ops?

FYI, I think reusing the block layer for all these command is a good
idea but the way it is done here seems a bit ugly and not very
maintainable.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
  2022-03-22 10:39 ` [dm-devel] " John Garry
@ 2022-03-22 11:30   ` Hannes Reinecke
  -1 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 11:30 UTC (permalink / raw)
  To: John Garry, axboe, damien.lemoal, bvanassche, jejb,
	martin.petersen, hch, ming.lei
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo

On 3/22/22 11:39, John Garry wrote:
> Currently SCSI low-level drivers are required to manage tags for commands
> which do not come via the block layer - libata internal commands would be
> an example of one of these.
> 
> There was some work to provide "reserved commands" support in such series
> as https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/
> 
> This was based on allocating a request for the lifetime of the "internal"
> command.
> 
> This series tries to solve that problem by not just allocating the request
> but also sending it through the block layer, that being the normal flow
> for a request. We need to do this as we may only poll completion of
> requests through the block layer, so would need to do this for poll queue
> support.
> 
> There is still scope to allocate commands just to get a tag as token as
> that may suit some other scenarios, but it's not what we do here.
> 
> This series extends blk-mq to support a request queue having a custom set
> of ops. In addition SCSI core code adds support for these type of requests.
> 
> This series does not include SCSI core handling for enabling reserved
> tags per tagset, but that would be easy to add.
> 
> Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993
> 
> Please consider as an RFC for now. I think that the libata change has the
> largest scope for improvement...
> 

Grand seeing that someone is taking it up. Thanks for doing this!

But:
Allocating a queue for every request (eg in patch 3) is overkill. If we 
want to go that route we should be allocating the queue upfront (eg when 
creating the device itself), and then just referencing it.

However, can't say I like this approach. I've been playing around with 
supporting internal commands, and really found two constraints really 
annoying:

- The tagset supports only _one_ set of payload via
   blk_mq_rq_(to,from)_pdu().
This requires each request to be of the same type, and with that making
it really hard for re-purposing the request for internal usage. In the
end I settled by just keeping it and skipping the SCSI command field.
If we could have a distinct PDU type for internal commands I guess 
things would be easier.

- The number of reserved commands is static.
With that it's getting really hard using reserved commands with 
low-queue depth devices like ATA; we only have 31 commands to work with, 
and setting one or two aside for TMF is really making a difference 
performance wise. It would be _awesome_ if we could allocate reserved 
commands dynamically (ie just marking a command as 'reserved' once 
allocated).
Sure, it won't have the same guarantees as 'real' reserved commands, but 
in most cases we don't actually need that.

Maybe these are some lines we could investigate?
Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
@ 2022-03-22 11:30   ` Hannes Reinecke
  0 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 11:30 UTC (permalink / raw)
  To: John Garry, axboe, damien.lemoal, bvanassche, jejb,
	martin.petersen, hch, ming.lei
  Cc: linux-scsi, chenxiang66, linux-kernel, linux-block, linux-ide,
	dm-devel, beanhuo

On 3/22/22 11:39, John Garry wrote:
> Currently SCSI low-level drivers are required to manage tags for commands
> which do not come via the block layer - libata internal commands would be
> an example of one of these.
> 
> There was some work to provide "reserved commands" support in such series
> as https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/
> 
> This was based on allocating a request for the lifetime of the "internal"
> command.
> 
> This series tries to solve that problem by not just allocating the request
> but also sending it through the block layer, that being the normal flow
> for a request. We need to do this as we may only poll completion of
> requests through the block layer, so would need to do this for poll queue
> support.
> 
> There is still scope to allocate commands just to get a tag as token as
> that may suit some other scenarios, but it's not what we do here.
> 
> This series extends blk-mq to support a request queue having a custom set
> of ops. In addition SCSI core code adds support for these type of requests.
> 
> This series does not include SCSI core handling for enabling reserved
> tags per tagset, but that would be easy to add.
> 
> Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993
> 
> Please consider as an RFC for now. I think that the libata change has the
> largest scope for improvement...
> 

Grand seeing that someone is taking it up. Thanks for doing this!

But:
Allocating a queue for every request (eg in patch 3) is overkill. If we 
want to go that route we should be allocating the queue upfront (eg when 
creating the device itself), and then just referencing it.

However, can't say I like this approach. I've been playing around with 
supporting internal commands, and really found two constraints really 
annoying:

- The tagset supports only _one_ set of payload via
   blk_mq_rq_(to,from)_pdu().
This requires each request to be of the same type, and with that making
it really hard for re-purposing the request for internal usage. In the
end I settled by just keeping it and skipping the SCSI command field.
If we could have a distinct PDU type for internal commands I guess 
things would be easier.

- The number of reserved commands is static.
With that it's getting really hard using reserved commands with 
low-queue depth devices like ATA; we only have 31 commands to work with, 
and setting one or two aside for TMF is really making a difference 
performance wise. It would be _awesome_ if we could allocate reserved 
commands dynamically (ie just marking a command as 'reserved' once 
allocated).
Sure, it won't have the same guarantees as 'real' reserved commands, but 
in most cases we don't actually need that.

Maybe these are some lines we could investigate?
Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 11:18     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 11:33       ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 11:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On 22/03/2022 11:18, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>> Add an API to allocate a request queue which accepts a custom set of
>> blk_mq_ops for that request queue.
>>
>> The reason which we may want custom ops is for queuing requests which we
>> don't want to go through the normal queuing path.
> 
> Eww.  I really do not think we should do separate ops per queue, as that
> is going to get us into a deep mess eventually.
> 

Yeah... so far (here) it works out quite nicely, as we don't need to 
change the SCSI blk mq ops nor allocate a scsi_device - everything is 
just separate.

The other method mentioned previously was to add the request "reserved" 
flag and add new paths in scsi_queue_rq() et al to handle this, but that 
gets messy.

Any other ideas ...?

Cheers,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 11:33       ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 11:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 11:18, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>> Add an API to allocate a request queue which accepts a custom set of
>> blk_mq_ops for that request queue.
>>
>> The reason which we may want custom ops is for queuing requests which we
>> don't want to go through the normal queuing path.
> 
> Eww.  I really do not think we should do separate ops per queue, as that
> is going to get us into a deep mess eventually.
> 

Yeah... so far (here) it works out quite nicely, as we don't need to 
change the SCSI blk mq ops nor allocate a scsi_device - everything is 
just separate.

The other method mentioned previously was to add the request "reserved" 
flag and add new paths in scsi_queue_rq() et al to handle this, but that 
gets messy.

Any other ideas ...?

Cheers,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/11] libata: Send internal commands through the block layer
  2022-03-22 11:20     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 11:36       ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 11:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On 22/03/2022 11:20, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
>> When SCSI HBA device drivers are required to process an ATA internal
>> command they still need a tag for the IO. This often requires the driver
>> to set aside a set of tags for these sorts of IOs and manage the tags
>> themselves.
>>
>> If we associate a SCSI command (and request) with an ATA internal command
>> then the tag is already provided, so introduce the change to send ATA
>> internal commands through the block layer with a set of custom blk-mq ops.
>>
>> note: I think that the timeout handling needs to be fixed up.
> 
> Any reason to not just send them through an ATA_16 passthrough CDB and
> just use all the normal SCSI command handling?
> 
> .

No reason that I know about.

TBH, I was hoping that someone woud have a better idea on how to do this 
as what I was doing was messy.

Let me check it.

Cheers,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 03/11] libata: Send internal commands through the block layer
@ 2022-03-22 11:36       ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 11:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 11:20, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
>> When SCSI HBA device drivers are required to process an ATA internal
>> command they still need a tag for the IO. This often requires the driver
>> to set aside a set of tags for these sorts of IOs and manage the tags
>> themselves.
>>
>> If we associate a SCSI command (and request) with an ATA internal command
>> then the tag is already provided, so introduce the change to send ATA
>> internal commands through the block layer with a set of custom blk-mq ops.
>>
>> note: I think that the timeout handling needs to be fixed up.
> 
> Any reason to not just send them through an ATA_16 passthrough CDB and
> just use all the normal SCSI command handling?
> 
> .

No reason that I know about.

TBH, I was hoping that someone woud have a better idea on how to do this 
as what I was doing was messy.

Let me check it.

Cheers,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 11:33       ` [dm-devel] " John Garry
@ 2022-03-22 12:16         ` Hannes Reinecke
  -1 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 12:16 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, chenxiang66, linux-block, linux-kernel, linux-ide,
	linux-scsi, dm-devel, beanhuo

On 3/22/22 12:33, John Garry wrote:
> On 22/03/2022 11:18, Christoph Hellwig wrote:
>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>> Add an API to allocate a request queue which accepts a custom set of
>>> blk_mq_ops for that request queue.
>>>
>>> The reason which we may want custom ops is for queuing requests which we
>>> don't want to go through the normal queuing path.
>>
>> Eww.  I really do not think we should do separate ops per queue, as that
>> is going to get us into a deep mess eventually.
>>
> 
> Yeah... so far (here) it works out quite nicely, as we don't need to 
> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
> just separate.
> 
> The other method mentioned previously was to add the request "reserved" 
> flag and add new paths in scsi_queue_rq() et al to handle this, but that 
> gets messy.
> 
> Any other ideas ...?
> 

As outlined in the other mail, I think might be useful is to have a 
_third_ type of requests (in addition to the normal and the reserved ones).
That one would be allocated from the normal I/O pool (and hence could 
fail if the pool is exhausted), but would be able to carry a different 
payload (type) than the normal requests.
And we could have a separate queue_rq for these requests, as we can 
differentiate them in the block layer.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 12:16         ` Hannes Reinecke
  0 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 12:16 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 3/22/22 12:33, John Garry wrote:
> On 22/03/2022 11:18, Christoph Hellwig wrote:
>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>> Add an API to allocate a request queue which accepts a custom set of
>>> blk_mq_ops for that request queue.
>>>
>>> The reason which we may want custom ops is for queuing requests which we
>>> don't want to go through the normal queuing path.
>>
>> Eww.  I really do not think we should do separate ops per queue, as that
>> is going to get us into a deep mess eventually.
>>
> 
> Yeah... so far (here) it works out quite nicely, as we don't need to 
> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
> just separate.
> 
> The other method mentioned previously was to add the request "reserved" 
> flag and add new paths in scsi_queue_rq() et al to handle this, but that 
> gets messy.
> 
> Any other ideas ...?
> 

As outlined in the other mail, I think might be useful is to have a 
_third_ type of requests (in addition to the normal and the reserved ones).
That one would be allocated from the normal I/O pool (and hence could 
fail if the pool is exhausted), but would be able to carry a different 
payload (type) than the normal requests.
And we could have a separate queue_rq for these requests, as we can 
differentiate them in the block layer.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
  2022-03-22 11:30   ` [dm-devel] " Hannes Reinecke
@ 2022-03-22 12:17     ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 12:17 UTC (permalink / raw)
  To: Hannes Reinecke, axboe, damien.lemoal, bvanassche, jejb,
	martin.petersen, hch, ming.lei
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo

On 22/03/2022 11:30, Hannes Reinecke wrote:
> On 3/22/22 11:39, John Garry wrote:
>> Currently SCSI low-level drivers are required to manage tags for commands
>> which do not come via the block layer - libata internal commands would be
>> an example of one of these.
>>
>> There was some work to provide "reserved commands" support in such series
>> as 
>> https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/
>>
>> This was based on allocating a request for the lifetime of the "internal"
>> command.
>>
>> This series tries to solve that problem by not just allocating the 
>> request
>> but also sending it through the block layer, that being the normal flow
>> for a request. We need to do this as we may only poll completion of
>> requests through the block layer, so would need to do this for poll queue
>> support.
>>
>> There is still scope to allocate commands just to get a tag as token as
>> that may suit some other scenarios, but it's not what we do here.
>>
>> This series extends blk-mq to support a request queue having a custom set
>> of ops. In addition SCSI core code adds support for these type of 
>> requests.
>>
>> This series does not include SCSI core handling for enabling reserved
>> tags per tagset, but that would be easy to add.
>>
>> Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993
>>
>> Please consider as an RFC for now. I think that the libata change has the
>> largest scope for improvement...
>>
> 
> Grand seeing that someone is taking it up. Thanks for doing this!
> 
> But:
> Allocating a queue for every request (eg in patch 3) is overkill. If we 
> want to go that route we should be allocating the queue upfront (eg when 
> creating the device itself), and then just referencing it.

For patch #3 I needed to allocate a separate request queue as the scsi 
device is not created by that stage.

And then for other scenarios in which we allocate the separate request 
queue, since the scheme here is to allocate a request queue with 
different ops, we can't use the same scsi_device request queue.

note: As for allocating request queues for the lifetime of the host, we 
need to remember that blk-mq fairly reserves a tag budget per request 
queue, and it would be a waste to keep a budget just for these internal 
commands. So that is why I only keep the request queues temporarily.

> 
> However, can't say I like this approach. I've been playing around with 
> supporting internal commands, and really found two constraints really 
> annoying:
> 
> - The tagset supports only _one_ set of payload via
>    blk_mq_rq_(to,from)_pdu().
> This requires each request to be of the same type, and with that making
> it really hard for re-purposing the request for internal usage. In the
> end I settled by just keeping it and skipping the SCSI command field.

That sounds reasonable.

For this series I am just fixing up libsas, and libsas has a sas_task 
per command already, so we can just allocate the sas_task separately and 
use the host_scribble to point to the sas_task.

> If we could have a distinct PDU type for internal commands I guess 
> things would be easier.

Other drivers can do something similar to above or use the scsi priv data.

> 
> - The number of reserved commands is static.
> With that it's getting really hard using reserved commands with 
> low-queue depth devices like ATA; we only have 31 commands to work with, 
> and setting one or two aside for TMF is really making a difference 
> performance wise. It would be _awesome_ if we could allocate reserved 
> commands dynamically (ie just marking a command as 'reserved' once 
> allocated).

I see. So you want to allocate a request, mark as "internal", and then 
we have a flag which can be used to decide which path we need to send it 
on. eh, maybe scsi_cmnd.submitter could be used

> Sure, it won't have the same guarantees as 'real' reserved commands, but 
> in most cases we don't actually need that.
> 
> Maybe these are some lines we could investigate?
> Hmm?
> 

Thanks,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements
@ 2022-03-22 12:17     ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 12:17 UTC (permalink / raw)
  To: Hannes Reinecke, axboe, damien.lemoal, bvanassche, jejb,
	martin.petersen, hch, ming.lei
  Cc: linux-scsi, chenxiang66, linux-kernel, linux-block, linux-ide,
	dm-devel, beanhuo

On 22/03/2022 11:30, Hannes Reinecke wrote:
> On 3/22/22 11:39, John Garry wrote:
>> Currently SCSI low-level drivers are required to manage tags for commands
>> which do not come via the block layer - libata internal commands would be
>> an example of one of these.
>>
>> There was some work to provide "reserved commands" support in such series
>> as 
>> https://lore.kernel.org/linux-scsi/20211125151048.103910-1-hare@suse.de/
>>
>> This was based on allocating a request for the lifetime of the "internal"
>> command.
>>
>> This series tries to solve that problem by not just allocating the 
>> request
>> but also sending it through the block layer, that being the normal flow
>> for a request. We need to do this as we may only poll completion of
>> requests through the block layer, so would need to do this for poll queue
>> support.
>>
>> There is still scope to allocate commands just to get a tag as token as
>> that may suit some other scenarios, but it's not what we do here.
>>
>> This series extends blk-mq to support a request queue having a custom set
>> of ops. In addition SCSI core code adds support for these type of 
>> requests.
>>
>> This series does not include SCSI core handling for enabling reserved
>> tags per tagset, but that would be easy to add.
>>
>> Based on mkp-scsi 5.18/scsi-staging @ 66daf3e6b993
>>
>> Please consider as an RFC for now. I think that the libata change has the
>> largest scope for improvement...
>>
> 
> Grand seeing that someone is taking it up. Thanks for doing this!
> 
> But:
> Allocating a queue for every request (eg in patch 3) is overkill. If we 
> want to go that route we should be allocating the queue upfront (eg when 
> creating the device itself), and then just referencing it.

For patch #3 I needed to allocate a separate request queue as the scsi 
device is not created by that stage.

And then for other scenarios in which we allocate the separate request 
queue, since the scheme here is to allocate a request queue with 
different ops, we can't use the same scsi_device request queue.

note: As for allocating request queues for the lifetime of the host, we 
need to remember that blk-mq fairly reserves a tag budget per request 
queue, and it would be a waste to keep a budget just for these internal 
commands. So that is why I only keep the request queues temporarily.

> 
> However, can't say I like this approach. I've been playing around with 
> supporting internal commands, and really found two constraints really 
> annoying:
> 
> - The tagset supports only _one_ set of payload via
>    blk_mq_rq_(to,from)_pdu().
> This requires each request to be of the same type, and with that making
> it really hard for re-purposing the request for internal usage. In the
> end I settled by just keeping it and skipping the SCSI command field.

That sounds reasonable.

For this series I am just fixing up libsas, and libsas has a sas_task 
per command already, so we can just allocate the sas_task separately and 
use the host_scribble to point to the sas_task.

> If we could have a distinct PDU type for internal commands I guess 
> things would be easier.

Other drivers can do something similar to above or use the scsi priv data.

> 
> - The number of reserved commands is static.
> With that it's getting really hard using reserved commands with 
> low-queue depth devices like ATA; we only have 31 commands to work with, 
> and setting one or two aside for TMF is really making a difference 
> performance wise. It would be _awesome_ if we could allocate reserved 
> commands dynamically (ie just marking a command as 'reserved' once 
> allocated).

I see. So you want to allocate a request, mark as "internal", and then 
we have a flag which can be used to decide which path we need to send it 
on. eh, maybe scsi_cmnd.submitter could be used

> Sure, it won't have the same guarantees as 'real' reserved commands, but 
> in most cases we don't actually need that.
> 
> Maybe these are some lines we could investigate?
> Hmm?
> 

Thanks,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 12:16         ` [dm-devel] " Hannes Reinecke
@ 2022-03-22 12:30           ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 12:30 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, chenxiang66, linux-block, linux-kernel, linux-ide,
	linux-scsi, dm-devel, beanhuo

On 22/03/2022 12:16, Hannes Reinecke wrote:
> On 3/22/22 12:33, John Garry wrote:
>> On 22/03/2022 11:18, Christoph Hellwig wrote:
>>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>>> Add an API to allocate a request queue which accepts a custom set of
>>>> blk_mq_ops for that request queue.
>>>>
>>>> The reason which we may want custom ops is for queuing requests 
>>>> which we
>>>> don't want to go through the normal queuing path.
>>>
>>> Eww.  I really do not think we should do separate ops per queue, as that
>>> is going to get us into a deep mess eventually.
>>>
>>
>> Yeah... so far (here) it works out quite nicely, as we don't need to 
>> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
>> just separate.
>>
>> The other method mentioned previously was to add the request 
>> "reserved" flag and add new paths in scsi_queue_rq() et al to handle 
>> this, but that gets messy.
>>
>> Any other ideas ...?
>>
> 
> As outlined in the other mail, I think might be useful is to have a 
> _third_ type of requests (in addition to the normal and the reserved ones).
> That one would be allocated from the normal I/O pool (and hence could 
> fail if the pool is exhausted), but would be able to carry a different 
> payload (type) than the normal requests.

As mentioned in the cover letter response, it just seems best to keep 
the normal scsi_cmnd payload but have other means to add on the internal 
command data, like using host_scribble or scsi_cmnd priv data.

> And we could have a separate queue_rq for these requests, as we can 
> differentiate them in the block layer.

I don't know, let me think about it. Maybe we could add an "internal" 
blk flag, which uses a separate "internal" queue_rq callback.

Thanks,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 12:30           ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 12:30 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 12:16, Hannes Reinecke wrote:
> On 3/22/22 12:33, John Garry wrote:
>> On 22/03/2022 11:18, Christoph Hellwig wrote:
>>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>>> Add an API to allocate a request queue which accepts a custom set of
>>>> blk_mq_ops for that request queue.
>>>>
>>>> The reason which we may want custom ops is for queuing requests 
>>>> which we
>>>> don't want to go through the normal queuing path.
>>>
>>> Eww.  I really do not think we should do separate ops per queue, as that
>>> is going to get us into a deep mess eventually.
>>>
>>
>> Yeah... so far (here) it works out quite nicely, as we don't need to 
>> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
>> just separate.
>>
>> The other method mentioned previously was to add the request 
>> "reserved" flag and add new paths in scsi_queue_rq() et al to handle 
>> this, but that gets messy.
>>
>> Any other ideas ...?
>>
> 
> As outlined in the other mail, I think might be useful is to have a 
> _third_ type of requests (in addition to the normal and the reserved ones).
> That one would be allocated from the normal I/O pool (and hence could 
> fail if the pool is exhausted), but would be able to carry a different 
> payload (type) than the normal requests.

As mentioned in the cover letter response, it just seems best to keep 
the normal scsi_cmnd payload but have other means to add on the internal 
command data, like using host_scribble or scsi_cmnd priv data.

> And we could have a separate queue_rq for these requests, as we can 
> differentiate them in the block layer.

I don't know, let me think about it. Maybe we could add an "internal" 
blk flag, which uses a separate "internal" queue_rq callback.

Thanks,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 12:30           ` [dm-devel] " John Garry
@ 2022-03-22 14:03             ` Hannes Reinecke
  -1 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 14:03 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, chenxiang66, linux-block, linux-kernel, linux-ide,
	linux-scsi, dm-devel, beanhuo

On 3/22/22 13:30, John Garry wrote:
> On 22/03/2022 12:16, Hannes Reinecke wrote:
>> On 3/22/22 12:33, John Garry wrote:
>>> On 22/03/2022 11:18, Christoph Hellwig wrote:
>>>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>>>> Add an API to allocate a request queue which accepts a custom set of
>>>>> blk_mq_ops for that request queue.
>>>>>
>>>>> The reason which we may want custom ops is for queuing requests 
>>>>> which we
>>>>> don't want to go through the normal queuing path.
>>>>
>>>> Eww.  I really do not think we should do separate ops per queue, as 
>>>> that
>>>> is going to get us into a deep mess eventually.
>>>>
>>>
>>> Yeah... so far (here) it works out quite nicely, as we don't need to 
>>> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
>>> just separate.
>>>
>>> The other method mentioned previously was to add the request 
>>> "reserved" flag and add new paths in scsi_queue_rq() et al to handle 
>>> this, but that gets messy.
>>>
>>> Any other ideas ...?
>>>
>>
>> As outlined in the other mail, I think might be useful is to have a 
>> _third_ type of requests (in addition to the normal and the reserved 
>> ones).
>> That one would be allocated from the normal I/O pool (and hence could 
>> fail if the pool is exhausted), but would be able to carry a different 
>> payload (type) than the normal requests.
> 
> As mentioned in the cover letter response, it just seems best to keep 
> the normal scsi_cmnd payload but have other means to add on the internal 
> command data, like using host_scribble or scsi_cmnd priv data.
> 
Well; I found that most drivers I had been looking at the scsi command 
payload isn't used at all; the drivers primarily cared about the 
(driver-provided) payload, and were completely ignoring the scsi command 
payload.

Similar for ATA/libsas: you basically never issue real scsi commands, 
but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
commands, so providing them is a bit of a waste.

(And causes irritations, too, as a scsi command requires associated 
pointers like ->device etc to be set up. Which makes it tricky to use 
for the initial device setup.)

>> And we could have a separate queue_rq for these requests, as we can 
>> differentiate them in the block layer.
> 
> I don't know, let me think about it. Maybe we could add an "internal" 
> blk flag, which uses a separate "internal" queue_rq callback.
> 
Yeah, that's what I had in mind.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 14:03             ` Hannes Reinecke
  0 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 14:03 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 3/22/22 13:30, John Garry wrote:
> On 22/03/2022 12:16, Hannes Reinecke wrote:
>> On 3/22/22 12:33, John Garry wrote:
>>> On 22/03/2022 11:18, Christoph Hellwig wrote:
>>>> On Tue, Mar 22, 2022 at 06:39:35PM +0800, John Garry wrote:
>>>>> Add an API to allocate a request queue which accepts a custom set of
>>>>> blk_mq_ops for that request queue.
>>>>>
>>>>> The reason which we may want custom ops is for queuing requests 
>>>>> which we
>>>>> don't want to go through the normal queuing path.
>>>>
>>>> Eww.  I really do not think we should do separate ops per queue, as 
>>>> that
>>>> is going to get us into a deep mess eventually.
>>>>
>>>
>>> Yeah... so far (here) it works out quite nicely, as we don't need to 
>>> change the SCSI blk mq ops nor allocate a scsi_device - everything is 
>>> just separate.
>>>
>>> The other method mentioned previously was to add the request 
>>> "reserved" flag and add new paths in scsi_queue_rq() et al to handle 
>>> this, but that gets messy.
>>>
>>> Any other ideas ...?
>>>
>>
>> As outlined in the other mail, I think might be useful is to have a 
>> _third_ type of requests (in addition to the normal and the reserved 
>> ones).
>> That one would be allocated from the normal I/O pool (and hence could 
>> fail if the pool is exhausted), but would be able to carry a different 
>> payload (type) than the normal requests.
> 
> As mentioned in the cover letter response, it just seems best to keep 
> the normal scsi_cmnd payload but have other means to add on the internal 
> command data, like using host_scribble or scsi_cmnd priv data.
> 
Well; I found that most drivers I had been looking at the scsi command 
payload isn't used at all; the drivers primarily cared about the 
(driver-provided) payload, and were completely ignoring the scsi command 
payload.

Similar for ATA/libsas: you basically never issue real scsi commands, 
but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
commands, so providing them is a bit of a waste.

(And causes irritations, too, as a scsi command requires associated 
pointers like ->device etc to be set up. Which makes it tricky to use 
for the initial device setup.)

>> And we could have a separate queue_rq for these requests, as we can 
>> differentiate them in the block layer.
> 
> I don't know, let me think about it. Maybe we could add an "internal" 
> blk flag, which uses a separate "internal" queue_rq callback.
> 
Yeah, that's what I had in mind.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
  2022-03-22 11:23     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 14:37       ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 14:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On 22/03/2022 11:23, Christoph Hellwig wrote:
> Wouldn't it make more sense to have a new member in th Scsi_Host_template
> that gets called for these request early from ->queue_rq instead of the
> new request_queue and custome ops?

I had something like this before, but it also has its disadvantages.

Let me revisit.

> 
> FYI, I think reusing the block layer for all these command is a good
> idea but the way it is done here seems a bit ugly and not very
> maintainable.
> 

Cheers,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 04/11] scsi: libsas: Send SMP commands through the block layer
@ 2022-03-22 14:37       ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 14:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 11:23, Christoph Hellwig wrote:
> Wouldn't it make more sense to have a new member in th Scsi_Host_template
> that gets called for these request early from ->queue_rq instead of the
> new request_queue and custome ops?

I had something like this before, but it also has its disadvantages.

Let me revisit.

> 
> FYI, I think reusing the block layer for all these command is a good
> idea but the way it is done here seems a bit ugly and not very
> maintainable.
> 

Cheers,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 14:03             ` [dm-devel] " Hannes Reinecke
@ 2022-03-22 15:17               ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 15:17 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, chenxiang66, linux-block, linux-kernel, linux-ide,
	linux-scsi, dm-devel, beanhuo

On 22/03/2022 14:03, Hannes Reinecke wrote:
>>
>> As mentioned in the cover letter response, it just seems best to keep 
>> the normal scsi_cmnd payload but have other means to add on the 
>> internal command data, like using host_scribble or scsi_cmnd priv data.
>>
> Well; I found that most drivers I had been looking at the scsi command 
> payload isn't used at all; the drivers primarily cared about the 
> (driver-provided) payload, and were completely ignoring the scsi command 
> payload.
> 
> Similar for ATA/libsas: you basically never issue real scsi commands, 
> but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
> commands, so providing them is a bit of a waste.
> 
> (And causes irritations, too, as a scsi command requires associated 
> pointers like ->device etc to be set up. Which makes it tricky to use 
> for the initial device setup.)

A problem I see is that in scsi_mq_init_request() we allocate memories 
like sense_buffer and prot_sdb and store the pointers in the scsi_cmnd 
payload. If we then reuse a scsi_cmnd payload as an "internal" command 
payload then this data may be lost.

It might be possible to reuse the scsi cmnd payload for the "internal", 
but I would rather not get hung up on it now if possible.

Thanks,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 15:17               ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-22 15:17 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 14:03, Hannes Reinecke wrote:
>>
>> As mentioned in the cover letter response, it just seems best to keep 
>> the normal scsi_cmnd payload but have other means to add on the 
>> internal command data, like using host_scribble or scsi_cmnd priv data.
>>
> Well; I found that most drivers I had been looking at the scsi command 
> payload isn't used at all; the drivers primarily cared about the 
> (driver-provided) payload, and were completely ignoring the scsi command 
> payload.
> 
> Similar for ATA/libsas: you basically never issue real scsi commands, 
> but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
> commands, so providing them is a bit of a waste.
> 
> (And causes irritations, too, as a scsi command requires associated 
> pointers like ->device etc to be set up. Which makes it tricky to use 
> for the initial device setup.)

A problem I see is that in scsi_mq_init_request() we allocate memories 
like sense_buffer and prot_sdb and store the pointers in the scsi_cmnd 
payload. If we then reuse a scsi_cmnd payload as an "internal" command 
payload then this data may be lost.

It might be possible to reuse the scsi cmnd payload for the "internal", 
but I would rather not get hung up on it now if possible.

Thanks,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 15:17               ` [dm-devel] " John Garry
@ 2022-03-22 15:31                 ` Hannes Reinecke
  -1 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 15:31 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, chenxiang66, linux-block, linux-kernel, linux-ide,
	linux-scsi, dm-devel, beanhuo

On 3/22/22 16:17, John Garry wrote:
> On 22/03/2022 14:03, Hannes Reinecke wrote:
>>>
>>> As mentioned in the cover letter response, it just seems best to keep 
>>> the normal scsi_cmnd payload but have other means to add on the 
>>> internal command data, like using host_scribble or scsi_cmnd priv data.
>>>
>> Well; I found that most drivers I had been looking at the scsi command 
>> payload isn't used at all; the drivers primarily cared about the 
>> (driver-provided) payload, and were completely ignoring the scsi 
>> command payload.
>>
>> Similar for ATA/libsas: you basically never issue real scsi commands, 
>> but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
>> commands, so providing them is a bit of a waste.
>>
>> (And causes irritations, too, as a scsi command requires associated 
>> pointers like ->device etc to be set up. Which makes it tricky to use 
>> for the initial device setup.)
> 
> A problem I see is that in scsi_mq_init_request() we allocate memories 
> like sense_buffer and prot_sdb and store the pointers in the scsi_cmnd 
> payload. If we then reuse a scsi_cmnd payload as an "internal" command 
> payload then this data may be lost.
> 
> It might be possible to reuse the scsi cmnd payload for the "internal", 
> but I would rather not get hung up on it now if possible.
> 
Or, keep the payload as is, and use the 'internal' marker to indicate 
that the scsi payload is not valid.
That would save us quite some checks during endio processing.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-22 15:31                 ` Hannes Reinecke
  0 siblings, 0 replies; 58+ messages in thread
From: Hannes Reinecke @ 2022-03-22 15:31 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 3/22/22 16:17, John Garry wrote:
> On 22/03/2022 14:03, Hannes Reinecke wrote:
>>>
>>> As mentioned in the cover letter response, it just seems best to keep 
>>> the normal scsi_cmnd payload but have other means to add on the 
>>> internal command data, like using host_scribble or scsi_cmnd priv data.
>>>
>> Well; I found that most drivers I had been looking at the scsi command 
>> payload isn't used at all; the drivers primarily cared about the 
>> (driver-provided) payload, and were completely ignoring the scsi 
>> command payload.
>>
>> Similar for ATA/libsas: you basically never issue real scsi commands, 
>> but either 'raw' ATA requests or SCSI TMFs. None of which are scsi 
>> commands, so providing them is a bit of a waste.
>>
>> (And causes irritations, too, as a scsi command requires associated 
>> pointers like ->device etc to be set up. Which makes it tricky to use 
>> for the initial device setup.)
> 
> A problem I see is that in scsi_mq_init_request() we allocate memories 
> like sense_buffer and prot_sdb and store the pointers in the scsi_cmnd 
> payload. If we then reuse a scsi_cmnd payload as an "internal" command 
> payload then this data may be lost.
> 
> It might be possible to reuse the scsi cmnd payload for the "internal", 
> but I would rather not get hung up on it now if possible.
> 
Or, keep the payload as is, and use the 'internal' marker to indicate 
that the scsi payload is not valid.
That would save us quite some checks during endio processing.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-22 10:39   ` [dm-devel] " John Garry
@ 2022-03-23  2:57     ` Bart Van Assche
  -1 siblings, 0 replies; 58+ messages in thread
From: Bart Van Assche @ 2022-03-23  2:57 UTC (permalink / raw)
  To: John Garry, axboe, damien.lemoal, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo

On 3/22/22 03:39, John Garry wrote:
> Add an API to allocate a request queue which accepts a custom set of
> blk_mq_ops for that request queue.
> 
> The reason which we may want custom ops is for queuing requests which we
> don't want to go through the normal queuing path.

Custom ops shouldn't be required for this. See e.g. how tmf_queue
is used in the UFS driver for an example of a queue implementation
with custom operations and that does not require changes of the block
layer core.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-23  2:57     ` Bart Van Assche
  0 siblings, 0 replies; 58+ messages in thread
From: Bart Van Assche @ 2022-03-23  2:57 UTC (permalink / raw)
  To: John Garry, axboe, damien.lemoal, jejb, martin.petersen, hch,
	ming.lei, hare
  Cc: linux-scsi, chenxiang66, linux-kernel, linux-block, linux-ide,
	dm-devel, beanhuo

On 3/22/22 03:39, John Garry wrote:
> Add an API to allocate a request queue which accepts a custom set of
> blk_mq_ops for that request queue.
> 
> The reason which we may want custom ops is for queuing requests which we
> don't want to go through the normal queuing path.

Custom ops shouldn't be required for this. See e.g. how tmf_queue
is used in the UFS driver for an example of a queue implementation
with custom operations and that does not require changes of the block
layer core.

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
  2022-03-23  2:57     ` [dm-devel] " Bart Van Assche
@ 2022-03-23  9:01       ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-23  9:01 UTC (permalink / raw)
  To: Bart Van Assche, axboe, damien.lemoal, jejb, martin.petersen,
	hch, ming.lei, hare
  Cc: chenxiang66, linux-block, linux-kernel, linux-ide, linux-scsi,
	dm-devel, beanhuo

On 23/03/2022 02:57, Bart Van Assche wrote:
> On 3/22/22 03:39, John Garry wrote:
>> Add an API to allocate a request queue which accepts a custom set of
>> blk_mq_ops for that request queue.
>>
>> The reason which we may want custom ops is for queuing requests which we
>> don't want to go through the normal queuing path.
> 

Hi Bart,

 > Custom ops shouldn't be required for this. See e.g. how tmf_queue
 > is used in the UFS driver for an example of a queue implementation
 > with custom operations and that does not require changes of the block
 > layer core.

The UFS code uses a private tagset (in ufs_hba.tmf_tag_set) for only 
management of TMF tags/memories. This tagset does not really have any 
custom operations. All it has is a stub of .queue_rq CB in 
ufshcd_queue_tmf() and that is because this CB is compulsory.

As for the idea of having multiple tagsets per shost with real custom 
operations, this idea was mentioned before, but I think managing 
multiple tagsets could be trouble. For a start, it would mean that we 
need a distinct allocation of reserved and regular tags, and sometimes 
we don't want this - as Hannes mentioned earlier, many HBAs have low 
queue depth and cannot afford to permanently carve out a bunch of 
reserved tags.

Thanks,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops()
@ 2022-03-23  9:01       ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-03-23  9:01 UTC (permalink / raw)
  To: Bart Van Assche, axboe, damien.lemoal, jejb, martin.petersen,
	hch, ming.lei, hare
  Cc: linux-scsi, chenxiang66, linux-kernel, linux-block, linux-ide,
	dm-devel, beanhuo

On 23/03/2022 02:57, Bart Van Assche wrote:
> On 3/22/22 03:39, John Garry wrote:
>> Add an API to allocate a request queue which accepts a custom set of
>> blk_mq_ops for that request queue.
>>
>> The reason which we may want custom ops is for queuing requests which we
>> don't want to go through the normal queuing path.
> 

Hi Bart,

 > Custom ops shouldn't be required for this. See e.g. how tmf_queue
 > is used in the UFS driver for an example of a queue implementation
 > with custom operations and that does not require changes of the block
 > layer core.

The UFS code uses a private tagset (in ufs_hba.tmf_tag_set) for only 
management of TMF tags/memories. This tagset does not really have any 
custom operations. All it has is a stub of .queue_rq CB in 
ufshcd_queue_tmf() and that is because this CB is compulsory.

As for the idea of having multiple tagsets per shost with real custom 
operations, this idea was mentioned before, but I think managing 
multiple tagsets could be trouble. For a start, it would mean that we 
need a distinct allocation of reserved and regular tags, and sometimes 
we don't want this - as Hannes mentioned earlier, many HBAs have low 
queue depth and cannot afford to permanently carve out a bunch of 
reserved tags.

Thanks,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/11] libata: Send internal commands through the block layer
  2022-03-22 11:20     ` [dm-devel] " Christoph Hellwig
@ 2022-04-07 14:32       ` John Garry
  -1 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-04-07 14:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, damien.lemoal, bvanassche, jejb, martin.petersen,
	ming.lei, hare, chenxiang66, linux-block, linux-kernel,
	linux-ide, linux-scsi, dm-devel, beanhuo

On 22/03/2022 11:20, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
>> When SCSI HBA device drivers are required to process an ATA internal
>> command they still need a tag for the IO. This often requires the driver
>> to set aside a set of tags for these sorts of IOs and manage the tags
>> themselves.
>>
>> If we associate a SCSI command (and request) with an ATA internal command
>> then the tag is already provided, so introduce the change to send ATA
>> internal commands through the block layer with a set of custom blk-mq ops.
>>
>> note: I think that the timeout handling needs to be fixed up.

Hi Christoph,

> Any reason to not just send them through an ATA_16 passthrough CDB and
> just use all the normal SCSI command handling?

I had a go at implementing this but I have come up against a few issues:

- ATA_16 handling translates the passthrough CDB to a ATA TF. However 
ata_exec_internal_sg() is passed a TF already. So what to do? Change the 
callers to generate a ATA_16 CDB? I guess not. Otherwise we could put 
the already-generated TF in the SCSI cmd CDB somehow and use directly.

- We may have no SCSI device (yet) for the target when issuing an 
internal command, but only the ATA port+dev. So need a method to pass 
these pointers to ATA_16 handling

- we would need to change ata_scsi_translate(), ata_scsi_pass_thru() and 
other friends to deal with ATA_TAG_INTERNAL and its peculiarities - 
today it just deals with regular qc's.

It still does seem a reasonable idea to use ATA_16, but it looks like 
significant modifications would be required....

Thanks,
John

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dm-devel] [PATCH 03/11] libata: Send internal commands through the block layer
@ 2022-04-07 14:32       ` John Garry
  0 siblings, 0 replies; 58+ messages in thread
From: John Garry @ 2022-04-07 14:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, bvanassche, martin.petersen, chenxiang66, damien.lemoal,
	linux-kernel, ming.lei, linux-block, linux-ide, dm-devel,
	linux-scsi, jejb, beanhuo

On 22/03/2022 11:20, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 06:39:37PM +0800, John Garry wrote:
>> When SCSI HBA device drivers are required to process an ATA internal
>> command they still need a tag for the IO. This often requires the driver
>> to set aside a set of tags for these sorts of IOs and manage the tags
>> themselves.
>>
>> If we associate a SCSI command (and request) with an ATA internal command
>> then the tag is already provided, so introduce the change to send ATA
>> internal commands through the block layer with a set of custom blk-mq ops.
>>
>> note: I think that the timeout handling needs to be fixed up.

Hi Christoph,

> Any reason to not just send them through an ATA_16 passthrough CDB and
> just use all the normal SCSI command handling?

I had a go at implementing this but I have come up against a few issues:

- ATA_16 handling translates the passthrough CDB to a ATA TF. However 
ata_exec_internal_sg() is passed a TF already. So what to do? Change the 
callers to generate a ATA_16 CDB? I guess not. Otherwise we could put 
the already-generated TF in the SCSI cmd CDB somehow and use directly.

- We may have no SCSI device (yet) for the target when issuing an 
internal command, but only the ATA port+dev. So need a method to pass 
these pointers to ATA_16 handling

- we would need to change ata_scsi_translate(), ata_scsi_pass_thru() and 
other friends to deal with ATA_TAG_INTERNAL and its peculiarities - 
today it just deals with regular qc's.

It still does seem a reasonable idea to use ATA_16, but it looks like 
significant modifications would be required....

Thanks,
John

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2022-04-11  6:19 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-22 10:39 [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements John Garry
2022-03-22 10:39 ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 01/11] blk-mq: Add blk_mq_init_queue_ops() John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 11:18   ` Christoph Hellwig
2022-03-22 11:18     ` [dm-devel] " Christoph Hellwig
2022-03-22 11:33     ` John Garry
2022-03-22 11:33       ` [dm-devel] " John Garry
2022-03-22 12:16       ` Hannes Reinecke
2022-03-22 12:16         ` [dm-devel] " Hannes Reinecke
2022-03-22 12:30         ` John Garry
2022-03-22 12:30           ` [dm-devel] " John Garry
2022-03-22 14:03           ` Hannes Reinecke
2022-03-22 14:03             ` [dm-devel] " Hannes Reinecke
2022-03-22 15:17             ` John Garry
2022-03-22 15:17               ` [dm-devel] " John Garry
2022-03-22 15:31               ` Hannes Reinecke
2022-03-22 15:31                 ` [dm-devel] " Hannes Reinecke
2022-03-23  2:57   ` Bart Van Assche
2022-03-23  2:57     ` [dm-devel] " Bart Van Assche
2022-03-23  9:01     ` John Garry
2022-03-23  9:01       ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 02/11] scsi: core: Add SUBMITTED_BY_SCSI_CUSTOM_OPS John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 11:20   ` Christoph Hellwig
2022-03-22 11:20     ` [dm-devel] " Christoph Hellwig
2022-03-22 10:39 ` [PATCH 03/11] libata: Send internal commands through the block layer John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 11:20   ` Christoph Hellwig
2022-03-22 11:20     ` [dm-devel] " Christoph Hellwig
2022-03-22 11:36     ` John Garry
2022-03-22 11:36       ` [dm-devel] " John Garry
2022-04-07 14:32     ` John Garry
2022-04-07 14:32       ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 04/11] scsi: libsas: Send SMP " John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 11:23   ` Christoph Hellwig
2022-03-22 11:23     ` [dm-devel] " Christoph Hellwig
2022-03-22 14:37     ` John Garry
2022-03-22 14:37       ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 05/11] scsi: libsas: Send TMF " John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 06/11] scsi: core: Add scsi_alloc_request_hwq() John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 07/11] scsi: libsas: Send internal abort commands through the block layer John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 08/11] scsi: libsas: Change ATA support to deal with each qc having a SCSI command John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 09/11] scsi: libsas: Add sas_task_to_unique_tag() John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 10/11] scsi: libsas: Add sas_task_to_hwq() John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 10:39 ` [PATCH 11/11] scsi: hisi_sas: Remove private tag management John Garry
2022-03-22 10:39   ` [dm-devel] " John Garry
2022-03-22 11:30 ` [PATCH RFC 00/11] blk-mq/libata/scsi: SCSI driver tagging improvements Hannes Reinecke
2022-03-22 11:30   ` [dm-devel] " Hannes Reinecke
2022-03-22 12:17   ` John Garry
2022-03-22 12:17     ` [dm-devel] " John Garry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.