All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue
@ 2018-03-13  9:42 Ming Lei
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei

Hi All,

The patches fixes reply queue(virt-queue on virtio-scsi) selection on hpsa,
megaraid_sa and virtio-scsi, and IO hang can be caused easily by this issue.

This issue is triggered by 84676c1f21e8 ("genirq/affinity: assign vectors
to all possible CPUs"). After 84676c1f21e8, it is easy to see one msix
vector mapped to all offline CPUs. If the reply queue is seleteced from
all allocated msix vectors(reply queues) in round-roin way, the selected
replay queue may not have any online CPU mapped, IO hang is caused.

Both hpsa and megaraid_sas uses host-wide tagset, we can't convert the
reply queue to blk_mq hw queue directly, otherwise IO performance is degraded
much, according to Kashyap's test, so this patchset sets up one mapping talbe
for selecting reply queue, and this approach has been used by mpt3sas already.

For virtio-scsi, the virt-queue is really hw queue wrt. blk-mq view, so
we introduce 'force_blk_mq' for fix this issue because: 1) virtio-blk
has been used for years in blk-mq mode; 2) we have discussed recently
that scsi_mq will be enabled at default soon.


gitweb:
	https://github.com/ming1/linux/tree/v4.16-rc-select-reply-queue-fix-V5

V5:
	- cover legacy vector for megaraid_sas(2/5)
	- patch style change (4/5)
	- add one virtio-scsi cleanup patch(5/5)

V4:
	- splitted from previous patchset
	- handle virtio-scsi by force_blk_mq

Ming Lei (5):
  scsi: hpsa: fix selection of reply queue
  scsi: megaraid_sas: fix selection of reply queue
  scsi: introduce force_blk_mq
  scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity
  scsi: virtio_scsi: unify scsi_host_template

 drivers/scsi/hosts.c                        |   1 +
 drivers/scsi/hpsa.c                         |  73 ++++++++++++----
 drivers/scsi/hpsa.h                         |   1 +
 drivers/scsi/megaraid/megaraid_sas.h        |   1 +
 drivers/scsi/megaraid/megaraid_sas_base.c   |  39 ++++++++-
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  12 +--
 drivers/scsi/virtio_scsi.c                  | 129 ++++------------------------
 include/scsi/scsi_host.h                    |   3 +
 8 files changed, 116 insertions(+), 143 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
@ 2018-03-13  9:42 ` Ming Lei
  2018-03-14  8:57   ` Christoph Hellwig
                     ` (2 more replies)
  2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei, Hannes Reinecke, James Bottomley,
	Artem Bityutskiy

>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one msix vector can be created without any online CPU mapped, then one
command's completion may not be notified.

This patch setups mapping between cpu and reply queue according to irq
affinity info retrived by pci_irq_get_affinity(), and uses this mapping
table to choose reply queue for queuing one command.

Then the chosen reply queue has to be active, and fixes IO hang caused
by using inactive reply queue which doesn't have any online CPU mapped.

Cc: Hannes Reinecke <hare@suse.de>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Artem Bityutskiy <artem.bityutskiy@intel.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Don Brace <don.brace@microsemi.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hpsa.c | 73 +++++++++++++++++++++++++++++++++++++++--------------
 drivers/scsi/hpsa.h |  1 +
 2 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 5293e6827ce5..3a9eca163db8 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1045,11 +1045,7 @@ static void set_performant_mode(struct ctlr_info *h, struct CommandList *c,
 		c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
 		if (unlikely(!h->msix_vectors))
 			return;
-		if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-			c->Header.ReplyQueue =
-				raw_smp_processor_id() % h->nreply_queues;
-		else
-			c->Header.ReplyQueue = reply_queue % h->nreply_queues;
+		c->Header.ReplyQueue = reply_queue;
 	}
 }
 
@@ -1063,10 +1059,7 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h,
 	 * Tell the controller to post the reply to the queue for this
 	 * processor.  This seems to give the best I/O throughput.
 	 */
-	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-		cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
-	else
-		cp->ReplyQueue = reply_queue % h->nreply_queues;
+	cp->ReplyQueue = reply_queue;
 	/*
 	 * Set the bits in the address sent down to include:
 	 *  - performant mode bit (bit 0)
@@ -1087,10 +1080,7 @@ static void set_ioaccel2_tmf_performant_mode(struct ctlr_info *h,
 	/* Tell the controller to post the reply to the queue for this
 	 * processor.  This seems to give the best I/O throughput.
 	 */
-	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-		cp->reply_queue = smp_processor_id() % h->nreply_queues;
-	else
-		cp->reply_queue = reply_queue % h->nreply_queues;
+	cp->reply_queue = reply_queue;
 	/* Set the bits in the address sent down to include:
 	 *  - performant mode bit not used in ioaccel mode 2
 	 *  - pull count (bits 0-3)
@@ -1109,10 +1099,7 @@ static void set_ioaccel2_performant_mode(struct ctlr_info *h,
 	 * Tell the controller to post the reply to the queue for this
 	 * processor.  This seems to give the best I/O throughput.
 	 */
-	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
-		cp->reply_queue = smp_processor_id() % h->nreply_queues;
-	else
-		cp->reply_queue = reply_queue % h->nreply_queues;
+	cp->reply_queue = reply_queue;
 	/*
 	 * Set the bits in the address sent down to include:
 	 *  - performant mode bit not used in ioaccel mode 2
@@ -1157,6 +1144,8 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
 {
 	dial_down_lockup_detection_during_fw_flash(h, c);
 	atomic_inc(&h->commands_outstanding);
+
+	reply_queue = h->reply_map[raw_smp_processor_id()];
 	switch (c->cmd_type) {
 	case CMD_IOACCEL1:
 		set_ioaccel1_performant_mode(h, c, reply_queue);
@@ -7376,6 +7365,26 @@ static void hpsa_disable_interrupt_mode(struct ctlr_info *h)
 	h->msix_vectors = 0;
 }
 
+static void hpsa_setup_reply_map(struct ctlr_info *h)
+{
+	const struct cpumask *mask;
+	unsigned int queue, cpu;
+
+	for (queue = 0; queue < h->msix_vectors; queue++) {
+		mask = pci_irq_get_affinity(h->pdev, queue);
+		if (!mask)
+			goto fallback;
+
+		for_each_cpu(cpu, mask)
+			h->reply_map[cpu] = queue;
+	}
+	return;
+
+fallback:
+	for_each_possible_cpu(cpu)
+		h->reply_map[cpu] = 0;
+}
+
 /* If MSI/MSI-X is supported by the kernel we will try to enable it on
  * controllers that are capable. If not, we use legacy INTx mode.
  */
@@ -7771,6 +7780,10 @@ static int hpsa_pci_init(struct ctlr_info *h)
 	err = hpsa_interrupt_mode(h);
 	if (err)
 		goto clean1;
+
+	/* setup mapping between CPU and reply queue */
+	hpsa_setup_reply_map(h);
+
 	err = hpsa_pci_find_memory_BAR(h->pdev, &h->paddr);
 	if (err)
 		goto clean2;	/* intmode+region, pci */
@@ -8480,6 +8493,28 @@ static struct workqueue_struct *hpsa_create_controller_wq(struct ctlr_info *h,
 	return wq;
 }
 
+static void hpda_free_ctlr_info(struct ctlr_info *h)
+{
+	kfree(h->reply_map);
+	kfree(h);
+}
+
+static struct ctlr_info *hpda_alloc_ctlr_info(void)
+{
+	struct ctlr_info *h;
+
+	h = kzalloc(sizeof(*h), GFP_KERNEL);
+	if (!h)
+		return NULL;
+
+	h->reply_map = kzalloc(sizeof(*h->reply_map) * nr_cpu_ids, GFP_KERNEL);
+	if (!h->reply_map) {
+		kfree(h);
+		return NULL;
+	}
+	return h;
+}
+
 static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	int dac, rc;
@@ -8517,7 +8552,7 @@ static int hpsa_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 * the driver.  See comments in hpsa.h for more info.
 	 */
 	BUILD_BUG_ON(sizeof(struct CommandList) % COMMANDLIST_ALIGNMENT);
-	h = kzalloc(sizeof(*h), GFP_KERNEL);
+	h = hpda_alloc_ctlr_info();
 	if (!h) {
 		dev_err(&pdev->dev, "Failed to allocate controller head\n");
 		return -ENOMEM;
@@ -8916,7 +8951,7 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 	h->lockup_detected = NULL;			/* init_one 2 */
 	/* (void) pci_disable_pcie_error_reporting(pdev); */	/* init_one 1 */
 
-	kfree(h);					/* init_one 1 */
+	hpda_free_ctlr_info(h);				/* init_one 1 */
 }
 
 static int hpsa_suspend(__attribute__((unused)) struct pci_dev *pdev,
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 018f980a701c..fb9f5e7f8209 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -158,6 +158,7 @@ struct bmic_controller_parameters {
 #pragma pack()
 
 struct ctlr_info {
+	unsigned int *reply_map;
 	int	ctlr;
 	char	devname[8];
 	char    *product_name;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
@ 2018-03-13  9:42 ` Ming Lei
  2018-03-13 17:13   ` Kashyap Desai
                     ` (2 more replies)
  2018-03-13  9:42 ` [PATCH V5 3/5] scsi: introduce force_blk_mq Ming Lei
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei, Hannes Reinecke, James Bottomley,
	Artem Bityutskiy

>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one msix vector can be created without any online CPU mapped, then
command may be queued, and won't be notified after its completion.

This patch setups mapping between cpu and reply queue according to irq
affinity info retrived by pci_irq_get_affinity(), and uses this info
to choose reply queue for queuing one command.

Then the chosen reply queue has to be active, and fixes IO hang caused
by using inactive reply queue which doesn't have any online CPU mapped.

Cc: Hannes Reinecke <hare@suse.de>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Artem Bityutskiy <artem.bityutskiy@intel.com>
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/megaraid/megaraid_sas.h        |  1 +
 drivers/scsi/megaraid/megaraid_sas_base.c   | 39 ++++++++++++++++++++++++++---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 +++------
 3 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index ba6503f37756..27fab8235ea5 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2128,6 +2128,7 @@ enum MR_PD_TYPE {
 
 struct megasas_instance {
 
+	unsigned int *reply_map;
 	__le32 *producer;
 	dma_addr_t producer_h;
 	__le32 *consumer;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index a71ee67df084..dde0798b8a91 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -5165,6 +5165,26 @@ megasas_setup_jbod_map(struct megasas_instance *instance)
 		instance->use_seqnum_jbod_fp = false;
 }
 
+static void megasas_setup_reply_map(struct megasas_instance *instance)
+{
+	const struct cpumask *mask;
+	unsigned int queue, cpu;
+
+	for (queue = 0; queue < instance->msix_vectors; queue++) {
+		mask = pci_irq_get_affinity(instance->pdev, queue);
+		if (!mask)
+			goto fallback;
+
+		for_each_cpu(cpu, mask)
+			instance->reply_map[cpu] = queue;
+	}
+	return;
+
+fallback:
+	for_each_possible_cpu(cpu)
+		instance->reply_map[cpu] = cpu % instance->msix_vectors;
+}
+
 /**
  * megasas_init_fw -	Initializes the FW
  * @instance:		Adapter soft state
@@ -5343,6 +5363,8 @@ static int megasas_init_fw(struct megasas_instance *instance)
 			goto fail_setup_irqs;
 	}
 
+	megasas_setup_reply_map(instance);
+
 	dev_info(&instance->pdev->dev,
 		"firmware supports msix\t: (%d)", fw_msix_count);
 	dev_info(&instance->pdev->dev,
@@ -6123,20 +6145,29 @@ static inline int megasas_alloc_mfi_ctrl_mem(struct megasas_instance *instance)
  */
 static int megasas_alloc_ctrl_mem(struct megasas_instance *instance)
 {
+	instance->reply_map = kzalloc(sizeof(unsigned int) * nr_cpu_ids,
+				      GFP_KERNEL);
+	if (!instance->reply_map)
+		return -ENOMEM;
+
 	switch (instance->adapter_type) {
 	case MFI_SERIES:
 		if (megasas_alloc_mfi_ctrl_mem(instance))
-			return -ENOMEM;
+			goto fail;
 		break;
 	case VENTURA_SERIES:
 	case THUNDERBOLT_SERIES:
 	case INVADER_SERIES:
 		if (megasas_alloc_fusion_context(instance))
-			return -ENOMEM;
+			goto fail;
 		break;
 	}
 
 	return 0;
+ fail:
+	kfree(instance->reply_map);
+	instance->reply_map = NULL;
+	return -ENOMEM;
 }
 
 /*
@@ -6148,6 +6179,7 @@ static int megasas_alloc_ctrl_mem(struct megasas_instance *instance)
  */
 static inline void megasas_free_ctrl_mem(struct megasas_instance *instance)
 {
+	kfree(instance->reply_map);
 	if (instance->adapter_type == MFI_SERIES) {
 		if (instance->producer)
 			pci_free_consistent(instance->pdev, sizeof(u32),
@@ -6540,7 +6572,6 @@ static int megasas_probe_one(struct pci_dev *pdev,
 		pci_free_irq_vectors(instance->pdev);
 fail_init_mfi:
 	scsi_host_put(host);
-
 fail_alloc_instance:
 	pci_disable_device(pdev);
 
@@ -6746,6 +6777,8 @@ megasas_resume(struct pci_dev *pdev)
 	if (rval < 0)
 		goto fail_reenable_msix;
 
+	megasas_setup_reply_map(instance);
+
 	if (instance->adapter_type != MFI_SERIES) {
 		megasas_reset_reply_desc(instance);
 		if (megasas_ioc_init_fusion(instance)) {
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 073ced07e662..2994176a0121 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -2655,11 +2655,8 @@ megasas_build_ldio_fusion(struct megasas_instance *instance,
 			fp_possible = (io_info.fpOkForIo > 0) ? true : false;
 	}
 
-	/* Use raw_smp_processor_id() for now until cmd->request->cpu is CPU
-	   id by default, not CPU group id, otherwise all MSI-X queues won't
-	   be utilized */
-	cmd->request_desc->SCSIIO.MSIxIndex = instance->msix_vectors ?
-		raw_smp_processor_id() % instance->msix_vectors : 0;
+	cmd->request_desc->SCSIIO.MSIxIndex =
+		instance->reply_map[raw_smp_processor_id()];
 
 	praid_context = &io_request->RaidContext;
 
@@ -2985,10 +2982,9 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
 	}
 
 	cmd->request_desc->SCSIIO.DevHandle = io_request->DevHandle;
-	cmd->request_desc->SCSIIO.MSIxIndex =
-		instance->msix_vectors ?
-		(raw_smp_processor_id() % instance->msix_vectors) : 0;
 
+	cmd->request_desc->SCSIIO.MSIxIndex =
+		instance->reply_map[raw_smp_processor_id()];
 
 	if (!fp_possible) {
 		/* system pd firmware path */
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V5 3/5] scsi: introduce force_blk_mq
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
  2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
@ 2018-03-13  9:42 ` Ming Lei
  2018-03-13  9:42 ` [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity Ming Lei
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei, Omar Sandoval, James Bottomley

>From scsi driver view, it is a bit troublesome to support both blk-mq
and non-blk-mq at the same time, especially when drivers need to support
multi hw-queue.

This patch introduces 'force_blk_mq' to scsi_host_template so that drivers
can provide blk-mq only support, so driver code can avoid the trouble
for supporting both.

Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hosts.c     | 1 +
 include/scsi/scsi_host.h | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 57bf43e34863..cbbc32df7595 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -477,6 +477,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 		shost->dma_boundary = 0xffffffff;
 
 	shost->use_blk_mq = scsi_use_blk_mq;
+	shost->use_blk_mq = scsi_use_blk_mq || shost->hostt->force_blk_mq;
 
 	device_initialize(&shost->shost_gendev);
 	dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 1a1df0d21ee3..6c6366f0bd15 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -452,6 +452,9 @@ struct scsi_host_template {
 	/* True if the controller does not support WRITE SAME */
 	unsigned no_write_same:1;
 
+	/* True if the low-level driver supports blk-mq only */
+	unsigned force_blk_mq:1;
+
 	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
                   ` (2 preceding siblings ...)
  2018-03-13  9:42 ` [PATCH V5 3/5] scsi: introduce force_blk_mq Ming Lei
@ 2018-03-13  9:42 ` Ming Lei
  2018-03-14  8:58   ` Christoph Hellwig
  2018-03-13  9:42 ` [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template Ming Lei
  2018-03-15  3:35 ` [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Martin K. Petersen
  5 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei, Omar Sandoval, James Bottomley

Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs)
has been merged to V4.16-rc, and it is easy to allocate all offline CPUs
for some irq vectors, this can't be avoided even though the allocation
is improved.

For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of
virtio-scsi, the irq affinity assigned can become the following shape:

	irq 36, cpu list 0-7
	irq 37, cpu list 0-7
	irq 38, cpu list 0-7
	irq 39, cpu list 0-1
	irq 40, cpu list 4,6
	irq 41, cpu list 2-3
	irq 42, cpu list 5,7

Then IO hang is triggered in case of non-SCSI_MQ.

Given storage IO is always C/S model, there isn't such issue with SCSI_MQ(blk-mq),
because no IO can be submitted to one hw queue if the hw queue isn't
mapped to online CPUs.

Fix this issue by forcing to use blk_mq.

BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has
been quite stable, so it shouldn't cause extra risk.

Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 59 +++-------------------------------------------
 1 file changed, 3 insertions(+), 56 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 7c28e8d4955a..54e3a0f6844c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -91,9 +91,6 @@ struct virtio_scsi_vq {
 struct virtio_scsi_target_state {
 	seqcount_t tgt_seq;
 
-	/* Count of outstanding requests. */
-	atomic_t reqs;
-
 	/* Currently active virtqueue for requests sent to this target. */
 	struct virtio_scsi_vq *req_vq;
 };
@@ -152,8 +149,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 	struct virtio_scsi_cmd *cmd = buf;
 	struct scsi_cmnd *sc = cmd->sc;
 	struct virtio_scsi_cmd_resp *resp = &cmd->resp.cmd;
-	struct virtio_scsi_target_state *tgt =
-				scsi_target(sc->device)->hostdata;
 
 	dev_dbg(&sc->device->sdev_gendev,
 		"cmd %p response %u status %#02x sense_len %u\n",
@@ -210,8 +205,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 	}
 
 	sc->scsi_done(sc);
-
-	atomic_dec(&tgt->reqs);
 }
 
 static void virtscsi_vq_done(struct virtio_scsi *vscsi,
@@ -580,10 +573,7 @@ static int virtscsi_queuecommand_single(struct Scsi_Host *sh,
 					struct scsi_cmnd *sc)
 {
 	struct virtio_scsi *vscsi = shost_priv(sh);
-	struct virtio_scsi_target_state *tgt =
-				scsi_target(sc->device)->hostdata;
 
-	atomic_inc(&tgt->reqs);
 	return virtscsi_queuecommand(vscsi, &vscsi->req_vqs[0], sc);
 }
 
@@ -596,55 +586,11 @@ static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
 	return &vscsi->req_vqs[hwq];
 }
 
-static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi,
-					       struct virtio_scsi_target_state *tgt)
-{
-	struct virtio_scsi_vq *vq;
-	unsigned long flags;
-	u32 queue_num;
-
-	local_irq_save(flags);
-	if (atomic_inc_return(&tgt->reqs) > 1) {
-		unsigned long seq;
-
-		do {
-			seq = read_seqcount_begin(&tgt->tgt_seq);
-			vq = tgt->req_vq;
-		} while (read_seqcount_retry(&tgt->tgt_seq, seq));
-	} else {
-		/* no writes can be concurrent because of atomic_t */
-		write_seqcount_begin(&tgt->tgt_seq);
-
-		/* keep previous req_vq if a reader just arrived */
-		if (unlikely(atomic_read(&tgt->reqs) > 1)) {
-			vq = tgt->req_vq;
-			goto unlock;
-		}
-
-		queue_num = smp_processor_id();
-		while (unlikely(queue_num >= vscsi->num_queues))
-			queue_num -= vscsi->num_queues;
-		tgt->req_vq = vq = &vscsi->req_vqs[queue_num];
- unlock:
-		write_seqcount_end(&tgt->tgt_seq);
-	}
-	local_irq_restore(flags);
-
-	return vq;
-}
-
 static int virtscsi_queuecommand_multi(struct Scsi_Host *sh,
 				       struct scsi_cmnd *sc)
 {
 	struct virtio_scsi *vscsi = shost_priv(sh);
-	struct virtio_scsi_target_state *tgt =
-				scsi_target(sc->device)->hostdata;
-	struct virtio_scsi_vq *req_vq;
-
-	if (shost_use_blk_mq(sh))
-		req_vq = virtscsi_pick_vq_mq(vscsi, sc);
-	else
-		req_vq = virtscsi_pick_vq(vscsi, tgt);
+	struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
 
 	return virtscsi_queuecommand(vscsi, req_vq, sc);
 }
@@ -775,7 +721,6 @@ static int virtscsi_target_alloc(struct scsi_target *starget)
 		return -ENOMEM;
 
 	seqcount_init(&tgt->tgt_seq);
-	atomic_set(&tgt->reqs, 0);
 	tgt->req_vq = &vscsi->req_vqs[0];
 
 	starget->hostdata = tgt;
@@ -823,6 +768,7 @@ static struct scsi_host_template virtscsi_host_template_single = {
 	.target_alloc = virtscsi_target_alloc,
 	.target_destroy = virtscsi_target_destroy,
 	.track_queue_depth = 1,
+	.force_blk_mq = 1,
 };
 
 static struct scsi_host_template virtscsi_host_template_multi = {
@@ -844,6 +790,7 @@ static struct scsi_host_template virtscsi_host_template_multi = {
 	.target_destroy = virtscsi_target_destroy,
 	.map_queues = virtscsi_map_queues,
 	.track_queue_depth = 1,
+	.force_blk_mq = 1,
 };
 
 #define virtscsi_config_get(vdev, fld) \
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
                   ` (3 preceding siblings ...)
  2018-03-13  9:42 ` [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity Ming Lei
@ 2018-03-13  9:42 ` Ming Lei
  2018-03-14  8:58   ` Christoph Hellwig
  2018-03-15  3:35 ` [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Martin K. Petersen
  5 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2018-03-13  9:42 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Ming Lei, Omar Sandoval, James Bottomley,
	Hannes Reinecke

Now we switch to use_blk_mq always, and both single queue and multi
queue cases can be handled in one .queuecommand callback, not necessary
to use two scsi_host_template.

Suggested-by: Christoph Hellwig <hch@lst.de>,
Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/virtio_scsi.c | 74 ++++++++++------------------------------------
 1 file changed, 15 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 54e3a0f6844c..45d04631888a 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -522,11 +522,20 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev,
 }
 #endif
 
-static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
-				 struct virtio_scsi_vq *req_vq,
+static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
+						  struct scsi_cmnd *sc)
+{
+	u32 tag = blk_mq_unique_tag(sc->request);
+	u16 hwq = blk_mq_unique_tag_to_hwq(tag);
+
+	return &vscsi->req_vqs[hwq];
+}
+
+static int virtscsi_queuecommand(struct Scsi_Host *shost,
 				 struct scsi_cmnd *sc)
 {
-	struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev);
+	struct virtio_scsi *vscsi = shost_priv(shost);
+	struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
 	struct virtio_scsi_cmd *cmd = scsi_cmd_priv(sc);
 	unsigned long flags;
 	int req_size;
@@ -569,32 +578,6 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
 	return 0;
 }
 
-static int virtscsi_queuecommand_single(struct Scsi_Host *sh,
-					struct scsi_cmnd *sc)
-{
-	struct virtio_scsi *vscsi = shost_priv(sh);
-
-	return virtscsi_queuecommand(vscsi, &vscsi->req_vqs[0], sc);
-}
-
-static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi,
-						  struct scsi_cmnd *sc)
-{
-	u32 tag = blk_mq_unique_tag(sc->request);
-	u16 hwq = blk_mq_unique_tag_to_hwq(tag);
-
-	return &vscsi->req_vqs[hwq];
-}
-
-static int virtscsi_queuecommand_multi(struct Scsi_Host *sh,
-				       struct scsi_cmnd *sc)
-{
-	struct virtio_scsi *vscsi = shost_priv(sh);
-	struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc);
-
-	return virtscsi_queuecommand(vscsi, req_vq, sc);
-}
-
 static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
 {
 	DECLARE_COMPLETION_ONSTACK(comp);
@@ -750,34 +733,13 @@ static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
 	return BLK_EH_RESET_TIMER;
 }
 
-static struct scsi_host_template virtscsi_host_template_single = {
-	.module = THIS_MODULE,
-	.name = "Virtio SCSI HBA",
-	.proc_name = "virtio_scsi",
-	.this_id = -1,
-	.cmd_size = sizeof(struct virtio_scsi_cmd),
-	.queuecommand = virtscsi_queuecommand_single,
-	.change_queue_depth = virtscsi_change_queue_depth,
-	.eh_abort_handler = virtscsi_abort,
-	.eh_device_reset_handler = virtscsi_device_reset,
-	.eh_timed_out = virtscsi_eh_timed_out,
-	.slave_alloc = virtscsi_device_alloc,
-
-	.dma_boundary = UINT_MAX,
-	.use_clustering = ENABLE_CLUSTERING,
-	.target_alloc = virtscsi_target_alloc,
-	.target_destroy = virtscsi_target_destroy,
-	.track_queue_depth = 1,
-	.force_blk_mq = 1,
-};
-
-static struct scsi_host_template virtscsi_host_template_multi = {
+static struct scsi_host_template virtscsi_host_template = {
 	.module = THIS_MODULE,
 	.name = "Virtio SCSI HBA",
 	.proc_name = "virtio_scsi",
 	.this_id = -1,
 	.cmd_size = sizeof(struct virtio_scsi_cmd),
-	.queuecommand = virtscsi_queuecommand_multi,
+	.queuecommand = virtscsi_queuecommand,
 	.change_queue_depth = virtscsi_change_queue_depth,
 	.eh_abort_handler = virtscsi_abort,
 	.eh_device_reset_handler = virtscsi_device_reset,
@@ -883,7 +845,6 @@ static int virtscsi_probe(struct virtio_device *vdev)
 	u32 sg_elems, num_targets;
 	u32 cmd_per_lun;
 	u32 num_queues;
-	struct scsi_host_template *hostt;
 
 	if (!vdev->config->get) {
 		dev_err(&vdev->dev, "%s failure: config access disabled\n",
@@ -896,12 +857,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
 
 	num_targets = virtscsi_config_get(vdev, max_target) + 1;
 
-	if (num_queues == 1)
-		hostt = &virtscsi_host_template_single;
-	else
-		hostt = &virtscsi_host_template_multi;
-
-	shost = scsi_host_alloc(hostt,
+	shost = scsi_host_alloc(&virtscsi_host_template,
 		sizeof(*vscsi) + sizeof(vscsi->req_vqs[0]) * num_queues);
 	if (!shost)
 		return -ENOMEM;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* RE: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
@ 2018-03-13 17:13   ` Kashyap Desai
  2018-03-14  8:57   ` Christoph Hellwig
  2018-03-14 15:07   ` Artem Bityutskiy
  2 siblings, 0 replies; 20+ messages in thread
From: Kashyap Desai @ 2018-03-13 17:13 UTC (permalink / raw)
  To: Ming Lei, James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Laurence Oberman, Mike Snitzer, Paolo Bonzini,
	Hannes Reinecke, Artem Bityutskiy

> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@redhat.com]
> Sent: Tuesday, March 13, 2018 3:13 PM
> To: James Bottomley; Jens Axboe; Martin K . Petersen
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; linux-
> block@vger.kernel.org; Meelis Roos; Don Brace; Kashyap Desai; Laurence
> Oberman; Mike Snitzer; Paolo Bonzini; Ming Lei; Hannes Reinecke; James
> Bottomley; Artem Bityutskiy
> Subject: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
>
> From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
one
> msix vector can be created without any online CPU mapped, then command
> may be queued, and won't be notified after its completion.
>
> This patch setups mapping between cpu and reply queue according to irq
> affinity info retrived by pci_irq_get_affinity(), and uses this info to
choose
> reply queue for queuing one command.
>
> Then the chosen reply queue has to be active, and fixes IO hang caused
by
> using inactive reply queue which doesn't have any online CPU mapped.
>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
> Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
> Cc: Christoph Hellwig <hch@lst.de>,
> Cc: Don Brace <don.brace@microsemi.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Laurence Oberman <loberman@redhat.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: Meelis Roos <mroos@linux.ee>
> Cc: Artem Bityutskiy <artem.bityutskiy@intel.com>
> Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible
CPUs")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>


Side note - For a max performance, your below proposed patch/series is
required.  Without below patch, performance is going to be dropped due to
fewer reply queues are getting utilized one this particular patch is
included.
genirq/affinity: irq vector spread  among online CPUs as far as possible

Acked-by: Kashyap Desai <kashyap.desai@broadcom.com>
Tested-by: Kashyap Desai <kashyap.desai@broadcom.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
@ 2018-03-14  8:57   ` Christoph Hellwig
  2018-03-14 15:07   ` Bityutskiy, Artem
  2018-03-19 11:48   ` Bityutskiy, Artem
  2 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2018-03-14  8:57 UTC (permalink / raw)
  To: Ming Lei
  Cc: James Bottomley, Jens Axboe, Martin K . Petersen,
	Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Hannes Reinecke, Artem Bityutskiy

I still don't like the code duplication, but I guess I can fix this
up in one of the next merge windows myself..

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
  2018-03-13 17:13   ` Kashyap Desai
@ 2018-03-14  8:57   ` Christoph Hellwig
  2018-03-14 15:07   ` Artem Bityutskiy
  2 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2018-03-14  8:57 UTC (permalink / raw)
  To: Ming Lei
  Cc: James Bottomley, Jens Axboe, Martin K . Petersen,
	Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Hannes Reinecke, Artem Bityutskiy

Same as for hpsa..

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity
  2018-03-13  9:42 ` [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity Ming Lei
@ 2018-03-14  8:58   ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2018-03-14  8:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: James Bottomley, Jens Axboe, Martin K . Petersen,
	Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Omar Sandoval

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template
  2018-03-13  9:42 ` [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template Ming Lei
@ 2018-03-14  8:58   ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2018-03-14  8:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: James Bottomley, Jens Axboe, Martin K . Petersen,
	Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Omar Sandoval, Hannes Reinecke

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
  2018-03-14  8:57   ` Christoph Hellwig
@ 2018-03-14 15:07   ` Bityutskiy, Artem
  2018-03-19 11:48   ` Bityutskiy, Artem
  2 siblings, 0 replies; 20+ messages in thread
From: Bityutskiy, Artem @ 2018-03-14 15:07 UTC (permalink / raw)
  To: martin.petersen, James.Bottomley, axboe, ming.lei
  Cc: linux-block, snitzer, hch, hare, mroos, linux-scsi, don.brace,
	pbonzini, loberman, kashyap.desai

T24gVHVlLCAyMDE4LTAzLTEzIGF0IDE3OjQyICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gRnJv
bSA4NDY3NmMxZjIxIChnZW5pcnEvYWZmaW5pdHk6IGFzc2lnbiB2ZWN0b3JzIHRvIGFsbCBwb3Nz
aWJsZSBDUFVzKSwNCj4gb25lIG1zaXggdmVjdG9yIGNhbiBiZSBjcmVhdGVkIHdpdGhvdXQgYW55
IG9ubGluZSBDUFUgbWFwcGVkLCB0aGVuIG9uZQ0KPiBjb21tYW5kJ3MgY29tcGxldGlvbiBtYXkg
bm90IGJlIG5vdGlmaWVkLg0KPiANCj4gVGhpcyBwYXRjaCBzZXR1cHMgbWFwcGluZyBiZXR3ZWVu
IGNwdSBhbmQgcmVwbHkgcXVldWUgYWNjb3JkaW5nIHRvIGlycQ0KPiBhZmZpbml0eSBpbmZvIHJl
dHJpdmVkIGJ5IHBjaV9pcnFfZ2V0X2FmZmluaXR5KCksIGFuZCB1c2VzIHRoaXMgbWFwcGluZw0K
PiB0YWJsZSB0byBjaG9vc2UgcmVwbHkgcXVldWUgZm9yIHF1ZXVpbmcgb25lIGNvbW1hbmQuDQo+
IA0KPiBUaGVuIHRoZSBjaG9zZW4gcmVwbHkgcXVldWUgaGFzIHRvIGJlIGFjdGl2ZSwgYW5kIGZp
eGVzIElPIGhhbmcgY2F1c2VkDQo+IGJ5IHVzaW5nIGluYWN0aXZlIHJlcGx5IHF1ZXVlIHdoaWNo
IGRvZXNuJ3QgaGF2ZSBhbnkgb25saW5lIENQVSBtYXBwZWQuDQo+IA0KPiBDYzogSGFubmVzIFJl
aW5lY2tlIDxoYXJlQHN1c2UuZGU+DQo+IENjOiAiTWFydGluIEsuIFBldGVyc2VuIiA8bWFydGlu
LnBldGVyc2VuQG9yYWNsZS5jb20+LA0KPiBDYzogSmFtZXMgQm90dG9tbGV5IDxqYW1lcy5ib3R0
b21sZXlAaGFuc2VucGFydG5lcnNoaXAuY29tPiwNCj4gQ2M6IENocmlzdG9waCBIZWxsd2lnIDxo
Y2hAbHN0LmRlPiwNCj4gQ2M6IERvbiBCcmFjZSA8ZG9uLmJyYWNlQG1pY3Jvc2VtaS5jb20+DQo+
IENjOiBLYXNoeWFwIERlc2FpIDxrYXNoeWFwLmRlc2FpQGJyb2FkY29tLmNvbT4NCj4gQ2M6IExh
dXJlbmNlIE9iZXJtYW4gPGxvYmVybWFuQHJlZGhhdC5jb20+DQo+IENjOiBNZWVsaXMgUm9vcyA8
bXJvb3NAbGludXguZWU+DQo+IENjOiBBcnRlbSBCaXR5dXRza2l5IDxhcnRlbS5iaXR5dXRza2l5
QGludGVsLmNvbT4NCj4gQ2M6IE1pa2UgU25pdHplciA8c25pdHplckByZWRoYXQuY29tPg0KPiBU
ZXN0ZWQtYnk6IExhdXJlbmNlIE9iZXJtYW4gPGxvYmVybWFuQHJlZGhhdC5jb20+DQo+IFRlc3Rl
ZC1ieTogRG9uIEJyYWNlIDxkb24uYnJhY2VAbWljcm9zZW1pLmNvbT4NCj4gVGVzdGVkLWJ5OiBB
cnRlbSBCaXR5dXRza2l5IDxhcnRlbS5iaXR5dXRza2l5QGludGVsLmNvbT4NCj4gQWNrZWQtYnk6
IERvbiBCcmFjZSA8ZG9uLmJyYWNlQG1pY3Jvc2VtaS5jb20+DQo+IEZpeGVzOiA4NDY3NmMxZjIx
ZTggKCJnZW5pcnEvYWZmaW5pdHk6IGFzc2lnbiB2ZWN0b3JzIHRvIGFsbCBwb3NzaWJsZSBDUFVz
IikNCj4gU2lnbmVkLW9mZi1ieTogTWluZyBMZWkgPG1pbmcubGVpQHJlZGhhdC5jb20+DQoNCkNo
ZWNrZWQgdjUgbXkgU2t5bGFrZSBYZW9uIGFuZCB3aXRoIHRoaXMgcGF0Y2ggdGhlIHJlZ3Jlc3Np
b24gdGhhdCBJDQpyZXBvcnRlZCBpcyBmaXhlZC4NCg0KVGVzdGVkLWJ5OiBBcnRlbSBCaXR5dXRz
a2l5IDxhcnRlbS5iaXR5dXRza2l5QGludGVsLmNvbT4NCkxpbms6IGh0dHBzOi8vbGttbC5rZXJu
ZWwub3JnL3IvMTUxOTMxMTI3MC4yNTM1LjUzLmNhbWVsQGludGVsLmNvbQotLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0K
SW50ZWwgRmlubGFuZCBPeQpSZWdpc3RlcmVkIEFkZHJlc3M6IFBMIDI4MSwgMDAxODEgSGVsc2lu
a2kgCkJ1c2luZXNzIElkZW50aXR5IENvZGU6IDAzNTc2MDYgLSA0IApEb21pY2lsZWQgaW4gSGVs
c2lua2kgCgpUaGlzIGUtbWFpbCBhbmQgYW55IGF0dGFjaG1lbnRzIG1heSBjb250YWluIGNvbmZp
ZGVudGlhbCBtYXRlcmlhbCBmb3IKdGhlIHNvbGUgdXNlIG9mIHRoZSBpbnRlbmRlZCByZWNpcGll
bnQocykuIEFueSByZXZpZXcgb3IgZGlzdHJpYnV0aW9uCmJ5IG90aGVycyBpcyBzdHJpY3RseSBw
cm9oaWJpdGVkLiBJZiB5b3UgYXJlIG5vdCB0aGUgaW50ZW5kZWQKcmVjaXBpZW50LCBwbGVhc2Ug
Y29udGFjdCB0aGUgc2VuZGVyIGFuZCBkZWxldGUgYWxsIGNvcGllcy4K

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/5] scsi: megaraid_sas: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
  2018-03-13 17:13   ` Kashyap Desai
  2018-03-14  8:57   ` Christoph Hellwig
@ 2018-03-14 15:07   ` Artem Bityutskiy
  2 siblings, 0 replies; 20+ messages in thread
From: Artem Bityutskiy @ 2018-03-14 15:07 UTC (permalink / raw)
  To: Ming Lei, James Bottomley, Jens Axboe, Martin K . Petersen
  Cc: Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini, Hannes Reinecke

On Tue, 2018-03-13 at 17:42 +0800, Ming Lei wrote:
> From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
> one msix vector can be created without any online CPU mapped, then
> command may be queued, and won't be notified after its completion.
> 
> This patch setups mapping between cpu and reply queue according to irq
> affinity info retrived by pci_irq_get_affinity(), and uses this info
> to choose reply queue for queuing one command.
> 
> Then the chosen reply queue has to be active, and fixes IO hang caused
> by using inactive reply queue which doesn't have any online CPU mapped.
> 
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
> Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
> Cc: Christoph Hellwig <hch@lst.de>,
> Cc: Don Brace <don.brace@microsemi.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Laurence Oberman <loberman@redhat.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: Meelis Roos <mroos@linux.ee>
> Cc: Artem Bityutskiy <artem.bityutskiy@intel.com>
> Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Checked v5 my Skylake Xeon and with this patch the regression that I reported is fixed.

Tested-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
Link: https://lkml.kernel.org/r/1519311270.2535.53.camel@intel.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue
  2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
                   ` (4 preceding siblings ...)
  2018-03-13  9:42 ` [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template Ming Lei
@ 2018-03-15  3:35 ` Martin K. Petersen
  5 siblings, 0 replies; 20+ messages in thread
From: Martin K. Petersen @ 2018-03-15  3:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: James Bottomley, Jens Axboe, Martin K . Petersen,
	Christoph Hellwig, linux-scsi, linux-block, Meelis Roos,
	Don Brace, Kashyap Desai, Laurence Oberman, Mike Snitzer,
	Paolo Bonzini


Ming,

> The patches fixes reply queue(virt-queue on virtio-scsi) selection on
> hpsa, megaraid_sa and virtio-scsi, and IO hang can be caused easily by
> this issue.

I clarified all the commit descriptions. There were also a bunch of
duplicate review tags and other warnings. Please run checkpatch next
time!

Applied to 4.16/scsi-fixes. Thank you.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
  2018-03-14  8:57   ` Christoph Hellwig
  2018-03-14 15:07   ` Bityutskiy, Artem
@ 2018-03-19 11:48   ` Bityutskiy, Artem
  2018-03-19 14:31     ` Jens Axboe
  2018-03-20  3:16     ` Martin K. Petersen
  2 siblings, 2 replies; 20+ messages in thread
From: Bityutskiy, Artem @ 2018-03-19 11:48 UTC (permalink / raw)
  To: martin.petersen, James.Bottomley, axboe, ming.lei
  Cc: linux-block, snitzer, hch, hare, mroos, linux-scsi, don.brace,
	pbonzini, loberman, kashyap.desai

T24gVHVlLCAyMDE4LTAzLTEzIGF0IDE3OjQyICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gRnJv
bSA4NDY3NmMxZjIxIChnZW5pcnEvYWZmaW5pdHk6IGFzc2lnbiB2ZWN0b3JzIHRvIGFsbCBwb3Nz
aWJsZSBDUFVzKSwNCj4gb25lIG1zaXggdmVjdG9yIGNhbiBiZSBjcmVhdGVkIHdpdGhvdXQgYW55
IG9ubGluZSBDUFUgbWFwcGVkLCB0aGVuIG9uZQ0KPiBjb21tYW5kJ3MgY29tcGxldGlvbiBtYXkg
bm90IGJlIG5vdGlmaWVkLg0KDQpIZWxsbyBNYXJ0aW4sIEphbWVzLA0KDQp3ZSBoYXZlIHRoaXMg
cGF0Y2gtc2V0IGFuZCBpdCBmaXhlcyBtZWdhcmFpZCByZWdyZXNzaW9uIGluIHY0LjE2LiBEbw0K
eW91IHBsYW4gdG8gbWVnZSBpdCB0byB2NC4xNi1yY1g/IEkgYW0gd29ycmllZCAtIHRoZXJlIHNl
ZW0gdG8gYmUgbm8NCnNpZ2h0IHRoYXQgdGhpcyBpcyBnb2luZyB0byBtZSBtZXJnZWQuIFRoZXkg
YXJlIG5vdCBpbiB0aGUgbGludXgtbmV4dC4NCg0KLS0gDQpCZXN0IFJlZ2FyZHMsDQpBcnRlbSBC
aXR5dXRza2l5Ci0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLQpJbnRlbCBGaW5sYW5kIE95ClJlZ2lzdGVyZWQgQWRkcmVz
czogUEwgMjgxLCAwMDE4MSBIZWxzaW5raSAKQnVzaW5lc3MgSWRlbnRpdHkgQ29kZTogMDM1NzYw
NiAtIDQgCkRvbWljaWxlZCBpbiBIZWxzaW5raSAKClRoaXMgZS1tYWlsIGFuZCBhbnkgYXR0YWNo
bWVudHMgbWF5IGNvbnRhaW4gY29uZmlkZW50aWFsIG1hdGVyaWFsIGZvcgp0aGUgc29sZSB1c2Ug
b2YgdGhlIGludGVuZGVkIHJlY2lwaWVudChzKS4gQW55IHJldmlldyBvciBkaXN0cmlidXRpb24K
Ynkgb3RoZXJzIGlzIHN0cmljdGx5IHByb2hpYml0ZWQuIElmIHlvdSBhcmUgbm90IHRoZSBpbnRl
bmRlZApyZWNpcGllbnQsIHBsZWFzZSBjb250YWN0IHRoZSBzZW5kZXIgYW5kIGRlbGV0ZSBhbGwg
Y29waWVzLgo=

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-19 11:48   ` Bityutskiy, Artem
@ 2018-03-19 14:31     ` Jens Axboe
  2018-03-19 14:42       ` Artem Bityutskiy
  2018-03-20  3:16     ` Martin K. Petersen
  1 sibling, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2018-03-19 14:31 UTC (permalink / raw)
  To: Bityutskiy, Artem, martin.petersen, James.Bottomley, ming.lei
  Cc: linux-block, snitzer, hch, hare, mroos, linux-scsi, don.brace,
	pbonzini, loberman, kashyap.desai

On 3/19/18 5:48 AM, Bityutskiy, Artem wrote:
> On Tue, 2018-03-13 at 17:42 +0800, Ming Lei wrote:
>> From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs),
>> one msix vector can be created without any online CPU mapped, then one
>> command's completion may not be notified.
> 
> Hello Martin, James,
> 
> we have this patch-set and it fixes megaraid regression in v4.16. Do
> you plan to mege it to v4.16-rcX? I am worried - there seem to be no
> sight that this is going to me merged. They are not in the linux-next.

I'm assuming that Martin will eventually queue this up. But probably
for 4.17, then we can always flag it for a backport to stable once
it's been thoroughly tested.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-19 14:31     ` Jens Axboe
@ 2018-03-19 14:42       ` Artem Bityutskiy
  2018-03-19 14:55         ` Kashyap Desai
  2018-03-19 15:26         ` Ming Lei
  0 siblings, 2 replies; 20+ messages in thread
From: Artem Bityutskiy @ 2018-03-19 14:42 UTC (permalink / raw)
  To: hch, Thomas Gleixner
  Cc: linux-block, snitzer, hare, mroos, linux-scsi, don.brace,
	pbonzini, loberman, kashyap.desai, Jens Axboe, martin.petersen,
	James.Bottomley, ming.lei

On Mon, 2018-03-19 at 08:31 -0600, Jens Axboe wrote:
> I'm assuming that Martin will eventually queue this up. But probably
> for 4.17, then we can always flag it for a backport to stable once
> it's been thoroughly tested.

Jens, thanks for reply.

I wonder if folks agree that in this case we should revert 

84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs

for v4.16.

If this was a minor niche use-case regression the -stable scenario
would probably be OK. But the patch seem to miss the fact that kernel's
"possible CPUs" notion may be way off and side effects are bad.

Christoph, Thomas, what do you think?

Thanks,
Artem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-19 14:42       ` Artem Bityutskiy
@ 2018-03-19 14:55         ` Kashyap Desai
  2018-03-19 15:26         ` Ming Lei
  1 sibling, 0 replies; 20+ messages in thread
From: Kashyap Desai @ 2018-03-19 14:55 UTC (permalink / raw)
  To: dedekind1, hch, Thomas Gleixner
  Cc: linux-block, snitzer, hare, mroos, linux-scsi, don.brace,
	pbonzini, loberman, Jens Axboe, martin.petersen, James.Bottomley,
	ming.lei

> -----Original Message-----
> From: Artem Bityutskiy [mailto:dedekind1@gmail.com]
> Sent: Monday, March 19, 2018 8:12 PM
> To: hch@lst.de; Thomas Gleixner
> Cc: linux-block@vger.kernel.org; snitzer@redhat.com; hare@suse.de;
> mroos@linux.ee; linux-scsi@vger.kernel.org; don.brace@microsemi.com;
> pbonzini@redhat.com; loberman@redhat.com;
> kashyap.desai@broadcom.com; Jens Axboe; martin.petersen@oracle.com;
> James.Bottomley@HansenPartnership.com; ming.lei@redhat.com
> Subject: Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
>
> On Mon, 2018-03-19 at 08:31 -0600, Jens Axboe wrote:
> > I'm assuming that Martin will eventually queue this up. But probably
> > for 4.17, then we can always flag it for a backport to stable once
> > it's been thoroughly tested.
>
> Jens, thanks for reply.
>
> I wonder if folks agree that in this case we should revert
>
> 84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs
>
> for v4.16.
>
> If this was a minor niche use-case regression the -stable scenario would
> probably be OK. But the patch seem to miss the fact that kernel's
> "possible
> CPUs" notion may be way off and side effects are bad.

Also it is performance issue as posted at below link, if we just use
"84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs".

https://www.spinics.net/lists/linux-scsi/msg118301.html

Performance drop was resolved using patch set (available at below link)under
discussion posted by Ming.

https://marc.info/?l=linux-block&m=152050646332092&w=2

Kashyap

>
> Christoph, Thomas, what do you think?
>
> Thanks,
> Artem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-19 14:42       ` Artem Bityutskiy
  2018-03-19 14:55         ` Kashyap Desai
@ 2018-03-19 15:26         ` Ming Lei
  1 sibling, 0 replies; 20+ messages in thread
From: Ming Lei @ 2018-03-19 15:26 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: hch, Thomas Gleixner, linux-block, snitzer, hare, mroos,
	linux-scsi, don.brace, pbonzini, loberman, kashyap.desai,
	Jens Axboe, martin.petersen, James.Bottomley

On Mon, Mar 19, 2018 at 04:42:09PM +0200, Artem Bityutskiy wrote:
> On Mon, 2018-03-19 at 08:31 -0600, Jens Axboe wrote:
> > I'm assuming that Martin will eventually queue this up. But probably
> > for 4.17, then we can always flag it for a backport to stable once
> > it's been thoroughly tested.
> 
> Jens, thanks for reply.
> 
> I wonder if folks agree that in this case we should revert 
> 
> 84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs
> 
> for v4.16.

Even 84676c1f21e8 is reverted, IO failure or hang can still be triggered
easily when doing CPU online/offline test.

So this patchset is really needed.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue
  2018-03-19 11:48   ` Bityutskiy, Artem
  2018-03-19 14:31     ` Jens Axboe
@ 2018-03-20  3:16     ` Martin K. Petersen
  1 sibling, 0 replies; 20+ messages in thread
From: Martin K. Petersen @ 2018-03-20  3:16 UTC (permalink / raw)
  To: Bityutskiy, Artem
  Cc: martin.petersen, James.Bottomley, axboe, ming.lei, linux-block,
	snitzer, hch, hare, mroos, linux-scsi, don.brace, pbonzini,
	loberman, kashyap.desai


Artem,

> we have this patch-set and it fixes megaraid regression in v4.16. Do
> you plan to mege it to v4.16-rcX? I am worried - there seem to be no
> sight that this is going to me merged. They are not in the linux-next.

I merged them into scsi-fixes last week.

It happens push a combined fixes+queue to linux-next to get more zeroday
coverage. However, most of the time linux-next is one 4.x+1 material
only.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-03-20  3:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13  9:42 [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Ming Lei
2018-03-13  9:42 ` [PATCH V5 1/5] scsi: hpsa: fix selection of reply queue Ming Lei
2018-03-14  8:57   ` Christoph Hellwig
2018-03-14 15:07   ` Bityutskiy, Artem
2018-03-19 11:48   ` Bityutskiy, Artem
2018-03-19 14:31     ` Jens Axboe
2018-03-19 14:42       ` Artem Bityutskiy
2018-03-19 14:55         ` Kashyap Desai
2018-03-19 15:26         ` Ming Lei
2018-03-20  3:16     ` Martin K. Petersen
2018-03-13  9:42 ` [PATCH V5 2/5] scsi: megaraid_sas: " Ming Lei
2018-03-13 17:13   ` Kashyap Desai
2018-03-14  8:57   ` Christoph Hellwig
2018-03-14 15:07   ` Artem Bityutskiy
2018-03-13  9:42 ` [PATCH V5 3/5] scsi: introduce force_blk_mq Ming Lei
2018-03-13  9:42 ` [PATCH V5 4/5] scsi: virtio_scsi: fix IO hang caused by irq vector automatic affinity Ming Lei
2018-03-14  8:58   ` Christoph Hellwig
2018-03-13  9:42 ` [PATCH V5 5/5] scsi: virtio_scsi: unify scsi_host_template Ming Lei
2018-03-14  8:58   ` Christoph Hellwig
2018-03-15  3:35 ` [PATCH V5 0/5] SCSI: fix selection of reply(hw) queue Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.