All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
@ 2018-03-27 15:39 ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)
  To: Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe, Keith Busch, Don Brace, qla2xxx-upstream, linux-scsi

The PCI interrupt vectors intended to be associated with a queue may
not start at 0; a driver may allocate pre_vectors for special use. This
patch adds an offset parameter so blk-mq may find the intended affinity
mask and updates all drivers using this API accordingly.

Cc: Don Brace <don.brace@microsemi.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: <linux-scsi@vger.kernel.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
v1 -> v2:

  Update blk-mq API directly instead of chaining a default parameter to
  a new API, and update all drivers accordingly.

 block/blk-mq-pci.c                    | 6 ++++--
 drivers/nvme/host/pci.c               | 2 +-
 drivers/scsi/qla2xxx/qla_os.c         | 2 +-
 drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
 include/linux/blk-mq-pci.h            | 3 ++-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c
index 76944e3271bf..e233996bb76f 100644
--- a/block/blk-mq-pci.c
+++ b/block/blk-mq-pci.c
@@ -21,6 +21,7 @@
  * blk_mq_pci_map_queues - provide a default queue mapping for PCI device
  * @set:	tagset to provide the mapping for
  * @pdev:	PCI device associated with @set.
+ * @offset:	Offset to use for the pci irq vector
  *
  * This function assumes the PCI device @pdev has at least as many available
  * interrupt vectors as @set has queues.  It will then query the vector
@@ -28,13 +29,14 @@
  * that maps a queue to the CPUs that have irq affinity for the corresponding
  * vector.
  */
-int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev)
+int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
+			    int offset)
 {
 	const struct cpumask *mask;
 	unsigned int queue, cpu;
 
 	for (queue = 0; queue < set->nr_hw_queues; queue++) {
-		mask = pci_irq_get_affinity(pdev, queue);
+		mask = pci_irq_get_affinity(pdev, queue + offset);
 		if (!mask)
 			goto fallback;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cef5ce851a92..e3b9efca0571 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -414,7 +414,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
+	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
 }
 
 /**
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 12ee6e02d146..2c705f3dd265 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -6805,7 +6805,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
 	if (USER_CTRL_IRQ(vha->hw))
 		rc = blk_mq_map_queues(&shost->tag_set);
 	else
-		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
+		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0);
 	return rc;
 }
 
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b2880c7709e6..10c94011c8a8 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
 {
 	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
 
-	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev);
+	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0);
 }
 
 static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info,
diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h
index 6338551e0fb9..9f4c17f0d2d8 100644
--- a/include/linux/blk-mq-pci.h
+++ b/include/linux/blk-mq-pci.h
@@ -5,6 +5,7 @@
 struct blk_mq_tag_set;
 struct pci_dev;
 
-int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev);
+int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
+			  int offset);
 
 #endif /* _LINUX_BLK_MQ_PCI_H */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
@ 2018-03-27 15:39 ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)


The PCI interrupt vectors intended to be associated with a queue may
not start at 0; a driver may allocate pre_vectors for special use. This
patch adds an offset parameter so blk-mq may find the intended affinity
mask and updates all drivers using this API accordingly.

Cc: Don Brace <don.brace at microsemi.com>
Cc: <qla2xxx-upstream at qlogic.com>
Cc: <linux-scsi at vger.kernel.org>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1 -> v2:

  Update blk-mq API directly instead of chaining a default parameter to
  a new API, and update all drivers accordingly.

 block/blk-mq-pci.c                    | 6 ++++--
 drivers/nvme/host/pci.c               | 2 +-
 drivers/scsi/qla2xxx/qla_os.c         | 2 +-
 drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
 include/linux/blk-mq-pci.h            | 3 ++-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c
index 76944e3271bf..e233996bb76f 100644
--- a/block/blk-mq-pci.c
+++ b/block/blk-mq-pci.c
@@ -21,6 +21,7 @@
  * blk_mq_pci_map_queues - provide a default queue mapping for PCI device
  * @set:	tagset to provide the mapping for
  * @pdev:	PCI device associated with @set.
+ * @offset:	Offset to use for the pci irq vector
  *
  * This function assumes the PCI device @pdev has at least as many available
  * interrupt vectors as @set has queues.  It will then query the vector
@@ -28,13 +29,14 @@
  * that maps a queue to the CPUs that have irq affinity for the corresponding
  * vector.
  */
-int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev)
+int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
+			    int offset)
 {
 	const struct cpumask *mask;
 	unsigned int queue, cpu;
 
 	for (queue = 0; queue < set->nr_hw_queues; queue++) {
-		mask = pci_irq_get_affinity(pdev, queue);
+		mask = pci_irq_get_affinity(pdev, queue + offset);
 		if (!mask)
 			goto fallback;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cef5ce851a92..e3b9efca0571 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -414,7 +414,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
+	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
 }
 
 /**
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 12ee6e02d146..2c705f3dd265 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -6805,7 +6805,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
 	if (USER_CTRL_IRQ(vha->hw))
 		rc = blk_mq_map_queues(&shost->tag_set);
 	else
-		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
+		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0);
 	return rc;
 }
 
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b2880c7709e6..10c94011c8a8 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
 {
 	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
 
-	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev);
+	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0);
 }
 
 static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info,
diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h
index 6338551e0fb9..9f4c17f0d2d8 100644
--- a/include/linux/blk-mq-pci.h
+++ b/include/linux/blk-mq-pci.h
@@ -5,6 +5,7 @@
 struct blk_mq_tag_set;
 struct pci_dev;
 
-int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev);
+int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
+			  int offset);
 
 #endif /* _LINUX_BLK_MQ_PCI_H */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] nvme-pci: Remove unused queue parameter
  2018-03-27 15:39 ` Keith Busch
@ 2018-03-27 15:39   ` Keith Busch
  -1 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)
  To: Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe, Keith Busch

All the queue memory is allocated up front. We don't take the node
into consideration when creating queues anymore, so removing the unused
parameter.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
v1 -> v2:

   Added review.

 drivers/nvme/host/pci.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e3b9efca0571..cc47fbe32ea5 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1379,8 +1379,7 @@ static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
 	return 0;
 }
 
-static int nvme_alloc_queue(struct nvme_dev *dev, int qid,
-		int depth, int node)
+static int nvme_alloc_queue(struct nvme_dev *dev, int qid, int depth)
 {
 	struct nvme_queue *nvmeq = &dev->queues[qid];
 
@@ -1595,8 +1594,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
 	if (result < 0)
 		return result;
 
-	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
-			dev_to_node(dev->dev));
+	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH);
 	if (result)
 		return result;
 
@@ -1629,9 +1627,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
 	int ret = 0;
 
 	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
-		/* vector == qid - 1, match nvme_create_queue */
-		if (nvme_alloc_queue(dev, i, dev->q_depth,
-		     pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) {
+		if (nvme_alloc_queue(dev, i, dev->q_depth)) {
 			ret = -ENOMEM;
 			break;
 		}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] nvme-pci: Remove unused queue parameter
@ 2018-03-27 15:39   ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)


All the queue memory is allocated up front. We don't take the node
into consideration when creating queues anymore, so removing the unused
parameter.

Signed-off-by: Keith Busch <keith.busch at intel.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
---
v1 -> v2:

   Added review.

 drivers/nvme/host/pci.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e3b9efca0571..cc47fbe32ea5 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1379,8 +1379,7 @@ static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
 	return 0;
 }
 
-static int nvme_alloc_queue(struct nvme_dev *dev, int qid,
-		int depth, int node)
+static int nvme_alloc_queue(struct nvme_dev *dev, int qid, int depth)
 {
 	struct nvme_queue *nvmeq = &dev->queues[qid];
 
@@ -1595,8 +1594,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
 	if (result < 0)
 		return result;
 
-	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
-			dev_to_node(dev->dev));
+	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH);
 	if (result)
 		return result;
 
@@ -1629,9 +1627,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
 	int ret = 0;
 
 	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
-		/* vector == qid - 1, match nvme_create_queue */
-		if (nvme_alloc_queue(dev, i, dev->q_depth,
-		     pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) {
+		if (nvme_alloc_queue(dev, i, dev->q_depth)) {
 			ret = -ENOMEM;
 			break;
 		}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-27 15:39 ` Keith Busch
@ 2018-03-27 15:39   ` Keith Busch
  -1 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)
  To: Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe, Keith Busch

The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.

To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.

In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.

Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
v1 -> v2:

  Update to use new blk-mq API.

  Removed unnecessary braces, inline functions, and temp variables.

  Amended author (this has evolved significantly from the original).

 drivers/nvme/host/pci.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cc47fbe32ea5..50c8eaf51d92 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
 	struct dma_pool *prp_small_pool;
 	unsigned online_queues;
 	unsigned max_qid;
+	unsigned int num_vecs;
 	int q_depth;
 	u32 db_stride;
 	void __iomem *bar;
@@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
+	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+				     dev->num_vecs > 1);
 }
 
 /**
@@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
 		nvmeq->sq_cmds_io = dev->cmb + offset;
 	}
 
-	nvmeq->cq_vector = qid - 1;
+	/*
+	 * A queue's vector matches the queue identifier unless the controller
+	 * has only one vector available.
+	 */
+	nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
 	result = adapter_alloc_cq(dev, qid, nvmeq);
 	if (result < 0)
 		goto release_vector;
@@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	int result, nr_io_queues;
 	unsigned long size;
 
+	struct irq_affinity affd = {
+		.pre_vectors = 1
+	};
+
 	nr_io_queues = num_present_cpus();
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
 	if (result < 0)
@@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 * setting up the full range we need.
 	 */
 	pci_free_irq_vectors(pdev);
-	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
-			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
-	if (nr_io_queues <= 0)
+	result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
+			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+	if (result <= 0)
 		return -EIO;
-	dev->max_qid = nr_io_queues;
+	dev->num_vecs = result;
+	dev->max_qid = max(result - 1, 1);
 
 	/*
 	 * Should investigate if there's a performance win from allocating
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-27 15:39   ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-27 15:39 UTC (permalink / raw)


The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.

To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.

In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.

Cc: Jianchao Wang <jianchao.w.wang at oracle.com>
Cc: Ming Lei <ming.lei at redhat.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1 -> v2:

  Update to use new blk-mq API.

  Removed unnecessary braces, inline functions, and temp variables.

  Amended author (this has evolved significantly from the original).

 drivers/nvme/host/pci.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cc47fbe32ea5..50c8eaf51d92 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
 	struct dma_pool *prp_small_pool;
 	unsigned online_queues;
 	unsigned max_qid;
+	unsigned int num_vecs;
 	int q_depth;
 	u32 db_stride;
 	void __iomem *bar;
@@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
+	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+				     dev->num_vecs > 1);
 }
 
 /**
@@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
 		nvmeq->sq_cmds_io = dev->cmb + offset;
 	}
 
-	nvmeq->cq_vector = qid - 1;
+	/*
+	 * A queue's vector matches the queue identifier unless the controller
+	 * has only one vector available.
+	 */
+	nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
 	result = adapter_alloc_cq(dev, qid, nvmeq);
 	if (result < 0)
 		goto release_vector;
@@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	int result, nr_io_queues;
 	unsigned long size;
 
+	struct irq_affinity affd = {
+		.pre_vectors = 1
+	};
+
 	nr_io_queues = num_present_cpus();
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
 	if (result < 0)
@@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 * setting up the full range we need.
 	 */
 	pci_free_irq_vectors(pdev);
-	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
-			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
-	if (nr_io_queues <= 0)
+	result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
+			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+	if (result <= 0)
 		return -EIO;
-	dev->max_qid = nr_io_queues;
+	dev->num_vecs = result;
+	dev->max_qid = max(result - 1, 1);
 
 	/*
 	 * Should investigate if there's a performance win from allocating
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
  2018-03-27 15:39 ` Keith Busch
@ 2018-03-28  1:26   ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  1:26 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux NVMe, Linux Block, Christoph Hellwig, Sagi Grimberg,
	Jianchao Wang, Jens Axboe, Don Brace, qla2xxx-upstream,
	linux-scsi

On Tue, Mar 27, 2018 at 09:39:06AM -0600, Keith Busch wrote:
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.
> 
> Cc: Don Brace <don.brace@microsemi.com>
> Cc: <qla2xxx-upstream@qlogic.com>
> Cc: <linux-scsi@vger.kernel.org>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
> v1 -> v2:
> 
>   Update blk-mq API directly instead of chaining a default parameter to
>   a new API, and update all drivers accordingly.
> 
>  block/blk-mq-pci.c                    | 6 ++++--
>  drivers/nvme/host/pci.c               | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c         | 2 +-
>  drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
>  include/linux/blk-mq-pci.h            | 3 ++-
>  5 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c
> index 76944e3271bf..e233996bb76f 100644
> --- a/block/blk-mq-pci.c
> +++ b/block/blk-mq-pci.c
> @@ -21,6 +21,7 @@
>   * blk_mq_pci_map_queues - provide a default queue mapping for PCI device
>   * @set:	tagset to provide the mapping for
>   * @pdev:	PCI device associated with @set.
> + * @offset:	Offset to use for the pci irq vector
>   *
>   * This function assumes the PCI device @pdev has at least as many available
>   * interrupt vectors as @set has queues.  It will then query the vector
> @@ -28,13 +29,14 @@
>   * that maps a queue to the CPUs that have irq affinity for the corresponding
>   * vector.
>   */
> -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev)
> +int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
> +			    int offset)
>  {
>  	const struct cpumask *mask;
>  	unsigned int queue, cpu;
>  
>  	for (queue = 0; queue < set->nr_hw_queues; queue++) {
> -		mask = pci_irq_get_affinity(pdev, queue);
> +		mask = pci_irq_get_affinity(pdev, queue + offset);
>  		if (!mask)
>  			goto fallback;
>  
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index cef5ce851a92..e3b9efca0571 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -414,7 +414,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct nvme_dev *dev = set->driver_data;
>  
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
>  }
>  
>  /**
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 12ee6e02d146..2c705f3dd265 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -6805,7 +6805,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
>  	if (USER_CTRL_IRQ(vha->hw))
>  		rc = blk_mq_map_queues(&shost->tag_set);
>  	else
> -		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
> +		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0);
>  	return rc;
>  }
>  
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index b2880c7709e6..10c94011c8a8 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>  {
>  	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
>  
> -	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev);
> +	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0);
>  }
>  
>  static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info,
> diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h
> index 6338551e0fb9..9f4c17f0d2d8 100644
> --- a/include/linux/blk-mq-pci.h
> +++ b/include/linux/blk-mq-pci.h
> @@ -5,6 +5,7 @@
>  struct blk_mq_tag_set;
>  struct pci_dev;
>  
> -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev);
> +int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
> +			  int offset);
>  
>  #endif /* _LINUX_BLK_MQ_PCI_H */
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
@ 2018-03-28  1:26   ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  1:26 UTC (permalink / raw)


On Tue, Mar 27, 2018@09:39:06AM -0600, Keith Busch wrote:
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.
> 
> Cc: Don Brace <don.brace at microsemi.com>
> Cc: <qla2xxx-upstream at qlogic.com>
> Cc: <linux-scsi at vger.kernel.org>
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
> v1 -> v2:
> 
>   Update blk-mq API directly instead of chaining a default parameter to
>   a new API, and update all drivers accordingly.
> 
>  block/blk-mq-pci.c                    | 6 ++++--
>  drivers/nvme/host/pci.c               | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c         | 2 +-
>  drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
>  include/linux/blk-mq-pci.h            | 3 ++-
>  5 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-mq-pci.c b/block/blk-mq-pci.c
> index 76944e3271bf..e233996bb76f 100644
> --- a/block/blk-mq-pci.c
> +++ b/block/blk-mq-pci.c
> @@ -21,6 +21,7 @@
>   * blk_mq_pci_map_queues - provide a default queue mapping for PCI device
>   * @set:	tagset to provide the mapping for
>   * @pdev:	PCI device associated with @set.
> + * @offset:	Offset to use for the pci irq vector
>   *
>   * This function assumes the PCI device @pdev has at least as many available
>   * interrupt vectors as @set has queues.  It will then query the vector
> @@ -28,13 +29,14 @@
>   * that maps a queue to the CPUs that have irq affinity for the corresponding
>   * vector.
>   */
> -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev)
> +int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
> +			    int offset)
>  {
>  	const struct cpumask *mask;
>  	unsigned int queue, cpu;
>  
>  	for (queue = 0; queue < set->nr_hw_queues; queue++) {
> -		mask = pci_irq_get_affinity(pdev, queue);
> +		mask = pci_irq_get_affinity(pdev, queue + offset);
>  		if (!mask)
>  			goto fallback;
>  
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index cef5ce851a92..e3b9efca0571 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -414,7 +414,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct nvme_dev *dev = set->driver_data;
>  
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
>  }
>  
>  /**
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 12ee6e02d146..2c705f3dd265 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -6805,7 +6805,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
>  	if (USER_CTRL_IRQ(vha->hw))
>  		rc = blk_mq_map_queues(&shost->tag_set);
>  	else
> -		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev);
> +		rc = blk_mq_pci_map_queues(&shost->tag_set, vha->hw->pdev, 0);
>  	return rc;
>  }
>  
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index b2880c7709e6..10c94011c8a8 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>  {
>  	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
>  
> -	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev);
> +	return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0);
>  }
>  
>  static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info,
> diff --git a/include/linux/blk-mq-pci.h b/include/linux/blk-mq-pci.h
> index 6338551e0fb9..9f4c17f0d2d8 100644
> --- a/include/linux/blk-mq-pci.h
> +++ b/include/linux/blk-mq-pci.h
> @@ -5,6 +5,7 @@
>  struct blk_mq_tag_set;
>  struct pci_dev;
>  
> -int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev);
> +int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev,
> +			  int offset);
>  
>  #endif /* _LINUX_BLK_MQ_PCI_H */
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei at redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] nvme-pci: Remove unused queue parameter
  2018-03-27 15:39   ` Keith Busch
@ 2018-03-28  1:27     ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  1:27 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux NVMe, Linux Block, Christoph Hellwig, Sagi Grimberg,
	Jianchao Wang, Jens Axboe

On Tue, Mar 27, 2018 at 09:39:07AM -0600, Keith Busch wrote:
> All the queue memory is allocated up front. We don't take the node
> into consideration when creating queues anymore, so removing the unused
> parameter.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
> v1 -> v2:
> 
>    Added review.
> 
>  drivers/nvme/host/pci.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index e3b9efca0571..cc47fbe32ea5 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1379,8 +1379,7 @@ static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
>  	return 0;
>  }
>  
> -static int nvme_alloc_queue(struct nvme_dev *dev, int qid,
> -		int depth, int node)
> +static int nvme_alloc_queue(struct nvme_dev *dev, int qid, int depth)
>  {
>  	struct nvme_queue *nvmeq = &dev->queues[qid];
>  
> @@ -1595,8 +1594,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
>  	if (result < 0)
>  		return result;
>  
> -	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
> -			dev_to_node(dev->dev));
> +	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH);
>  	if (result)
>  		return result;
>  
> @@ -1629,9 +1627,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
>  	int ret = 0;
>  
>  	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
> -		/* vector == qid - 1, match nvme_create_queue */
> -		if (nvme_alloc_queue(dev, i, dev->q_depth,
> -		     pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) {
> +		if (nvme_alloc_queue(dev, i, dev->q_depth)) {
>  			ret = -ENOMEM;
>  			break;
>  		}
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] nvme-pci: Remove unused queue parameter
@ 2018-03-28  1:27     ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  1:27 UTC (permalink / raw)


On Tue, Mar 27, 2018@09:39:07AM -0600, Keith Busch wrote:
> All the queue memory is allocated up front. We don't take the node
> into consideration when creating queues anymore, so removing the unused
> parameter.
> 
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> Reviewed-by: Christoph Hellwig <hch at lst.de>
> ---
> v1 -> v2:
> 
>    Added review.
> 
>  drivers/nvme/host/pci.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index e3b9efca0571..cc47fbe32ea5 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1379,8 +1379,7 @@ static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
>  	return 0;
>  }
>  
> -static int nvme_alloc_queue(struct nvme_dev *dev, int qid,
> -		int depth, int node)
> +static int nvme_alloc_queue(struct nvme_dev *dev, int qid, int depth)
>  {
>  	struct nvme_queue *nvmeq = &dev->queues[qid];
>  
> @@ -1595,8 +1594,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
>  	if (result < 0)
>  		return result;
>  
> -	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
> -			dev_to_node(dev->dev));
> +	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH);
>  	if (result)
>  		return result;
>  
> @@ -1629,9 +1627,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
>  	int ret = 0;
>  
>  	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
> -		/* vector == qid - 1, match nvme_create_queue */
> -		if (nvme_alloc_queue(dev, i, dev->q_depth,
> -		     pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) {
> +		if (nvme_alloc_queue(dev, i, dev->q_depth)) {
>  			ret = -ENOMEM;
>  			break;
>  		}
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei at redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-27 15:39   ` Keith Busch
@ 2018-03-28  2:08     ` Ming Lei
  -1 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  2:08 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux NVMe, Linux Block, Christoph Hellwig, Sagi Grimberg,
	Jianchao Wang, Jens Axboe

On Tue, Mar 27, 2018 at 09:39:08AM -0600, Keith Busch wrote:
> The admin and first IO queues shared the first irq vector, which has an
> affinity mask including cpu0. If a system allows cpu0 to be offlined,
> the admin queue may not be usable if no other CPUs in the affinity mask
> are online. This is a problem since unlike IO queues, there is only
> one admin queue that always needs to be usable.
> 
> To fix, this patch allocates one pre_vector for the admin queue that
> is assigned all CPUs, so will always be accessible. The IO queues are
> assigned the remaining managed vectors.
> 
> In case a controller has only one interrupt vector available, the admin
> and IO queues will share the pre_vector with all CPUs assigned.
> 
> Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
> v1 -> v2:
> 
>   Update to use new blk-mq API.
> 
>   Removed unnecessary braces, inline functions, and temp variables.
> 
>   Amended author (this has evolved significantly from the original).
> 
>  drivers/nvme/host/pci.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index cc47fbe32ea5..50c8eaf51d92 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -84,6 +84,7 @@ struct nvme_dev {
>  	struct dma_pool *prp_small_pool;
>  	unsigned online_queues;
>  	unsigned max_qid;
> +	unsigned int num_vecs;
>  	int q_depth;
>  	u32 db_stride;
>  	void __iomem *bar;
> @@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct nvme_dev *dev = set->driver_data;
>  
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> +				     dev->num_vecs > 1);
>  }
>  
>  /**
> @@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
>  		nvmeq->sq_cmds_io = dev->cmb + offset;
>  	}
>  
> -	nvmeq->cq_vector = qid - 1;
> +	/*
> +	 * A queue's vector matches the queue identifier unless the controller
> +	 * has only one vector available.
> +	 */
> +	nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
>  	result = adapter_alloc_cq(dev, qid, nvmeq);
>  	if (result < 0)
>  		goto release_vector;
> @@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
>  	int result, nr_io_queues;
>  	unsigned long size;
>  
> +	struct irq_affinity affd = {
> +		.pre_vectors = 1
> +	};
> +
>  	nr_io_queues = num_present_cpus();
>  	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
>  	if (result < 0)
> @@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
>  	 * setting up the full range we need.
>  	 */
>  	pci_free_irq_vectors(pdev);
> -	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
> -			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
> -	if (nr_io_queues <= 0)
> +	result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
> +			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
> +	if (result <= 0)
>  		return -EIO;
> -	dev->max_qid = nr_io_queues;
> +	dev->num_vecs = result;
> +	dev->max_qid = max(result - 1, 1);
>  
>  	/*
>  	 * Should investigate if there's a performance win from allocating
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-28  2:08     ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2018-03-28  2:08 UTC (permalink / raw)


On Tue, Mar 27, 2018@09:39:08AM -0600, Keith Busch wrote:
> The admin and first IO queues shared the first irq vector, which has an
> affinity mask including cpu0. If a system allows cpu0 to be offlined,
> the admin queue may not be usable if no other CPUs in the affinity mask
> are online. This is a problem since unlike IO queues, there is only
> one admin queue that always needs to be usable.
> 
> To fix, this patch allocates one pre_vector for the admin queue that
> is assigned all CPUs, so will always be accessible. The IO queues are
> assigned the remaining managed vectors.
> 
> In case a controller has only one interrupt vector available, the admin
> and IO queues will share the pre_vector with all CPUs assigned.
> 
> Cc: Jianchao Wang <jianchao.w.wang at oracle.com>
> Cc: Ming Lei <ming.lei at redhat.com>
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
> v1 -> v2:
> 
>   Update to use new blk-mq API.
> 
>   Removed unnecessary braces, inline functions, and temp variables.
> 
>   Amended author (this has evolved significantly from the original).
> 
>  drivers/nvme/host/pci.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index cc47fbe32ea5..50c8eaf51d92 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -84,6 +84,7 @@ struct nvme_dev {
>  	struct dma_pool *prp_small_pool;
>  	unsigned online_queues;
>  	unsigned max_qid;
> +	unsigned int num_vecs;
>  	int q_depth;
>  	u32 db_stride;
>  	void __iomem *bar;
> @@ -414,7 +415,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct nvme_dev *dev = set->driver_data;
>  
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> +				     dev->num_vecs > 1);
>  }
>  
>  /**
> @@ -1455,7 +1457,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
>  		nvmeq->sq_cmds_io = dev->cmb + offset;
>  	}
>  
> -	nvmeq->cq_vector = qid - 1;
> +	/*
> +	 * A queue's vector matches the queue identifier unless the controller
> +	 * has only one vector available.
> +	 */
> +	nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
>  	result = adapter_alloc_cq(dev, qid, nvmeq);
>  	if (result < 0)
>  		goto release_vector;
> @@ -1909,6 +1915,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
>  	int result, nr_io_queues;
>  	unsigned long size;
>  
> +	struct irq_affinity affd = {
> +		.pre_vectors = 1
> +	};
> +
>  	nr_io_queues = num_present_cpus();
>  	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
>  	if (result < 0)
> @@ -1944,11 +1954,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
>  	 * setting up the full range we need.
>  	 */
>  	pci_free_irq_vectors(pdev);
> -	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
> -			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
> -	if (nr_io_queues <= 0)
> +	result = pci_alloc_irq_vectors_affinity(pdev, 1, nr_io_queues + 1,
> +			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
> +	if (result <= 0)
>  		return -EIO;
> -	dev->max_qid = nr_io_queues;
> +	dev->num_vecs = result;
> +	dev->max_qid = max(result - 1, 1);
>  
>  	/*
>  	 * Should investigate if there's a performance win from allocating
> -- 
> 2.14.3
> 

Reviewed-by: Ming Lei <ming.lei at redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
  2018-03-27 15:39 ` Keith Busch
@ 2018-03-28  3:24   ` Jens Axboe
  -1 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2018-03-28  3:24 UTC (permalink / raw)
  To: Keith Busch, Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Don Brace, qla2xxx-upstream, linux-scsi

On 3/27/18 9:39 AM, Keith Busch wrote:
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.

Looks good to me, I'll queue it up for 4.17.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
@ 2018-03-28  3:24   ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2018-03-28  3:24 UTC (permalink / raw)


On 3/27/18 9:39 AM, Keith Busch wrote:
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.

Looks good to me, I'll queue it up for 4.17.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-27 15:39   ` Keith Busch
@ 2018-03-28  7:32     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:32 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux NVMe, Linux Block, Christoph Hellwig, Sagi Grimberg,
	Jianchao Wang, Ming Lei, Jens Axboe

On Tue, Mar 27, 2018 at 09:39:08AM -0600, Keith Busch wrote:
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> +				     dev->num_vecs > 1);

Can you turn this into:

-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
			     dev->num_vecs > 1 ? 1 /* admin queue */ : 0);

no functional change, but much easier to understand.

Except for that the whole series looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-28  7:32     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2018-03-28  7:32 UTC (permalink / raw)


On Tue, Mar 27, 2018@09:39:08AM -0600, Keith Busch wrote:
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> +				     dev->num_vecs > 1);

Can you turn this into:

-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
			     dev->num_vecs > 1 ? 1 /* admin queue */ : 0);

no functional change, but much easier to understand.

Except for that the whole series looks good:

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
  2018-03-27 15:39 ` Keith Busch
@ 2018-03-28 14:48   ` Don Brace
  -1 siblings, 0 replies; 24+ messages in thread
From: Don Brace @ 2018-03-28 14:48 UTC (permalink / raw)
  To: Keith Busch, Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe, qla2xxx-upstream, linux-scsi

> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Keith Busch
> Sent: Tuesday, March 27, 2018 10:39 AM
> To: Linux NVMe <linux-nvme@lists.infradead.org>; Linux Block <linux-
> block@vger.kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>; Sagi Grimberg <sagi@grimberg.me>;
> Jianchao Wang <jianchao.w.wang@oracle.com>; Ming Lei
> <ming.lei@redhat.com>; Jens Axboe <axboe@kernel.dk>; Keith Busch
> <keith.busch@intel.com>; Don Brace <don.brace@microsemi.com>; qla2xxx-
> upstream@qlogic.com; linux-scsi@vger.kernel.org
> Subject: [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
>=20
> EXTERNAL EMAIL
>=20
>=20
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.
>=20
> Cc: Don Brace <don.brace@microsemi.com>
> Cc: <qla2xxx-upstream@qlogic.com>
> Cc: <linux-scsi@vger.kernel.org>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
> v1 -> v2:
>=20
>   Update blk-mq API directly instead of chaining a default parameter to
>   a new API, and update all drivers accordingly.
>=20
>  block/blk-mq-pci.c                    | 6 ++++--
>  drivers/nvme/host/pci.c               | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c         | 2 +-
>  drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
>  include/linux/blk-mq-pci.h            | 3 ++-
>  5 files changed, 9 insertions(+), 6 deletions(-)
>=20
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index b2880c7709e6..10c94011c8a8 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>  {
>         struct pqi_ctrl_info *ctrl_info =3D shost_to_hba(shost);
>=20
> -       return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev)=
;
> +       return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev,=
 0);
>  }
>=20

Acked-by: Don Brace <don.brace@microsemi.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
@ 2018-03-28 14:48   ` Don Brace
  0 siblings, 0 replies; 24+ messages in thread
From: Don Brace @ 2018-03-28 14:48 UTC (permalink / raw)


> -----Original Message-----
> From: linux-scsi-owner at vger.kernel.org [mailto:linux-scsi-
> owner at vger.kernel.org] On Behalf Of Keith Busch
> Sent: Tuesday, March 27, 2018 10:39 AM
> To: Linux NVMe <linux-nvme at lists.infradead.org>; Linux Block <linux-
> block at vger.kernel.org>
> Cc: Christoph Hellwig <hch at lst.de>; Sagi Grimberg <sagi at grimberg.me>;
> Jianchao Wang <jianchao.w.wang at oracle.com>; Ming Lei
> <ming.lei at redhat.com>; Jens Axboe <axboe at kernel.dk>; Keith Busch
> <keith.busch at intel.com>; Don Brace <don.brace at microsemi.com>; qla2xxx-
> upstream at qlogic.com; linux-scsi at vger.kernel.org
> Subject: [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues
> 
> EXTERNAL EMAIL
> 
> 
> The PCI interrupt vectors intended to be associated with a queue may
> not start at 0; a driver may allocate pre_vectors for special use. This
> patch adds an offset parameter so blk-mq may find the intended affinity
> mask and updates all drivers using this API accordingly.
> 
> Cc: Don Brace <don.brace at microsemi.com>
> Cc: <qla2xxx-upstream at qlogic.com>
> Cc: <linux-scsi at vger.kernel.org>
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
> v1 -> v2:
> 
>   Update blk-mq API directly instead of chaining a default parameter to
>   a new API, and update all drivers accordingly.
> 
>  block/blk-mq-pci.c                    | 6 ++++--
>  drivers/nvme/host/pci.c               | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c         | 2 +-
>  drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
>  include/linux/blk-mq-pci.h            | 3 ++-
>  5 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index b2880c7709e6..10c94011c8a8 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5348,7 +5348,7 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>  {
>         struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> 
> -       return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev);
> +       return blk_mq_pci_map_queues(&shost->tag_set, ctrl_info->pci_dev, 0);
>  }
> 

Acked-by: Don Brace <don.brace at microsemi.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-28  7:32     ` Christoph Hellwig
@ 2018-03-28 20:38       ` Keith Busch
  -1 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-28 20:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux NVMe, Linux Block, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe

On Wed, Mar 28, 2018 at 09:32:14AM +0200, Christoph Hellwig wrote:
> On Tue, Mar 27, 2018 at 09:39:08AM -0600, Keith Busch wrote:
> > -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> > +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> > +				     dev->num_vecs > 1);
> 
> Can you turn this into:
> 
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> 			     dev->num_vecs > 1 ? 1 /* admin queue */ : 0);
> 
> no functional change, but much easier to understand.
> 
> Except for that the whole series looks good:

Sounds good, thanks for the reviews, Christoph and Ming.

Updated with your suggestion and applied (Jens picked up the blk-mq part;
nvme-4.17 is rebased to that).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-28 20:38       ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-28 20:38 UTC (permalink / raw)


On Wed, Mar 28, 2018@09:32:14AM +0200, Christoph Hellwig wrote:
> On Tue, Mar 27, 2018@09:39:08AM -0600, Keith Busch wrote:
> > -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> > +	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
> > +				     dev->num_vecs > 1);
> 
> Can you turn this into:
> 
> -	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev), 0);
> 			     dev->num_vecs > 1 ? 1 /* admin queue */ : 0);
> 
> no functional change, but much easier to understand.
> 
> Except for that the whole series looks good:

Sounds good, thanks for the reviews, Christoph and Ming.

Updated with your suggestion and applied (Jens picked up the blk-mq part;
nvme-4.17 is rebased to that).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-23 22:19   ` Keith Busch
@ 2018-03-27 14:20     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2018-03-27 14:20 UTC (permalink / raw)
  To: Keith Busch
  Cc: Linux NVMe, Linux Block, Christoph Hellwig, Sagi Grimberg,
	Jianchao Wang, Ming Lei, Jens Axboe

> +static inline unsigned int nvme_ioq_vector(struct nvme_dev *dev,
> +		unsigned int qid)

No need for the inline here I think.

> +{
> +	/*
> +	 * A queue's vector matches the queue identifier unless the controller
> +	 * has only one vector available.
> +	 */
> +	return (dev->num_vecs == 1) ? 0 : qid;

and no need for the braces here.

> +	struct irq_affinity affd = {.pre_vectors = 1};

	struct irq_affinity affd = {
		.pre_vectors	= 1
	};

to make it a little more readable.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-27 14:20     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2018-03-27 14:20 UTC (permalink / raw)


> +static inline unsigned int nvme_ioq_vector(struct nvme_dev *dev,
> +		unsigned int qid)

No need for the inline here I think.

> +{
> +	/*
> +	 * A queue's vector matches the queue identifier unless the controller
> +	 * has only one vector available.
> +	 */
> +	return (dev->num_vecs == 1) ? 0 : qid;

and no need for the braces here.

> +	struct irq_affinity affd = {.pre_vectors = 1};

	struct irq_affinity affd = {
		.pre_vectors	= 1
	};

to make it a little more readable.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
  2018-03-23 22:19 Keith Busch
@ 2018-03-23 22:19   ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-23 22:19 UTC (permalink / raw)
  To: Linux NVMe, Linux Block
  Cc: Christoph Hellwig, Sagi Grimberg, Jianchao Wang, Ming Lei,
	Jens Axboe, Keith Busch

From: Jianchao Wang <jianchao.w.wang@oracle.com>

The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.

To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.

In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
[changelog, code comments, merge, and blk-mq pci vector offset]
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/nvme/host/pci.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 632166f7d8f2..7b31bc01df6c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
 	struct dma_pool *prp_small_pool;
 	unsigned online_queues;
 	unsigned max_qid;
+	unsigned int num_vecs;
 	int q_depth;
 	u32 db_stride;
 	void __iomem *bar;
@@ -139,6 +140,16 @@ static inline struct nvme_dev *to_nvme_dev(struct nvme_ctrl *ctrl)
 	return container_of(ctrl, struct nvme_dev, ctrl);
 }
 
+static inline unsigned int nvme_ioq_vector(struct nvme_dev *dev,
+		unsigned int qid)
+{
+	/*
+	 * A queue's vector matches the queue identifier unless the controller
+	 * has only one vector available.
+	 */
+	return (dev->num_vecs == 1) ? 0 : qid;
+}
+
 /*
  * An NVM Express queue.  Each device has at least two (one for admin
  * commands and one for I/O commands).
@@ -414,7 +425,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
+	return __blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+				       dev->num_vecs > 1);
 }
 
 /**
@@ -1455,7 +1467,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
 		nvmeq->sq_cmds_io = dev->cmb + offset;
 	}
 
-	nvmeq->cq_vector = qid - 1;
+	nvmeq->cq_vector = nvme_ioq_vector(dev, qid);
 	result = adapter_alloc_cq(dev, qid, nvmeq);
 	if (result < 0)
 		goto release_vector;
@@ -1908,6 +1920,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	int result, nr_io_queues;
 	unsigned long size;
+	struct irq_affinity affd = {.pre_vectors = 1};
+	int ret;
 
 	nr_io_queues = num_present_cpus();
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
@@ -1944,11 +1958,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 * setting up the full range we need.
 	 */
 	pci_free_irq_vectors(pdev);
-	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
-			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
-	if (nr_io_queues <= 0)
+	ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_io_queues + 1),
+			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+	if (ret <= 0)
 		return -EIO;
-	dev->max_qid = nr_io_queues;
+	dev->num_vecs = ret;
+	dev->max_qid = max(ret - 1, 1);
 
 	/*
 	 * Should investigate if there's a performance win from allocating
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors
@ 2018-03-23 22:19   ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-03-23 22:19 UTC (permalink / raw)


From: Jianchao Wang <jianchao.w.wang@oracle.com>

The admin and first IO queues shared the first irq vector, which has an
affinity mask including cpu0. If a system allows cpu0 to be offlined,
the admin queue may not be usable if no other CPUs in the affinity mask
are online. This is a problem since unlike IO queues, there is only
one admin queue that always needs to be usable.

To fix, this patch allocates one pre_vector for the admin queue that
is assigned all CPUs, so will always be accessible. The IO queues are
assigned the remaining managed vectors.

In case a controller has only one interrupt vector available, the admin
and IO queues will share the pre_vector with all CPUs assigned.

Signed-off-by: Jianchao Wang <jianchao.w.wang at oracle.com>
Reviewed-by: Ming Lei <ming.lei at redhat.com>
[changelog, code comments, merge, and blk-mq pci vector offset]
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/pci.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 632166f7d8f2..7b31bc01df6c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -84,6 +84,7 @@ struct nvme_dev {
 	struct dma_pool *prp_small_pool;
 	unsigned online_queues;
 	unsigned max_qid;
+	unsigned int num_vecs;
 	int q_depth;
 	u32 db_stride;
 	void __iomem *bar;
@@ -139,6 +140,16 @@ static inline struct nvme_dev *to_nvme_dev(struct nvme_ctrl *ctrl)
 	return container_of(ctrl, struct nvme_dev, ctrl);
 }
 
+static inline unsigned int nvme_ioq_vector(struct nvme_dev *dev,
+		unsigned int qid)
+{
+	/*
+	 * A queue's vector matches the queue identifier unless the controller
+	 * has only one vector available.
+	 */
+	return (dev->num_vecs == 1) ? 0 : qid;
+}
+
 /*
  * An NVM Express queue.  Each device has at least two (one for admin
  * commands and one for I/O commands).
@@ -414,7 +425,8 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
 {
 	struct nvme_dev *dev = set->driver_data;
 
-	return blk_mq_pci_map_queues(set, to_pci_dev(dev->dev));
+	return __blk_mq_pci_map_queues(set, to_pci_dev(dev->dev),
+				       dev->num_vecs > 1);
 }
 
 /**
@@ -1455,7 +1467,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
 		nvmeq->sq_cmds_io = dev->cmb + offset;
 	}
 
-	nvmeq->cq_vector = qid - 1;
+	nvmeq->cq_vector = nvme_ioq_vector(dev, qid);
 	result = adapter_alloc_cq(dev, qid, nvmeq);
 	if (result < 0)
 		goto release_vector;
@@ -1908,6 +1920,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	int result, nr_io_queues;
 	unsigned long size;
+	struct irq_affinity affd = {.pre_vectors = 1};
+	int ret;
 
 	nr_io_queues = num_present_cpus();
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
@@ -1944,11 +1958,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 * setting up the full range we need.
 	 */
 	pci_free_irq_vectors(pdev);
-	nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues,
-			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY);
-	if (nr_io_queues <= 0)
+	ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_io_queues + 1),
+			PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+	if (ret <= 0)
 		return -EIO;
-	dev->max_qid = nr_io_queues;
+	dev->num_vecs = ret;
+	dev->max_qid = max(ret - 1, 1);
 
 	/*
 	 * Should investigate if there's a performance win from allocating
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-03-28 20:38 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-27 15:39 [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues Keith Busch
2018-03-27 15:39 ` Keith Busch
2018-03-27 15:39 ` [PATCH 2/3] nvme-pci: Remove unused queue parameter Keith Busch
2018-03-27 15:39   ` Keith Busch
2018-03-28  1:27   ` Ming Lei
2018-03-28  1:27     ` Ming Lei
2018-03-27 15:39 ` [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors Keith Busch
2018-03-27 15:39   ` Keith Busch
2018-03-28  2:08   ` Ming Lei
2018-03-28  2:08     ` Ming Lei
2018-03-28  7:32   ` Christoph Hellwig
2018-03-28  7:32     ` Christoph Hellwig
2018-03-28 20:38     ` Keith Busch
2018-03-28 20:38       ` Keith Busch
2018-03-28  1:26 ` [PATCH 1/3] blk-mq: Allow PCI vector offset for mapping queues Ming Lei
2018-03-28  1:26   ` Ming Lei
2018-03-28  3:24 ` Jens Axboe
2018-03-28  3:24   ` Jens Axboe
2018-03-28 14:48 ` Don Brace
2018-03-28 14:48   ` Don Brace
  -- strict thread matches above, loose matches on Subject: below --
2018-03-23 22:19 Keith Busch
2018-03-23 22:19 ` [PATCH 3/3] nvme-pci: Separate IO and admin queue IRQ vectors Keith Busch
2018-03-23 22:19   ` Keith Busch
2018-03-27 14:20   ` Christoph Hellwig
2018-03-27 14:20     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.