[PATCH] rdma: force queue size to respect controller capability

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] rdma: force queue size to respect controller capability
@ 2016-10-25  7:22 Samuel Jones
  2016-10-25 12:39 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Samuel Jones @ 2016-10-25  7:22 UTC (permalink / raw)


Queue size needs to respect the Maximum Queue Entries Supported advertised by
the controller in its Capability register.

Signed-off-by: Samuel Jones <sjones at kalray.eu>
---
 drivers/nvme/host/rdma.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 5a83881..98d9c09 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1908,6 +1908,14 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 		opts->queue_size = ctrl->ctrl.maxcmd;
 	}
 
+	if (opts->queue_size > ctrl->ctrl.sqsize) {
+		/* warn if sqsize is lower than queue_size */
+		dev_warn(ctrl->ctrl.device,
+			"queue_size %zu > ctrl sqsize %u, clamping down\n",
+			opts->queue_size, ctrl->ctrl.sqsize);
+		opts->queue_size = ctrl->ctrl.sqsize;
+	}
+
 	if (opts->nr_io_queues) {
 		ret = nvme_rdma_create_io_queues(ctrl);
 		if (ret)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2016-10-25  7:22 [PATCH] rdma: force queue size to respect controller capability Samuel Jones
@ 2016-10-25 12:39 ` Christoph Hellwig
  2016-10-25 15:36 ` Sagi Grimberg
  2016-10-25 16:35 ` Daniel Verkamp
  2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2016-10-25 12:39 UTC (permalink / raw)


On Tue, Oct 25, 2016@09:22:34AM +0200, Samuel Jones wrote:
> Queue size needs to respect the Maximum Queue Entries Supported advertised by
> the controller in its Capability register.
> 
> Signed-off-by: Samuel Jones <sjones at kalray.eu>

Thanks Samuel,

this looks good:

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2016-10-25  7:22 [PATCH] rdma: force queue size to respect controller capability Samuel Jones
  2016-10-25 12:39 ` Christoph Hellwig
@ 2016-10-25 15:36 ` Sagi Grimberg
  2016-10-25 16:35 ` Daniel Verkamp
  2 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2016-10-25 15:36 UTC (permalink / raw)


Thanks Samuel,

queued to nvmf-4.9-rc

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2016-10-25  7:22 [PATCH] rdma: force queue size to respect controller capability Samuel Jones
  2016-10-25 12:39 ` Christoph Hellwig
  2016-10-25 15:36 ` Sagi Grimberg
@ 2016-10-25 16:35 ` Daniel Verkamp
  2016-10-26  8:36   ` Sagi Grimberg
  2 siblings, 1 reply; 7+ messages in thread
From: Daniel Verkamp @ 2016-10-25 16:35 UTC (permalink / raw)


On 10/25/2016 12:22 AM, Samuel Jones wrote:
> Queue size needs to respect the Maximum Queue Entries Supported advertised by
> the controller in its Capability register.
>
> Signed-off-by: Samuel Jones <sjones at kalray.eu>
> ---
>  drivers/nvme/host/rdma.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 5a83881..98d9c09 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1908,6 +1908,14 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
>  		opts->queue_size = ctrl->ctrl.maxcmd;
>  	}
>
> +	if (opts->queue_size > ctrl->ctrl.sqsize) {
> +		/* warn if sqsize is lower than queue_size */
> +		dev_warn(ctrl->ctrl.device,
> +			"queue_size %zu > ctrl sqsize %u, clamping down\n",
> +			opts->queue_size, ctrl->ctrl.sqsize);
> +		opts->queue_size = ctrl->ctrl.sqsize;
> +	}

I believe there's an off-by-one (or two) lurking here.

It looks to me like ctrlr->ctrlr.sqsize is intended to be 0's-based:
- sqsize is passed as HSQSIZE in the Connect private data, which is 
0's-based
- sqsize is set to opts->queue_size - 1 earlier in nvme_rdma_create_ctrlr()

However, this new change assigns opts->queue_size (non-0's-based value) 
the current value of ctrlr->ctrlr.sqsize (0's-based value) without any 
adjustment.

This is probably because there is another bug in 
nvme_rdma_configure_admin_queue(), where sqsize is set to a value based 
on CAP.MQES + 1.  MQES is a 0's-based value, so if sqsize is also 
supposed to be 0's based, no addition should be necessary.

I think it is currently working only in the case where the original 
sqsize is already smaller than MQES, because then the min of MQES and 
sqsize will pick sqsize, which is already 0's based from the initial 
queue_size - 1 assignment.

If I'm not missing anything, the correct fix should be removing the + 1 
adjustment of MQES as well as changing this patch to consider the 0's 
based nature of sqsize:

	if (opts->queue_size > ctrlr->ctrlr->sqsize + 1) {
		...
  		opts->queue_size = ctrlr->ctrlr.sqsize + 1;
	}

Thanks,
-- Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2016-10-25 16:35 ` Daniel Verkamp
@ 2016-10-26  8:36   ` Sagi Grimberg
  2017-04-05 17:48     ` Trapp, Darren
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2016-10-26  8:36 UTC (permalink / raw)


> I believe there's an off-by-one (or two) lurking here.
>
> It looks to me like ctrlr->ctrlr.sqsize is intended to be 0's-based:
> - sqsize is passed as HSQSIZE in the Connect private data, which is
> 0's-based
> - sqsize is set to opts->queue_size - 1 earlier in nvme_rdma_create_ctrlr()
>
> However, this new change assigns opts->queue_size (non-0's-based value)
> the current value of ctrlr->ctrlr.sqsize (0's-based value) without any
> adjustment.
>
> This is probably because there is another bug in
> nvme_rdma_configure_admin_queue(), where sqsize is set to a value based
> on CAP.MQES + 1.  MQES is a 0's-based value, so if sqsize is also
> supposed to be 0's based, no addition should be necessary.
>
> I think it is currently working only in the case where the original
> sqsize is already smaller than MQES, because then the min of MQES and
> sqsize will pick sqsize, which is already 0's based from the initial
> queue_size - 1 assignment.
>
> If I'm not missing anything, the correct fix should be removing the + 1
> adjustment of MQES as well as changing this patch to consider the 0's
> based nature of sqsize:
>
>     if (opts->queue_size > ctrlr->ctrlr->sqsize + 1) {
>         ...
>          opts->queue_size = ctrlr->ctrlr.sqsize + 1;
>     }

Your correct Daniel. I'll fix that in the patch itself.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2016-10-26  8:36   ` Sagi Grimberg
@ 2017-04-05 17:48     ` Trapp, Darren
  2017-04-06  6:15       ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Trapp, Darren @ 2017-04-05 17:48 UTC (permalink / raw)


Sagi,

Was the +1 issue resolved differently?  I?m looking at the current code and I still see MQES + 1 being used.

I?m running into an issue where my target reports a MQES of 3fh.  

The original ctrl->ctrl.sqsize comes from the create ctrl which sets a 0 based size.
       ctrl->ctrl.sqsize = opts->queue_size - 1;

When host does the 
        ctrl->ctrl.sqsize =
                min_t(int, NVME_CAP_MQES(ctrl->cap) + 1, ctrl->ctrl.sqsize);

 the comparison is  min_t(40h, 7fh)

So sqsize is now set as a 1 based value.  When the IO q is created, the target rejects it since it is larger than the max size.



Darren Trapp
Senior Director - Engineering
darren.trapp at cavium.com <mailto:darren.trapp at caviumnetworks.com>
T 949-389-6273
M 949-212-8576
26650 Aliso Viejo Pkwy, Aliso Viejo, CA 92656
www.qlogic.com <http://www.qlogic.com/>
 <http://www.qlogic.com/>

On 10/26/16, 1:36 AM, "Linux-nvme on behalf of Sagi Grimberg" <linux-nvme-bounces@lists.infradead.org on behalf of sagi@grimberg.me> wrote:

    > I believe there's an off-by-one (or two) lurking here.
    >
    > It looks to me like ctrlr->ctrlr.sqsize is intended to be 0's-based:
    > - sqsize is passed as HSQSIZE in the Connect private data, which is
    > 0's-based
    > - sqsize is set to opts->queue_size - 1 earlier in nvme_rdma_create_ctrlr()
    >
    > However, this new change assigns opts->queue_size (non-0's-based value)
    > the current value of ctrlr->ctrlr.sqsize (0's-based value) without any
    > adjustment.
    >
    > This is probably because there is another bug in
    > nvme_rdma_configure_admin_queue(), where sqsize is set to a value based
    > on CAP.MQES + 1.  MQES is a 0's-based value, so if sqsize is also
    > supposed to be 0's based, no addition should be necessary.
    >
    > I think it is currently working only in the case where the original
    > sqsize is already smaller than MQES, because then the min of MQES and
    > sqsize will pick sqsize, which is already 0's based from the initial
    > queue_size - 1 assignment.
    >
    > If I'm not missing anything, the correct fix should be removing the + 1
    > adjustment of MQES as well as changing this patch to consider the 0's
    > based nature of sqsize:
    >
    >     if (opts->queue_size > ctrlr->ctrlr->sqsize + 1) {
    >         ...
    >          opts->queue_size = ctrlr->ctrlr.sqsize + 1;
    >     }
    
    Your correct Daniel. I'll fix that in the patch itself.
    
    Thanks!
    
    _______________________________________________
    Linux-nvme mailing list
    Linux-nvme at lists.infradead.org
    http://lists.infradead.org/mailman/listinfo/linux-nvme
    

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] rdma: force queue size to respect controller capability
  2017-04-05 17:48     ` Trapp, Darren
@ 2017-04-06  6:15       ` Sagi Grimberg
  0 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2017-04-06  6:15 UTC (permalink / raw)


> Sagi,
>
> Was the +1 issue resolved differently?  I?m looking at the current code and I still see MQES + 1 being used.
>
> I?m running into an issue where my target reports a MQES of 3fh.
>
> The original ctrl->ctrl.sqsize comes from the create ctrl which sets a 0 based size.
>        ctrl->ctrl.sqsize = opts->queue_size - 1;
>
> When host does the
>         ctrl->ctrl.sqsize =
>                 min_t(int, NVME_CAP_MQES(ctrl->cap) + 1, ctrl->ctrl.sqsize);
>
>  the comparison is  min_t(40h, 7fh)
>
> So sqsize is now set as a 1 based value.  When the IO q is created, the target rejects it since it is larger than the max size.

Hey Darren,

You're right, we are wrongly taking MQES + 1. Not sure how this slipped.

I'll send a patch, thanks for reporting!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-04-06  6:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-25  7:22 [PATCH] rdma: force queue size to respect controller capability Samuel Jones
2016-10-25 12:39 ` Christoph Hellwig
2016-10-25 15:36 ` Sagi Grimberg
2016-10-25 16:35 ` Daniel Verkamp
2016-10-26  8:36   ` Sagi Grimberg
2017-04-05 17:48     ` Trapp, Darren
2017-04-06  6:15       ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.