All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-23  9:04 ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-23  9:04 UTC (permalink / raw)
  To: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	keith.busch-ral2JQCrhuEAvxtiuMwx3w, axboe-b10kYP2dOMg,
	hch-jcswGhMUV9g, sagi-NQWnxTmZq1alnMjI0IkVqw
  Cc: Samuel Jones

In the case of small NVMe-oF queue size (<32) we may enter
a deadlock caused by the fact that the IB completions aren't sent
waiting for 32 and the send queue will fill up.

The error is seen as (using mlx5):
[ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
[ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12

The patch doesn't change the behaviour for remote devices with
larger queues.

Signed-off-by: Marta Rybczynska <marta.rybczynska-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
Signed-off-by: Samuel Jones <sjones-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
---
 drivers/nvme/host/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 779f516..8ea4cba 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1023,6 +1023,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
 {
        struct ib_send_wr wr, *bad_wr;
        int ret;
+       int sig_limit;
 
        sge->addr   = qe->dma;
        sge->length = sizeof(struct nvme_command),
@@ -1054,7 +1055,8 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
         * embedded in request's payload, is not freed when __ib_process_cq()
         * calls wr_cqe->done().
         */
-       if ((++queue->sig_count % 32) == 0 || flush)
+       sig_limit = min(queue->queue_size, 32);
+       if ((++queue->sig_count % sig_limit) == 0 || flush)
                wr.send_flags |= IB_SEND_SIGNALED;
 
        if (first)
-- 
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-23  9:04 ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-23  9:04 UTC (permalink / raw)


In the case of small NVMe-oF queue size (<32) we may enter
a deadlock caused by the fact that the IB completions aren't sent
waiting for 32 and the send queue will fill up.

The error is seen as (using mlx5):
[ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
[ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12

The patch doesn't change the behaviour for remote devices with
larger queues.

Signed-off-by: Marta Rybczynska <marta.rybczynska at kalray.eu>
Signed-off-by: Samuel Jones <sjones at kalray.eu>
---
 drivers/nvme/host/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 779f516..8ea4cba 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1023,6 +1023,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
 {
        struct ib_send_wr wr, *bad_wr;
        int ret;
+       int sig_limit;
 
        sge->addr   = qe->dma;
        sge->length = sizeof(struct nvme_command),
@@ -1054,7 +1055,8 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
         * embedded in request's payload, is not freed when __ib_process_cq()
         * calls wr_cqe->done().
         */
-       if ((++queue->sig_count % 32) == 0 || flush)
+       sig_limit = min(queue->queue_size, 32);
+       if ((++queue->sig_count % sig_limit) == 0 || flush)
                wr.send_flags |= IB_SEND_SIGNALED;
 
        if (first)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-23  9:04 ` Marta Rybczynska
  (?)
@ 2017-03-23  9:24 ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-23  9:24 UTC (permalink / raw)


In the case of small NVMe-oF queue size (<32) we may enter
a deadlock caused by the fact that the IB completions aren't sent
waiting for 32 and the send queue will fill up.

The error is seen as (using mlx5):
[ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
[ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12

The patch doesn't change the behaviour for remote devices with
larger queues.

Signed-off-by: Marta Rybczynska <marta.rybczynska at kalray.eu>
Signed-off-by: Samuel Jones <sjones at kalray.eu>
---
 drivers/nvme/host/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 779f516..8ea4cba 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1023,6 +1023,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
 {
        struct ib_send_wr wr, *bad_wr;
        int ret;
+       int sig_limit;
 
        sge->addr   = qe->dma;
        sge->length = sizeof(struct nvme_command),
@@ -1054,7 +1055,8 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue,
         * embedded in request's payload, is not freed when __ib_process_cq()
         * calls wr_cqe->done().
         */
-       if ((++queue->sig_count % 32) == 0 || flush)
+       sig_limit = min(queue->queue_size, 32);
+       if ((++queue->sig_count % sig_limit) == 0 || flush)
                wr.send_flags |= IB_SEND_SIGNALED;
 
        if (first)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-23  9:04 ` Marta Rybczynska
@ 2017-03-23 14:00     ` Christoph Hellwig
  -1 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2017-03-23 14:00 UTC (permalink / raw)
  To: Marta Rybczynska
  Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	keith.busch-ral2JQCrhuEAvxtiuMwx3w, axboe-b10kYP2dOMg,
	hch-jcswGhMUV9g, sagi-NQWnxTmZq1alnMjI0IkVqw, Samuel Jones

On Thu, Mar 23, 2017 at 10:04:09AM +0100, Marta Rybczynska wrote:
> In the case of small NVMe-oF queue size (<32) we may enter
> a deadlock caused by the fact that the IB completions aren't sent
> waiting for 32 and the send queue will fill up.
> 
> The error is seen as (using mlx5):
> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
> 
> The patch doesn't change the behaviour for remote devices with
> larger queues.

Thanks, this looks useful.  But wouldn't it be better to do something
like queue_size divided by 2 or 4 to get a better refill latency?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-23 14:00     ` Christoph Hellwig
  0 siblings, 0 replies; 43+ messages in thread
From: Christoph Hellwig @ 2017-03-23 14:00 UTC (permalink / raw)


On Thu, Mar 23, 2017@10:04:09AM +0100, Marta Rybczynska wrote:
> In the case of small NVMe-oF queue size (<32) we may enter
> a deadlock caused by the fact that the IB completions aren't sent
> waiting for 32 and the send queue will fill up.
> 
> The error is seen as (using mlx5):
> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
> 
> The patch doesn't change the behaviour for remote devices with
> larger queues.

Thanks, this looks useful.  But wouldn't it be better to do something
like queue_size divided by 2 or 4 to get a better refill latency?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-23 14:00     ` Christoph Hellwig
@ 2017-03-23 14:36         ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-23 14:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, sagi-NQWnxTmZq1alnMjI0IkVqw, Samuel Jones

----- Mail original -----
> On Thu, Mar 23, 2017 at 10:04:09AM +0100, Marta Rybczynska wrote:
>> In the case of small NVMe-oF queue size (<32) we may enter
>> a deadlock caused by the fact that the IB completions aren't sent
>> waiting for 32 and the send queue will fill up.
>> 
>> The error is seen as (using mlx5):
>> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
>> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
>> 
>> The patch doesn't change the behaviour for remote devices with
>> larger queues.
> 
> Thanks, this looks useful.  But wouldn't it be better to do something
> like queue_size divided by 2 or 4 to get a better refill latency?

That's an interesting question. The max number of requests is already at 3 or 4 times
of the queue size because of different message types (see Sam's original
message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is low').
I guess it would have inflence on configs with bigger latency.

I would like to have Sagi's view on this as he's the one who has changed that
part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-23 14:36         ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-23 14:36 UTC (permalink / raw)


----- Mail original -----
> On Thu, Mar 23, 2017@10:04:09AM +0100, Marta Rybczynska wrote:
>> In the case of small NVMe-oF queue size (<32) we may enter
>> a deadlock caused by the fact that the IB completions aren't sent
>> waiting for 32 and the send queue will fill up.
>> 
>> The error is seen as (using mlx5):
>> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
>> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
>> 
>> The patch doesn't change the behaviour for remote devices with
>> larger queues.
> 
> Thanks, this looks useful.  But wouldn't it be better to do something
> like queue_size divided by 2 or 4 to get a better refill latency?

That's an interesting question. The max number of requests is already at 3 or 4 times
of the queue size because of different message types (see Sam's original
message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is low').
I guess it would have inflence on configs with bigger latency.

I would like to have Sagi's view on this as he's the one who has changed that
part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-23 14:36         ` Marta Rybczynska
@ 2017-03-28 11:09             ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-28 11:09 UTC (permalink / raw)
  To: Marta Rybczynska, Christoph Hellwig
  Cc: linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones


>> Thanks, this looks useful.  But wouldn't it be better to do something
>> like queue_size divided by 2 or 4 to get a better refill latency?
>
> That's an interesting question. The max number of requests is already at 3 or 4 times
> of the queue size because of different message types (see Sam's original
> message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is low').
> I guess it would have inflence on configs with bigger latency.
>
> I would like to have Sagi's view on this as he's the one who has changed that
> part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd

Hi Marta,

This looks good indeed. IIRC both for IB and iWARP we need to signal
at least once every send-queue depth (in practice I remember that some
devices need more than once) so maybe it'll be good to have division by
2.

Maybe it'll be better if we do:

static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
{
	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
}

And lose the hard-coded 32 entirely. Care to test that?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-28 11:09             ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-28 11:09 UTC (permalink / raw)



>> Thanks, this looks useful.  But wouldn't it be better to do something
>> like queue_size divided by 2 or 4 to get a better refill latency?
>
> That's an interesting question. The max number of requests is already at 3 or 4 times
> of the queue size because of different message types (see Sam's original
> message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is low').
> I guess it would have inflence on configs with bigger latency.
>
> I would like to have Sagi's view on this as he's the one who has changed that
> part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd

Hi Marta,

This looks good indeed. IIRC both for IB and iWARP we need to signal
at least once every send-queue depth (in practice I remember that some
devices need more than once) so maybe it'll be good to have division by
2.

Maybe it'll be better if we do:

static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
{
	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
}

And lose the hard-coded 32 entirely. Care to test that?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-28 11:09             ` Sagi Grimberg
@ 2017-03-28 11:20                 ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-28 11:20 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones

----- Mail original -----
>>> Thanks, this looks useful.  But wouldn't it be better to do something
>>> like queue_size divided by 2 or 4 to get a better refill latency?
>>
>> That's an interesting question. The max number of requests is already at 3 or 4
>> times
>> of the queue size because of different message types (see Sam's original
>> message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is
>> low').
>> I guess it would have inflence on configs with bigger latency.
>>
>> I would like to have Sagi's view on this as he's the one who has changed that
>> part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd
> 
> Hi Marta,
> 
> This looks good indeed. IIRC both for IB and iWARP we need to signal
> at least once every send-queue depth (in practice I remember that some
> devices need more than once) so maybe it'll be good to have division by
> 2.
> 
> Maybe it'll be better if we do:
> 
> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
> {
>	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
> }
> 
> And lose the hard-coded 32 entirely. Care to test that?

Hello Sigi,
I agree with you, we've found a setup where the signalling every queue
depth is not enough and we're testing the division by two that seems
to work fine till now.

In your version in case of queue length > 32 the notifications would
be sent less often that they are now. I'm wondering if it will have
impact on performance and internal card buffering (it seems that
Mellanox buffers are ~100 elements). Wouldn't it create issues?

I'd like see the magic constant removed. From what I can see we 
need to have something not exceeding send buffer of the card but 
also not lower than the queue depth. What do you think?

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-28 11:20                 ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-28 11:20 UTC (permalink / raw)


----- Mail original -----
>>> Thanks, this looks useful.  But wouldn't it be better to do something
>>> like queue_size divided by 2 or 4 to get a better refill latency?
>>
>> That's an interesting question. The max number of requests is already at 3 or 4
>> times
>> of the queue size because of different message types (see Sam's original
>> message in 'NVMe RDMA driver: CX4 send queue fills up when nvme queue depth is
>> low').
>> I guess it would have inflence on configs with bigger latency.
>>
>> I would like to have Sagi's view on this as he's the one who has changed that
>> part in the iSER initiator in 6df5a128f0fde6315a44e80b30412997147f5efd
> 
> Hi Marta,
> 
> This looks good indeed. IIRC both for IB and iWARP we need to signal
> at least once every send-queue depth (in practice I remember that some
> devices need more than once) so maybe it'll be good to have division by
> 2.
> 
> Maybe it'll be better if we do:
> 
> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
> {
>	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
> }
> 
> And lose the hard-coded 32 entirely. Care to test that?

Hello Sigi,
I agree with you, we've found a setup where the signalling every queue
depth is not enough and we're testing the division by two that seems
to work fine till now.

In your version in case of queue length > 32 the notifications would
be sent less often that they are now. I'm wondering if it will have
impact on performance and internal card buffering (it seems that
Mellanox buffers are ~100 elements). Wouldn't it create issues?

I'd like see the magic constant removed. From what I can see we 
need to have something not exceeding send buffer of the card but 
also not lower than the queue depth. What do you think?

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-28 11:20                 ` Marta Rybczynska
@ 2017-03-28 11:30                     ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-28 11:30 UTC (permalink / raw)
  To: Marta Rybczynska
  Cc: Christoph Hellwig, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky


>> Maybe it'll be better if we do:
>>
>> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
>> {
>> 	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
>> }
>>
>> And lose the hard-coded 32 entirely. Care to test that?
>
> Hello Sigi,
> I agree with you, we've found a setup where the signalling every queue
> depth is not enough and we're testing the division by two that seems
> to work fine till now.
>
> In your version in case of queue length > 32 the notifications would
> be sent less often that they are now. I'm wondering if it will have
> impact on performance and internal card buffering (it seems that
> Mellanox buffers are ~100 elements). Wouldn't it create issues?
>
> I'd like see the magic constant removed. From what I can see we
> need to have something not exceeding send buffer of the card but
> also not lower than the queue depth. What do you think?

I'm not sure what buffering is needed from the device at all in this
case, the device is simply expected to avoid signaling completions.

Mellanox folks, any idea where is this limitation coming from?
Do we need a device capability for it?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-28 11:30                     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-28 11:30 UTC (permalink / raw)



>> Maybe it'll be better if we do:
>>
>> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
>> {
>> 	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
>> }
>>
>> And lose the hard-coded 32 entirely. Care to test that?
>
> Hello Sigi,
> I agree with you, we've found a setup where the signalling every queue
> depth is not enough and we're testing the division by two that seems
> to work fine till now.
>
> In your version in case of queue length > 32 the notifications would
> be sent less often that they are now. I'm wondering if it will have
> impact on performance and internal card buffering (it seems that
> Mellanox buffers are ~100 elements). Wouldn't it create issues?
>
> I'd like see the magic constant removed. From what I can see we
> need to have something not exceeding send buffer of the card but
> also not lower than the queue depth. What do you think?

I'm not sure what buffering is needed from the device at all in this
case, the device is simply expected to avoid signaling completions.

Mellanox folks, any idea where is this limitation coming from?
Do we need a device capability for it?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-28 11:30                     ` Sagi Grimberg
@ 2017-03-29  9:36                         ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-29  9:36 UTC (permalink / raw)
  To: Sagi Grimberg, matanb-VPRAkNaXOzVWk0Htik3J/w,
	leonro-VPRAkNaXOzVWk0Htik3J/w
  Cc: Christoph Hellwig, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy

>>> Maybe it'll be better if we do:
>>>
>>> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
>>> {
>>> 	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
>>> }
>>>
>>> And lose the hard-coded 32 entirely. Care to test that?
>>
>> Hello Sigi,
>> I agree with you, we've found a setup where the signalling every queue
>> depth is not enough and we're testing the division by two that seems
>> to work fine till now.
>>
>> In your version in case of queue length > 32 the notifications would
>> be sent less often that they are now. I'm wondering if it will have
>> impact on performance and internal card buffering (it seems that
>> Mellanox buffers are ~100 elements). Wouldn't it create issues?
>>
>> I'd like see the magic constant removed. From what I can see we
>> need to have something not exceeding send buffer of the card but
>> also not lower than the queue depth. What do you think?
> 
> I'm not sure what buffering is needed from the device at all in this
> case, the device is simply expected to avoid signaling completions.
> 
> Mellanox folks, any idea where is this limitation coming from?
> Do we need a device capability for it?

In the case of mlx5 the we're getting -ENOMEM from begin_wqe
(condition on mlx5_wq_overflow). This queue is sized in the driver
based on multiple factors. If we ack less often, this could
happen for higher queue depths too, I think.

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29  9:36                         ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-29  9:36 UTC (permalink / raw)


>>> Maybe it'll be better if we do:
>>>
>>> static inline bool queue_sig_limit(struct nvme_rdma_queue *queue)
>>> {
>>> 	return (++queue->sig_count % (queue->queue_size / 2)) == 0;
>>> }
>>>
>>> And lose the hard-coded 32 entirely. Care to test that?
>>
>> Hello Sigi,
>> I agree with you, we've found a setup where the signalling every queue
>> depth is not enough and we're testing the division by two that seems
>> to work fine till now.
>>
>> In your version in case of queue length > 32 the notifications would
>> be sent less often that they are now. I'm wondering if it will have
>> impact on performance and internal card buffering (it seems that
>> Mellanox buffers are ~100 elements). Wouldn't it create issues?
>>
>> I'd like see the magic constant removed. From what I can see we
>> need to have something not exceeding send buffer of the card but
>> also not lower than the queue depth. What do you think?
> 
> I'm not sure what buffering is needed from the device at all in this
> case, the device is simply expected to avoid signaling completions.
> 
> Mellanox folks, any idea where is this limitation coming from?
> Do we need a device capability for it?

In the case of mlx5 the we're getting -ENOMEM from begin_wqe
(condition on mlx5_wq_overflow). This queue is sized in the driver
based on multiple factors. If we ack less often, this could
happen for higher queue depths too, I think.

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-28 11:30                     ` Sagi Grimberg
@ 2017-03-29 13:29                         ` Jason Gunthorpe
  -1 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 13:29 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

On Tue, Mar 28, 2017 at 02:30:14PM +0300, Sagi Grimberg wrote:
> I'm not sure what buffering is needed from the device at all in this
> case, the device is simply expected to avoid signaling completions.
> 
> Mellanox folks, any idea where is this limitation coming from?
> Do we need a device capability for it?

Fundamentally you must drive SQ flow control via CQ completions. For
instance a ULP cannot disable all CQ notifications and keep
stuffing things into the SQ.

An alternative way to state this: A ULP cannot use activity on the
RQ to infer that there is space in the SQ. Only CQ completions can be
used to prove there is more available SQ space. Do not post to the SQ
until a CQ has been polled proving available space.

Ultimately you need a minimum of one CQ notification for every SQ
depth post and the ULP must not post to the SQ once it fills until
it sees the CQ notification. That usually drives the rule of thumb to
notify every 1/2 depth, however any SQWE posting failures indicate a
ULP bug..

There are a bunch of varied reasons for this, and it was discussed to
death for NFS. NFS's bugs and wonkyness in this area went away when
Chuck did strict accounting SQ capicty driven by the CQ...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 13:29                         ` Jason Gunthorpe
  0 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 13:29 UTC (permalink / raw)


On Tue, Mar 28, 2017@02:30:14PM +0300, Sagi Grimberg wrote:
> I'm not sure what buffering is needed from the device at all in this
> case, the device is simply expected to avoid signaling completions.
> 
> Mellanox folks, any idea where is this limitation coming from?
> Do we need a device capability for it?

Fundamentally you must drive SQ flow control via CQ completions. For
instance a ULP cannot disable all CQ notifications and keep
stuffing things into the SQ.

An alternative way to state this: A ULP cannot use activity on the
RQ to infer that there is space in the SQ. Only CQ completions can be
used to prove there is more available SQ space. Do not post to the SQ
until a CQ has been polled proving available space.

Ultimately you need a minimum of one CQ notification for every SQ
depth post and the ULP must not post to the SQ once it fills until
it sees the CQ notification. That usually drives the rule of thumb to
notify every 1/2 depth, however any SQWE posting failures indicate a
ULP bug..

There are a bunch of varied reasons for this, and it was discussed to
death for NFS. NFS's bugs and wonkyness in this area went away when
Chuck did strict accounting SQ capicty driven by the CQ...

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 13:29                         ` Jason Gunthorpe
@ 2017-03-29 15:47                             ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

Jason,

> Fundamentally you must drive SQ flow control via CQ completions. For
> instance a ULP cannot disable all CQ notifications and keep
> stuffing things into the SQ.
>
> An alternative way to state this: A ULP cannot use activity on the
> RQ to infer that there is space in the SQ. Only CQ completions can be
> used to prove there is more available SQ space. Do not post to the SQ
> until a CQ has been polled proving available space.

The recv queue is out of the ball game here...

We just selectively signal send completions only to reduce some
interrupts and completion processing.

> Ultimately you need a minimum of one CQ notification for every SQ
> depth post and the ULP must not post to the SQ once it fills until
> it sees the CQ notification. That usually drives the rule of thumb to
> notify every 1/2 depth, however any SQWE posting failures indicate a
> ULP bug..

Well, usually, but I'm not convinced this is the case.

For each I/O we post up to 2 work requests, 1 for memory registration
and 1 for sending an I/O request (and 1 for local invalidation if the
target doesn't do it for us, but that is not the case here). So if our
queue depth is X, we size our completion queue to be X*3, and we need
to make sure we signal every (X*3)/2.

What I think Marta is seeing, is that no matter how long we size our
send-queue (in other words no matter how high X is) we're not able
to get away with suppress send completion signaling for more than
~100 posts.

Please correct me if I'm wrong Marta?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 15:47                             ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 15:47 UTC (permalink / raw)


Jason,

> Fundamentally you must drive SQ flow control via CQ completions. For
> instance a ULP cannot disable all CQ notifications and keep
> stuffing things into the SQ.
>
> An alternative way to state this: A ULP cannot use activity on the
> RQ to infer that there is space in the SQ. Only CQ completions can be
> used to prove there is more available SQ space. Do not post to the SQ
> until a CQ has been polled proving available space.

The recv queue is out of the ball game here...

We just selectively signal send completions only to reduce some
interrupts and completion processing.

> Ultimately you need a minimum of one CQ notification for every SQ
> depth post and the ULP must not post to the SQ once it fills until
> it sees the CQ notification. That usually drives the rule of thumb to
> notify every 1/2 depth, however any SQWE posting failures indicate a
> ULP bug..

Well, usually, but I'm not convinced this is the case.

For each I/O we post up to 2 work requests, 1 for memory registration
and 1 for sending an I/O request (and 1 for local invalidation if the
target doesn't do it for us, but that is not the case here). So if our
queue depth is X, we size our completion queue to be X*3, and we need
to make sure we signal every (X*3)/2.

What I think Marta is seeing, is that no matter how long we size our
send-queue (in other words no matter how high X is) we're not able
to get away with suppress send completion signaling for more than
~100 posts.

Please correct me if I'm wrong Marta?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 15:47                             ` Sagi Grimberg
@ 2017-03-29 16:27                                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 16:27 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

On Wed, Mar 29, 2017 at 06:47:54PM +0300, Sagi Grimberg wrote:

> For each I/O we post up to 2 work requests, 1 for memory registration
> and 1 for sending an I/O request (and 1 for local invalidation if the
> target doesn't do it for us, but that is not the case here). So if our
> queue depth is X, we size our completion queue to be X*3, and we need
> to make sure we signal every (X*3)/2.

??? If your SQ is X and your CQ is X*3 you need to signal at X/2.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 16:27                                 ` Jason Gunthorpe
  0 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 16:27 UTC (permalink / raw)


On Wed, Mar 29, 2017@06:47:54PM +0300, Sagi Grimberg wrote:

> For each I/O we post up to 2 work requests, 1 for memory registration
> and 1 for sending an I/O request (and 1 for local invalidation if the
> target doesn't do it for us, but that is not the case here). So if our
> queue depth is X, we size our completion queue to be X*3, and we need
> to make sure we signal every (X*3)/2.

??? If your SQ is X and your CQ is X*3 you need to signal at X/2.

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:27                                 ` Jason Gunthorpe
@ 2017-03-29 16:39                                     ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 16:39 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky


>> For each I/O we post up to 2 work requests, 1 for memory registration
>> and 1 for sending an I/O request (and 1 for local invalidation if the
>> target doesn't do it for us, but that is not the case here). So if our
>> queue depth is X, we size our completion queue to be X*3, and we need
>> to make sure we signal every (X*3)/2.
>
> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.

Sorry, I confused SQ with CQ (which made it even more confusing..)

Our application queue-depth is X, we size our SQ to be X*3
(send+reg+inv), we size our RQ to be X (resp) and our CQ to be
X*4 (SQ+RQ).

So we should signal every (X*3)/2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 16:39                                     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 16:39 UTC (permalink / raw)



>> For each I/O we post up to 2 work requests, 1 for memory registration
>> and 1 for sending an I/O request (and 1 for local invalidation if the
>> target doesn't do it for us, but that is not the case here). So if our
>> queue depth is X, we size our completion queue to be X*3, and we need
>> to make sure we signal every (X*3)/2.
>
> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.

Sorry, I confused SQ with CQ (which made it even more confusing..)

Our application queue-depth is X, we size our SQ to be X*3
(send+reg+inv), we size our RQ to be X (resp) and our CQ to be
X*4 (SQ+RQ).

So we should signal every (X*3)/2

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:39                                     ` Sagi Grimberg
@ 2017-03-29 16:44                                         ` Doug Ledford
  -1 siblings, 0 replies; 43+ messages in thread
From: Doug Ledford @ 2017-03-29 16:44 UTC (permalink / raw)
  To: Sagi Grimberg, Jason Gunthorpe
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

On 3/29/17 11:39 AM, Sagi Grimberg wrote:
>
>>> For each I/O we post up to 2 work requests, 1 for memory registration
>>> and 1 for sending an I/O request (and 1 for local invalidation if the
>>> target doesn't do it for us, but that is not the case here). So if our
>>> queue depth is X, we size our completion queue to be X*3, and we need
>>> to make sure we signal every (X*3)/2.
>>
>> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.
>
> Sorry, I confused SQ with CQ (which made it even more confusing..)
>
> Our application queue-depth is X, we size our SQ to be X*3
> (send+reg+inv), we size our RQ to be X (resp) and our CQ to be
> X*4 (SQ+RQ).
>
> So we should signal every (X*3)/2

You say above "we post *up to* 2 work requests", unless you wish to 
change that to "we always post at least 2 work requests per queue 
entry", Jason is right, your frequency of signaling needs to be X/2 
regardless of your CQ size, you need the signaling to control the queue 
depth tracking.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 16:44                                         ` Doug Ledford
  0 siblings, 0 replies; 43+ messages in thread
From: Doug Ledford @ 2017-03-29 16:44 UTC (permalink / raw)


On 3/29/17 11:39 AM, Sagi Grimberg wrote:
>
>>> For each I/O we post up to 2 work requests, 1 for memory registration
>>> and 1 for sending an I/O request (and 1 for local invalidation if the
>>> target doesn't do it for us, but that is not the case here). So if our
>>> queue depth is X, we size our completion queue to be X*3, and we need
>>> to make sure we signal every (X*3)/2.
>>
>> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.
>
> Sorry, I confused SQ with CQ (which made it even more confusing..)
>
> Our application queue-depth is X, we size our SQ to be X*3
> (send+reg+inv), we size our RQ to be X (resp) and our CQ to be
> X*4 (SQ+RQ).
>
> So we should signal every (X*3)/2

You say above "we post *up to* 2 work requests", unless you wish to 
change that to "we always post at least 2 work requests per queue 
entry", Jason is right, your frequency of signaling needs to be X/2 
regardless of your CQ size, you need the signaling to control the queue 
depth tracking.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:44                                         ` Doug Ledford
@ 2017-03-29 16:47                                             ` Doug Ledford
  -1 siblings, 0 replies; 43+ messages in thread
From: Doug Ledford @ 2017-03-29 16:47 UTC (permalink / raw)
  To: Sagi Grimberg, Jason Gunthorpe
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

On 3/29/17 11:44 AM, Doug Ledford wrote:
> On 3/29/17 11:39 AM, Sagi Grimberg wrote:
>>
>>>> For each I/O we post up to 2 work requests, 1 for memory registration
>>>> and 1 for sending an I/O request (and 1 for local invalidation if the
>>>> target doesn't do it for us, but that is not the case here). So if our
>>>> queue depth is X, we size our completion queue to be X*3, and we need
>>>> to make sure we signal every (X*3)/2.
>>>
>>> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.
>>
>> Sorry, I confused SQ with CQ (which made it even more confusing..)
>>
>> Our application queue-depth is X, we size our SQ to be X*3
>> (send+reg+inv), we size our RQ to be X (resp) and our CQ to be
>> X*4 (SQ+RQ).
>>
>> So we should signal every (X*3)/2
>
> You say above "we post *up to* 2 work requests", unless you wish to
> change that to "we always post at least 2 work requests per queue
> entry", Jason is right, your frequency of signaling needs to be X/2
> regardless of your CQ size, you need the signaling to control the queue
> depth tracking.

If you would like to spread things out farther between signaling, then 
you can modify your send routine to only increment the send counter for 
actual send requests, ignoring registration WQEs and invalidate WQES, 
and then signal every X/2 sends.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 16:47                                             ` Doug Ledford
  0 siblings, 0 replies; 43+ messages in thread
From: Doug Ledford @ 2017-03-29 16:47 UTC (permalink / raw)


On 3/29/17 11:44 AM, Doug Ledford wrote:
> On 3/29/17 11:39 AM, Sagi Grimberg wrote:
>>
>>>> For each I/O we post up to 2 work requests, 1 for memory registration
>>>> and 1 for sending an I/O request (and 1 for local invalidation if the
>>>> target doesn't do it for us, but that is not the case here). So if our
>>>> queue depth is X, we size our completion queue to be X*3, and we need
>>>> to make sure we signal every (X*3)/2.
>>>
>>> ??? If your SQ is X and your CQ is X*3 you need to signal at X/2.
>>
>> Sorry, I confused SQ with CQ (which made it even more confusing..)
>>
>> Our application queue-depth is X, we size our SQ to be X*3
>> (send+reg+inv), we size our RQ to be X (resp) and our CQ to be
>> X*4 (SQ+RQ).
>>
>> So we should signal every (X*3)/2
>
> You say above "we post *up to* 2 work requests", unless you wish to
> change that to "we always post at least 2 work requests per queue
> entry", Jason is right, your frequency of signaling needs to be X/2
> regardless of your CQ size, you need the signaling to control the queue
> depth tracking.

If you would like to spread things out farther between signaling, then 
you can modify your send routine to only increment the send counter for 
actual send requests, ignoring registration WQEs and invalidate WQES, 
and then signal every X/2 sends.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:47                                             ` Doug Ledford
@ 2017-03-29 16:59                                                 ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 16:59 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky


>> You say above "we post *up to* 2 work requests", unless you wish to
>> change that to "we always post at least 2 work requests per queue
>> entry", Jason is right, your frequency of signaling needs to be X/2
>> regardless of your CQ size, you need the signaling to control the queue
>> depth tracking.
>
> If you would like to spread things out farther between signaling, then
> you can modify your send routine to only increment the send counter for
> actual send requests, ignoring registration WQEs and invalidate WQES,
> and then signal every X/2 sends.

Yea, you're right, and not only I got it wrong, I even contradicted my
own suggestion that was exactly what you and Jason suggested (where is
the nearest rat-hole...)

So I suggested to signal every X/2 and Marta reported SQ overflows for
high queue-dpeth. Marta, at what queue-depth have you seen this?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 16:59                                                 ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-03-29 16:59 UTC (permalink / raw)



>> You say above "we post *up to* 2 work requests", unless you wish to
>> change that to "we always post at least 2 work requests per queue
>> entry", Jason is right, your frequency of signaling needs to be X/2
>> regardless of your CQ size, you need the signaling to control the queue
>> depth tracking.
>
> If you would like to spread things out farther between signaling, then
> you can modify your send routine to only increment the send counter for
> actual send requests, ignoring registration WQEs and invalidate WQES,
> and then signal every X/2 sends.

Yea, you're right, and not only I got it wrong, I even contradicted my
own suggestion that was exactly what you and Jason suggested (where is
the nearest rat-hole...)

So I suggested to signal every X/2 and Marta reported SQ overflows for
high queue-dpeth. Marta, at what queue-depth have you seen this?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:59                                                 ` Sagi Grimberg
@ 2017-03-29 22:19                                                     ` Jason Gunthorpe
  -1 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 22:19 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Doug Ledford, Marta Rybczynska, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

On Wed, Mar 29, 2017 at 07:59:13PM +0300, Sagi Grimberg wrote:
> Yea, you're right, and not only I got it wrong, I even contradicted my
> own suggestion that was exactly what you and Jason suggested (where is
> the nearest rat-hole...)
> 
> So I suggested to signal every X/2 and Marta reported SQ overflows for
> high queue-dpeth. Marta, at what queue-depth have you seen this?

I just want to clarify my comment about RQ - because you are talking
about a queue depth concept.

It is tempting to rely on a queue depth of X == SQ depth of X (or 2*
in this case) and then try to re-use the flow control on the queue to
protect the SQ from overflow.

Generally this does not work because the main queue can retire work
from either the CQ or the RQ. If the retire workload is mainly RQ
based then you can submit to the SQ without a CQ poll and overflow
it.

Generally I expect to see direct measurement of SQ capacity and plug
the main queue when the SQ goes full.

To check this use a simple assertion, decrement a counter on every
post, increment it appropriately on every polled CQ, init to the SQ
depth, and assert the counter never goes negative.

Triggering the assertion is an unconditional ULP bug, and is the most
likely cause of a posting no space failure.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-29 22:19                                                     ` Jason Gunthorpe
  0 siblings, 0 replies; 43+ messages in thread
From: Jason Gunthorpe @ 2017-03-29 22:19 UTC (permalink / raw)


On Wed, Mar 29, 2017@07:59:13PM +0300, Sagi Grimberg wrote:
> Yea, you're right, and not only I got it wrong, I even contradicted my
> own suggestion that was exactly what you and Jason suggested (where is
> the nearest rat-hole...)
> 
> So I suggested to signal every X/2 and Marta reported SQ overflows for
> high queue-dpeth. Marta, at what queue-depth have you seen this?

I just want to clarify my comment about RQ - because you are talking
about a queue depth concept.

It is tempting to rely on a queue depth of X == SQ depth of X (or 2*
in this case) and then try to re-use the flow control on the queue to
protect the SQ from overflow.

Generally this does not work because the main queue can retire work
from either the CQ or the RQ. If the retire workload is mainly RQ
based then you can submit to the SQ without a CQ poll and overflow
it.

Generally I expect to see direct measurement of SQ capacity and plug
the main queue when the SQ goes full.

To check this use a simple assertion, decrement a counter on every
post, increment it appropriately on every polled CQ, init to the SQ
depth, and assert the counter never goes negative.

Triggering the assertion is an unconditional ULP bug, and is the most
likely cause of a posting no space failure.

Jason

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-29 16:59                                                 ` Sagi Grimberg
@ 2017-03-30 14:23                                                     ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-30 14:23 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Doug Ledford, Jason Gunthorpe, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

>>> You say above "we post *up to* 2 work requests", unless you wish to
>>> change that to "we always post at least 2 work requests per queue
>>> entry", Jason is right, your frequency of signaling needs to be X/2
>>> regardless of your CQ size, you need the signaling to control the queue
>>> depth tracking.
>>
>> If you would like to spread things out farther between signaling, then
>> you can modify your send routine to only increment the send counter for
>> actual send requests, ignoring registration WQEs and invalidate WQES,
>> and then signal every X/2 sends.
> 
> Yea, you're right, and not only I got it wrong, I even contradicted my
> own suggestion that was exactly what you and Jason suggested (where is
> the nearest rat-hole...)
> 
> So I suggested to signal every X/2 and Marta reported SQ overflows for
> high queue-dpeth. Marta, at what queue-depth have you seen this?

The remote side had queue depth of 16 or 32 and that's the WQ on the
initiator side that overflows (mlx5_wq_overflow). We're testing with
signalling X/2 and it seems to work.

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-03-30 14:23                                                     ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-03-30 14:23 UTC (permalink / raw)


>>> You say above "we post *up to* 2 work requests", unless you wish to
>>> change that to "we always post at least 2 work requests per queue
>>> entry", Jason is right, your frequency of signaling needs to be X/2
>>> regardless of your CQ size, you need the signaling to control the queue
>>> depth tracking.
>>
>> If you would like to spread things out farther between signaling, then
>> you can modify your send routine to only increment the send counter for
>> actual send requests, ignoring registration WQEs and invalidate WQES,
>> and then signal every X/2 sends.
> 
> Yea, you're right, and not only I got it wrong, I even contradicted my
> own suggestion that was exactly what you and Jason suggested (where is
> the nearest rat-hole...)
> 
> So I suggested to signal every X/2 and Marta reported SQ overflows for
> high queue-dpeth. Marta, at what queue-depth have you seen this?

The remote side had queue depth of 16 or 32 and that's the WQ on the
initiator side that overflows (mlx5_wq_overflow). We're testing with
signalling X/2 and it seems to work.

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-03-30 14:23                                                     ` Marta Rybczynska
@ 2017-04-06 12:29                                                         ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-06 12:29 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Doug Ledford, Jason Gunthorpe, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy, Leon Romanovsky

>>>> You say above "we post *up to* 2 work requests", unless you wish to
>>>> change that to "we always post at least 2 work requests per queue
>>>> entry", Jason is right, your frequency of signaling needs to be X/2
>>>> regardless of your CQ size, you need the signaling to control the queue
>>>> depth tracking.
>>>
>>> If you would like to spread things out farther between signaling, then
>>> you can modify your send routine to only increment the send counter for
>>> actual send requests, ignoring registration WQEs and invalidate WQES,
>>> and then signal every X/2 sends.
>> 
>> Yea, you're right, and not only I got it wrong, I even contradicted my
>> own suggestion that was exactly what you and Jason suggested (where is
>> the nearest rat-hole...)
>> 
>> So I suggested to signal every X/2 and Marta reported SQ overflows for
>> high queue-dpeth. Marta, at what queue-depth have you seen this?
> 
> The remote side had queue depth of 16 or 32 and that's the WQ on the
> initiator side that overflows (mlx5_wq_overflow). We're testing with
> signalling X/2 and it seems to work.

Update on the situation: the signalling on X/2 seems to work fine in
practice. To clarify more that's the send queue that overflows
(mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).

However, I have still doubt how it's going to work in the case of
higher queue depths (i.e. the typical case). If we signal every X/2
we'll do it much more rarely than today (every 32 messages). I'm not
sure on the system effect this would have. 

Mellanox guys, do you have an idea what it might do?

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-04-06 12:29                                                         ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-06 12:29 UTC (permalink / raw)


>>>> You say above "we post *up to* 2 work requests", unless you wish to
>>>> change that to "we always post at least 2 work requests per queue
>>>> entry", Jason is right, your frequency of signaling needs to be X/2
>>>> regardless of your CQ size, you need the signaling to control the queue
>>>> depth tracking.
>>>
>>> If you would like to spread things out farther between signaling, then
>>> you can modify your send routine to only increment the send counter for
>>> actual send requests, ignoring registration WQEs and invalidate WQES,
>>> and then signal every X/2 sends.
>> 
>> Yea, you're right, and not only I got it wrong, I even contradicted my
>> own suggestion that was exactly what you and Jason suggested (where is
>> the nearest rat-hole...)
>> 
>> So I suggested to signal every X/2 and Marta reported SQ overflows for
>> high queue-dpeth. Marta, at what queue-depth have you seen this?
> 
> The remote side had queue depth of 16 or 32 and that's the WQ on the
> initiator side that overflows (mlx5_wq_overflow). We're testing with
> signalling X/2 and it seems to work.

Update on the situation: the signalling on X/2 seems to work fine in
practice. To clarify more that's the send queue that overflows
(mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).

However, I have still doubt how it's going to work in the case of
higher queue depths (i.e. the typical case). If we signal every X/2
we'll do it much more rarely than today (every 32 messages). I'm not
sure on the system effect this would have. 

Mellanox guys, do you have an idea what it might do?

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-04-06 12:29                                                         ` Marta Rybczynska
@ 2017-04-06 13:02                                                             ` Leon Romanovsky
  -1 siblings, 0 replies; 43+ messages in thread
From: Leon Romanovsky @ 2017-04-06 13:02 UTC (permalink / raw)
  To: Marta Rybczynska
  Cc: Sagi Grimberg, Doug Ledford, Jason Gunthorpe, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy

[-- Attachment #1: Type: text/plain, Size: 2100 bytes --]

On Thu, Apr 06, 2017 at 02:29:03PM +0200, Marta Rybczynska wrote:
> >>>> You say above "we post *up to* 2 work requests", unless you wish to
> >>>> change that to "we always post at least 2 work requests per queue
> >>>> entry", Jason is right, your frequency of signaling needs to be X/2
> >>>> regardless of your CQ size, you need the signaling to control the queue
> >>>> depth tracking.
> >>>
> >>> If you would like to spread things out farther between signaling, then
> >>> you can modify your send routine to only increment the send counter for
> >>> actual send requests, ignoring registration WQEs and invalidate WQES,
> >>> and then signal every X/2 sends.
> >>
> >> Yea, you're right, and not only I got it wrong, I even contradicted my
> >> own suggestion that was exactly what you and Jason suggested (where is
> >> the nearest rat-hole...)
> >>
> >> So I suggested to signal every X/2 and Marta reported SQ overflows for
> >> high queue-dpeth. Marta, at what queue-depth have you seen this?
> >
> > The remote side had queue depth of 16 or 32 and that's the WQ on the
> > initiator side that overflows (mlx5_wq_overflow). We're testing with
> > signalling X/2 and it seems to work.
>
> Update on the situation: the signalling on X/2 seems to work fine in
> practice. To clarify more that's the send queue that overflows
> (mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).
>
> However, I have still doubt how it's going to work in the case of
> higher queue depths (i.e. the typical case). If we signal every X/2
> we'll do it much more rarely than today (every 32 messages). I'm not
> sure on the system effect this would have.
>
> Mellanox guys, do you have an idea what it might do?

It will continue to work as expected with long depths too.
All that you need is do not to forget to issue signal if queue is terminated.

Thanks

>
> Marta
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-04-06 13:02                                                             ` Leon Romanovsky
  0 siblings, 0 replies; 43+ messages in thread
From: Leon Romanovsky @ 2017-04-06 13:02 UTC (permalink / raw)


On Thu, Apr 06, 2017@02:29:03PM +0200, Marta Rybczynska wrote:
> >>>> You say above "we post *up to* 2 work requests", unless you wish to
> >>>> change that to "we always post at least 2 work requests per queue
> >>>> entry", Jason is right, your frequency of signaling needs to be X/2
> >>>> regardless of your CQ size, you need the signaling to control the queue
> >>>> depth tracking.
> >>>
> >>> If you would like to spread things out farther between signaling, then
> >>> you can modify your send routine to only increment the send counter for
> >>> actual send requests, ignoring registration WQEs and invalidate WQES,
> >>> and then signal every X/2 sends.
> >>
> >> Yea, you're right, and not only I got it wrong, I even contradicted my
> >> own suggestion that was exactly what you and Jason suggested (where is
> >> the nearest rat-hole...)
> >>
> >> So I suggested to signal every X/2 and Marta reported SQ overflows for
> >> high queue-dpeth. Marta, at what queue-depth have you seen this?
> >
> > The remote side had queue depth of 16 or 32 and that's the WQ on the
> > initiator side that overflows (mlx5_wq_overflow). We're testing with
> > signalling X/2 and it seems to work.
>
> Update on the situation: the signalling on X/2 seems to work fine in
> practice. To clarify more that's the send queue that overflows
> (mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).
>
> However, I have still doubt how it's going to work in the case of
> higher queue depths (i.e. the typical case). If we signal every X/2
> we'll do it much more rarely than today (every 32 messages). I'm not
> sure on the system effect this would have.
>
> Mellanox guys, do you have an idea what it might do?

It will continue to work as expected with long depths too.
All that you need is do not to forget to issue signal if queue is terminated.

Thanks

>
> Marta
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20170406/98715074/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-04-06 13:02                                                             ` Leon Romanovsky
@ 2017-04-07 13:31                                                                 ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-07 13:31 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Sagi Grimberg, Doug Ledford, Jason Gunthorpe, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy

> On Thu, Apr 06, 2017 at 02:29:03PM +0200, Marta Rybczynska wrote:
>> >>>> You say above "we post *up to* 2 work requests", unless you wish to
>> >>>> change that to "we always post at least 2 work requests per queue
>> >>>> entry", Jason is right, your frequency of signaling needs to be X/2
>> >>>> regardless of your CQ size, you need the signaling to control the queue
>> >>>> depth tracking.
>> >>>
>> >>> If you would like to spread things out farther between signaling, then
>> >>> you can modify your send routine to only increment the send counter for
>> >>> actual send requests, ignoring registration WQEs and invalidate WQES,
>> >>> and then signal every X/2 sends.
>> >>
>> >> Yea, you're right, and not only I got it wrong, I even contradicted my
>> >> own suggestion that was exactly what you and Jason suggested (where is
>> >> the nearest rat-hole...)
>> >>
>> >> So I suggested to signal every X/2 and Marta reported SQ overflows for
>> >> high queue-dpeth. Marta, at what queue-depth have you seen this?
>> >
>> > The remote side had queue depth of 16 or 32 and that's the WQ on the
>> > initiator side that overflows (mlx5_wq_overflow). We're testing with
>> > signalling X/2 and it seems to work.
>>
>> Update on the situation: the signalling on X/2 seems to work fine in
>> practice. To clarify more that's the send queue that overflows
>> (mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).
>>
>> However, I have still doubt how it's going to work in the case of
>> higher queue depths (i.e. the typical case). If we signal every X/2
>> we'll do it much more rarely than today (every 32 messages). I'm not
>> sure on the system effect this would have.
>>
>> Mellanox guys, do you have an idea what it might do?
> 
> It will continue to work as expected with long depths too.
> All that you need is do not to forget to issue signal if queue is terminated.
> 

Thanks Leon. I will then submit the v2.

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-04-07 13:31                                                                 ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-07 13:31 UTC (permalink / raw)


> On Thu, Apr 06, 2017@02:29:03PM +0200, Marta Rybczynska wrote:
>> >>>> You say above "we post *up to* 2 work requests", unless you wish to
>> >>>> change that to "we always post at least 2 work requests per queue
>> >>>> entry", Jason is right, your frequency of signaling needs to be X/2
>> >>>> regardless of your CQ size, you need the signaling to control the queue
>> >>>> depth tracking.
>> >>>
>> >>> If you would like to spread things out farther between signaling, then
>> >>> you can modify your send routine to only increment the send counter for
>> >>> actual send requests, ignoring registration WQEs and invalidate WQES,
>> >>> and then signal every X/2 sends.
>> >>
>> >> Yea, you're right, and not only I got it wrong, I even contradicted my
>> >> own suggestion that was exactly what you and Jason suggested (where is
>> >> the nearest rat-hole...)
>> >>
>> >> So I suggested to signal every X/2 and Marta reported SQ overflows for
>> >> high queue-dpeth. Marta, at what queue-depth have you seen this?
>> >
>> > The remote side had queue depth of 16 or 32 and that's the WQ on the
>> > initiator side that overflows (mlx5_wq_overflow). We're testing with
>> > signalling X/2 and it seems to work.
>>
>> Update on the situation: the signalling on X/2 seems to work fine in
>> practice. To clarify more that's the send queue that overflows
>> (mlx5_wq_overflow in begin_wqe of drivers/infiniband/hw/mlx5/qp.c).
>>
>> However, I have still doubt how it's going to work in the case of
>> higher queue depths (i.e. the typical case). If we signal every X/2
>> we'll do it much more rarely than today (every 32 messages). I'm not
>> sure on the system effect this would have.
>>
>> Mellanox guys, do you have an idea what it might do?
> 
> It will continue to work as expected with long depths too.
> All that you need is do not to forget to issue signal if queue is terminated.
> 

Thanks Leon. I will then submit the v2.

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-04-07 13:31                                                                 ` Marta Rybczynska
@ 2017-04-09 12:31                                                                     ` Sagi Grimberg
  -1 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-04-09 12:31 UTC (permalink / raw)
  To: Marta Rybczynska, Leon Romanovsky
  Cc: Doug Ledford, Jason Gunthorpe, Christoph Hellwig,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy


>>> Mellanox guys, do you have an idea what it might do?
>>
>> It will continue to work as expected with long depths too.
>> All that you need is do not to forget to issue signal if queue is terminated.
>>
>
> Thanks Leon. I will then submit the v2.

Marta, can you please test with higher queue-depth? Say 512 and/or 1024?

Thanks,
Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-04-09 12:31                                                                     ` Sagi Grimberg
  0 siblings, 0 replies; 43+ messages in thread
From: Sagi Grimberg @ 2017-04-09 12:31 UTC (permalink / raw)



>>> Mellanox guys, do you have an idea what it might do?
>>
>> It will continue to work as expected with long depths too.
>> All that you need is do not to forget to issue signal if queue is terminated.
>>
>
> Thanks Leon. I will then submit the v2.

Marta, can you please test with higher queue-depth? Say 512 and/or 1024?

Thanks,
Sagi.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC] nvme-rdma: support devices with queue size < 32
  2017-04-09 12:31                                                                     ` Sagi Grimberg
@ 2017-04-10 11:29                                                                         ` Marta Rybczynska
  -1 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-10 11:29 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	Christoph Hellwig, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, keith busch,
	axboe-b10kYP2dOMg, Samuel Jones, Max Gurtovoy

>>>> Mellanox guys, do you have an idea what it might do?
>>>
>>> It will continue to work as expected with long depths too.
>>> All that you need is do not to forget to issue signal if queue is terminated.
>>>
>>
>> Thanks Leon. I will then submit the v2.
> 
> Marta, can you please test with higher queue-depth? Say 512 and/or 1024?
> 

Yes, we'll do that.

Marta
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC] nvme-rdma: support devices with queue size < 32
@ 2017-04-10 11:29                                                                         ` Marta Rybczynska
  0 siblings, 0 replies; 43+ messages in thread
From: Marta Rybczynska @ 2017-04-10 11:29 UTC (permalink / raw)


>>>> Mellanox guys, do you have an idea what it might do?
>>>
>>> It will continue to work as expected with long depths too.
>>> All that you need is do not to forget to issue signal if queue is terminated.
>>>
>>
>> Thanks Leon. I will then submit the v2.
> 
> Marta, can you please test with higher queue-depth? Say 512 and/or 1024?
> 

Yes, we'll do that.

Marta

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-04-10 11:29 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-23  9:04 [PATCH RFC] nvme-rdma: support devices with queue size < 32 Marta Rybczynska
2017-03-23  9:04 ` Marta Rybczynska
2017-03-23  9:24 ` Marta Rybczynska
     [not found] ` <1315914765.312051621.1490259849534.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-03-23 14:00   ` Christoph Hellwig
2017-03-23 14:00     ` Christoph Hellwig
     [not found]     ` <20170323140042.GA30536-jcswGhMUV9g@public.gmane.org>
2017-03-23 14:36       ` Marta Rybczynska
2017-03-23 14:36         ` Marta Rybczynska
     [not found]         ` <277345557.313693033.1490279818647.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-03-28 11:09           ` Sagi Grimberg
2017-03-28 11:09             ` Sagi Grimberg
     [not found]             ` <4951fac6-662f-29a6-5ba5-38d37a2c2dca-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-03-28 11:20               ` Marta Rybczynska
2017-03-28 11:20                 ` Marta Rybczynska
     [not found]                 ` <1180136633.325075447.1490700022740.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-03-28 11:30                   ` Sagi Grimberg
2017-03-28 11:30                     ` Sagi Grimberg
     [not found]                     ` <8dc0414f-be90-ee30-0f66-8cee26c4c2aa-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-03-29  9:36                       ` Marta Rybczynska
2017-03-29  9:36                         ` Marta Rybczynska
2017-03-29 13:29                       ` Jason Gunthorpe
2017-03-29 13:29                         ` Jason Gunthorpe
     [not found]                         ` <20170329132918.GA32072-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-03-29 15:47                           ` Sagi Grimberg
2017-03-29 15:47                             ` Sagi Grimberg
     [not found]                             ` <406682b9-d7ca-4718-5830-7940d2822bc0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-03-29 16:27                               ` Jason Gunthorpe
2017-03-29 16:27                                 ` Jason Gunthorpe
     [not found]                                 ` <20170329162751.GA7113-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-03-29 16:39                                   ` Sagi Grimberg
2017-03-29 16:39                                     ` Sagi Grimberg
     [not found]                                     ` <80770042-743e-a271-c636-f72099f9ac56-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-03-29 16:44                                       ` Doug Ledford
2017-03-29 16:44                                         ` Doug Ledford
     [not found]                                         ` <9eb08168-18a6-a176-01df-b68b6a225963-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-03-29 16:47                                           ` Doug Ledford
2017-03-29 16:47                                             ` Doug Ledford
     [not found]                                             ` <3505b835-d0ba-70cb-dfe8-1265fc2bbb83-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-03-29 16:59                                               ` Sagi Grimberg
2017-03-29 16:59                                                 ` Sagi Grimberg
     [not found]                                                 ` <2ba93b5b-ad36-5533-36e2-9db2e3198c19-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-03-29 22:19                                                   ` Jason Gunthorpe
2017-03-29 22:19                                                     ` Jason Gunthorpe
2017-03-30 14:23                                                   ` Marta Rybczynska
2017-03-30 14:23                                                     ` Marta Rybczynska
     [not found]                                                     ` <1121115847.336382032.1490883808617.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-04-06 12:29                                                       ` Marta Rybczynska
2017-04-06 12:29                                                         ` Marta Rybczynska
     [not found]                                                         ` <1224897571.353404199.1491481743295.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-04-06 13:02                                                           ` Leon Romanovsky
2017-04-06 13:02                                                             ` Leon Romanovsky
     [not found]                                                             ` <20170406130207.GI2269-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-07 13:31                                                               ` Marta Rybczynska
2017-04-07 13:31                                                                 ` Marta Rybczynska
     [not found]                                                                 ` <1917497781.360337179.1491571916242.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2017-04-09 12:31                                                                   ` Sagi Grimberg
2017-04-09 12:31                                                                     ` Sagi Grimberg
     [not found]                                                                     ` <c7e56394-feb3-282b-0eec-166ef0d8b13c-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-04-10 11:29                                                                       ` Marta Rybczynska
2017-04-10 11:29                                                                         ` Marta Rybczynska

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.