All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
@ 2020-05-05  5:20 Sagi Grimberg
  2020-05-05  5:20 ` [PATCH 2/2] nvmet-tcp: " Sagi Grimberg
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-05  5:20 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Anil Vasudevan, Mark Wunderlich

We can signal the stack that this is not the last page coming and the
stack can build a larger tso segment, so go ahead and use it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c79e248b9f43..7c7c1886642f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -885,7 +885,7 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
 		if (last && !queue->data_digest)
 			flags |= MSG_EOR;
 		else
-			flags |= MSG_MORE;
+			flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 
 		/* can't zcopy slab pages */
 		if (unlikely(PageSlab(page))) {
@@ -924,11 +924,16 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req)
 	struct nvme_tcp_queue *queue = req->queue;
 	struct nvme_tcp_cmd_pdu *pdu = req->pdu;
 	bool inline_data = nvme_tcp_has_inline_data(req);
-	int flags = MSG_DONTWAIT | (inline_data ? MSG_MORE : MSG_EOR);
 	u8 hdgst = nvme_tcp_hdgst_len(queue);
 	int len = sizeof(*pdu) + hdgst - req->offset;
+	int flags = MSG_DONTWAIT;
 	int ret;
 
+	if (inline_data)
+		flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
+	else
+		flags |= MSG_EOR;
+
 	if (queue->hdr_digest && !req->offset)
 		nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu));
 
@@ -967,7 +972,7 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req)
 
 	ret = kernel_sendpage(queue->sock, virt_to_page(pdu),
 			offset_in_page(pdu) + req->offset, len,
-			MSG_DONTWAIT | MSG_MORE);
+			MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST);
 	if (unlikely(ret <= 0))
 		return ret;
 
-- 
2.20.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] nvmet-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  5:20 [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send Sagi Grimberg
@ 2020-05-05  5:20 ` Sagi Grimberg
  2020-05-12 16:13   ` Christoph Hellwig
  2020-05-05  6:09 ` [PATCH 1/2] nvme-tcp: " Christoph Hellwig
  2020-05-12 16:12 ` Christoph Hellwig
  2 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-05  5:20 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Anil Vasudevan, Mark Wunderlich

We can signal the stack that this is not the last page coming and the
stack can build a larger tso segment, so go ahead and use it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/target/tcp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index f0da04e960f4..c08aec62115e 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -510,7 +510,7 @@ static int nvmet_try_send_data_pdu(struct nvmet_tcp_cmd *cmd)
 
 	ret = kernel_sendpage(cmd->queue->sock, virt_to_page(cmd->data_pdu),
 			offset_in_page(cmd->data_pdu) + cmd->offset,
-			left, MSG_DONTWAIT | MSG_MORE);
+			left, MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST);
 	if (ret <= 0)
 		return ret;
 
@@ -538,7 +538,7 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch)
 		if ((!last_in_batch && cmd->queue->send_list_len) ||
 		    cmd->wbytes_done + left < cmd->req.transfer_len ||
 		    queue->data_digest || !queue->nvme_sq.sqhd_disabled)
-			flags |= MSG_MORE;
+			flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 
 		ret = kernel_sendpage(cmd->queue->sock, page, cmd->offset,
 					left, flags);
@@ -585,7 +585,7 @@ static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd,
 	int ret;
 
 	if (!last_in_batch && cmd->queue->send_list_len)
-		flags |= MSG_MORE;
+		flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 	else
 		flags |= MSG_EOR;
 
@@ -614,7 +614,7 @@ static int nvmet_try_send_r2t(struct nvmet_tcp_cmd *cmd, bool last_in_batch)
 	int ret;
 
 	if (!last_in_batch && cmd->queue->send_list_len)
-		flags |= MSG_MORE;
+		flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 	else
 		flags |= MSG_EOR;
 
-- 
2.20.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  5:20 [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send Sagi Grimberg
  2020-05-05  5:20 ` [PATCH 2/2] nvmet-tcp: " Sagi Grimberg
@ 2020-05-05  6:09 ` Christoph Hellwig
  2020-05-05  6:50   ` Sagi Grimberg
  2020-05-12 16:12 ` Christoph Hellwig
  2 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-05  6:09 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Anil Vasudevan, Mark Wunderlich, Christoph Hellwig,
	linux-nvme

On Mon, May 04, 2020 at 10:20:01PM -0700, Sagi Grimberg wrote:
> We can signal the stack that this is not the last page coming and the
> stack can build a larger tso segment, so go ahead and use it.

Maybe you wan a little helper that returns the flags based on a last
flag?  Something like:

static int nvme_tcp_msg_flags(bool last_page)
{
	if (last_page)
		return MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST;
	return MSG_DONTWAIT | MSG_EOR;
}

or do we have a case where we don't want to set EOR?  At least the
target seems to currently have such a case.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  6:09 ` [PATCH 1/2] nvme-tcp: " Christoph Hellwig
@ 2020-05-05  6:50   ` Sagi Grimberg
  2020-05-05 10:23     ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-05  6:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, linux-nvme


>> We can signal the stack that this is not the last page coming and the
>> stack can build a larger tso segment, so go ahead and use it.
> 
> Maybe you wan a little helper that returns the flags based on a last
> flag?  Something like:
> 
> static int nvme_tcp_msg_flags(bool last_page)
> {
> 	if (last_page)
> 		return MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST;
> 	return MSG_DONTWAIT | MSG_EOR;
> }

You have it reversed, the flag here probably means more...

Let me see if it is useful to have, will let you know...

> 
> or do we have a case where we don't want to set EOR?  At least the
> target seems to currently have such a case.

As a design goal, we try to tell the stack explicitly if we have more
to send and if not we want to push it down to reduce latency. So
I think we need to have it in the target as well.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  6:50   ` Sagi Grimberg
@ 2020-05-05 10:23     ` Christoph Hellwig
  2020-05-05 21:53       ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-05 10:23 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, Christoph Hellwig,
	linux-nvme

On Mon, May 04, 2020 at 11:50:57PM -0700, Sagi Grimberg wrote:
>
>>> We can signal the stack that this is not the last page coming and the
>>> stack can build a larger tso segment, so go ahead and use it.
>>
>> Maybe you wan a little helper that returns the flags based on a last
>> flag?  Something like:
>>
>> static int nvme_tcp_msg_flags(bool last_page)
>> {
>> 	if (last_page)
>> 		return MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST;
>> 	return MSG_DONTWAIT | MSG_EOR;
>> }
>
> You have it reversed, the flag here probably means more...
>
> Let me see if it is useful to have, will let you know...
>
>>
>> or do we have a case where we don't want to set EOR?  At least the
>> target seems to currently have such a case.
>
> As a design goal, we try to tell the stack explicitly if we have more
> to send and if not we want to push it down to reduce latency. So
> I think we need to have it in the target as well.

What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
neither MS_MORE nor MSG_EOR.  Is that intentional?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05 10:23     ` Christoph Hellwig
@ 2020-05-05 21:53       ` Sagi Grimberg
  2020-05-06  4:27         ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-05 21:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, linux-nvme


>>>> We can signal the stack that this is not the last page coming and the
>>>> stack can build a larger tso segment, so go ahead and use it.
>>>
>>> Maybe you wan a little helper that returns the flags based on a last
>>> flag?  Something like:
>>>
>>> static int nvme_tcp_msg_flags(bool last_page)
>>> {
>>> 	if (last_page)
>>> 		return MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST;
>>> 	return MSG_DONTWAIT | MSG_EOR;
>>> }
>>
>> You have it reversed, the flag here probably means more...
>>
>> Let me see if it is useful to have, will let you know...
>>
>>>
>>> or do we have a case where we don't want to set EOR?  At least the
>>> target seems to currently have such a case.
>>
>> As a design goal, we try to tell the stack explicitly if we have more
>> to send and if not we want to push it down to reduce latency. So
>> I think we need to have it in the target as well.
> 
> What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
> neither MS_MORE nor MSG_EOR.  Is that intentional?

nvmet_try_send_data should set MSG_EOR if it doesn't have more to send
and also nvmet_try_send_ddgst. So its not intentional.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05 21:53       ` Sagi Grimberg
@ 2020-05-06  4:27         ` Christoph Hellwig
  2020-05-08  0:50           ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-06  4:27 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, Christoph Hellwig,
	linux-nvme

On Tue, May 05, 2020 at 02:53:40PM -0700, Sagi Grimberg wrote:
>> What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
>> neither MS_MORE nor MSG_EOR.  Is that intentional?
>
> nvmet_try_send_data should set MSG_EOR if it doesn't have more to send
> and also nvmet_try_send_ddgst. So its not intentional.

Ok.  Can you send it with a little helper like I suggested (probably one
each for host and target) that ensures the right flags are set
everywhere?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-06  4:27         ` Christoph Hellwig
@ 2020-05-08  0:50           ` Sagi Grimberg
  2020-05-08  7:35             ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-08  0:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, linux-nvme


>>> What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
>>> neither MS_MORE nor MSG_EOR.  Is that intentional?
>>
>> nvmet_try_send_data should set MSG_EOR if it doesn't have more to send
>> and also nvmet_try_send_ddgst. So its not intentional.
> 
> Ok.  Can you send it with a little helper like I suggested (probably one
> each for host and target) that ensures the right flags are set
> everywhere?

I think its actually better without the helper. MSG_SENDPAGE_NOTLAST is
designed only for sendpage and not for sendmsg which we use for ddgst
(although the net stack code appears to ignore, but still) and when we
send a pdu header that has data, we dont need the condition because its
not last for sure.

So the helpers capture ~60% of the call-sites... seems to me like its
better off without them at the moment. WDYT?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-08  0:50           ` Sagi Grimberg
@ 2020-05-08  7:35             ` Christoph Hellwig
  2020-05-08  7:38               ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-08  7:35 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Mark Wunderlich, Anil Vasudevan, Christoph Hellwig,
	linux-nvme

On Thu, May 07, 2020 at 05:50:38PM -0700, Sagi Grimberg wrote:
>
>>>> What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
>>>> neither MS_MORE nor MSG_EOR.  Is that intentional?
>>>
>>> nvmet_try_send_data should set MSG_EOR if it doesn't have more to send
>>> and also nvmet_try_send_ddgst. So its not intentional.
>>
>> Ok.  Can you send it with a little helper like I suggested (probably one
>> each for host and target) that ensures the right flags are set
>> everywhere?
>
> I think its actually better without the helper. MSG_SENDPAGE_NOTLAST is
> designed only for sendpage and not for sendmsg which we use for ddgst
> (although the net stack code appears to ignore, but still) and when we
> send a pdu header that has data, we dont need the condition because its
> not last for sure.
>
> So the helpers capture ~60% of the call-sites... seems to me like its
> better off without them at the moment. WDYT?

Ok.  Are going to resend with the nvmet_try_send_data fix thrown in?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-08  7:35             ` Christoph Hellwig
@ 2020-05-08  7:38               ` Sagi Grimberg
  0 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2020-05-08  7:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Anil Vasudevan, Mark Wunderlich, linux-nvme


>>>>> What I mean is that nvmet_try_send_data and nvmet_try_send_ddgst may set
>>>>> neither MS_MORE nor MSG_EOR.  Is that intentional?
>>>>
>>>> nvmet_try_send_data should set MSG_EOR if it doesn't have more to send
>>>> and also nvmet_try_send_ddgst. So its not intentional.
>>>
>>> Ok.  Can you send it with a little helper like I suggested (probably one
>>> each for host and target) that ensures the right flags are set
>>> everywhere?
>>
>> I think its actually better without the helper. MSG_SENDPAGE_NOTLAST is
>> designed only for sendpage and not for sendmsg which we use for ddgst
>> (although the net stack code appears to ignore, but still) and when we
>> send a pdu header that has data, we dont need the condition because its
>> not last for sure.
>>
>> So the helpers capture ~60% of the call-sites... seems to me like its
>> better off without them at the moment. WDYT?
> 
> Ok.  Are going to resend with the nvmet_try_send_data fix thrown in?

Don't want to overload more logical changes on the patch itself, I'll
send a patch separately, these should be good to go...

Thanks.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  5:20 [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send Sagi Grimberg
  2020-05-05  5:20 ` [PATCH 2/2] nvmet-tcp: " Sagi Grimberg
  2020-05-05  6:09 ` [PATCH 1/2] nvme-tcp: " Christoph Hellwig
@ 2020-05-12 16:12 ` Christoph Hellwig
  2 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-12 16:12 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Anil Vasudevan, Mark Wunderlich, Christoph Hellwig,
	linux-nvme

Applied to nvme-5.8.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] nvmet-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send
  2020-05-05  5:20 ` [PATCH 2/2] nvmet-tcp: " Sagi Grimberg
@ 2020-05-12 16:13   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2020-05-12 16:13 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Keith Busch, Anil Vasudevan, Mark Wunderlich, Christoph Hellwig,
	linux-nvme

On Mon, May 04, 2020 at 10:20:02PM -0700, Sagi Grimberg wrote:
> We can signal the stack that this is not the last page coming and the
> stack can build a larger tso segment, so go ahead and use it.

Applied to nvme-5.8.

Can you prepare a patch to add the missing MSG_EOR flag in
nvmet_try_send_ddgst?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-12 16:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-05  5:20 [PATCH 1/2] nvme-tcp: set MSG_SENDPAGE_NOTLAST with MSG_MORE when we have more to send Sagi Grimberg
2020-05-05  5:20 ` [PATCH 2/2] nvmet-tcp: " Sagi Grimberg
2020-05-12 16:13   ` Christoph Hellwig
2020-05-05  6:09 ` [PATCH 1/2] nvme-tcp: " Christoph Hellwig
2020-05-05  6:50   ` Sagi Grimberg
2020-05-05 10:23     ` Christoph Hellwig
2020-05-05 21:53       ` Sagi Grimberg
2020-05-06  4:27         ` Christoph Hellwig
2020-05-08  0:50           ` Sagi Grimberg
2020-05-08  7:35             ` Christoph Hellwig
2020-05-08  7:38               ` Sagi Grimberg
2020-05-12 16:12 ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.