linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
@ 2022-10-24 15:03 Daniel Wagner
  2022-10-25  0:54 ` Chaitanya Kulkarni
  2022-10-25  6:37 ` Hannes Reinecke
  0 siblings, 2 replies; 9+ messages in thread
From: Daniel Wagner @ 2022-10-24 15:03 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, Hannes Reinecke

Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
large amounts of requests.

Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---

IMO, this patch might still be a good idea to add. At least in my test
setup where I only have one ethernet port it makes a big difference
when accessing the system via ssh. When nvme-tcp is pushing a lot of
data via the network, the ssh session is completely blocked by the
storage traffic. With it, the ssh session stays responsive.

I suspect a proper storage setup would use more than one ethernet
port.

Daniel

v1:
  https://lore.kernel.org/linux-nvme/20220519062617.39715-4-hare@suse.de/

 drivers/nvme/host/tcp.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 3e7b29d07c71..84a66ca208c8 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -308,12 +308,23 @@ static inline void nvme_tcp_advance_req(struct nvme_tcp_request *req,
 
 static inline void nvme_tcp_send_all(struct nvme_tcp_queue *queue)
 {
-	int ret;
+	unsigned long deadline = jiffies + msecs_to_jiffies(1);
 
 	/* drain the send queue as much as we can... */
 	do {
-		ret = nvme_tcp_try_send(queue);
-	} while (ret > 0);
+		bool pending = false;
+		int result;
+
+		result = nvme_tcp_try_send(queue);
+		if (result > 0)
+			pending = true;
+		else if (unlikely(result < 0))
+			return;
+
+		if (!pending)
+			return;
+
+	} while (!time_after(jiffies, deadline)); /* quota is exhausted */
 }
 
 static inline bool nvme_tcp_queue_more(struct nvme_tcp_queue *queue)
-- 
2.38.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-24 15:03 [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all() Daniel Wagner
@ 2022-10-25  0:54 ` Chaitanya Kulkarni
  2022-10-25  5:52   ` Daniel Wagner
  2022-10-25  7:09   ` Hannes Reinecke
  2022-10-25  6:37 ` Hannes Reinecke
  1 sibling, 2 replies; 9+ messages in thread
From: Chaitanya Kulkarni @ 2022-10-25  0:54 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Keith Busch, Hannes Reinecke

On 10/24/22 08:03, Daniel Wagner wrote:
> Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
> large amounts of requests.
> 
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
> IMO, this patch might still be a good idea to add. At least in my test
> setup where I only have one ethernet port it makes a big difference
> when accessing the system via ssh. When nvme-tcp is pushing a lot of
> data via the network, the ssh session is completely blocked by the
> storage traffic. With it, the ssh session stays responsive. >

I'm not sure whether it is possible but is there a way to gather
some form of quantitative data and present it here to we all know
exactly which aspect is improving by this patch in context of
"ssd session is completely blocked" ?

> I suspect a proper storage setup would use more than one ethernet
> port.
> 
> Daniel
> 
> v1:
>    https://lore.kernel.org/linux-nvme/20220519062617.39715-4-hare@suse.de/
> 
>   drivers/nvme/host/tcp.c | 17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 3e7b29d07c71..84a66ca208c8 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -308,12 +308,23 @@ static inline void nvme_tcp_advance_req(struct nvme_tcp_request *req,
>   
>   static inline void nvme_tcp_send_all(struct nvme_tcp_queue *queue)
>   {
> -	int ret;
> +	unsigned long deadline = jiffies + msecs_to_jiffies(1);

We need to provide some flexibility to set this value as one specified
value may not work with all the setups and H/W.

Can we make it tunable and not statically coded ?

>   
>   	/* drain the send queue as much as we can... */
>   	do {
> -		ret = nvme_tcp_try_send(queue);
> -	} while (ret > 0);
> +		bool pending = false;
> +		int result;
> +
> +		result = nvme_tcp_try_send(queue);
> +		if (result > 0)
> +			pending = true;
> +		else if (unlikely(result < 0))
> +			return;
> +
> +		if (!pending)
> +			return;
> +
> +	} while (!time_after(jiffies, deadline)); /* quota is exhausted */
>   }
>   
>   static inline bool nvme_tcp_queue_more(struct nvme_tcp_queue *queue)

-ck



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-25  0:54 ` Chaitanya Kulkarni
@ 2022-10-25  5:52   ` Daniel Wagner
  2022-10-25  7:09   ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Wagner @ 2022-10-25  5:52 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: linux-nvme, Sagi Grimberg, Keith Busch, Hannes Reinecke

On Tue, Oct 25, 2022 at 12:54:43AM +0000, Chaitanya Kulkarni wrote:
> On 10/24/22 08:03, Daniel Wagner wrote:
> > Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
> > large amounts of requests.
> > 
> > Cc: Hannes Reinecke <hare@suse.de>
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> > ---
> > 
> > IMO, this patch might still be a good idea to add. At least in my test
> > setup where I only have one ethernet port it makes a big difference
> > when accessing the system via ssh. When nvme-tcp is pushing a lot of
> > data via the network, the ssh session is completely blocked by the
> > storage traffic. With it, the ssh session stays responsive. >
> 
> I'm not sure whether it is possible but is there a way to gather
> some form of quantitative data and present it here to we all know
> exactly which aspect is improving by this patch in context of
> "ssd session is completely blocked" ?

Before starting a fio test run, the remote shell works fine and when fio
runs, no keystroke gets echoed until the fio stops again. Obviously,
this is heavily depending on the workload. So my observation is that
limiting an unbound send loop prevents a 'fair' usage of the bandwidth.

> > +	unsigned long deadline = jiffies + msecs_to_jiffies(1);
> 
> We need to provide some flexibility to set this value as one specified
> value may not work with all the setups and H/W.

Note, this is the identically approach we have in nvme_tcp_io_work()
already. So nothing new. Though I was wondering too why 1 jiffy.

> Can we make it tunable and not statically coded ?

I would really like to avoid having a tunable knob like a sysfs
entry. None will get it right. That means this would need to be
algorithm which auto adapts, though I don't have any good idea how this
algorithm should operate. Any ideas?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-24 15:03 [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all() Daniel Wagner
  2022-10-25  0:54 ` Chaitanya Kulkarni
@ 2022-10-25  6:37 ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2022-10-25  6:37 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Keith Busch

On 10/24/22 17:03, Daniel Wagner wrote:
> Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
> large amounts of requests.
> 
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
> IMO, this patch might still be a good idea to add. At least in my test
> setup where I only have one ethernet port it makes a big difference
> when accessing the system via ssh. When nvme-tcp is pushing a lot of
> data via the network, the ssh session is completely blocked by the
> storage traffic. With it, the ssh session stays responsive.
> 
> I suspect a proper storage setup would use more than one ethernet
> port.
> 
> Daniel
> 
> v1:
>    https://lore.kernel.org/linux-nvme/20220519062617.39715-4-hare@suse.de/
> 
>   drivers/nvme/host/tcp.c | 17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 3e7b29d07c71..84a66ca208c8 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -308,12 +308,23 @@ static inline void nvme_tcp_advance_req(struct nvme_tcp_request *req,
>   
>   static inline void nvme_tcp_send_all(struct nvme_tcp_queue *queue)
>   {
> -	int ret;
> +	unsigned long deadline = jiffies + msecs_to_jiffies(1);
>   
>   	/* drain the send queue as much as we can... */
>   	do {
> -		ret = nvme_tcp_try_send(queue);
> -	} while (ret > 0);
> +		bool pending = false;
> +		int result;
> +
> +		result = nvme_tcp_try_send(queue);
> +		if (result > 0)
> +			pending = true;
> +		else if (unlikely(result < 0))
> +			return;
> +
> +		if (!pending)
> +			return;
> +
> +	} while (!time_after(jiffies, deadline)); /* quota is exhausted */
>   }
>   
>   static inline bool nvme_tcp_queue_more(struct nvme_tcp_queue *queue)

Looks like a good idea indeed.
One might want to make the deadline configurable, though.
Other than that:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-25  0:54 ` Chaitanya Kulkarni
  2022-10-25  5:52   ` Daniel Wagner
@ 2022-10-25  7:09   ` Hannes Reinecke
  2022-10-25 13:46     ` Sagi Grimberg
  1 sibling, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2022-10-25  7:09 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Daniel Wagner, linux-nvme; +Cc: Sagi Grimberg, Keith Busch

On 10/25/22 02:54, Chaitanya Kulkarni wrote:
> On 10/24/22 08:03, Daniel Wagner wrote:
>> Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
>> large amounts of requests.
>>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Signed-off-by: Daniel Wagner <dwagner@suse.de>
>> ---
>>
>> IMO, this patch might still be a good idea to add. At least in my test
>> setup where I only have one ethernet port it makes a big difference
>> when accessing the system via ssh. When nvme-tcp is pushing a lot of
>> data via the network, the ssh session is completely blocked by the
>> storage traffic. With it, the ssh session stays responsive. >
> 
> I'm not sure whether it is possible but is there a way to gather
> some form of quantitative data and present it here to we all know
> exactly which aspect is improving by this patch in context of
> "ssd session is completely blocked" ?
> 
Doubt that we can do it. Point is, the send code will run in a tight 
loop, making scheduling of other processes / packets really hard.
So if you have several processes on the same interface (as here with the 
ssh connection) nvme-tcp will eat up the entire bandwidth sending its 
data, and everyone else on the line will suffer.

I guess the same effect could be had by adding a 'schedule()' after each 
nvme_tcp_try_send() call.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-25  7:09   ` Hannes Reinecke
@ 2022-10-25 13:46     ` Sagi Grimberg
  2022-10-26  7:13       ` Daniel Wagner
  0 siblings, 1 reply; 9+ messages in thread
From: Sagi Grimberg @ 2022-10-25 13:46 UTC (permalink / raw)
  To: Hannes Reinecke, Chaitanya Kulkarni, Daniel Wagner, linux-nvme
  Cc: Keith Busch


>>> Add a send quota in nvme_tcp_send_all() to avoid stalls when sending
>>> large amounts of requests.
>>>
>>> Cc: Hannes Reinecke <hare@suse.de>
>>> Signed-off-by: Daniel Wagner <dwagner@suse.de>
>>> ---
>>>
>>> IMO, this patch might still be a good idea to add. At least in my test
>>> setup where I only have one ethernet port it makes a big difference
>>> when accessing the system via ssh. When nvme-tcp is pushing a lot of
>>> data via the network, the ssh session is completely blocked by the
>>> storage traffic. With it, the ssh session stays responsive. >
>>
>> I'm not sure whether it is possible but is there a way to gather
>> some form of quantitative data and present it here to we all know
>> exactly which aspect is improving by this patch in context of
>> "ssd session is completely blocked" ?
>>
> Doubt that we can do it. Point is, the send code will run in a tight 
> loop, making scheduling of other processes / packets really hard.
> So if you have several processes on the same interface (as here with the 
> ssh connection) nvme-tcp will eat up the entire bandwidth sending its 
> data, and everyone else on the line will suffer.
> 
> I guess the same effect could be had by adding a 'schedule()' after each 
> nvme_tcp_try_send() call.

Daniel, does adding cond_resched() make the system responsive again?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-25 13:46     ` Sagi Grimberg
@ 2022-10-26  7:13       ` Daniel Wagner
  2022-10-26  8:30         ` Sagi Grimberg
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Wagner @ 2022-10-26  7:13 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, Chaitanya Kulkarni, linux-nvme, Keith Busch

On Tue, Oct 25, 2022 at 04:46:07PM +0300, Sagi Grimberg wrote:
> Daniel, does adding cond_resched() make the system responsive again?

As it turns out I can't reproduce it anymore. Our test lab got
restructured and the cabling between machine changed (maybe even
different network switches, idk). Also the target got a firmware
update.

It looks more like the real problem was caused by the network
infrastructure than the host itself. And adding the additional delay in
the send path was just reducing the load which made the ssh session
working.

Given this, I don't think we currently need to touch this code. Though
the unbounded loop makes me a bit uneasy.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-26  7:13       ` Daniel Wagner
@ 2022-10-26  8:30         ` Sagi Grimberg
  2022-10-26  8:39           ` Daniel Wagner
  0 siblings, 1 reply; 9+ messages in thread
From: Sagi Grimberg @ 2022-10-26  8:30 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Hannes Reinecke, Chaitanya Kulkarni, linux-nvme, Keith Busch


>> Daniel, does adding cond_resched() make the system responsive again?
> 
> As it turns out I can't reproduce it anymore. Our test lab got
> restructured and the cabling between machine changed (maybe even
> different network switches, idk). Also the target got a firmware
> update.
> 
> It looks more like the real problem was caused by the network
> infrastructure than the host itself. And adding the additional delay in
> the send path was just reducing the load which made the ssh session
> working.

Let's table it for now.

> Given this, I don't think we currently need to touch this code. Though
> the unbounded loop makes me a bit uneasy.

This flow only enters when there is a single request queued, so it
should not be something that takes long. Unless there is a real
problem, lets not add code that may produce one.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all()
  2022-10-26  8:30         ` Sagi Grimberg
@ 2022-10-26  8:39           ` Daniel Wagner
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Wagner @ 2022-10-26  8:39 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, Chaitanya Kulkarni, linux-nvme, Keith Busch

On Wed, Oct 26, 2022 at 11:30:50AM +0300, Sagi Grimberg wrote:
> > Given this, I don't think we currently need to touch this code. Though
> > the unbounded loop makes me a bit uneasy.
> 
> This flow only enters when there is a single request queued, so it
> should not be something that takes long. Unless there is a real
> problem, lets not add code that may produce one.

Okay, make sense. I just wonder how to diagnose this scenario.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-10-26  8:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 15:03 [PATCH v2] nvme-tcp: send quota for nvme_tcp_send_all() Daniel Wagner
2022-10-25  0:54 ` Chaitanya Kulkarni
2022-10-25  5:52   ` Daniel Wagner
2022-10-25  7:09   ` Hannes Reinecke
2022-10-25 13:46     ` Sagi Grimberg
2022-10-26  7:13       ` Daniel Wagner
2022-10-26  8:30         ` Sagi Grimberg
2022-10-26  8:39           ` Daniel Wagner
2022-10-25  6:37 ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).