All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3] nvmet-tcp: enable optional queue idle period tracking
@ 2021-03-23 22:25 Wunderlich, Mark
  2021-03-23 22:52 ` Sagi Grimberg
  0 siblings, 1 reply; 2+ messages in thread
From: Wunderlich, Mark @ 2021-03-23 22:25 UTC (permalink / raw)
  To: linux-nvme; +Cc: Sagi Grimberg

nvmet-tcp: enable optional queue idle period tracking

Add 'idle_poll_period_usecs' option used by io_work() to support
network devices enabled with advanced interrupt moderation
supporting a relaxed interrupt model. It was discovered that
such a NIC used on the target was unable to support initiator
connection establishment, caused by the existing io_work()
flow that immediately exits after a loop with no activity and
does not re-queue itself.

With this new option a queue is assigned a period of time
that no activity must occur in order to become 'idle'.  Until
the queue is idle the work item is requeued.

The new module option is defined as changeable making it
flexible for testing purposes.

The pre-existing legacy behavior is preserved when no module option
for idle_poll_period_usecs is specified.

Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com>
---
V2 of this patch removes the accounting of time deducted from the
idle deadline time period only during io_work activity.  The result
is a more simple solution, only requiring the selection of a
sufficient optional time period that will catch any non-idle activity
to keep a queue active.

Testing was performed with a NIC using standard HW interrupt mode, with
and without the new module option enabled.  No measurable performance
drop was seen when the patch wsa applied and the new option specified
or not.  A side effect of a standard NIC using the new option
will reduce the context switch rate.  We measured a drop from roughly
90K to less than 300 (for 32 active connections).

For a NIC using a passive advanced interrupt moderation policy, it was
then successfully able to achieve and maintain active connections with
the target.
---
V3 of this patch provides a bit more simplification, pulling the
tracking code out of io_work and into two support functions.  Now a
single test made after the process loop in io_work to determine if
optional idle tracking is active or not.  The base logic of idle tracking
used as condition to re-queue worker remains the same.
---
 drivers/nvme/target/tcp.c |   44 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index dc1f0f647189..0e86115af9f4 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -29,6 +29,16 @@ static int so_priority;
 module_param(so_priority, int, 0644);
 MODULE_PARM_DESC(so_priority, "nvmet tcp socket optimize priority");
 
+/* Define a time period (in usecs) that io_work() shall sample an activated
+ * queue before determining it to be idle.  This optional module behavior
+ * can enable NIC solutions that support socket optimized packet processing
+ * using advanced interrupt moderation techniques.
+ */
+static int idle_poll_period_usecs;
+module_param(idle_poll_period_usecs, int, 0644);
+MODULE_PARM_DESC(idle_poll_period_usecs,
+		"nvmet tcp io_work poll till idle time period in usecs");
+
 #define NVMET_TCP_RECV_BUDGET		8
 #define NVMET_TCP_SEND_BUDGET		8
 #define NVMET_TCP_IO_WORK_BUDGET	64
@@ -119,6 +129,9 @@ struct nvmet_tcp_queue {
 	struct ahash_request	*snd_hash;
 	struct ahash_request	*rcv_hash;
 
+	unsigned long		poll_start;
+	unsigned long		poll_end;
+
 	spinlock_t		state_lock;
 	enum nvmet_tcp_queue_state state;
 
@@ -1198,11 +1211,34 @@ static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
 	spin_unlock(&queue->state_lock);
 }
 
+static inline void nvmet_tcp_arm_queue_deadline(struct nvmet_tcp_queue *queue)
+{
+	queue->poll_start = jiffies;
+	queue->poll_end = queue->poll_start +
+			usecs_to_jiffies(idle_poll_period_usecs);
+}
+
+static bool nvmet_tcp_check_queue_deadline(struct nvmet_tcp_queue *queue,
+		int ops)
+{
+	if (!idle_poll_period_usecs)
+		return false;
+
+	if (ops || !queue->poll_start)
+		nvmet_tcp_arm_queue_deadline(queue);
+
+	if (!time_in_range(jiffies, queue->poll_start, queue->poll_end)) {
+		queue->poll_start = queue->poll_end = 0;
+		return false;
+	}
+	return true;
+}
+
 static void nvmet_tcp_io_work(struct work_struct *w)
 {
 	struct nvmet_tcp_queue *queue =
 		container_of(w, struct nvmet_tcp_queue, io_work);
-	bool pending;
+	bool pending, requeue;
 	int ret, ops = 0;
 
 	do {
@@ -1223,9 +1259,11 @@ static void nvmet_tcp_io_work(struct work_struct *w)
 	} while (pending && ops < NVMET_TCP_IO_WORK_BUDGET);
 
 	/*
-	 * We exahusted our budget, requeue our selves
+	 * Requeue the worker if idle deadline period is in progress or any
+	 * ops activity was recorded during the do-while loop above.
 	 */
-	if (pending)
+	requeue = nvmet_tcp_check_queue_deadline(queue, ops) || pending;
+	if (requeue)
 		queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
 }


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH V3] nvmet-tcp: enable optional queue idle period tracking
  2021-03-23 22:25 [PATCH V3] nvmet-tcp: enable optional queue idle period tracking Wunderlich, Mark
@ 2021-03-23 22:52 ` Sagi Grimberg
  0 siblings, 0 replies; 2+ messages in thread
From: Sagi Grimberg @ 2021-03-23 22:52 UTC (permalink / raw)
  To: Wunderlich, Mark, linux-nvme

Hey Mark,

> nvmet-tcp: enable optional queue idle period tracking
> 
> Add 'idle_poll_period_usecs' option used by io_work() to support
> network devices enabled with advanced interrupt moderation
> supporting a relaxed interrupt model. It was discovered that
> such a NIC used on the target was unable to support initiator
> connection establishment, caused by the existing io_work()
> flow that immediately exits after a loop with no activity and
> does not re-queue itself.
> 
> With this new option a queue is assigned a period of time
> that no activity must occur in order to become 'idle'.  Until
> the queue is idle the work item is requeued.
> 
> The new module option is defined as changeable making it
> flexible for testing purposes.
> 
> The pre-existing legacy behavior is preserved when no module option
> for idle_poll_period_usecs is specified.
> 
> Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com>
> ---

Its easier to read here with the format:
Changes from v2:
- ...
- ...

Changes from v1:
- ...
- ...

> V2 of this patch removes the accounting of time deducted from the
> idle deadline time period only during io_work activity.  The result
> is a more simple solution, only requiring the selection of a
> sufficient optional time period that will catch any non-idle activity
> to keep a queue active.
> 
> Testing was performed with a NIC using standard HW interrupt mode, with
> and without the new module option enabled.  No measurable performance
> drop was seen when the patch wsa applied and the new option specified
> or not.  A side effect of a standard NIC using the new option
> will reduce the context switch rate.  We measured a drop from roughly
> 90K to less than 300 (for 32 active connections).
> 
> For a NIC using a passive advanced interrupt moderation policy, it was
> then successfully able to achieve and maintain active connections with
> the target.
> ---
> V3 of this patch provides a bit more simplification, pulling the
> tracking code out of io_work and into two support functions.  Now a
> single test made after the process loop in io_work to determine if
> optional idle tracking is active or not.  The base logic of idle tracking
> used as condition to re-queue worker remains the same.
> ---
>   drivers/nvme/target/tcp.c |   44 +++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index dc1f0f647189..0e86115af9f4 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -29,6 +29,16 @@ static int so_priority;
>   module_param(so_priority, int, 0644);
>   MODULE_PARM_DESC(so_priority, "nvmet tcp socket optimize priority");
>   
> +/* Define a time period (in usecs) that io_work() shall sample an activated
> + * queue before determining it to be idle.  This optional module behavior
> + * can enable NIC solutions that support socket optimized packet processing
> + * using advanced interrupt moderation techniques.
> + */
> +static int idle_poll_period_usecs;
> +module_param(idle_poll_period_usecs, int, 0644);
> +MODULE_PARM_DESC(idle_poll_period_usecs,
> +		"nvmet tcp io_work poll till idle time period in usecs");
> +
>   #define NVMET_TCP_RECV_BUDGET		8
>   #define NVMET_TCP_SEND_BUDGET		8
>   #define NVMET_TCP_IO_WORK_BUDGET	64
> @@ -119,6 +129,9 @@ struct nvmet_tcp_queue {
>   	struct ahash_request	*snd_hash;
>   	struct ahash_request	*rcv_hash;
>   
> +	unsigned long		poll_start;
> +	unsigned long		poll_end;
> +
>   	spinlock_t		state_lock;
>   	enum nvmet_tcp_queue_state state;
>   
> @@ -1198,11 +1211,34 @@ static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
>   	spin_unlock(&queue->state_lock);
>   }
>   
> +static inline void nvmet_tcp_arm_queue_deadline(struct nvmet_tcp_queue *queue)
> +{
> +	queue->poll_start = jiffies;
> +	queue->poll_end = queue->poll_start +
> +			usecs_to_jiffies(idle_poll_period_usecs);
> +}
> +
> +static bool nvmet_tcp_check_queue_deadline(struct nvmet_tcp_queue *queue,
> +		int ops)
> +{
> +	if (!idle_poll_period_usecs)
> +		return false;
> +
> +	if (ops || !queue->poll_start)
> +		nvmet_tcp_arm_queue_deadline(queue);
> +
> +	if (!time_in_range(jiffies, queue->poll_start, queue->poll_end)) {
> +		queue->poll_start = queue->poll_end = 0;
> +		return false;
> +	}
> +	return true;

Can this be simpler somehow? without clearing the poll limits and then
looking at poll_start to indicate that?

> +}
> +
>   static void nvmet_tcp_io_work(struct work_struct *w)
>   {
>   	struct nvmet_tcp_queue *queue =
>   		container_of(w, struct nvmet_tcp_queue, io_work);
> -	bool pending;
> +	bool pending, requeue;
>   	int ret, ops = 0;
>   
>   	do {
> @@ -1223,9 +1259,11 @@ static void nvmet_tcp_io_work(struct work_struct *w)
>   	} while (pending && ops < NVMET_TCP_IO_WORK_BUDGET);
>   
>   	/*
> -	 * We exahusted our budget, requeue our selves
> +	 * Requeue the worker if idle deadline period is in progress or any
> +	 * ops activity was recorded during the do-while loop above.
>   	 */
> -	if (pending)
> +	requeue = nvmet_tcp_check_queue_deadline(queue, ops) || pending;
> +	if (requeue)

I'm thinking that this requeue variable is redundant,

You can do instead:

	if (nvmet_tcp_check_queue_deadline(queue, ops) || pending)
>   		queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
>   }
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-23 22:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 22:25 [PATCH V3] nvmet-tcp: enable optional queue idle period tracking Wunderlich, Mark
2021-03-23 22:52 ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.