io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] improve SQPOLL handling
@ 2020-10-20  8:23 Xiaoguang Wang
  2020-10-20  8:23 ` [PATCH 1/2] io_uring: refactor io_sq_thread() handling Xiaoguang Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Xiaoguang Wang @ 2020-10-20  8:23 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi, Xiaoguang Wang

The first patch tries to improve various issues in current implementation:
  The prepare_to_wait() usage in __io_sq_thread() is weird. If multiple ctxs
share one same poll thread, one ctx will put poll thread in TASK_INTERRUPTIBLE,
but if other ctxs have work to do, we don't need to change task's stat at all.
I think only if all ctxs don't have work to do, we can do it.
  We use round-robin strategy to make multiple ctxs share one same poll thread,
but there are various condition in __io_sq_thread(), which seems complicated and
may affect round-robin strategy.
  In io_sq_thread(), we always call io_sq_thread_drop_mm() when we complete a
round of ctxs iteration, which maybe inefficient.

The second patch adds a IORING_SETUP_SQPOLL_PERCPU flag, for those rings which
have SQPOLL enabled and are willing to be bound to one same cpu, hence share
one same poll thread, add a capability that these rings can share one poll thread
by specifying a new IORING_SETUP_SQPOLL_PERCPU flag. FIO tool can integrate this
feature easily, so we can test multiple rings to share same poll thread easily.

Xiaoguang Wang (2):
  io_uring: refactor io_sq_thread() handling
  io_uring: support multiple rings to share same poll thread by
    specifying same cpu

 fs/io_uring.c                 | 307 ++++++++++++++++++++--------------
 include/uapi/linux/io_uring.h |   1 +
 2 files changed, 181 insertions(+), 127 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] io_uring: refactor io_sq_thread() handling
  2020-10-20  8:23 [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
@ 2020-10-20  8:23 ` Xiaoguang Wang
  2020-10-27 16:56   ` Jens Axboe
  2020-10-20  8:23 ` [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu Xiaoguang Wang
  2020-10-27 13:34 ` [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
  2 siblings, 1 reply; 9+ messages in thread
From: Xiaoguang Wang @ 2020-10-20  8:23 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi, Xiaoguang Wang

There are some issues about current io_sq_thread() implementation:
  1. The prepare_to_wait() usage in __io_sq_thread() is weird. If
multiple ctxs share one same poll thread, one ctx will put poll thread
in TASK_INTERRUPTIBLE, but if other ctxs have work to do, we don't
need to change task's stat at all. I think only if all ctxs don't have
work to do, we can do it.
  2. We use round-robin strategy to make multiple ctxs share one same
poll thread, but there are various condition in __io_sq_thread(), which
seems complicated and may affect round-robin strategy.
  3. In io_sq_thread(), we always call io_sq_thread_drop_mm() when we
complete a round of ctxs iteration, which maybe inefficient.

To improve above issues, I take below actions:
  1. If multiple ctxs share one same poll thread, only if all all ctxs
don't have work to do, we can call prepare_to_wait() and schedule() to
make poll thread enter sleep state.
  2. To make round-robin strategy more straight, I simplify
__io_sq_thread() a bit, it just does io poll and sqes submit work once,
does not check various condition.
  3. Only call io_sq_thread_drop_mm() when we're polling in the defined
idle period.
  4. For multiple ctxs share one same poll thread, we choose the biggest
sq_thread_idle among these ctxs as timeout condition, and will update
it when ctx is in or out.
  5. Not need to check EBUSY especially, if io_submit_sqes() returns
EBUSY, IORING_SQ_CQ_OVERFLOW should be set, helper in liburing should
be aware of cq overflow and enters kernel to flush work.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/io_uring.c | 165 ++++++++++++++++++++------------------------------
 1 file changed, 66 insertions(+), 99 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2c83c2688ec5..f7b65a9ed5b8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -243,6 +243,8 @@ struct io_sq_data {
 
 	struct task_struct	*thread;
 	struct wait_queue_head	wait;
+
+	unsigned		sq_thread_idle;
 };
 
 struct io_ring_ctx {
@@ -308,7 +310,6 @@ struct io_ring_ctx {
 	struct io_sq_data	*sq_data;	/* if using sq thread polling */
 
 	struct wait_queue_head	sqo_sq_wait;
-	struct wait_queue_entry	sqo_wait_entry;
 	struct list_head	sqd_list;
 
 	/*
@@ -6484,38 +6485,11 @@ static inline void io_ring_clear_wakeup_flag(struct io_ring_ctx *ctx)
 	spin_unlock_irq(&ctx->completion_lock);
 }
 
-static int io_sq_wake_function(struct wait_queue_entry *wqe, unsigned mode,
-			       int sync, void *key)
-{
-	struct io_ring_ctx *ctx = container_of(wqe, struct io_ring_ctx, sqo_wait_entry);
-	int ret;
-
-	ret = autoremove_wake_function(wqe, mode, sync, key);
-	if (ret) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&ctx->completion_lock, flags);
-		ctx->rings->sq_flags &= ~IORING_SQ_NEED_WAKEUP;
-		spin_unlock_irqrestore(&ctx->completion_lock, flags);
-	}
-	return ret;
-}
-
-enum sq_ret {
-	SQT_IDLE	= 1,
-	SQT_SPIN	= 2,
-	SQT_DID_WORK	= 4,
-};
-
-static enum sq_ret __io_sq_thread(struct io_ring_ctx *ctx,
-				  unsigned long start_jiffies, bool cap_entries)
+static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
 {
-	unsigned long timeout = start_jiffies + ctx->sq_thread_idle;
-	struct io_sq_data *sqd = ctx->sq_data;
 	unsigned int to_submit;
 	int ret = 0;
 
-again:
 	if (!list_empty(&ctx->iopoll_list)) {
 		unsigned nr_events = 0;
 
@@ -6526,56 +6500,6 @@ static enum sq_ret __io_sq_thread(struct io_ring_ctx *ctx,
 	}
 
 	to_submit = io_sqring_entries(ctx);
-
-	/*
-	 * If submit got -EBUSY, flag us as needing the application
-	 * to enter the kernel to reap and flush events.
-	 */
-	if (!to_submit || ret == -EBUSY || need_resched()) {
-		/*
-		 * Drop cur_mm before scheduling, we can't hold it for
-		 * long periods (or over schedule()). Do this before
-		 * adding ourselves to the waitqueue, as the unuse/drop
-		 * may sleep.
-		 */
-		io_sq_thread_drop_mm();
-
-		/*
-		 * We're polling. If we're within the defined idle
-		 * period, then let us spin without work before going
-		 * to sleep. The exception is if we got EBUSY doing
-		 * more IO, we should wait for the application to
-		 * reap events and wake us up.
-		 */
-		if (!list_empty(&ctx->iopoll_list) || need_resched() ||
-		    (!time_after(jiffies, timeout) && ret != -EBUSY &&
-		    !percpu_ref_is_dying(&ctx->refs)))
-			return SQT_SPIN;
-
-		prepare_to_wait(&sqd->wait, &ctx->sqo_wait_entry,
-					TASK_INTERRUPTIBLE);
-
-		/*
-		 * While doing polled IO, before going to sleep, we need
-		 * to check if there are new reqs added to iopoll_list,
-		 * it is because reqs may have been punted to io worker
-		 * and will be added to iopoll_list later, hence check
-		 * the iopoll_list again.
-		 */
-		if ((ctx->flags & IORING_SETUP_IOPOLL) &&
-		    !list_empty_careful(&ctx->iopoll_list)) {
-			finish_wait(&sqd->wait, &ctx->sqo_wait_entry);
-			goto again;
-		}
-
-		to_submit = io_sqring_entries(ctx);
-		if (!to_submit || ret == -EBUSY)
-			return SQT_IDLE;
-	}
-
-	finish_wait(&sqd->wait, &ctx->sqo_wait_entry);
-	io_ring_clear_wakeup_flag(ctx);
-
 	/* if we're handling multiple rings, cap submit size for fairness */
 	if (cap_entries && to_submit > 8)
 		to_submit = 8;
@@ -6588,7 +6512,20 @@ static enum sq_ret __io_sq_thread(struct io_ring_ctx *ctx,
 	if (!io_sqring_full(ctx) && wq_has_sleeper(&ctx->sqo_sq_wait))
 		wake_up(&ctx->sqo_sq_wait);
 
-	return SQT_DID_WORK;
+	return ret;
+}
+
+static void io_sqd_update_thread_idle(struct io_sq_data *sqd)
+{
+	struct io_ring_ctx *ctx;
+	unsigned sq_thread_idle = 0;
+
+	list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+		if (sq_thread_idle < ctx->sq_thread_idle)
+			sq_thread_idle = ctx->sq_thread_idle;
+	}
+
+	sqd->sq_thread_idle = sq_thread_idle;
 }
 
 static void io_sqd_init_new(struct io_sq_data *sqd)
@@ -6597,11 +6534,11 @@ static void io_sqd_init_new(struct io_sq_data *sqd)
 
 	while (!list_empty(&sqd->ctx_new_list)) {
 		ctx = list_first_entry(&sqd->ctx_new_list, struct io_ring_ctx, sqd_list);
-		init_wait(&ctx->sqo_wait_entry);
-		ctx->sqo_wait_entry.func = io_sq_wake_function;
 		list_move_tail(&ctx->sqd_list, &sqd->ctx_list);
 		complete(&ctx->sq_thread_comp);
 	}
+
+	io_sqd_update_thread_idle(sqd);
 }
 
 static int io_sq_thread(void *data)
@@ -6610,12 +6547,13 @@ static int io_sq_thread(void *data)
 	const struct cred *old_cred = NULL;
 	struct io_sq_data *sqd = data;
 	struct io_ring_ctx *ctx;
-	unsigned long start_jiffies;
+	unsigned long timeout;
+	unsigned int to_submit;
+	DEFINE_WAIT(wait);
 
-	start_jiffies = jiffies;
 	while (!kthread_should_stop()) {
-		enum sq_ret ret = 0;
-		bool cap_entries;
+		int ret;
+		bool cap_entries, sqt_spin, needs_sched;
 
 		/*
 		 * Any changes to the sqd lists are synchronized through the
@@ -6625,11 +6563,13 @@ static int io_sq_thread(void *data)
 		if (kthread_should_park())
 			kthread_parkme();
 
-		if (unlikely(!list_empty(&sqd->ctx_new_list)))
+		if (unlikely(!list_empty(&sqd->ctx_new_list))) {
 			io_sqd_init_new(sqd);
+			timeout = jiffies + sqd->sq_thread_idle;
+		}
 
+		sqt_spin = false;
 		cap_entries = !list_is_singular(&sqd->ctx_list);
-
 		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
 			if (current->cred != ctx->creds) {
 				if (old_cred)
@@ -6638,24 +6578,52 @@ static int io_sq_thread(void *data)
 			}
 			io_sq_thread_associate_blkcg(ctx, &cur_css);
 
-			ret |= __io_sq_thread(ctx, start_jiffies, cap_entries);
+			ret = __io_sq_thread(ctx, cap_entries);
+			if (!sqt_spin && (ret > 0 || !list_empty(&ctx->iopoll_list)))
+				sqt_spin = true;
+		}
 
-			io_sq_thread_drop_mm();
+		if (sqt_spin) {
+			io_run_task_work();
+			cond_resched();
+			continue;
 		}
 
-		if (ret & SQT_SPIN) {
+		if (!time_after(jiffies, timeout)) {
 			io_run_task_work();
 			cond_resched();
-		} else if (ret == SQT_IDLE) {
-			if (kthread_should_park())
-				continue;
+			io_sq_thread_drop_mm();
+			continue;
+		}
+
+		needs_sched = true;
+		prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
+		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+			if ((ctx->flags & IORING_SETUP_IOPOLL) &&
+			    !list_empty_careful(&ctx->iopoll_list)) {
+				needs_sched = false;
+				break;
+			}
+			to_submit = io_sqring_entries(ctx);
+			if (to_submit) {
+				needs_sched = false;
+				break;
+			}
+		}
+
+		if (needs_sched) {
 			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
 				io_ring_set_wakeup_flag(ctx);
+		}
+
+		if (needs_sched) {
 			schedule();
-			start_jiffies = jiffies;
 			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
 				io_ring_clear_wakeup_flag(ctx);
-		}
+			finish_wait(&sqd->wait, &wait);
+		} else
+			finish_wait(&sqd->wait, &wait);
+		timeout = jiffies + sqd->sq_thread_idle;
 	}
 
 	io_run_task_work();
@@ -6947,12 +6915,11 @@ static void io_sq_thread_stop(struct io_ring_ctx *ctx)
 
 		mutex_lock(&sqd->ctx_lock);
 		list_del(&ctx->sqd_list);
+		io_sqd_update_thread_idle(sqd);
 		mutex_unlock(&sqd->ctx_lock);
 
-		if (sqd->thread) {
-			finish_wait(&sqd->wait, &ctx->sqo_wait_entry);
+		if (sqd->thread)
 			io_sq_thread_unpark(sqd);
-		}
 
 		io_put_sq_data(sqd);
 		ctx->sq_data = NULL;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu
  2020-10-20  8:23 [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
  2020-10-20  8:23 ` [PATCH 1/2] io_uring: refactor io_sq_thread() handling Xiaoguang Wang
@ 2020-10-20  8:23 ` Xiaoguang Wang
  2020-10-27 17:01   ` Jens Axboe
  2020-10-27 13:34 ` [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
  2 siblings, 1 reply; 9+ messages in thread
From: Xiaoguang Wang @ 2020-10-20  8:23 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi, Xiaoguang Wang

We have already supported multiple rings to share one same poll thread
by passing IORING_SETUP_ATTACH_WQ, but it's not that convenient to use.
IORING_SETUP_ATTACH_WQ needs users to ensure that a parent ring instance
has already existed, that means it will require app to regulate the
creation oder between uring instances.

Currently we can make this a bit simpler, for those rings which will
have SQPOLL enabled and are willing to be bound to one same cpu, add a
capability that these rings can share one poll thread by specifying
a new IORING_SETUP_SQPOLL_PERCPU flag, then we have 3 cases
  1, IORING_SETUP_ATTACH_WQ: if user specifies this flag, we'll always
try to attach this ring to an existing ring's corresponding poll thread,
no matter whether IORING_SETUP_SQ_AFF or IORING_SETUP_SQPOLL_PERCPU is
set.
  2, IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
for this case, we'll create a single poll thread to be shared by these
rings, and this poll thread is bound to a fixed cpu.
  3, for any other cases, we'll just create one new poll thread for the
corresponding ring.

And for case 2, don't need to regulate creation oder of multiple uring
instances, we use a mutex to synchronize creation, for example, say five
rings which all have IORING_SETUP_SQ_AFF & IORING_SETUP_SQPOLL_PERCPU
enabled, and are willing to be bound same cpu, one ring that gets the
mutex lock will create one poll thread, the other four rings will just
attach themselves the previous created poll thread once they get lock
successfully.

To implement above function, define a percpu io_sq_data array:
    static struct io_sq_data __percpu *percpu_sqd;
When IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
we will use struct io_uring_params' sq_thread_cpu to locate corresponding
sqd, and use this sqd to save poll thread info.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/io_uring.c                 | 142 +++++++++++++++++++++++++++-------
 include/uapi/linux/io_uring.h |   1 +
 2 files changed, 115 insertions(+), 28 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f7b65a9ed5b8..b9ebf1ca93d7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -235,6 +235,7 @@ struct io_restriction {
 struct io_sq_data {
 	refcount_t		refs;
 	struct mutex		lock;
+	struct mutex		percpu_sq_lock;
 
 	/* ctx's that are using this sqd */
 	struct list_head	ctx_list;
@@ -247,6 +248,8 @@ struct io_sq_data {
 	unsigned		sq_thread_idle;
 };
 
+static struct io_sq_data __percpu *percpu_sqd;
+
 struct io_ring_ctx {
 	struct {
 		struct percpu_ref	refs;
@@ -6504,10 +6507,12 @@ static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
 	if (cap_entries && to_submit > 8)
 		to_submit = 8;
 
-	mutex_lock(&ctx->uring_lock);
-	if (likely(!percpu_ref_is_dying(&ctx->refs)))
-		ret = io_submit_sqes(ctx, to_submit);
-	mutex_unlock(&ctx->uring_lock);
+	if (to_submit) {
+		mutex_lock(&ctx->uring_lock);
+		if (likely(!percpu_ref_is_dying(&ctx->refs)))
+			ret = io_submit_sqes(ctx, to_submit);
+		mutex_unlock(&ctx->uring_lock);
+	}
 
 	if (!io_sqring_full(ctx) && wq_has_sleeper(&ctx->sqo_sq_wait))
 		wake_up(&ctx->sqo_sq_wait);
@@ -6814,8 +6819,17 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
 	return 0;
 }
 
-static void io_put_sq_data(struct io_sq_data *sqd)
+static void io_put_sq_data(struct io_ring_ctx *ctx, struct io_sq_data *sqd)
 {
+	int percpu_sqd = 0;
+
+	if ((ctx->flags & IORING_SETUP_SQ_AFF) &&
+	    (ctx->flags & IORING_SETUP_SQPOLL_PERCPU))
+		percpu_sqd = 1;
+
+	if (percpu_sqd)
+		mutex_lock(&sqd->percpu_sq_lock);
+
 	if (refcount_dec_and_test(&sqd->refs)) {
 		/*
 		 * The park is a bit of a work-around, without it we get
@@ -6827,8 +6841,14 @@ static void io_put_sq_data(struct io_sq_data *sqd)
 			kthread_stop(sqd->thread);
 		}
 
-		kfree(sqd);
+		if (!percpu_sqd)
+			kfree(sqd);
+		else
+			sqd->thread = NULL;
 	}
+
+	if (percpu_sqd)
+		mutex_unlock(&sqd->percpu_sq_lock);
 }
 
 static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
@@ -6857,13 +6877,10 @@ static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
 	return sqd;
 }
 
-static struct io_sq_data *io_get_sq_data(struct io_uring_params *p)
+static struct io_sq_data *io_alloc_sq_data(struct io_uring_params *p)
 {
 	struct io_sq_data *sqd;
 
-	if (p->flags & IORING_SETUP_ATTACH_WQ)
-		return io_attach_sq_data(p);
-
 	sqd = kzalloc(sizeof(*sqd), GFP_KERNEL);
 	if (!sqd)
 		return ERR_PTR(-ENOMEM);
@@ -6873,7 +6890,9 @@ static struct io_sq_data *io_get_sq_data(struct io_uring_params *p)
 	INIT_LIST_HEAD(&sqd->ctx_new_list);
 	mutex_init(&sqd->ctx_lock);
 	mutex_init(&sqd->lock);
+	mutex_init(&sqd->percpu_sq_lock);
 	init_waitqueue_head(&sqd->wait);
+
 	return sqd;
 }
 
@@ -6895,6 +6914,43 @@ static void io_sq_thread_park(struct io_sq_data *sqd)
 	kthread_park(sqd->thread);
 }
 
+static void io_attach_ctx_to_sqd(struct io_sq_data *sqd, struct io_ring_ctx *ctx)
+{
+	ctx->sq_data = sqd;
+	io_sq_thread_park(sqd);
+	mutex_lock(&sqd->ctx_lock);
+	list_add(&ctx->sqd_list, &sqd->ctx_new_list);
+	mutex_unlock(&sqd->ctx_lock);
+	io_sq_thread_unpark(sqd);
+}
+
+static struct io_sq_data *io_find_or_create_percpu_sq_thread(struct io_ring_ctx *ctx,
+					struct io_uring_params *p)
+{
+	struct io_sq_data *sqd;
+	struct task_struct *tsk;
+	int cpu = p->sq_thread_cpu;
+
+	sqd = per_cpu_ptr(percpu_sqd, cpu);
+	mutex_lock(&sqd->percpu_sq_lock);
+	if (sqd->thread) {
+		refcount_inc(&sqd->refs);
+		mutex_unlock(&sqd->percpu_sq_lock);
+		return sqd;
+	}
+
+	tsk = kthread_create_on_cpu(io_sq_thread, sqd, cpu, "io_uring-sq");
+	if (IS_ERR(tsk)) {
+		sqd = ERR_PTR(PTR_ERR(tsk));
+		goto out;
+	}
+	sqd->thread = tsk;
+	refcount_set(&sqd->refs, 1);
+out:
+	mutex_unlock(&sqd->percpu_sq_lock);
+	return sqd;
+}
+
 static void io_sq_thread_stop(struct io_ring_ctx *ctx)
 {
 	struct io_sq_data *sqd = ctx->sq_data;
@@ -6921,7 +6977,7 @@ static void io_sq_thread_stop(struct io_ring_ctx *ctx)
 		if (sqd->thread)
 			io_sq_thread_unpark(sqd);
 
-		io_put_sq_data(sqd);
+		io_put_sq_data(ctx, sqd);
 		ctx->sq_data = NULL;
 	}
 }
@@ -7578,25 +7634,29 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 		if (!capable(CAP_SYS_ADMIN))
 			goto err;
 
-		sqd = io_get_sq_data(p);
-		if (IS_ERR(sqd)) {
-			ret = PTR_ERR(sqd);
-			goto err;
-		}
-
-		ctx->sq_data = sqd;
-		io_sq_thread_park(sqd);
-		mutex_lock(&sqd->ctx_lock);
-		list_add(&ctx->sqd_list, &sqd->ctx_new_list);
-		mutex_unlock(&sqd->ctx_lock);
-		io_sq_thread_unpark(sqd);
-
 		ctx->sq_thread_idle = msecs_to_jiffies(p->sq_thread_idle);
 		if (!ctx->sq_thread_idle)
 			ctx->sq_thread_idle = HZ;
 
-		if (sqd->thread)
-			goto done;
+		if (p->flags & IORING_SETUP_ATTACH_WQ) {
+			sqd = io_attach_sq_data(p);
+			if (IS_ERR(sqd)) {
+				ret = PTR_ERR(sqd);
+				goto err;
+			}
+			io_attach_ctx_to_sqd(sqd, ctx);
+			if (sqd->thread)
+				goto done;
+		}
+
+		if (!(p->flags & IORING_SETUP_SQ_AFF) ||
+		    !(p->flags & IORING_SETUP_SQPOLL_PERCPU)) {
+			sqd = io_alloc_sq_data(p);
+			if (IS_ERR(sqd)) {
+				ret = PTR_ERR(sqd);
+				goto err;
+			}
+		}
 
 		if (p->flags & IORING_SETUP_SQ_AFF) {
 			int cpu = p->sq_thread_cpu;
@@ -7607,7 +7667,14 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 			if (!cpu_online(cpu))
 				goto err;
 
-			sqd->thread = kthread_create_on_cpu(io_sq_thread, sqd,
+			if (p->flags & IORING_SETUP_SQPOLL_PERCPU) {
+				sqd = io_find_or_create_percpu_sq_thread(ctx, p);
+				if (IS_ERR(sqd)) {
+					ret = PTR_ERR(sqd);
+					goto err;
+				}
+			} else
+				sqd->thread = kthread_create_on_cpu(io_sq_thread, sqd,
 							cpu, "io_uring-sq");
 		} else {
 			sqd->thread = kthread_create(io_sq_thread, sqd,
@@ -7618,6 +7685,8 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx,
 			sqd->thread = NULL;
 			goto err;
 		}
+		io_attach_ctx_to_sqd(sqd, ctx);
+
 		ret = io_uring_alloc_task_context(sqd->thread);
 		if (ret)
 			goto err;
@@ -9157,7 +9226,7 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
 	if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
 			IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
 			IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
-			IORING_SETUP_R_DISABLED))
+			IORING_SETUP_R_DISABLED | IORING_SETUP_SQPOLL_PERCPU))
 		return -EINVAL;
 
 	return  io_uring_create(entries, &p, params);
@@ -9500,6 +9569,8 @@ SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
 
 static int __init io_uring_init(void)
 {
+	int cpu;
+
 #define __BUILD_BUG_VERIFY_ELEMENT(stype, eoffset, etype, ename) do { \
 	BUILD_BUG_ON(offsetof(stype, ename) != eoffset); \
 	BUILD_BUG_ON(sizeof(etype) != sizeof_field(stype, ename)); \
@@ -9540,6 +9611,21 @@ static int __init io_uring_init(void)
 	BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
 	BUILD_BUG_ON(__REQ_F_LAST_BIT >= 8 * sizeof(int));
 	req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC);
+
+	percpu_sqd = alloc_percpu(struct io_sq_data);
+
+	for_each_possible_cpu(cpu) {
+		struct io_sq_data *sqd;
+
+		sqd = per_cpu_ptr(percpu_sqd, cpu);
+		INIT_LIST_HEAD(&sqd->ctx_list);
+		INIT_LIST_HEAD(&sqd->ctx_new_list);
+		mutex_init(&sqd->ctx_lock);
+		mutex_init(&sqd->lock);
+		mutex_init(&sqd->percpu_sq_lock);
+		init_waitqueue_head(&sqd->wait);
+
+	}
 	return 0;
 };
 __initcall(io_uring_init);
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 98d8e06dea22..a162b077a453 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -96,6 +96,7 @@ enum {
 #define IORING_SETUP_CLAMP	(1U << 4)	/* clamp SQ/CQ ring sizes */
 #define IORING_SETUP_ATTACH_WQ	(1U << 5)	/* attach to existing wq */
 #define IORING_SETUP_R_DISABLED	(1U << 6)	/* start with ring disabled */
+#define IORING_SETUP_SQPOLL_PERCPU	(1U << 7)	/* use percpu SQ poll thread */
 
 enum {
 	IORING_OP_NOP,
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] improve SQPOLL handling
  2020-10-20  8:23 [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
  2020-10-20  8:23 ` [PATCH 1/2] io_uring: refactor io_sq_thread() handling Xiaoguang Wang
  2020-10-20  8:23 ` [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu Xiaoguang Wang
@ 2020-10-27 13:34 ` Xiaoguang Wang
  2020-10-27 13:48   ` Jens Axboe
  2 siblings, 1 reply; 9+ messages in thread
From: Xiaoguang Wang @ 2020-10-27 13:34 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, joseph.qi

hello,

This is a gentle ping.

Regards,
Xiaogaung Wang

> The first patch tries to improve various issues in current implementation:
>    The prepare_to_wait() usage in __io_sq_thread() is weird. If multiple ctxs
> share one same poll thread, one ctx will put poll thread in TASK_INTERRUPTIBLE,
> but if other ctxs have work to do, we don't need to change task's stat at all.
> I think only if all ctxs don't have work to do, we can do it.
>    We use round-robin strategy to make multiple ctxs share one same poll thread,
> but there are various condition in __io_sq_thread(), which seems complicated and
> may affect round-robin strategy.
>    In io_sq_thread(), we always call io_sq_thread_drop_mm() when we complete a
> round of ctxs iteration, which maybe inefficient.
> 
> The second patch adds a IORING_SETUP_SQPOLL_PERCPU flag, for those rings which
> have SQPOLL enabled and are willing to be bound to one same cpu, hence share
> one same poll thread, add a capability that these rings can share one poll thread
> by specifying a new IORING_SETUP_SQPOLL_PERCPU flag. FIO tool can integrate this
> feature easily, so we can test multiple rings to share same poll thread easily.
> 
> Xiaoguang Wang (2):
>    io_uring: refactor io_sq_thread() handling
>    io_uring: support multiple rings to share same poll thread by
>      specifying same cpu
> 
>   fs/io_uring.c                 | 307 ++++++++++++++++++++--------------
>   include/uapi/linux/io_uring.h |   1 +
>   2 files changed, 181 insertions(+), 127 deletions(-)
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] improve SQPOLL handling
  2020-10-27 13:34 ` [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
@ 2020-10-27 13:48   ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-27 13:48 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: asml.silence, joseph.qi

On 10/27/20 7:34 AM, Xiaoguang Wang wrote:
> hello,
> 
> This is a gentle ping.

I'll take a look soon, with the aim being getting this merged for 5.11.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] io_uring: refactor io_sq_thread() handling
  2020-10-20  8:23 ` [PATCH 1/2] io_uring: refactor io_sq_thread() handling Xiaoguang Wang
@ 2020-10-27 16:56   ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-27 16:56 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: asml.silence, joseph.qi

On 10/20/20 2:23 AM, Xiaoguang Wang wrote:
> @@ -6638,24 +6578,52 @@ static int io_sq_thread(void *data)
>  			}
>  			io_sq_thread_associate_blkcg(ctx, &cur_css);
>  
> -			ret |= __io_sq_thread(ctx, start_jiffies, cap_entries);
> +			ret = __io_sq_thread(ctx, cap_entries);
> +			if (!sqt_spin && (ret > 0 || !list_empty(&ctx->iopoll_list)))
> +				sqt_spin = true;
> +		}
>  
> -			io_sq_thread_drop_mm();
> +		if (sqt_spin) {
> +			io_run_task_work();
> +			cond_resched();
> +			continue;
>  		}
>  
> -		if (ret & SQT_SPIN) {
> +		if (!time_after(jiffies, timeout)) {
>  			io_run_task_work();
>  			cond_resched();
> -		} else if (ret == SQT_IDLE) {
> -			if (kthread_should_park())
> -				continue;
> +			io_sq_thread_drop_mm();
> +			continue;
> +		}
> +
> +		needs_sched = true;
> +		prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
> +		list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> +			if ((ctx->flags & IORING_SETUP_IOPOLL) &&
> +			    !list_empty_careful(&ctx->iopoll_list)) {
> +				needs_sched = false;
> +				break;
> +			}
> +			to_submit = io_sqring_entries(ctx);
> +			if (to_submit) {
> +				needs_sched = false;
> +				break;
> +			}
> +		}
> +
> +		if (needs_sched) {
>  			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
>  				io_ring_set_wakeup_flag(ctx);
> +		}
> +
> +		if (needs_sched) {
>  			schedule();
> -			start_jiffies = jiffies;
>  			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
>  				io_ring_clear_wakeup_flag(ctx);
> -		}
> +			finish_wait(&sqd->wait, &wait);
> +		} else
> +			finish_wait(&sqd->wait, &wait);
> +		timeout = jiffies + sqd->sq_thread_idle;
>  	}

Not sure why, but you have two

if (needs_sched) {
}

right after each other, that should be folded into one.

And I'd get rid of the to_submit, just do:

if (io_sqring_entries(ctx)) {
	needs_sched = false;
	break;
}

as we're not going to be using that value anyway.

As a minor style thing, always includes bracket on both cases if one has
it, don't do this:

if (foo) {
	...
	...
} else
	bar();


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu
  2020-10-20  8:23 ` [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu Xiaoguang Wang
@ 2020-10-27 17:01   ` Jens Axboe
  2020-11-01 14:22     ` Xiaoguang Wang
  2020-11-02 12:00     ` Xiaoguang Wang
  0 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-27 17:01 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: asml.silence, joseph.qi

On 10/20/20 2:23 AM, Xiaoguang Wang wrote:
> We have already supported multiple rings to share one same poll thread
> by passing IORING_SETUP_ATTACH_WQ, but it's not that convenient to use.
> IORING_SETUP_ATTACH_WQ needs users to ensure that a parent ring instance
> has already existed, that means it will require app to regulate the
> creation oder between uring instances.
> 
> Currently we can make this a bit simpler, for those rings which will
> have SQPOLL enabled and are willing to be bound to one same cpu, add a
> capability that these rings can share one poll thread by specifying
> a new IORING_SETUP_SQPOLL_PERCPU flag, then we have 3 cases
>   1, IORING_SETUP_ATTACH_WQ: if user specifies this flag, we'll always
> try to attach this ring to an existing ring's corresponding poll thread,
> no matter whether IORING_SETUP_SQ_AFF or IORING_SETUP_SQPOLL_PERCPU is
> set.
>   2, IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
> for this case, we'll create a single poll thread to be shared by these
> rings, and this poll thread is bound to a fixed cpu.
>   3, for any other cases, we'll just create one new poll thread for the
> corresponding ring.
> 
> And for case 2, don't need to regulate creation oder of multiple uring
> instances, we use a mutex to synchronize creation, for example, say five
> rings which all have IORING_SETUP_SQ_AFF & IORING_SETUP_SQPOLL_PERCPU
> enabled, and are willing to be bound same cpu, one ring that gets the
> mutex lock will create one poll thread, the other four rings will just
> attach themselves the previous created poll thread once they get lock
> successfully.
> 
> To implement above function, define a percpu io_sq_data array:
>     static struct io_sq_data __percpu *percpu_sqd;
> When IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
> we will use struct io_uring_params' sq_thread_cpu to locate corresponding
> sqd, and use this sqd to save poll thread info.

Do you have any test results?

Not quite clear to me, but if IORING_SETUP_SQPOLL_PERCPU is set, I think
it should always imply IORING_SETUP_ATTACH_WQ in the sense that it would
not make sense to have more than one poller thread that's bound to a
single CPU, for example.

> @@ -6814,8 +6819,17 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
>  	return 0;
>  }
>  
> -static void io_put_sq_data(struct io_sq_data *sqd)
> +static void io_put_sq_data(struct io_ring_ctx *ctx, struct io_sq_data *sqd)
>  {
> +	int percpu_sqd = 0;
> +
> +	if ((ctx->flags & IORING_SETUP_SQ_AFF) &&
> +	    (ctx->flags & IORING_SETUP_SQPOLL_PERCPU))
> +		percpu_sqd = 1;
> +
> +	if (percpu_sqd)
> +		mutex_lock(&sqd->percpu_sq_lock);
> +
>  	if (refcount_dec_and_test(&sqd->refs)) {
>  		/*
>  		 * The park is a bit of a work-around, without it we get

For this, and the setup, you should make it dynamic. Hence don't
allocate the percpu data etc until someone asks for it, and when the
last user of it goes away, it should go away as well.

That would make the handling of it identical to what we currently have,
and no need to special case any of this like you do above.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu
  2020-10-27 17:01   ` Jens Axboe
@ 2020-11-01 14:22     ` Xiaoguang Wang
  2020-11-02 12:00     ` Xiaoguang Wang
  1 sibling, 0 replies; 9+ messages in thread
From: Xiaoguang Wang @ 2020-11-01 14:22 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: asml.silence, joseph.qi

hi Jens,

> On 10/20/20 2:23 AM, Xiaoguang Wang wrote:
>> We have already supported multiple rings to share one same poll thread
>> by passing IORING_SETUP_ATTACH_WQ, but it's not that convenient to use.
>> IORING_SETUP_ATTACH_WQ needs users to ensure that a parent ring instance
>> has already existed, that means it will require app to regulate the
>> creation oder between uring instances.
>>
>> Currently we can make this a bit simpler, for those rings which will
>> have SQPOLL enabled and are willing to be bound to one same cpu, add a
>> capability that these rings can share one poll thread by specifying
>> a new IORING_SETUP_SQPOLL_PERCPU flag, then we have 3 cases
>>    1, IORING_SETUP_ATTACH_WQ: if user specifies this flag, we'll always
>> try to attach this ring to an existing ring's corresponding poll thread,
>> no matter whether IORING_SETUP_SQ_AFF or IORING_SETUP_SQPOLL_PERCPU is
>> set.
>>    2, IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
>> for this case, we'll create a single poll thread to be shared by these
>> rings, and this poll thread is bound to a fixed cpu.
>>    3, for any other cases, we'll just create one new poll thread for the
>> corresponding ring.
>>
>> And for case 2, don't need to regulate creation oder of multiple uring
>> instances, we use a mutex to synchronize creation, for example, say five
>> rings which all have IORING_SETUP_SQ_AFF & IORING_SETUP_SQPOLL_PERCPU
>> enabled, and are willing to be bound same cpu, one ring that gets the
>> mutex lock will create one poll thread, the other four rings will just
>> attach themselves the previous created poll thread once they get lock
>> successfully.
>>
>> To implement above function, define a percpu io_sq_data array:
>>      static struct io_sq_data __percpu *percpu_sqd;
>> When IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
>> we will use struct io_uring_params' sq_thread_cpu to locate corresponding
>> sqd, and use this sqd to save poll thread info.
> 
> Do you have any test results?
First sorry for late reply.
Yes, and I'll attach it in next version patch.

> 
> Not quite clear to me, but if IORING_SETUP_SQPOLL_PERCPU is set, I think
> it should always imply IORING_SETUP_ATTACH_WQ in the sense that it would
> not make sense to have more than one poller thread that's bound to a
> single CPU, for example.
Yes, agree, in this case, it's meaningful that IORING_SETUP_SQPOLL_PERCPU
implies IORING_SETUP_ATTACH_WQ, but that needs app already has a sqpoll
io_uring instance bound to same cpu. I think it doesn't matter much, app
can pass IORING_SETUP_SQPOLL_PERCPU and IORING_SETUP_ATTACH_WQ, then
we will try to attach an exsiting sqpoll io_uring instance, and ignore
IORING_SETUP_SQPOLL_PERCPU.

The key benefit for IORING_SETUP_SQPOLL_PERCPU is that it does not require
app to regulate the creation oder between uring instances, IORING_SETUP_ATTACH_WQ
needs app creats a sqpoll io_uring instance, follwing io_uring instances attach
to it, it's not that convenient, needs to pass fd between threads in a process.
If we support to share one poll thread between processes, it's also not convenient.

And I have another question about the feature that make multiple ctxs share
one poll thread, do we need that these ctxs belong to one single process, or
can ctxs belong to different process share one poll thread?
   Seems that current mainline codes allow it, say a parent process creates a
   sqpoll io_uring instance, and this parent process fork a child process. This
   child process will inherit parent process' files, it may create a new sqpoll
   io_uring instance and attach it to the previous created io_uring instance,
   then parent and child share one poll thread.

Also in current mainline codes:
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
	if (current->cred != ctx->creds) {
		if (old_cred)
			revert_creds(old_cred);
		old_cred = override_creds(ctx->creds);
	}
	io_sq_thread_associate_blkcg(ctx, &cur_css);
#ifdef CONFIG_AUDIT
	current->loginuid = ctx->loginuid;
	current->sessionid = ctx->sessionid;
#endif

	ret |= __io_sq_thread(ctx, start_jiffies, cap_entries);

	io_sq_thread_drop_mm_files();
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here you always drop mm, files and nsproxy once finishing every ctx's process, is it
because you design that ctxs belong to different process share one poll thread, so you
switch every ctx's mm, files and nsproxy when handling it?

}

Regards,
Xiaoguang Wang


> 
>> @@ -6814,8 +6819,17 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
>>   	return 0;
>>   }
>>   
>> -static void io_put_sq_data(struct io_sq_data *sqd)
>> +static void io_put_sq_data(struct io_ring_ctx *ctx, struct io_sq_data *sqd)
>>   {
>> +	int percpu_sqd = 0;
>> +
>> +	if ((ctx->flags & IORING_SETUP_SQ_AFF) &&
>> +	    (ctx->flags & IORING_SETUP_SQPOLL_PERCPU))
>> +		percpu_sqd = 1;
>> +
>> +	if (percpu_sqd)
>> +		mutex_lock(&sqd->percpu_sq_lock);
>> +
>>   	if (refcount_dec_and_test(&sqd->refs)) {
>>   		/*
>>   		 * The park is a bit of a work-around, without it we get
> 
> For this, and the setup, you should make it dynamic. Hence don't
> allocate the percpu data etc until someone asks for it, and when the
> last user of it goes away, it should go away as well.
> 
> That would make the handling of it identical to what we currently have,
> and no need to special case any of this like you do above.
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu
  2020-10-27 17:01   ` Jens Axboe
  2020-11-01 14:22     ` Xiaoguang Wang
@ 2020-11-02 12:00     ` Xiaoguang Wang
  1 sibling, 0 replies; 9+ messages in thread
From: Xiaoguang Wang @ 2020-11-02 12:00 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: asml.silence, joseph.qi

hi,

> On 10/20/20 2:23 AM, Xiaoguang Wang wrote:
>> We have already supported multiple rings to share one same poll thread
>> by passing IORING_SETUP_ATTACH_WQ, but it's not that convenient to use.
>> IORING_SETUP_ATTACH_WQ needs users to ensure that a parent ring instance
>> has already existed, that means it will require app to regulate the
>> creation oder between uring instances.
>>
>> Currently we can make this a bit simpler, for those rings which will
>> have SQPOLL enabled and are willing to be bound to one same cpu, add a
>> capability that these rings can share one poll thread by specifying
>> a new IORING_SETUP_SQPOLL_PERCPU flag, then we have 3 cases
>>    1, IORING_SETUP_ATTACH_WQ: if user specifies this flag, we'll always
>> try to attach this ring to an existing ring's corresponding poll thread,
>> no matter whether IORING_SETUP_SQ_AFF or IORING_SETUP_SQPOLL_PERCPU is
>> set.
>>    2, IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
>> for this case, we'll create a single poll thread to be shared by these
>> rings, and this poll thread is bound to a fixed cpu.
>>    3, for any other cases, we'll just create one new poll thread for the
>> corresponding ring.
>>
>> And for case 2, don't need to regulate creation oder of multiple uring
>> instances, we use a mutex to synchronize creation, for example, say five
>> rings which all have IORING_SETUP_SQ_AFF & IORING_SETUP_SQPOLL_PERCPU
>> enabled, and are willing to be bound same cpu, one ring that gets the
>> mutex lock will create one poll thread, the other four rings will just
>> attach themselves the previous created poll thread once they get lock
>> successfully.
>>
>> To implement above function, define a percpu io_sq_data array:
>>      static struct io_sq_data __percpu *percpu_sqd;
>> When IORING_SETUP_SQ_AFF and IORING_SETUP_SQPOLL_PERCPU are both enabled,
>> we will use struct io_uring_params' sq_thread_cpu to locate corresponding
>> sqd, and use this sqd to save poll thread info.
> 
> Do you have any test results?
> 
> Not quite clear to me, but if IORING_SETUP_SQPOLL_PERCPU is set, I think
> it should always imply IORING_SETUP_ATTACH_WQ in the sense that it would
> not make sense to have more than one poller thread that's bound to a
> single CPU, for example.
> 
>> @@ -6814,8 +6819,17 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
>>   	return 0;
>>   }
>>   
>> -static void io_put_sq_data(struct io_sq_data *sqd)
>> +static void io_put_sq_data(struct io_ring_ctx *ctx, struct io_sq_data *sqd)
>>   {
>> +	int percpu_sqd = 0;
>> +
>> +	if ((ctx->flags & IORING_SETUP_SQ_AFF) &&
>> +	    (ctx->flags & IORING_SETUP_SQPOLL_PERCPU))
>> +		percpu_sqd = 1;
>> +
>> +	if (percpu_sqd)
>> +		mutex_lock(&sqd->percpu_sq_lock);
>> +
>>   	if (refcount_dec_and_test(&sqd->refs)) {
>>   		/*
>>   		 * The park is a bit of a work-around, without it we get
> 
> For this, and the setup, you should make it dynamic. Hence don't
> allocate the percpu data etc until someone asks for it, and when the
> last user of it goes away, it should go away as well.
> 
> That would make the handling of it identical to what we currently have,
> and no need to special case any of this like you do above.
I had thought I could improve codes according to your suggestions, but seems that
it couldn't do that. For the percpu sqd, I need to use a global mutex to serialize
the io_uring instance creation order. Once a task gets this lock, if there isn't sqd
bound to specified cpu, it'll allocate sqd data, create poll thread, and both operations
need to be atomic, so I need to handle percpu sqd a little specially.

Regards,
Xiaoguang Wang


> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-11-02 12:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-20  8:23 [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
2020-10-20  8:23 ` [PATCH 1/2] io_uring: refactor io_sq_thread() handling Xiaoguang Wang
2020-10-27 16:56   ` Jens Axboe
2020-10-20  8:23 ` [PATCH 2/2] io_uring: support multiple rings to share same poll thread by specifying same cpu Xiaoguang Wang
2020-10-27 17:01   ` Jens Axboe
2020-11-01 14:22     ` Xiaoguang Wang
2020-11-02 12:00     ` Xiaoguang Wang
2020-10-27 13:34 ` [PATCH 0/2] improve SQPOLL handling Xiaoguang Wang
2020-10-27 13:48   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).