io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hao Xu <haoxu@linux.alibaba.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring <io-uring@vger.kernel.org>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: Re: [PATCH 2/3] io-wq: fix no lock protection of acct->nr_worker
Date: Sat, 7 Aug 2021 17:56:23 +0800	[thread overview]
Message-ID: <1f795e93-c137-439e-b02c-b460cb38bb14@linux.alibaba.com> (raw)
In-Reply-To: <cc9e61da-6591-c257-6899-d2afa037b2ad@kernel.dk>

在 2021/8/6 下午10:27, Jens Axboe 写道:
> On Thu, Aug 5, 2021 at 4:05 AM Hao Xu <haoxu@linux.alibaba.com> wrote:
>>
>> There is an acct->nr_worker visit without lock protection. Think about
>> the case: two callers call io_wqe_wake_worker(), one is the original
>> context and the other one is an io-worker(by calling
>> io_wqe_enqueue(wqe, linked)), on two cpus paralelly, this may cause
>> nr_worker to be larger than max_worker.
>> Let's fix it by adding lock for it, and let's do nr_workers++ before
>> create_io_worker. There may be a edge cause that the first caller fails
>> to create an io-worker, but the second caller doesn't know it and then
>> quit creating io-worker as well:
>>
>> say nr_worker = max_worker - 1
>>          cpu 0                        cpu 1
>>     io_wqe_wake_worker()          io_wqe_wake_worker()
>>        nr_worker < max_worker
>>        nr_worker++
>>        create_io_worker()         nr_worker == max_worker
>>           failed                  return
>>        return
>>
>> But the chance of this case is very slim.
>>
>> Fixes: 685fe7feedb9 ("io-wq: eliminate the need for a manager thread")
>> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
>> ---
>>   fs/io-wq.c | 17 ++++++++++++-----
>>   1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>> index cd4fd4d6268f..88d0ba7be1fb 100644
>> --- a/fs/io-wq.c
>> +++ b/fs/io-wq.c
>> @@ -247,9 +247,14 @@ static void io_wqe_wake_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
>>          ret = io_wqe_activate_free_worker(wqe);
>>          rcu_read_unlock();
>>
>> -       if (!ret && acct->nr_workers < acct->max_workers) {
>> -               atomic_inc(&acct->nr_running);
>> -               atomic_inc(&wqe->wq->worker_refs);
>> +       if (!ret) {
>> +               raw_spin_lock_irq(&wqe->lock);
>> +               if (acct->nr_workers < acct->max_workers) {
>> +                       atomic_inc(&acct->nr_running);
>> +                       atomic_inc(&wqe->wq->worker_refs);
>> +                       acct->nr_workers++;
>> +               }
>> +               raw_spin_unlock_irq(&wqe->lock);
>>                  create_io_worker(wqe->wq, wqe, acct->index);
>>          }
>>   }
> 
> There's a pretty grave bug in this patch, in that you no call
> create_io_worker() unconditionally. This causes obvious problems with
> misaccounting, and stalls that hit the idle timeout...
> 
This is surely a silly mistake, I'll check this patch and the 3/3 again.



  reply	other threads:[~2021-08-07  9:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 10:05 [PATCH 0/3] code clean and nr_worker fixes Hao Xu
2021-08-05 10:05 ` [PATCH 1/3] io-wq: clean code of task state setting Hao Xu
2021-08-05 14:23   ` Jens Axboe
2021-08-05 17:37     ` Hao Xu
2021-08-05 10:05 ` [PATCH 2/3] io-wq: fix no lock protection of acct->nr_worker Hao Xu
2021-08-06 14:27   ` Jens Axboe
2021-08-07  9:56     ` Hao Xu [this message]
2021-08-07 13:51       ` Jens Axboe
2021-08-09 20:19         ` Olivier Langlois
2021-08-09 20:34           ` Pavel Begunkov
2021-08-09 20:35           ` Jens Axboe
2021-08-05 10:05 ` [PATCH 3/3] io-wq: fix lack of acct->nr_workers < acct->max_workers judgement Hao Xu
2021-08-05 14:58 ` [PATCH 0/3] code clean and nr_worker fixes Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f795e93-c137-439e-b02c-b460cb38bb14@linux.alibaba.com \
    --to=haoxu@linux.alibaba.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=joseph.qi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).