All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: io-uring <io-uring@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] io_uring changes for 5.9-rc1
Date: Mon, 3 Aug 2020 17:56:01 -0600	[thread overview]
Message-ID: <af6b61c1-e98e-f312-3550-deb7972751a9@kernel.dk> (raw)
In-Reply-To: <CAHk-=whZYCK2eNEcTvKWgBvoSL8YLT6G0dexVkFbDiVCLN3zBQ@mail.gmail.com>

On 8/3/20 5:49 PM, Linus Torvalds wrote:
> On Mon, Aug 3, 2020 at 4:31 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Updated to honor exclusive return value as well:
> 
> See my previous email, You're just adding code that makes no sense,
> because your wait entry fundamentally isn't an exclusive one.

Right, I get that now, it's just dead code for my use case. It was sent
out before your previous email.

> So all that code is a no-op and only makes it more confusing to read.
> 
> Your wakeup handler has _nothing_ to do with the generic
> wake_page_function(). There is _zero_ overlap. Your wakeup handler
> gets called only for the wait entries _you_ created.
> 
> Trying to use the wakeup logic from wake_page_function() makes no
> sense, because the rules for wake_page_function() are entirely
> different. Yes, they are called for the same thing (somebody unlocked
> a page and is waking up waiters), but it's using a completely
> different sleeping logic.
> 
> See? When wake_page_function() does that
> 
>         wait->flags |= WQ_FLAG_WOKEN;
> 
> and does something different (and returns different values) depending
> on whether WQ_FLAG_EXCLUSIVE was set, that is all because
> wait_on_page_bit_common() entry set yo that wait entry (on its stack)
> with those exact rules in mind.
> 
> So the wakeup function is 1:1 tied to the code that registers the wait
> entry. wait_on_page_bit_common() has one set of rules, that are then
> honored by the wakeup function it uses. But those rules have _zero_
> impact on your use. You can have - and you *do* have - different sets
> of rules.
> 
> For example, none of your wakeups are ever exclusive. All you do is
> make a work runnable - that doesn't mean that other people shouldn't
> do other things when they get a "page was unlocked" wakeup
> notification.
> 
> Also, for you "list_del_init()" is fine, because you never do the
> unlocked "list_empty_careful()" on that wait entry.  All the waitqueue
> operations run under the queue head lock.
> 
> So what I think you _should_ do is just something like this:
> 
>     diff --git a/fs/io_uring.c b/fs/io_uring.c
>     index 2a3af95be4ca..1e243f99643b 100644
>     --- a/fs/io_uring.c
>     +++ b/fs/io_uring.c
>     @@ -2965,10 +2965,10 @@ static int io_async_buf_func(struct
> wait_queue_entry *wait, unsigned mode,
>             if (!wake_page_match(wpq, key))
>                     return 0;
> 
>     -       /* Stop waking things up if the page is locked again */
>     -       if (test_bit(key->bit_nr, &key->page->flags))
>     -              return -1;
>     -
>     +       /*
>     +        * Somebody unlocked the page. Unqueue the wait entry
>     +        * and run the task_work
>     +        */
>              list_del_init(&wait->entry);
> 
>              init_task_work(&req->task_work, io_req_task_submit);
> 
> because that matches what you're actually doing.
> 
> There's no reason to stop waking up others because the page is locked,
> because you don't know what others want.
> 
> And there's never any reason for the exclusive thing, b3ecause none of
> what you do guarantees that you take exclusive ownership of the page
> lock. Running the work *may* end up doing a "lock_page()", but you
> don't actually guarantee that.

What I ended up with after the last email was just removing the test
bit:

https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.9&id=cbd287c09351f1d3a4b3cb9167a2616a11390d32

and I clarified the comments on the io_async_buf_func() to add more
hints on how everything is triggered instead of just a vague "handler"
reference:

https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.9&id=c1dd91d16246b168b80af9b64c5cc35a66410455

-- 
Jens Axboe


  reply	other threads:[~2020-08-03 23:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-02 21:41 [GIT PULL] io_uring changes for 5.9-rc1 Jens Axboe
2020-08-03 20:48 ` Linus Torvalds
2020-08-03 20:53   ` Linus Torvalds
2020-08-03 21:10     ` Konstantin Ryabitsev
2020-08-03 22:31       ` Jens Axboe
2020-08-03 22:30   ` Jens Axboe
2020-08-03 23:18     ` Jens Axboe
2020-08-03 23:31       ` Jens Axboe
2020-08-03 23:49         ` Linus Torvalds
2020-08-03 23:56           ` Jens Axboe [this message]
2020-08-04  0:11             ` Linus Torvalds
2020-08-03 23:34       ` Linus Torvalds
2020-08-03 23:43         ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af6b61c1-e98e-f312-3550-deb7972751a9@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.