All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: io-uring@vger.kernel.org
Subject: Re: io_uring_prep_openat_direct() and link/drain
Date: Fri, 1 Apr 2022 09:36:25 -0600	[thread overview]
Message-ID: <fbf3b195-7415-7f84-c0e6-bdfebf9692f2@kernel.dk> (raw)
In-Reply-To: <CAJfpegvM3LQ8nsJf=LsWjQznpOzC+mZFXB5xkZgZHR2tXXjxLQ@mail.gmail.com>

On 4/1/22 2:40 AM, Miklos Szeredi wrote:
> On Wed, 30 Mar 2022 at 19:49, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 3/30/22 9:53 AM, Jens Axboe wrote:
>>> On 3/30/22 9:17 AM, Jens Axboe wrote:
>>>> On 3/30/22 9:12 AM, Miklos Szeredi wrote:
>>>>> On Wed, 30 Mar 2022 at 17:05, Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>
>>>>>> On 3/30/22 8:58 AM, Miklos Szeredi wrote:
>>>>>>> Next issue:  seems like file slot reuse is not working correctly.
>>>>>>> Attached program compares reads using io_uring with plain reads of
>>>>>>> proc files.
>>>>>>>
>>>>>>> In the below example it is using two slots alternately but the number
>>>>>>> of slots does not seem to matter, read is apparently always using a
>>>>>>> stale file (the prior one to the most recent open on that slot).  See
>>>>>>> how the sizes of the files lag by two lines:
>>>>>>>
>>>>>>> root@kvm:~# ./procreads
>>>>>>> procreads: /proc/1/stat: ok (313)
>>>>>>> procreads: /proc/2/stat: ok (149)
>>>>>>> procreads: /proc/3/stat: read size mismatch 313/150
>>>>>>> procreads: /proc/4/stat: read size mismatch 149/154
>>>>>>> procreads: /proc/5/stat: read size mismatch 150/161
>>>>>>> procreads: /proc/6/stat: read size mismatch 154/171
>>>>>>> ...
>>>>>>>
>>>>>>> Any ideas?
>>>>>>
>>>>>> Didn't look at your code yet, but with the current tree, this is the
>>>>>> behavior when a fixed file is used:
>>>>>>
>>>>>> At prep time, if the slot is valid it is used. If it isn't valid,
>>>>>> assignment is deferred until the request is issued.
>>>>>>
>>>>>> Which granted is a bit weird. It means that if you do:
>>>>>>
>>>>>> <open fileA into slot 1, slot 1 currently unused><read slot 1>
>>>>>>
>>>>>> the read will read from fileA. But for:
>>>>>>
>>>>>> <open fileB into slot 1, slot 1 is fileA currently><read slot 1>
>>>>>>
>>>>>> since slot 1 is already valid at prep time for the read, the read will
>>>>>> be from fileA again.
>>>>>>
>>>>>> Is this what you are seeing? It's definitely a bit confusing, and the
>>>>>> only reason why I didn't change it is because it could potentially break
>>>>>> applications. Don't think there's a high risk of that, however, so may
>>>>>> indeed be worth it to just bite the bullet and the assignment is
>>>>>> consistent (eg always done from the perspective of the previous
>>>>>> dependent request having completed).
>>>>>>
>>>>>> Is this what you are seeing?
>>>>>
>>>>> Right, this explains it.   Then the only workaround would be to wait
>>>>> for the open to finish before submitting the read, but that would
>>>>> defeat the whole point of using io_uring for this purpose.
>>>>
>>>> Honestly, I think we should just change it during this round, making it
>>>> consistent with the "slot is unused" use case. The old use case is more
>>>> more of a "it happened to work" vs the newer consistent behavior of "we
>>>> always assign the file when execution starts on the request".
>>>>
>>>> Let me spin a patch, would be great if you could test.
>>>
>>> Something like this on top of the current tree should work. Can you
>>> test?
>>
>> You can also just re-pull for-5.18/io_uring, it has been updated. A last
>> minute edit make a 0 return from io_assign_file() which should've been
>> 'true'...
> 
> Yep, this works now.
> 
> Next issue:  will get ENFILE even though there are just 40 slots.
> When running as root, then it will get as far as invoking the OOM
> killer, which is really bad.
> 
> There's no leak, this apparently only happens when the worker doing
> the fputs can't keep up.  Simple solution:  do the fput() of the
> previous file synchronously with the open_direct operation; fput
> shouldn't be expensive...  Is there a reason why this wouldn't work?

I take it you're continually reusing those slots? If you have a test
case that'd be ideal. Agree that it sounds like we just need an
appropriate breather to allow fput/task_work to run. Or it could be the
deferral free of the fixed slot.

-- 
Jens Axboe


  reply	other threads:[~2022-04-01 16:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-29 13:20 io_uring_prep_openat_direct() and link/drain Miklos Szeredi
2022-03-29 16:08 ` Jens Axboe
2022-03-29 17:04   ` Jens Axboe
2022-03-29 18:21     ` Miklos Szeredi
2022-03-29 18:26       ` Jens Axboe
2022-03-29 18:31         ` Miklos Szeredi
2022-03-29 18:40           ` Jens Axboe
2022-03-29 19:30             ` Miklos Szeredi
2022-03-29 20:03               ` Jens Axboe
2022-03-30  8:18                 ` Miklos Szeredi
2022-03-30 12:35                   ` Jens Axboe
2022-03-30 12:43                     ` Miklos Szeredi
2022-03-30 12:48                       ` Jens Axboe
2022-03-30 12:51                         ` Miklos Szeredi
2022-03-30 14:58                           ` Miklos Szeredi
2022-03-30 15:05                             ` Jens Axboe
2022-03-30 15:12                               ` Miklos Szeredi
2022-03-30 15:17                                 ` Jens Axboe
2022-03-30 15:53                                   ` Jens Axboe
2022-03-30 17:49                                     ` Jens Axboe
2022-04-01  8:40                                       ` Miklos Szeredi
2022-04-01 15:36                                         ` Jens Axboe [this message]
2022-04-01 16:02                                           ` Miklos Szeredi
2022-04-01 16:21                                             ` Jens Axboe
2022-04-02  1:17                                               ` Jens Axboe
2022-04-05  7:45                                                 ` Miklos Szeredi
2022-04-05 14:44                                                   ` Jens Axboe
2022-04-21 12:31                                                     ` Miklos Szeredi
2022-04-21 12:34                                                       ` Jens Axboe
2022-04-21 12:39                                                         ` Miklos Szeredi
2022-04-21 12:41                                                           ` Jens Axboe
2022-04-21 13:10                                                             ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fbf3b195-7415-7f84-c0e6-bdfebf9692f2@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.