io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hao_Xu <haoxu@linux.alibaba.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: io-uring@vger.kernel.org, Johannes Weiner <hannes@cmpxchg.org>,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: Loophole in async page I/O
Date: Wed, 14 Oct 2020 03:57:25 +0800	[thread overview]
Message-ID: <34097eb9-c517-6ddb-1765-433e7d5083ed@linux.alibaba.com> (raw)
In-Reply-To: <20201013120119.GD20115@casper.infradead.org>

在 2020/10/13 下午8:01, Matthew Wilcox 写道:
> On Tue, Oct 13, 2020 at 01:13:48PM +0800, Hao_Xu wrote:
>> 在 2020/10/13 上午5:13, Matthew Wilcox 写道:
>>> This one's pretty unlikely, but there's a case in buffered reads where
>>> an IOCB_WAITQ read can end up sleeping.
>>>
>>> generic_file_buffered_read():
>>>                   page = find_get_page(mapping, index);
>>> ...
>>>                   if (!PageUptodate(page)) {
>>> ...
>>>                           if (iocb->ki_flags & IOCB_WAITQ) {
>>> ...
>>>                                   error = wait_on_page_locked_async(page,
>>>                                                                   iocb->ki_waitq);
>>> wait_on_page_locked_async():
>>>           if (!PageLocked(page))
>>>                   return 0;
>>> (back to generic_file_buffered_read):
>>>                           if (!mapping->a_ops->is_partially_uptodate(page,
>>>                                                           offset, iter->count))
>>>                                   goto page_not_up_to_date_locked;
>>>
>>> page_not_up_to_date_locked:
>>>                   if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) {
>>>                           unlock_page(page);
>>>                           put_page(page);
>>>                           goto would_block;
>>>                   }
>>> ...
>>>                   error = mapping->a_ops->readpage(filp, page);
>>> (will unlock page on I/O completion)
>>>                   if (!PageUptodate(page)) {
>>>                           error = lock_page_killable(page);
>>>
>>> So if we have IOCB_WAITQ set but IOCB_NOWAIT clear, we'll call ->readpage()
>>> and wait for the I/O to complete.  I can't quite figure out if this is
>>> intentional -- I think not; if I understand the semantics right, we
>>> should be returning -EIOCBQUEUED and punting to an I/O thread to
>>> kick off the I/O and wait.
>>>
>>> I think the right fix is to return -EIOCBQUEUED from
>>> wait_on_page_locked_async() if the page isn't locked.  ie this:
>>>
>>> @@ -1258,7 +1258,7 @@ static int wait_on_page_locked_async(struct page *page,
>>>                                        struct wait_page_queue *wait)
>>>    {
>>>           if (!PageLocked(page))
>>> -               return 0;
>>> +               return -EIOCBQUEUED;
>>>           return __wait_on_page_locked_async(compound_head(page), wait, false);
>>>    }
>>> But as I said, I'm not sure what the semantics are supposed to be.
>>>
>> Hi Matthew,
>> which kernel version are you use, I believe I've fixed this case in the
>> commit c8d317aa1887b40b188ec3aaa6e9e524333caed1
> 
> Ah, I don't have that commit in my tree.
> 
> Nevertheless, there is still a problem.  The ->readpage implementation
> is not required to execute asynchronously.  For example, it may enter
> page reclaim by using GFP_KERNEL.  Indeed, I feel it is better if it
> works synchronously as it can then report the actual error from an I/O
> instead of the almost-meaningless -EIO.
> 
> This patch series documents 12 filesystems which implement ->readpage
> in a synchronous way today (for at least some cases) and converts iomap
> to be synchronous (making two more filesystems synchronous).
> 
> https://lore.kernel.org/linux-fsdevel/20201009143104.22673-1-willy@infradead.org/
> 
Thanks, Matthew. I didn't have this knowledge before, thank you for your 
share and information. It's really kind of you. I'll look into it soon.

      reply	other threads:[~2020-10-13 19:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 21:13 Loophole in async page I/O Matthew Wilcox
2020-10-12 22:08 ` Jens Axboe
2020-10-12 22:22   ` Jens Axboe
2020-10-12 22:42     ` Jens Axboe
2020-10-14 20:31       ` Hao_Xu
2020-10-14 20:57         ` Jens Axboe
2020-10-15 11:27           ` Hao_Xu
2020-10-15 12:17             ` Hao_Xu
2020-10-13  5:31   ` Hao_Xu
2020-10-13 17:50     ` Jens Axboe
2020-10-13 19:50       ` Hao_Xu
2020-10-13  5:13 ` Hao_Xu
2020-10-13 12:01   ` Matthew Wilcox
2020-10-13 19:57     ` Hao_Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34097eb9-c517-6ddb-1765-433e7d5083ed@linux.alibaba.com \
    --to=haoxu@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=hannes@cmpxchg.org \
    --cc=io-uring@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).