linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "yukuai (C)" <yukuai3@huawei.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: <hch@infradead.org>, <darrick.wong@oracle.com>,
	<linux-xfs@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <houtao1@huawei.com>,
	<zhengbin13@huawei.com>, <yi.zhang@huawei.com>
Subject: Re: [RFC] iomap: fix race between readahead and direct write
Date: Sun, 19 Jan 2020 19:21:24 +0800	[thread overview]
Message-ID: <16241bd6-e3f9-5272-92aa-b31cc0a2b2fa@huawei.com> (raw)
In-Reply-To: <20200119075828.GA4147@bombadil.infradead.org>

On 2020/1/19 15:58, Matthew Wilcox wrote:
> On Sun, Jan 19, 2020 at 02:55:14PM +0800, yukuai (C) wrote:
>> On 2020/1/19 14:14, Matthew Wilcox wrote:
>>> I don't understand your reasoning here.  If another process wants to
>>> access a page of the file which isn't currently in cache, it would have
>>> to first read the page in from storage.  If it's under readahead, it
>>> has to wait for the read to finish.  Why is the second case worse than
>>> the second?  It seems better to me.
>>
>> Thanks for your response! My worries is that, for example:
>>
>> We read page 0, and trigger readahead to read n pages(0 - n-1). While in
>> another thread, we read page n-1.
>>
>> In the current implementation, if readahead is in the process of reading
>> page 0 - n-2,  later operation doesn't need to wait the former one to
>> finish. However, later operation will have to wait if we add all pages
>> to page cache first. And that is why I said it might cause problem for
>> performance overhead.
> 
> OK, but let's put some numbers on that.  Imagine that we're using high
> performance spinning rust so we have an access latency of 5ms (200
> IOPS), we're accessing 20 consecutive pages which happen to have their
> data contiguous on disk.  Our CPU is running at 2GHz and takes about
> 100,000 cycles to submit an I/O, plus 1,000 cycles to add an extra page
> to the I/O.
> 
> Current implementation: Allocate 20 pages, place 19 of them in the cache,
> fail to place the last one in the cache.  The later thread actually gets
> to jump the queue and submit its bio first.  Its latency will be 100,000
> cycles (20us) plus the 5ms access time.  But it only has 20,000 cycles
> (4us) to hit this race, or it will end up behaving the same way as below.
> 
> New implementation: Allocate 20 pages, place them all in the cache,
> then takes 120,000 cycles to build & submit the I/O, and wait 5ms for
> the I/O to complete.
> 
> But look how much more likely it is that it'll hit during the window
> where we're waiting for the I/O to complete -- 5ms is 1250 times longer
> than 4us.
> 
> If it _does_ get the latency benefit of jumping the queue, the readahead
> will create one or two I/Os.  If it hit page 18 instead of page 19, we'd
> end up doing three I/Os; the first for page 18, then one for pages 0-17,
> and one for page 19.  And that means the disk is going to be busy for
> 15ms, delaying the next I/O for up to 10ms.  It's actually beneficial in
> the long term for the second thread to wait for the readahead to finish.
> 

Thank you very much for your detailed explanation, I was too blind for
my sided view. And I do agree that your patch series is a better
solution for the problem.

Yu Kuai


      reply	other threads:[~2020-01-19 11:21 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16  6:36 [RFC] iomap: fix race between readahead and direct write yu kuai
2020-01-16 15:32 ` Jan Kara
2020-01-17  9:39   ` yukuai (C)
2020-01-17 11:05     ` Jan Kara
2020-01-17 16:24       ` Darrick J. Wong
2020-01-19  1:25         ` yukuai (C)
2020-01-19  1:17       ` yukuai (C)
2020-01-20 11:42         ` Jan Kara
2020-01-18 23:08 ` Matthew Wilcox
2020-01-19  1:34   ` yukuai (C)
2020-01-19  1:42     ` Matthew Wilcox
2020-01-19  1:57       ` yukuai (C)
2020-01-19  2:51       ` yukuai (C)
2020-01-19  3:01         ` Gao Xiang
2020-01-19  3:15           ` yukuai (C)
2020-01-19  6:14         ` Matthew Wilcox
2020-01-19  6:55           ` yukuai (C)
2020-01-19  7:58             ` Matthew Wilcox
2020-01-19 11:21               ` yukuai (C) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16241bd6-e3f9-5272-92aa-b31cc0a2b2fa@huawei.com \
    --to=yukuai3@huawei.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@infradead.org \
    --cc=houtao1@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    --cc=yi.zhang@huawei.com \
    --cc=zhengbin13@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).