linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: "Darrick J . Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <willy@infradead.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [POC][PATCH] xfs: reduce ilock contention on buffered randrw workload
Date: Tue, 21 Jun 2022 15:53:33 +0300	[thread overview]
Message-ID: <CAOQ4uxheatf+GCHxbUDQ4s4YSQib3qeYVeXZwEicR9fURrEFBA@mail.gmail.com> (raw)
In-Reply-To: <20220621085956.y5wyopfgzmqkaeiw@quack3.lan>

On Tue, Jun 21, 2022 at 11:59 AM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 21-06-22 10:49:48, Amir Goldstein wrote:
> > > How exactly do you imagine the synchronization of buffered read against
> > > buffered write would work? Lock all pages for the read range in the page
> > > cache? You'd need to be careful to not bring the machine OOM when someone
> > > asks to read a huge range...
> >
> > I imagine that the atomic r/w synchronisation will remain *exactly* as it is
> > today by taking XFS_IOLOCK_SHARED around generic_file_read_iter(),
> > when reading data into user buffer, but before that, I would like to issue
> > and wait for read of the pages in the range to reduce the probability
> > of doing the read I/O under XFS_IOLOCK_SHARED.
> >
> > The pre-warm of page cache does not need to abide to the atomic read
> > semantics and it is also tolerable if some pages are evicted in between
> > pre-warn and read to user buffer - in the worst case this will result in
> > I/O amplification, but for the common case, it will be a big win for the
> > mixed random r/w performance on xfs.
> >
> > To reduce risk of page cache thrashing we can limit this optimization
> > to a maximum number of page cache pre-warm.
> >
> > The questions are:
> > 1. Does this plan sound reasonable?
>
> Ah, I see now. So essentially the idea is to pull the readahead (which is
> currently happening from filemap_read() -> filemap_get_pages()) out from under
> the i_rwsem. It looks like a fine idea to me.

Great!
Anyone doesn't like the idea or has another suggestion?

>
> > 2. Is there a ready helper (force_page_cache_readahead?) that
> >     I can use which takes the required page/invalidate locks?
>
> page_cache_sync_readahead() should be the function you need. It does take
> care to lock invalidate_lock internally when creating & reading pages. I

Thanks, I'll try that.

> just cannot comment on whether calling this without i_rwsem does not break
> some internal XFS expectations for stuff like reflink etc.

relink is done under xfs_ilock2_io_mmap => filemap_invalidate_lock_two
so it should not be a problem.

pNFS leases I need to look into.

Thanks,
Amir.

  reply	other threads:[~2022-06-21 12:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 16:57 [POC][PATCH] xfs: reduce ilock contention on buffered randrw workload Amir Goldstein
2019-04-04 21:17 ` Dave Chinner
2019-04-05 14:02   ` Amir Goldstein
2019-04-07 23:27     ` Dave Chinner
2019-04-08  9:02       ` Amir Goldstein
2019-04-08 14:11         ` Jan Kara
2019-04-08 17:41           ` Amir Goldstein
2019-04-09  8:26             ` Jan Kara
2022-06-17 14:48               ` Amir Goldstein
2022-06-17 15:11                 ` Jan Kara
2022-06-18  8:38                   ` Amir Goldstein
2022-06-20  9:11                     ` Jan Kara
2022-06-21  7:49                       ` Amir Goldstein
2022-06-21  8:59                         ` Jan Kara
2022-06-21 12:53                           ` Amir Goldstein [this message]
2022-06-22  3:23                             ` Matthew Wilcox
2022-06-22  9:00                               ` Amir Goldstein
2022-06-22  9:34                                 ` Jan Kara
2022-06-22 16:26                                   ` Amir Goldstein
2022-09-13 14:40                             ` Amir Goldstein
2022-09-14 16:01                               ` Darrick J. Wong
2022-09-14 16:29                                 ` Amir Goldstein
2022-09-14 17:39                                   ` Darrick J. Wong
2022-09-19 23:09                                     ` Dave Chinner
2022-09-20  2:24                                       ` Dave Chinner
2022-09-20  3:08                                         ` Amir Goldstein
2022-09-21 11:20                                           ` Amir Goldstein
2019-04-08 11:03       ` Jan Kara
2019-04-22 10:55         ` Boaz Harrosh
2019-04-08 10:33   ` Jan Kara
2019-04-08 16:37     ` Davidlohr Bueso
2019-04-11  1:11       ` Dave Chinner
2019-04-16 12:22         ` Dave Chinner
2019-04-18  3:10           ` Dave Chinner
2019-04-18 18:21             ` Davidlohr Bueso
2019-04-20 23:54               ` Dave Chinner
2019-05-03  4:17                 ` Dave Chinner
2019-05-03  5:17                   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxheatf+GCHxbUDQ4s4YSQib3qeYVeXZwEicR9fURrEFBA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).