Re: [POC][PATCH] xfs: reduce ilock contention on buffered randrw workload

From: Jan Kara <jack@suse.cz>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <willy@infradead.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [POC][PATCH] xfs: reduce ilock contention on buffered randrw workload
Date: Tue, 21 Jun 2022 10:59:56 +0200	[thread overview]
Message-ID: <20220621085956.y5wyopfgzmqkaeiw@quack3.lan> (raw)
In-Reply-To: <CAOQ4uxg+uY5PdcU1=RyDWCxbP4gJB3jH1zkAj=RpfndH9czXbg@mail.gmail.com>

On Tue 21-06-22 10:49:48, Amir Goldstein wrote:
> > How exactly do you imagine the synchronization of buffered read against
> > buffered write would work? Lock all pages for the read range in the page
> > cache? You'd need to be careful to not bring the machine OOM when someone
> > asks to read a huge range...
> 
> I imagine that the atomic r/w synchronisation will remain *exactly* as it is
> today by taking XFS_IOLOCK_SHARED around generic_file_read_iter(),
> when reading data into user buffer, but before that, I would like to issue
> and wait for read of the pages in the range to reduce the probability
> of doing the read I/O under XFS_IOLOCK_SHARED.
> 
> The pre-warm of page cache does not need to abide to the atomic read
> semantics and it is also tolerable if some pages are evicted in between
> pre-warn and read to user buffer - in the worst case this will result in
> I/O amplification, but for the common case, it will be a big win for the
> mixed random r/w performance on xfs.
> 
> To reduce risk of page cache thrashing we can limit this optimization
> to a maximum number of page cache pre-warm.
> 
> The questions are:
> 1. Does this plan sound reasonable?

Ah, I see now. So essentially the idea is to pull the readahead (which is
currently happening from filemap_read() -> filemap_get_pages()) out from under
the i_rwsem. It looks like a fine idea to me.

> 2. Is there a ready helper (force_page_cache_readahead?) that
>     I can use which takes the required page/invalidate locks?

page_cache_sync_readahead() should be the function you need. It does take
care to lock invalidate_lock internally when creating & reading pages. I
just cannot comment on whether calling this without i_rwsem does not break
some internal XFS expectations for stuff like reflink etc.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR