All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Jan Kara <jack@suse.cz>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Dave Chinner <dchinner@redhat.com>,
	linux-kernel@vger.kernel.org,
	William Kucharski <william.kucharski@oracle.com>
Subject: Re: [PATCH v3 04/12] mm/filemap: Add mapping_seek_hole_data
Date: Mon, 26 Oct 2020 12:17:27 +0000	[thread overview]
Message-ID: <20201026121727.GO20115@casper.infradead.org> (raw)
In-Reply-To: <20201026104806.GB29758@quack2.suse.cz>

On Mon, Oct 26, 2020 at 11:48:06AM +0100, Jan Kara wrote:
> > +static inline loff_t page_seek_hole_data(struct page *page,
> > +		loff_t start, loff_t end, bool seek_data)
> > +{
> > +	if (xa_is_value(page) || PageUptodate(page))
> 
> Please add a comment here that this is currently tmpfs specific treating
> exceptional entries as swapped out pages and thus data. It took me quite a
> while to figure this out. You can remove the comment later when it is no
> longer true...

But it's not tmpfs specific.  If the value entry is a DAX entry, there's
data here, and if the value entry is a shadow entry, there's data here
too.  Not that it should be called for either of those cases because the
filesystem should know, but a value entry always means there's data here.

I'm open to adding a comment, but saying "A value entry always means data"
seems a little redundant with the code.  What would have helped?

> > +		return seek_data ? start : end;
> > +	return seek_data ? end : start;
> > +}
> > +
> > +static inline
> > +unsigned int seek_page_size(struct xa_state *xas, struct page *page)
> > +{
> > +	if (xa_is_value(page))
> > +		return PAGE_SIZE << xa_get_order(xas->xa, xas->xa_index);
> > +	return thp_size(page);
> > +}
> > +
> > +/**
> > + * mapping_seek_hole_data - Seek for SEEK_DATA / SEEK_HOLE in the page cache.
> > + * @mapping: Address space to search.
> > + * @start: First byte to consider.
> > + * @end: Limit of search (exclusive).
> > + * @whence: Either SEEK_HOLE or SEEK_DATA.
> > + *
> > + * If the page cache knows which blocks contain holes and which blocks
> > + * contain data, your filesystem can use this function to implement
> > + * SEEK_HOLE and SEEK_DATA.  This is useful for filesystems which are
> > + * entirely memory-based such as tmpfs, and filesystems which support
> > + * unwritten extents.
> > + *
> > + * Return: The requested offset on successs, or -ENXIO if @whence specifies
> > + * SEEK_DATA and there is no data after @start.  There is an implicit hole
> > + * after @end - 1, so SEEK_HOLE returns @end if all the bytes between @start
> > + * and @end contain data.
> > + */
> > +loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
> > +		loff_t end, int whence)
> > +{
> > +	XA_STATE(xas, &mapping->i_pages, start >> PAGE_SHIFT);
> > +	pgoff_t max = (end - 1) / PAGE_SIZE;
> > +	bool seek_data = (whence == SEEK_DATA);
> > +	struct page *page;
> > +
> > +	if (end <= start)
> > +		return -ENXIO;
> > +
> > +	rcu_read_lock();
> > +	while ((page = xas_find_get_entry(&xas, max, XA_PRESENT))) {
> > +		loff_t pos = xas.xa_index * PAGE_SIZE;
> > +
> > +		if (start < pos) {
> > +			if (!seek_data)
> > +				goto unlock;
> > +			start = pos;
> > +		}
> > +
> > +		pos += seek_page_size(&xas, page);
> > +		start = page_seek_hole_data(page, start, pos, seek_data);
> > +		if (start < pos)
> > +			goto unlock;
> 
> Uh, I was staring at this function for half an hour but I still couldn't
> convince myself that it is correct in all the corner cases. Maybe I'm dumb
> but I'd wish this was more intuitive (and I have to say that the original
> tmpfs function is much more obviously correct to me). It would more 
> understandable for me if we had a code like:
> 
> 		if (page_seek_match(page, seek_data))
> 			goto unlock;
> 
> which would be just the condition in page_seek_hole_data(). Honestly at the
> moment I fail to see why you bother with 'pos' in the above four lines at
> all.

So this?

static bool page_seek_match(struct page *page, bool seek_data)
{
	/* Swap, shadow & DAX entries all represent data */
	if (xa_is_value(page) || PageUptodate(page))
		return seek_data;
	return !seek_data;
}

...

		if (page_seek_match(page, seek_data))
			goto unlock;
		start = pos + seek_page_size(&xas, page);

The function makes more sense when page_seek_hole_data() gains the
ability to look at sub-page uptodate status and it needs to return
where in the page the data (or hole) starts.  But that can be delayed
for the later patch.

With those changes,

Ran: generic/285 generic/286 generic/436 generic/445 generic/448 generic/490 generic/539
Passed all 7 tests

> BTW I suspect that this loop forgets to release the page reference it has got
> when doing SEEK_HOLE.

You're right.  I need a put_page() at the end of the loop.  Also true
for the case where we find a !Uptodate page when doing SEEK_DATA.

> > +	}
> > +	rcu_read_unlock();
> > +
> > +	if (seek_data)
> > +		return -ENXIO;
> > +	goto out;
> > +
> > +unlock:
> > +	rcu_read_unlock();
> > +	if (!xa_is_value(page))
> > +		put_page(page);
> > +out:
> > +	if (start > end)
> > +		return end;
> > +	return start;
> > +}
> 
> 
> 								Honza
> 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

  reply	other threads:[~2020-10-26 12:17 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26  4:13 [PATCH v3 00/12] Overhaul multi-page lookups for THP Matthew Wilcox (Oracle)
2020-10-26  4:13 ` [PATCH v3 01/12] mm: Make pagecache tagged lookups return only head pages Matthew Wilcox (Oracle)
2020-10-28  7:50   ` Mike Rapoport
2020-11-12 17:41     ` Matthew Wilcox
2020-11-12 19:11       ` Mike Rapoport
2020-10-26  4:13 ` [PATCH v3 02/12] mm/shmem: Use pagevec_lookup in shmem_unlock_mapping Matthew Wilcox (Oracle)
2020-10-26  4:13 ` [PATCH v3 03/12] mm/filemap: Add helper for finding pages Matthew Wilcox (Oracle)
2020-10-27 18:56   ` Christoph Hellwig
2020-11-12 14:25     ` Matthew Wilcox
2020-10-26  4:14 ` [PATCH v3 04/12] mm/filemap: Add mapping_seek_hole_data Matthew Wilcox (Oracle)
2020-10-26 10:48   ` Jan Kara
2020-10-26 12:17     ` Matthew Wilcox [this message]
2020-10-26 14:49       ` Jan Kara
2020-10-27 18:58   ` Christoph Hellwig
2020-10-27 20:04     ` Matthew Wilcox
2020-10-28  6:40       ` Christoph Hellwig
2020-10-26  4:14 ` [PATCH v3 05/12] mm: Add and use find_lock_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 06/12] mm: Add an 'end' parameter to find_get_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 07/12] mm: Add an 'end' parameter to pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 08/12] mm: Remove nr_entries parameter from pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 09/12] mm: Pass pvec directly to find_get_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 10/12] mm: Remove pagevec_lookup_entries Matthew Wilcox (Oracle)
2020-10-26  4:14 ` [PATCH v3 11/12] mm/truncate,shmem: Handle truncates that split THPs Matthew Wilcox (Oracle)
2020-10-26 11:05   ` Jan Kara
2020-10-26  4:14 ` [PATCH v3 12/12] mm/filemap: Return only head pages from find_get_entries Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201026121727.GO20115@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=william.kucharski@oracle.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.