From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33B35C4363A for ; Mon, 26 Oct 2020 10:48:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4073A2072E for ; Mon, 26 Oct 2020 10:48:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4073A2072E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 29D676B005D; Mon, 26 Oct 2020 06:48:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 24D556B0062; Mon, 26 Oct 2020 06:48:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 115526B0068; Mon, 26 Oct 2020 06:48:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id D74EA6B005D for ; Mon, 26 Oct 2020 06:48:09 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 61DAE3624 for ; Mon, 26 Oct 2020 10:48:09 +0000 (UTC) X-FDA: 77413751898.12.laugh98_5e0998d27272 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 3989E1800F096 for ; Mon, 26 Oct 2020 10:48:09 +0000 (UTC) X-HE-Tag: laugh98_5e0998d27272 X-Filterd-Recvd-Size: 6112 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 26 Oct 2020 10:48:08 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 743C7AE3B; Mon, 26 Oct 2020 10:48:07 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id D39A41E10F5; Mon, 26 Oct 2020 11:48:06 +0100 (CET) Date: Mon, 26 Oct 2020 11:48:06 +0100 From: Jan Kara To: "Matthew Wilcox (Oracle)" Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Hugh Dickins , Johannes Weiner , Yang Shi , Dave Chinner , linux-kernel@vger.kernel.org, William Kucharski Subject: Re: [PATCH v3 04/12] mm/filemap: Add mapping_seek_hole_data Message-ID: <20201026104806.GB29758@quack2.suse.cz> References: <20201026041408.25230-1-willy@infradead.org> <20201026041408.25230-5-willy@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201026041408.25230-5-willy@infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 26-10-20 04:14:00, Matthew Wilcox (Oracle) wrote: > Rewrite shmem_seek_hole_data() and move it to filemap.c. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: William Kucharski > --- > include/linux/pagemap.h | 2 ++ > mm/filemap.c | 76 +++++++++++++++++++++++++++++++++++++++++ > mm/shmem.c | 72 +++----------------------------------- > 3 files changed, 82 insertions(+), 68 deletions(-) > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index c77b7c31b2e4..5f3e829c91fd 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -760,6 +760,8 @@ extern void __delete_from_page_cache(struct page *page, void *shadow); > int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask); > void delete_from_page_cache_batch(struct address_space *mapping, > struct pagevec *pvec); > +loff_t mapping_seek_hole_data(struct address_space *, loff_t start, loff_t end, > + int whence); > > /* > * Like add_to_page_cache_locked, but used to add newly allocated pages: > diff --git a/mm/filemap.c b/mm/filemap.c > index 00eaed59e797..3a55d258d9f2 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2526,6 +2526,82 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) > } > EXPORT_SYMBOL(generic_file_read_iter); > > +static inline loff_t page_seek_hole_data(struct page *page, > + loff_t start, loff_t end, bool seek_data) > +{ > + if (xa_is_value(page) || PageUptodate(page)) Please add a comment here that this is currently tmpfs specific treating exceptional entries as swapped out pages and thus data. It took me quite a while to figure this out. You can remove the comment later when it is no longer true... > + return seek_data ? start : end; > + return seek_data ? end : start; > +} > + > +static inline > +unsigned int seek_page_size(struct xa_state *xas, struct page *page) > +{ > + if (xa_is_value(page)) > + return PAGE_SIZE << xa_get_order(xas->xa, xas->xa_index); > + return thp_size(page); > +} > + > +/** > + * mapping_seek_hole_data - Seek for SEEK_DATA / SEEK_HOLE in the page cache. > + * @mapping: Address space to search. > + * @start: First byte to consider. > + * @end: Limit of search (exclusive). > + * @whence: Either SEEK_HOLE or SEEK_DATA. > + * > + * If the page cache knows which blocks contain holes and which blocks > + * contain data, your filesystem can use this function to implement > + * SEEK_HOLE and SEEK_DATA. This is useful for filesystems which are > + * entirely memory-based such as tmpfs, and filesystems which support > + * unwritten extents. > + * > + * Return: The requested offset on successs, or -ENXIO if @whence specifies > + * SEEK_DATA and there is no data after @start. There is an implicit hole > + * after @end - 1, so SEEK_HOLE returns @end if all the bytes between @start > + * and @end contain data. > + */ > +loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start, > + loff_t end, int whence) > +{ > + XA_STATE(xas, &mapping->i_pages, start >> PAGE_SHIFT); > + pgoff_t max = (end - 1) / PAGE_SIZE; > + bool seek_data = (whence == SEEK_DATA); > + struct page *page; > + > + if (end <= start) > + return -ENXIO; > + > + rcu_read_lock(); > + while ((page = xas_find_get_entry(&xas, max, XA_PRESENT))) { > + loff_t pos = xas.xa_index * PAGE_SIZE; > + > + if (start < pos) { > + if (!seek_data) > + goto unlock; > + start = pos; > + } > + > + pos += seek_page_size(&xas, page); > + start = page_seek_hole_data(page, start, pos, seek_data); > + if (start < pos) > + goto unlock; Uh, I was staring at this function for half an hour but I still couldn't convince myself that it is correct in all the corner cases. Maybe I'm dumb but I'd wish this was more intuitive (and I have to say that the original tmpfs function is much more obviously correct to me). It would more understandable for me if we had a code like: if (page_seek_match(page, seek_data)) goto unlock; which would be just the condition in page_seek_hole_data(). Honestly at the moment I fail to see why you bother with 'pos' in the above four lines at all. BTW I suspect that this loop forgets to release the page reference it has got when doing SEEK_HOLE. > + } > + rcu_read_unlock(); > + > + if (seek_data) > + return -ENXIO; > + goto out; > + > +unlock: > + rcu_read_unlock(); > + if (!xa_is_value(page)) > + put_page(page); > +out: > + if (start > end) > + return end; > + return start; > +} Honza -- Jan Kara SUSE Labs, CR