From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4386CC3402A for ; Mon, 17 Feb 2020 18:46:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F38EE22527 for ; Mon, 17 Feb 2020 18:46:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="i7Z6Vj6w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F38EE22527 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5DF66B0078; Mon, 17 Feb 2020 13:46:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B509F6B0080; Mon, 17 Feb 2020 13:46:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90B2A6B0078; Mon, 17 Feb 2020 13:46:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 596136B0073 for ; Mon, 17 Feb 2020 13:46:23 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F27CF2C8F for ; Mon, 17 Feb 2020 18:46:22 +0000 (UTC) X-FDA: 76500499404.02.basin71_1a9177367be22 X-HE-Tag: basin71_1a9177367be22 X-Filterd-Recvd-Size: 8883 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Mon, 17 Feb 2020 18:46:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=IdbzKhHNnhdRYLMVi5hhCdd4h8STLZkROVG6Wr3MY44=; b=i7Z6Vj6wwMTAXfgAvdOo2oaMcJ LqeO+lH8TMReny2uW/bz8R7NrjhQMCZqQlovCEjnNTyLqmMQ5II9orC/y2cXO351y/6QfP/eq9SB7 k7mGlh68xoLUqmuu8w09NYV1CHY3kMnxzhv2y3IWUOcCOiJsjCUWC9k9AcpXUh7eaaZCOITgKAxhR fZWUrVojffqwx2VYvkTw7fzwNCMrwSbu+UDrYys3swceDTofPrN0MSqQpsrYBfX0GbYZ6A7yN84L7 2bd1u1HT7USizpCacyRpB+T6OYW99+W+nD6XoejPSt6ZG1DWxn7U2iCE3DFon9qF4AlFPOq8lsKzk R9ByIgAQ==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005AD-Pu; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 09/16] btrfs: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:45:55 -0800 Message-Id: <20200217184613.19668-15-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Use the new readahead operation in btrfs. Add a readahead_for_each_batch() iterator to optimise the loop in the XArray. Signed-off-by: Matthew Wilcox (Oracle) --- fs/btrfs/extent_io.c | 48 ++++++++++++++--------------------------- fs/btrfs/extent_io.h | 3 +-- fs/btrfs/inode.c | 16 ++++++-------- include/linux/pagemap.h | 27 +++++++++++++++++++++++ 4 files changed, 51 insertions(+), 43 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c0f202741e09..d9f66058e0a7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4278,52 +4278,36 @@ int extent_writepages(struct address_space *mappi= ng, return ret; } =20 -int extent_readpages(struct address_space *mapping, struct list_head *pa= ges, - unsigned nr_pages) +void extent_readahead(struct readahead_control *rac) { struct bio *bio =3D NULL; unsigned long bio_flags =3D 0; struct page *pagepool[16]; struct extent_map *em_cached =3D NULL; - struct extent_io_tree *tree =3D &BTRFS_I(mapping->host)->io_tree; - int nr =3D 0; + struct extent_io_tree *tree =3D &BTRFS_I(rac->mapping->host)->io_tree; u64 prev_em_start =3D (u64)-1; + int nr; =20 - while (!list_empty(pages)) { - u64 contig_end =3D 0; - - for (nr =3D 0; nr < ARRAY_SIZE(pagepool) && !list_empty(pages);) { - struct page *page =3D lru_to_page(pages); - - prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, page->index, - readahead_gfp_mask(mapping))) { - put_page(page); - break; - } - - pagepool[nr++] =3D page; - contig_end =3D page_offset(page) + PAGE_SIZE - 1; - } - - if (nr) { - u64 contig_start =3D page_offset(pagepool[0]); + readahead_for_each_batch(rac, pagepool, ARRAY_SIZE(pagepool), nr) { + u64 contig_start =3D page_offset(pagepool[0]); + u64 contig_end =3D page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1; =20 - ASSERT(contig_start + nr * PAGE_SIZE - 1 =3D=3D contig_end); + ASSERT(contig_start + nr * PAGE_SIZE - 1 =3D=3D contig_end); =20 - contiguous_readpages(tree, pagepool, nr, contig_start, - contig_end, &em_cached, &bio, &bio_flags, - &prev_em_start); - } + contiguous_readpages(tree, pagepool, nr, contig_start, + contig_end, &em_cached, &bio, &bio_flags, + &prev_em_start); } =20 if (em_cached) free_extent_map(em_cached); =20 - if (bio) - return submit_one_bio(bio, 0, bio_flags); - return 0; + if (bio) { + int ret =3D submit_one_bio(bio, 0, bio_flags); + if (ret < 0) { + /* XXX: unlock the pages here? */ + } + } } =20 /* diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 5d205bbaafdc..bddac32948c7 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -198,8 +198,7 @@ int extent_writepages(struct address_space *mapping, struct writeback_control *wbc); int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc); -int extent_readpages(struct address_space *mapping, struct list_head *pa= ges, - unsigned nr_pages); +void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinf= o, __u64 start, __u64 len); void set_page_extent_mapped(struct page *page); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5b3ec93ff911..d964b2a78ed8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4794,8 +4794,8 @@ static void evict_inode_truncate_pages(struct inode= *inode) =20 /* * Keep looping until we have no more ranges in the io tree. - * We can have ongoing bios started by readpages (called from readahead= ) - * that have their endio callback (extent_io.c:end_bio_extent_readpage) + * We can have ongoing bios started by readahead that have + * their endio callback (extent_io.c:end_bio_extent_readpage) * still in progress (unlocked the pages in the bio but did not yet * unlocked the ranges in the io tree). Therefore this means some * ranges can still be locked and eviction started because before @@ -6996,11 +6996,11 @@ static int lock_extent_direct(struct inode *inode= , u64 lockstart, u64 lockend, * for it to complete) and then invalidate the pages for * this range (through invalidate_inode_pages2_range()), * but that can lead us to a deadlock with a concurrent - * call to readpages() (a buffered read or a defrag call + * call to readahead (a buffered read or a defrag call * triggered a readahead) on a page lock due to an * ordered dio extent we created before but did not have * yet a corresponding bio submitted (whence it can not - * complete), which makes readpages() wait for that + * complete), which makes readahead wait for that * ordered extent to complete while holding a lock on * that page. */ @@ -8239,11 +8239,9 @@ static int btrfs_writepages(struct address_space *= mapping, return extent_writepages(mapping, wbc); } =20 -static int -btrfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void btrfs_readahead(struct readahead_control *rac) { - return extent_readpages(mapping, pages, nr_pages); + extent_readahead(rac); } =20 static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -10448,7 +10446,7 @@ static const struct address_space_operations btrf= s_aops =3D { .readpage =3D btrfs_readpage, .writepage =3D btrfs_writepage, .writepages =3D btrfs_writepages, - .readpages =3D btrfs_readpages, + .readahead =3D btrfs_readahead, .direct_IO =3D btrfs_direct_IO, .invalidatepage =3D btrfs_invalidatepage, .releasepage =3D btrfs_releasepage, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 4f36c06d064d..1bbb60a0bf16 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -669,6 +669,33 @@ static inline void readahead_next(struct readahead_c= ontrol *rac) #define readahead_for_each(rac, page) \ for (; (page =3D readahead_page(rac)); readahead_next(rac)) =20 +static inline unsigned int readahead_page_batch(struct readahead_control= *rac, + struct page **array, unsigned int size) +{ + unsigned int batch =3D 0; + XA_STATE(xas, &rac->mapping->i_pages, rac->_start); + struct page *page; + + rac->_batch_count =3D 0; + xas_for_each(&xas, page, rac->_start + rac->_nr_pages - 1) { + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(PageTail(page), page); + array[batch++] =3D page; + rac->_batch_count +=3D hpage_nr_pages(page); + if (PageHead(page)) + xas_set(&xas, rac->_start + rac->_batch_count); + + if (batch =3D=3D size) + break; + } + + return batch; +} + +#define readahead_for_each_batch(rac, array, size, nr) \ + for (; (nr =3D readahead_page_batch(rac, array, size)); \ + readahead_next(rac)) + /* The byte offset into the file of this readahead block */ static inline loff_t readahead_offset(struct readahead_control *rac) { --=20 2.25.0