From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: "Theodore Ts'o" <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Jan Kara <jack@suse.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Dave Hansen <dave.hansen@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv3 23/41] fs: make block_read_full_page() be able to read huge page
Date: Thu, 15 Sep 2016 14:55:05 +0300 [thread overview]
Message-ID: <20160915115523.29737-24-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <20160915115523.29737-1-kirill.shutemov@linux.intel.com>
The approach is straight-forward: for compound pages we read out whole
huge page.
For huge page we cannot have array of buffer head pointers on stack --
it's 4096 pointers on x86-64 -- 'arr' is allocated with kmalloc() for
huge pages.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
fs/buffer.c | 22 +++++++++++++++++-----
include/linux/buffer_head.h | 9 +++++----
include/linux/page-flags.h | 2 +-
3 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 9c8eb9b6db6a..2739f5dae690 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -870,7 +870,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
try_again:
head = NULL;
- offset = PAGE_SIZE;
+ offset = hpage_size(page);
while ((offset -= size) >= 0) {
bh = alloc_buffer_head(GFP_NOFS);
if (!bh)
@@ -1466,7 +1466,7 @@ void set_bh_page(struct buffer_head *bh,
struct page *page, unsigned long offset)
{
bh->b_page = page;
- BUG_ON(offset >= PAGE_SIZE);
+ BUG_ON(offset >= hpage_size(page));
if (PageHighMem(page))
/*
* This catches illegal uses and preserves the offset:
@@ -2239,11 +2239,13 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
{
struct inode *inode = page->mapping->host;
sector_t iblock, lblock;
- struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
+ struct buffer_head *arr_on_stack[MAX_BUF_PER_PAGE];
+ struct buffer_head *bh, *head, **arr = arr_on_stack;
unsigned int blocksize, bbits;
int nr, i;
int fully_mapped = 1;
+ VM_BUG_ON_PAGE(PageTail(page), page);
head = create_page_buffers(page, inode, 0);
blocksize = head->b_size;
bbits = block_size_bits(blocksize);
@@ -2254,6 +2256,11 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
nr = 0;
i = 0;
+ if (PageTransHuge(page)) {
+ arr = kmalloc(sizeof(struct buffer_head *) * HPAGE_PMD_NR *
+ MAX_BUF_PER_PAGE, GFP_NOFS);
+ }
+
do {
if (buffer_uptodate(bh))
continue;
@@ -2269,7 +2276,9 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
SetPageError(page);
}
if (!buffer_mapped(bh)) {
- zero_user(page, i * blocksize, blocksize);
+ zero_user(page + (i * blocksize / PAGE_SIZE),
+ i * blocksize % PAGE_SIZE,
+ blocksize);
if (!err)
set_buffer_uptodate(bh);
continue;
@@ -2295,7 +2304,7 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
if (!PageError(page))
SetPageUptodate(page);
unlock_page(page);
- return 0;
+ goto out;
}
/* Stage two: lock the buffers */
@@ -2317,6 +2326,9 @@ int block_read_full_page(struct page *page, get_block_t *get_block)
else
submit_bh(REQ_OP_READ, 0, bh);
}
+out:
+ if (arr != arr_on_stack)
+ kfree(arr);
return 0;
}
EXPORT_SYMBOL(block_read_full_page);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 006a8a42acfb..194a85822d5f 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -131,13 +131,14 @@ BUFFER_FNS(Meta, meta)
BUFFER_FNS(Prio, prio)
BUFFER_FNS(Defer_Completion, defer_completion)
-#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~hpage_mask(bh->b_page))
/* If we *know* page->private refers to buffer_heads */
-#define page_buffers(page) \
+#define page_buffers(__page) \
({ \
- BUG_ON(!PagePrivate(page)); \
- ((struct buffer_head *)page_private(page)); \
+ struct page *p = compound_head(__page); \
+ BUG_ON(!PagePrivate(p)); \
+ ((struct buffer_head *)page_private(p)); \
})
#define page_has_buffers(page) PagePrivate(page)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index a2bef9a41bcf..20b7684e9298 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -730,7 +730,7 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
*/
static inline int page_has_private(struct page *page)
{
- return !!(page->flags & PAGE_FLAGS_PRIVATE);
+ return !!(compound_head(page)->flags & PAGE_FLAGS_PRIVATE);
}
#undef PF_ANY
--
2.9.3
next prev parent reply other threads:[~2016-09-15 12:08 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-15 11:54 [PATCHv3 00/41] ext4: support of huge pages Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 01/41] tools: Add WARN_ON_ONCE Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 02/41] radix tree test suite: Allow GFP_ATOMIC allocations to fail Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 03/41] radix-tree: Add radix_tree_join Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 04/41] radix-tree: Add radix_tree_split Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 05/41] radix-tree: Add radix_tree_split_preload() Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 06/41] radix-tree: Handle multiorder entries being deleted by replace_clear_tags Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 07/41] mm, shmem: swich huge tmpfs to multi-order radix-tree entries Kirill A. Shutemov
2016-09-16 12:07 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 08/41] Revert "radix-tree: implement radix_tree_maybe_preload_order()" Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 09/41] page-flags: relax page flag policy for few flags Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 10/41] mm, rmap: account file thp pages Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 11/41] thp: try to free page's buffers before attempt split Kirill A. Shutemov
2016-10-11 15:40 ` Jan Kara
2016-10-11 21:43 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 12/41] thp: handle write-protection faults for file THP Kirill A. Shutemov
2016-10-11 15:47 ` Jan Kara
2016-10-11 21:47 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 13/41] truncate: make sure invalidate_mapping_pages() can discard huge pages Kirill A. Shutemov
2016-10-11 15:58 ` Jan Kara
2016-10-11 21:53 ` Kirill A. Shutemov
2016-10-12 6:43 ` Jan Kara
2016-10-24 10:41 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 14/41] filemap: allocate huge page in page_cache_read(), if allowed Kirill A. Shutemov
2016-10-11 16:15 ` Jan Kara
2016-10-11 21:57 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 15/41] filemap: handle huge pages in do_generic_file_read() Kirill A. Shutemov
2016-10-13 9:33 ` Jan Kara
2016-10-31 18:10 ` Kirill A. Shutemov
2016-11-01 16:39 ` Jan Kara
2016-11-02 8:32 ` Kirill A. Shutemov
2016-11-02 14:37 ` Christoph Hellwig
2016-11-03 20:40 ` Jan Kara
2016-11-07 11:07 ` Kirill A. Shutemov
2016-11-07 14:59 ` Christoph Hellwig
2016-11-02 14:36 ` Christoph Hellwig
2016-11-03 17:56 ` Jan Kara
2016-11-07 11:13 ` Kirill A. Shutemov
2016-11-07 15:01 ` Christoph Hellwig
2016-11-07 16:03 ` Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 16/41] filemap: allocate huge page in pagecache_get_page(), if allowed Kirill A. Shutemov
2016-09-15 11:54 ` [PATCHv3 17/41] filemap: handle huge pages in filemap_fdatawait_range() Kirill A. Shutemov
2016-10-13 9:44 ` Jan Kara
2016-10-13 12:08 ` Kirill A. Shutemov
2016-10-13 13:18 ` Jan Kara
2016-10-24 11:36 ` Kirill A. Shutemov
2016-10-30 17:31 ` Jan Kara
2016-09-15 11:55 ` [PATCHv3 18/41] HACK: readahead: alloc huge pages, if allowed Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 19/41] block: define BIO_MAX_PAGES to HPAGE_PMD_NR if huge page cache enabled Kirill A. Shutemov
2016-09-16 12:09 ` Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 20/41] mm: make write_cache_pages() work on huge pages Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 21/41] thp: introduce hpage_size() and hpage_mask() Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 22/41] thp: do not threat slab pages as huge in hpage_{nr_pages,size,mask} Kirill A. Shutemov
2016-09-15 11:55 ` Kirill A. Shutemov [this message]
2016-09-15 11:55 ` [PATCHv3 24/41] fs: make block_write_{begin,end}() be able to handle huge pages Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 25/41] fs: make block_page_mkwrite() aware about " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 26/41] truncate: make truncate_inode_pages_range() " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 27/41] truncate: make invalidate_inode_pages2_range() " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 28/41] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 29/41] ext4: make ext4_mpage_readpages() hugepage-aware Kirill A. Shutemov
2016-09-15 12:27 ` Andreas Dilger
2016-09-16 12:17 ` Kirill A. Shutemov
2016-09-16 12:10 ` Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 30/41] ext4: make ext4_writepage() work on huge pages Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 31/41] ext4: handle huge pages in ext4_page_mkwrite() Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 32/41] ext4: handle huge pages in __ext4_block_zero_page_range() Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 33/41] ext4: make ext4_block_write_begin() aware about huge pages Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 34/41] ext4: handle huge pages in ext4_da_write_end() Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 35/41] ext4: make ext4_da_page_release_reservation() aware about huge pages Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 36/41] ext4: handle writeback with " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 37/41] ext4: make EXT4_IOC_MOVE_EXT work " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 38/41] ext4: fix SEEK_DATA/SEEK_HOLE for " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 39/41] ext4: make fallocate() operations work with " Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 40/41] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff() Kirill A. Shutemov
2016-09-15 11:55 ` [PATCHv3 41/41] ext4, vfs: add huge= mount option Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160915115523.29737-24-kirill.shutemov@linux.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ross.zwisler@linux.intel.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).