All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Theodore Ts'o <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Jan Kara <jack@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org
Subject: Re: [PATCHv6 11/37] HACK: readahead: alloc huge pages, if allowed
Date: Thu, 9 Feb 2017 17:23:31 -0700	[thread overview]
Message-ID: <7D35EB8E-29F8-41DA-BB46-8BCF7B6C5A72@dilger.ca> (raw)
In-Reply-To: <20170209233436.GZ2267@bombadil.infradead.org>

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On Feb 9, 2017, at 4:34 PM, Matthew Wilcox <willy@infradead.org> wrote:
> 
> On Thu, Jan 26, 2017 at 02:57:53PM +0300, Kirill A. Shutemov wrote:
>> Most page cache allocation happens via readahead (sync or async), so if
>> we want to have significant number of huge pages in page cache we need
>> to find a ways to allocate them from readahead.
>> 
>> Unfortunately, huge pages doesn't fit into current readahead design:
>> 128 max readahead window, assumption on page size, PageReadahead() to
>> track hit/miss.
>> 
>> I haven't found a ways to get it right yet.
>> 
>> This patch just allocates huge page if allowed, but doesn't really
>> provide any readahead if huge page is allocated. We read out 2M a time
>> and I would expect spikes in latancy without readahead.
>> 
>> Therefore HACK.
>> 
>> Having that said, I don't think it should prevent huge page support to
>> be applied. Future will show if lacking readahead is a big deal with
>> huge pages in page cache.
>> 
>> Any suggestions are welcome.
> 
> Well ... what if we made readahead 2 hugepages in size for inodes which
> are using huge pages?  That's only 8x our current readahead window, and
> if you're asking for hugepages, you're accepting that IOs are going to
> be larger, and you probably have the kind of storage system which can
> handle doing larger IOs.

It would be nice if the bdi had a parameter for the maximum readahead size.
Currently, readahead is capped at 2MB chunks by force_page_cache_readahead()
even if bdi->ra_pages and bdi->io_pages are much larger.

It should be up to the filesystem to decide how large the readahead chunks
are rather than imposing some policy in the MM code.  For high-speed (network)
storage access it is better to have at least 4MB read chunks, for RAID storage
it is desirable to have stripe-aligned readahead to avoid read inflation when
verifying the parity.  Any fixed size will eventually be inadequate as disks
and filesystems change, so it may as well be a per-bdi tunable that can be set
by the filesystem as needed, or possibly with a mount option if needed.


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  reply	other threads:[~2017-02-10  0:23 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 11:57 [PATCHv6 00/37] ext4: support of huge pages Kirill A. Shutemov
2017-01-26 11:57 ` Kirill A. Shutemov
2017-01-26 11:57 ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 01/37] mm, shmem: swich huge tmpfs to multi-order radix-tree entries Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09  3:57   ` Matthew Wilcox
2017-02-09  3:57     ` Matthew Wilcox
2017-02-09 16:58     ` Kirill A. Shutemov
2017-02-09 16:58       ` Kirill A. Shutemov
2017-02-13 13:43       ` Kirill A. Shutemov
2017-02-13 13:43         ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 02/37] Revert "radix-tree: implement radix_tree_maybe_preload_order()" Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-01-26 15:38   ` Matthew Wilcox
2017-01-26 15:38     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 03/37] page-flags: relax page flag policy for few flags Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09  4:01   ` Matthew Wilcox
2017-02-09  4:01     ` Matthew Wilcox
2017-02-13 13:59     ` Kirill A. Shutemov
2017-02-13 13:59       ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 04/37] mm, rmap: account file thp pages Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 20:17   ` Matthew Wilcox
2017-02-09 20:17     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 05/37] thp: try to free page's buffers before attempt split Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 20:14   ` Matthew Wilcox
2017-02-09 20:14     ` Matthew Wilcox
2017-02-13 14:32     ` Kirill A. Shutemov
2017-02-13 14:32       ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 06/37] thp: handle write-protection faults for file THP Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-01-26 15:44   ` Matthew Wilcox
2017-01-26 15:44     ` Matthew Wilcox
2017-01-26 15:57     ` Kirill A. Shutemov
2017-01-26 15:57       ` Kirill A. Shutemov
2017-02-09 20:19   ` Matthew Wilcox
2017-02-09 20:19     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 07/37] filemap: allocate huge page in page_cache_read(), if allowed Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 21:18   ` Matthew Wilcox
2017-02-09 21:18     ` Matthew Wilcox
2017-02-13 15:17     ` Kirill A. Shutemov
2017-02-13 15:17       ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 08/37] filemap: handle huge pages in do_generic_file_read() Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 21:55   ` Matthew Wilcox
2017-02-09 21:55     ` Matthew Wilcox
2017-02-13 15:33     ` Kirill A. Shutemov
2017-02-13 15:33       ` Kirill A. Shutemov
2017-02-13 16:01       ` Matthew Wilcox
2017-02-13 16:01         ` Matthew Wilcox
2017-02-13 16:09         ` Matthew Wilcox
2017-02-13 16:09           ` Matthew Wilcox
2017-02-13 16:28   ` Matthew Wilcox
2017-02-13 16:28     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 09/37] filemap: allocate huge page in pagecache_get_page(), if allowed Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 21:59   ` Matthew Wilcox
2017-02-09 21:59     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 10/37] filemap: handle huge pages in filemap_fdatawait_range() Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 23:03   ` Matthew Wilcox
2017-02-09 23:03     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 11/37] HACK: readahead: alloc huge pages, if allowed Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-09 23:34   ` Matthew Wilcox
2017-02-09 23:34     ` Matthew Wilcox
2017-02-10  0:23     ` Andreas Dilger [this message]
2017-02-10 14:51       ` Matthew Wilcox
2017-02-10 14:51         ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 12/37] brd: make it handle huge pages Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-10 17:24   ` Matthew Wilcox
2017-02-10 17:24     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 13/37] mm: make write_cache_pages() work on " Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-10 17:46   ` Matthew Wilcox
2017-02-10 17:46     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 14/37] thp: introduce hpage_size() and hpage_mask() Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 15/37] thp: do not threat slab pages as huge in hpage_{nr_pages,size,mask} Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-10 22:13   ` Matthew Wilcox
2017-02-10 22:13     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 16/37] thp: make thp_get_unmapped_area() respect S_HUGE_MODE Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-10 17:50   ` Matthew Wilcox
2017-02-10 17:50     ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 17/37] fs: make block_read_full_page() be able to read huge page Kirill A. Shutemov
2017-01-26 11:57   ` Kirill A. Shutemov
2017-02-10 17:58   ` Matthew Wilcox
2017-02-10 17:58     ` Matthew Wilcox
2017-01-26 11:58 ` [PATCHv6 18/37] fs: make block_write_{begin,end}() be able to handle huge pages Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 19/37] fs: make block_page_mkwrite() aware about " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 20/37] truncate: make truncate_inode_pages_range() " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 21/37] truncate: make invalidate_inode_pages2_range() " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 22/37] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 23/37] mm: account huge pages to dirty, writaback, reclaimable, etc Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 24/37] ext4: make ext4_mpage_readpages() hugepage-aware Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 25/37] ext4: make ext4_writepage() work on huge pages Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 26/37] ext4: handle huge pages in ext4_page_mkwrite() Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 27/37] ext4: handle huge pages in __ext4_block_zero_page_range() Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 28/37] ext4: make ext4_block_write_begin() aware about huge pages Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 29/37] ext4: handle huge pages in ext4_da_write_end() Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 30/37] ext4: make ext4_da_page_release_reservation() aware about huge pages Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 31/37] ext4: handle writeback with " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 32/37] ext4: make EXT4_IOC_MOVE_EXT work " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 33/37] ext4: fix SEEK_DATA/SEEK_HOLE for " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 34/37] ext4: make fallocate() operations work with " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 35/37] ext4: reserve larger jounral transaction for " Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 36/37] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff() Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 37/37] ext4, vfs: add huge= mount option Kirill A. Shutemov
2017-01-26 11:58   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7D35EB8E-29F8-41DA-BB46-8BCF7B6C5A72@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=aarcange@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=hughd@google.com \
    --cc=jack@suse.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.