linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/27] Memory Folios
@ 2021-03-20  5:40 Matthew Wilcox (Oracle)
  2021-03-20  5:40 ` [PATCH v5 01/27] fs/cachefiles: Remove wait_bit_key layout dependency Matthew Wilcox (Oracle)
                   ` (33 more replies)
  0 siblings, 34 replies; 56+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-03-20  5:40 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs

Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
exist which show the benefits of a larger "page size".  As an example,
an earlier iteration of this idea which used compound pages got a 7%
performance boost when compiling the kernel using kernbench without any
particular tuning.

Using compound pages or THPs exposes a serious weakness in our type
system.  Functions are often unprepared for compound pages to be passed
to them, and may only act on PAGE_SIZE chunks.  Even functions which are
aware of compound pages may expect a head page, and do the wrong thing
if passed a tail page.

There have been efforts to label function parameters as 'head' instead
of 'page' to indicate that the function expects a head page, but this
leaves us with runtime assertions instead of using the compiler to prove
that nobody has mistakenly passed a tail page.  Calling a struct page
'head' is also inaccurate as they will work perfectly well on base pages.
The term 'nottail' has not proven popular.

We also waste a lot of instructions ensuring that we're not looking at
a tail page.  Almost every call to PageFoo() contains one or more hidden
calls to compound_head().  This also happens for get_page(), put_page()
and many more functions.  There does not appear to be a way to tell gcc
that it can cache the result of compound_head(), nor is there a way to
tell it that compound_head() is idempotent.

This series introduces the 'struct folio' as a replacement for
head-or-base pages.  This initial set reduces the kernel size by
approximately 6kB, although its real purpose is adding infrastructure
to enable further use of the folio.

The intent is to convert all filesystems and some device drivers to work
in terms of folios.  This series contains a lot of explicit conversions,
but it's important to realise it's removing a lot of implicit conversions
in some relatively hot paths.  There will be very few conversions from
folios when this work is completed; filesystems, the page cache, the
LRU and so on will generally only deal with folios.

I analysed the text size reduction using a config based on Oracle UEK
with all modules changed to built-in.  That's obviously not a kernel
which makes sense to run, but it serves to compare the effects on (many
common) filesystems & drivers, not just the core.

add/remove: 33645/33632 grow/shrink: 1850/1924 up/down: 894474/-899674 (-5200)

Current tree at:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio

(contains another ~100 patches on top of this batch, not all of which are
in good shape for submission)

v5:
 - Rebase on next-20210319
 - Pull out three bug-fix patches to the front of the series, allowing
   them to be applied earlier.
 - Fix folio_page() against pages being moved between swap & page cache
 - Fix FolioDoubleMap to use the right page flags
 - Rename next_folio() to folio_next() (akpm)
 - Renamed folio stat functions (akpm)
 - Add 'mod' versions of the folio stats for users that already have 'nr'
 - Renamed folio_page to folio_file_page() (akpm)
 - Added kernel-doc for struct folio, folio_next(), folio_index(),
   folio_file_page(), folio_contains(), folio_order(), folio_nr_pages(),
   folio_shift(), folio_size(), page_folio(), get_folio(), put_folio()
 - Make folio_private() work in terms of void * instead of unsigned long
 - Used page_folio() in attach/detach page_private() (hch)
 - Drop afs_page_mkwrite folio conversion from this series
 - Add wait_on_folio_writeback_killable()
 - Convert add_page_wait_queue() to add_folio_wait_queue()
 - Add folio_swap_entry() helper
 - Drop the additions of *FolioFsCache
 - Simplify the addition of lock_folio_memcg() et al
 - Drop test_clear_page_writeback() conversion from this series
 - Add FolioTransHuge() definition
 - Rename __folio_file_mapping() to swapcache_mapping()
 - Added swapcache_index() helper
 - Removed lock_folio_async()
 - Made __lock_folio_async() static to filemap.c
 - Converted unlock_page_private_2() to use a folio internally
v4:
 - Rebase on current Linus tree (including swap fix)
 - Analyse each patch in terms of its effects on kernel text size.
   A few were modified to improve their effect.  In particular, where
   pushing calls to page_folio() into the callers resulted in unacceptable
   size increases, the wrapper was placed in mm/folio-compat.c.  This lets
   us see all the places which are good targets for conversion to folios.
 - Some of the patches were reordered, split or merged in order to make
   more logical sense.
 - Use nth_page() for folio_next() if we're using SPARSEMEM and not
   VMEMMAP (Zi Yan)
 - Increment and decrement page stats in units of pages instead of units
   of folios (Zi Yan)
v3:
 - Rebase on next-20210127.  Two major sources of conflict, the
   generic_file_buffered_read refactoring (in akpm tree) and the
   fscache work (in dhowells tree).
v2:
 - Pare patch series back to just infrastructure and the page waiting
   parts.

Matthew Wilcox (Oracle) (27):
  fs/cachefiles: Remove wait_bit_key layout dependency
  mm/writeback: Add wait_on_page_writeback_killable
  afs: Use wait_on_page_writeback_killable
  mm: Introduce struct folio
  mm: Add folio_pgdat and folio_zone
  mm/vmstat: Add functions to account folio statistics
  mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  mm: Add put_folio
  mm: Add get_folio
  mm: Create FolioFlags
  mm: Handle per-folio private data
  mm: Add folio_index, folio_file_page and folio_contains
  mm/util: Add folio_mapping and folio_file_mapping
  mm/memcg: Add folio wrappers for various functions
  mm/filemap: Add unlock_folio
  mm/filemap: Add lock_folio
  mm/filemap: Add lock_folio_killable
  mm/filemap: Add __lock_folio_async
  mm/filemap: Add __lock_folio_or_retry
  mm/filemap: Add wait_on_folio_locked
  mm/filemap: Add end_folio_writeback
  mm/writeback: Add wait_on_folio_writeback
  mm/writeback: Add wait_for_stable_folio
  mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit
  mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit
  mm/filemap: Convert page wait queues to be folios
  mm/doc: Build kerneldoc for various mm files

 Documentation/core-api/mm-api.rst |   7 +
 fs/afs/write.c                    |   3 +-
 fs/cachefiles/rdwr.c              |  19 ++-
 fs/io_uring.c                     |   2 +-
 include/linux/memcontrol.h        |  21 +++
 include/linux/mm.h                | 156 +++++++++++++++----
 include/linux/mm_types.h          |  52 +++++++
 include/linux/mmdebug.h           |  20 +++
 include/linux/netfs.h             |   2 +-
 include/linux/page-flags.h        | 120 +++++++++++---
 include/linux/pagemap.h           | 249 ++++++++++++++++++++++--------
 include/linux/swap.h              |   6 +
 include/linux/vmstat.h            | 107 +++++++++++++
 mm/Makefile                       |   2 +-
 mm/filemap.c                      | 237 ++++++++++++++--------------
 mm/folio-compat.c                 |  37 +++++
 mm/memory.c                       |   8 +-
 mm/page-writeback.c               |  62 ++++++--
 mm/swapfile.c                     |   8 +-
 mm/util.c                         |  30 ++--
 20 files changed, 857 insertions(+), 291 deletions(-)
 create mode 100644 mm/folio-compat.c

-- 
2.30.2



^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2021-04-01 16:00 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-20  5:40 [PATCH v5 00/27] Memory Folios Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 01/27] fs/cachefiles: Remove wait_bit_key layout dependency Matthew Wilcox (Oracle)
2021-03-22  8:06   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 02/27] mm/writeback: Add wait_on_page_writeback_killable Matthew Wilcox (Oracle)
2021-03-22  8:07   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable Matthew Wilcox (Oracle)
2021-03-22  8:08   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 04/27] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 05/27] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 06/27] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 07/27] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 08/27] mm: Add put_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 09/27] mm: Add get_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 10/27] mm: Create FolioFlags Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 11/27] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 12/27] mm: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 13/27] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 14/27] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 15/27] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 16/27] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 17/27] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 18/27] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 19/27] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 20/27] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 21/27] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 22/27] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 23/27] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 24/27] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
2021-03-21  7:10   ` kernel test robot
2021-03-20  5:41 ` [PATCH v5 25/27] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 26/27] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-03-20  7:54   ` kernel test robot
2021-03-20  5:41 ` [PATCH v5 27/27] mm/doc: Build kerneldoc for various mm files Matthew Wilcox (Oracle)
2021-03-22  3:25 ` [PATCH v5 00/27] Memory Folios Matthew Wilcox
2021-03-22  9:25 ` [PATCH v5 01/27] fs/cachefiles: Remove wait_bit_key layout dependency David Howells
2021-03-22  9:26 ` [PATCH v5 02/27] mm/writeback: Add wait_on_page_writeback_killable David Howells
2021-03-22  9:27 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable David Howells
2021-03-22 19:41   ` Matthew Wilcox
2021-03-22 17:59 ` [PATCH v5 00/27] Memory Folios Johannes Weiner
2021-03-22 18:47   ` Matthew Wilcox
2021-03-24  0:29     ` Johannes Weiner
2021-03-24  6:24       ` Matthew Wilcox
2021-03-26 17:48         ` Johannes Weiner
2021-03-29 16:58           ` Matthew Wilcox
2021-03-29 17:56             ` Matthew Wilcox
2021-03-30 19:30             ` Johannes Weiner
2021-03-30 21:09               ` Matthew Wilcox
2021-03-31 18:14                 ` Johannes Weiner
2021-03-31 18:28                   ` Matthew Wilcox
2021-04-01  5:05                 ` Al Viro
2021-04-01 12:07                   ` Matthew Wilcox
2021-04-01 16:00                   ` Johannes Weiner
2021-03-31 14:54               ` Christoph Hellwig
2021-03-23 15:50   ` Christoph Hellwig
2021-03-23 11:29 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable David Howells
2021-03-23 17:50 ` [PATCH v5 00/27] Memory Folios David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).