All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-cachefs@redhat.com,
	linux-afs@lists.infradead.org
Subject: Re: [PATCH v5 00/27] Memory Folios
Date: Mon, 29 Mar 2021 18:56:24 +0100	[thread overview]
Message-ID: <20210329175624.GI351017@casper.infradead.org> (raw)
In-Reply-To: <20210329165832.GG351017@casper.infradead.org>

On Mon, Mar 29, 2021 at 05:58:32PM +0100, Matthew Wilcox wrote:
> In broad strokes, I think that having a Power Of Two Allocator
> with Descriptor (POTAD) is a useful foundational allocator to have.
> The specific allocator that we call the buddy allocator is very clever for
> the 1990s, but touches too many cachelines to be good with today's CPUs.
> The generalisation of the buddy allocator to the POTAD lets us allocate
> smaller quantities (eg a 512 byte block) and allocate descriptors which
> differ in size from a struct page.  For an extreme example, see xfs_buf
> which is 360 bytes and is the descriptor for an allocation between 512
> and 65536 bytes.
> 
> There are times when we need to get from the physical address to
> the descriptor, eg memory-failure.c or get_user_pages().  This is the
> equivalent of phys_to_page(), and it's going to have to be a lookup tree.
> I think this is a role for the Maple Tree, but it's not ready yet.
> I don't know if it'll be fast enough for this case.  There's also the
> need (particularly for memory-failure) to determine exactly what kind
> of descriptor we're dealing with, and also its size.  Even its owner,
> so we can notify them of memory failure.

A couple of things I forgot to mention ...

I'd like the POTAD to be not necessarily tied to allocating memory.
For example, I think it could be used to allocate swap space.  eg the swap
code could register the space in a swap file as allocatable through the
POTAD, and then later ask the POTAD to allocate a POT from the swap space.

The POTAD wouldn't need to be limited to MAX_ORDER.  It should be
perfectly capable of allocating 1TB if your machine has 1.5TB of RAM
in it (... and things haven't got too fragmented)

I think the POTAD can be used to replace the CMA.  The CMA supports
weirdo things like "Allocate 8MB of memory at a 1MB alignment", and I
think that's doable within the data structures that I'm thinking about
for the POTAD.  It'd first try to allocate an 8MB chunk at 8MB alignment,
and then if that's not possible, try to allocate two adjacent 4MB chunks;
continuing down until it finds that there aren't 8x1MB chunks, at which
point it can give up.

  reply	other threads:[~2021-03-29 17:57 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-20  5:40 [PATCH v5 00/27] Memory Folios Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 01/27] fs/cachefiles: Remove wait_bit_key layout dependency Matthew Wilcox (Oracle)
2021-03-22  8:06   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 02/27] mm/writeback: Add wait_on_page_writeback_killable Matthew Wilcox (Oracle)
2021-03-22  8:07   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable Matthew Wilcox (Oracle)
2021-03-22  8:08   ` Christoph Hellwig
2021-03-20  5:40 ` [PATCH v5 04/27] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 05/27] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 06/27] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 07/27] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 08/27] mm: Add put_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 09/27] mm: Add get_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 10/27] mm: Create FolioFlags Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 11/27] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 12/27] mm: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 13/27] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 14/27] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 15/27] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 16/27] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 17/27] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 18/27] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 19/27] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 20/27] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 21/27] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
2021-03-20  5:40 ` [PATCH v5 22/27] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 23/27] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 24/27] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
2021-03-21  7:10   ` kernel test robot
2021-03-21  7:10     ` kernel test robot
2021-03-20  5:41 ` [PATCH v5 25/27] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
2021-03-20  5:41 ` [PATCH v5 26/27] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-03-20  7:54   ` kernel test robot
2021-03-20  7:54     ` kernel test robot
2021-03-20  5:41 ` [PATCH v5 27/27] mm/doc: Build kerneldoc for various mm files Matthew Wilcox (Oracle)
2021-03-22  3:25 ` [PATCH v5 00/27] Memory Folios Matthew Wilcox
2021-03-22  9:25 ` [PATCH v5 01/27] fs/cachefiles: Remove wait_bit_key layout dependency David Howells
2021-03-22  9:26 ` [PATCH v5 02/27] mm/writeback: Add wait_on_page_writeback_killable David Howells
2021-03-22  9:27 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable David Howells
2021-03-22 19:41   ` Matthew Wilcox
2021-03-22 17:59 ` [PATCH v5 00/27] Memory Folios Johannes Weiner
2021-03-22 18:47   ` Matthew Wilcox
2021-03-24  0:29     ` Johannes Weiner
2021-03-24  6:24       ` Matthew Wilcox
2021-03-26 17:48         ` Johannes Weiner
2021-03-29 16:58           ` Matthew Wilcox
2021-03-29 17:56             ` Matthew Wilcox [this message]
2021-03-30 19:30             ` Johannes Weiner
2021-03-30 21:09               ` Matthew Wilcox
2021-03-31 18:14                 ` Johannes Weiner
2021-03-31 18:28                   ` Matthew Wilcox
2021-04-01  5:05                 ` Al Viro
2021-04-01 12:07                   ` Matthew Wilcox
2021-04-01 16:00                   ` Johannes Weiner
2021-03-31 14:54               ` Christoph Hellwig
2021-03-23 15:50   ` Christoph Hellwig
2021-03-23 11:29 ` [PATCH v5 03/27] afs: Use wait_on_page_writeback_killable David Howells
2021-03-23 17:50 ` [PATCH v5 00/27] Memory Folios David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210329175624.GI351017@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.