All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	SeongJae Park <sj@kernel.org>
Subject: Re: [PATCH 1/3] mm: return the number of pages successfully paged out
Date: Wed, 18 Jan 2023 14:27:23 -0800	[thread overview]
Message-ID: <Y8hyS3yVnxXTsFIz@google.com> (raw)
In-Reply-To: <Y8hjNm+kB8WquUH6@dhcp22.suse.cz>

On Wed, Jan 18, 2023 at 10:23:02PM +0100, Michal Hocko wrote:
> On Wed 18-01-23 10:07:17, Minchan Kim wrote:
> > On Wed, Jan 18, 2023 at 06:35:32PM +0100, Michal Hocko wrote:
> > > On Wed 18-01-23 09:09:36, Minchan Kim wrote:
> > > > On Wed, Jan 18, 2023 at 10:10:44AM +0100, Michal Hocko wrote:
> > > > > On Tue 17-01-23 15:16:30, Minchan Kim wrote:
> > > > > > The reclaim_pages MADV_PAGEOUT uses needs to return the number of
> > > > > > pages paged-out successfully, not only the number of reclaimed pages
> > > > > > in the operation because those pages paged-out successfully will be
> > > > > > reclaimed easily at the memory pressure due to asynchronous writeback
> > > > > > rotation(i.e., PG_reclaim with folio_rotate_reclaimable).
> > > > > > 
> > > > > > This patch renames the reclaim_pages with paging_out(with hope that
> > > > > > it's clear from operation point of view) and then adds a additional
> > > > > > stat in reclaim_stat to represent the number of paged-out but kept
> > > > > > in the memory for rotation on writeback completion.
> > > > > > 
> > > > > > With that stat, madvise_pageout can know how many pages were paged-out
> > > > > > successfully as well as reclaimed. The return value will be used for
> > > > > > statistics in next patch.
> > > > > 
> > > > > I really fail to see the reson for the rename and paging_out doesn't
> > > > > even make much sense as a name TBH.
> > > > 
> > > > Currently, what we are doing to reclaim memory is
> > > > 
> > > > reclaim_folio_list
> > > >     shrink_folio_list
> > > >         if (folio_mapped(folio))
> > > >             try_to_unmap(folio)
> > > > 
> > > >         if (folio_test_dirty(folio))
> > > >             pageout
> > > > 
> > > > Based on the structure, pageout is just one of way to reclaim memory.
> > > > 
> > > > With MADV_PAGEOUT, what user want to know how many pages
> > > > were paged out as they requested(from userspace PoV, how many times
> > > > pages fault happens in future accesses), not the number of reclaimed
> > > > pages shrink_folio_list returns currently.
> > > > 
> > > > In the sense, I wanted to distinguish between reclaim and pageout.
> > > 
> > > But MADV_PAGEOUT is documented to trigger memory reclaim in general
> > > not a pageout. Let me quote from the man page
> > > : Reclaim a given range of pages.  This is done to free up memory occupied
> > > : by these pages.
> > 
> > IMO, we need to change the documentation something like this.
> > 
> >  : Try to reclaim a given range of pages. The reclaim carries on the
> >    unmap pages from address space and then write them out to backing
> >    storage. It could help to free up memory occupied by these pages
> >    or improve memory reclaim efficiency.
> 
> But this is not what the implementation does nor should it be specific
> about what reclaim actual can do. The specific implementation of the
> reclaim is an implementation detail.
>  
> > > Sure anonymous pages can be paged out to the swap storage but with the
> > > upcomming multi-tiering it can be also "paged out" to a lower tier. All
> > > that leads to freeing up memory that is currently mapped by that address
> > > range.
> > 
> > I am not familiar with multi-tiering. However, thing is the operation
> > of pageout is synchronous or not. If it's synchronous(IOW, when the
> > pageout returns, the page was really written to the storage), yes,
> > it can reclaim memory. If the backing storage is asynchrnous device
> > (which is *major* these days), we cannot reclaim the memory but just
> > wrote the page to the storage with hope it could help reclaim speed
> > at next iteration of reclaim.
> 
> I am sorry but I do not follow. Synchronicity of the reclaim should be
> completely irrelevant. Even swapout (pageout from your POV AFAIU) can be
> async or sync.
>  
> > > Anyway, what do you actually meen by distinguishing between reclaim and
> > > pageout. Aren't those just two names for the same thing?
> > 
> > reclaim is realy memory freeing but pageout is just one of the way
> > to achieve the memory freeing, which is not guaranteed depending on
> > backing storage's speed.
> 
> Try to think about it some more. Do you really want the MADV_PAGEOUT to
> be so specific about how the memory reclaim is achieved? How do you
> reflect new ways of reclaiming memory - e.g. memory demotion when the
> primary memory gets freed by migrating the content to a slower type of
> memory yet not write it out to ultra slow swap storage (which is just
> yet another tier that cannot be accessed directly without an explicit
> IO)?

I understand your concern now and believe better implementation would
account the number of virtual address scanning and the number of page
*unmapped from page table* so we don't need to worry what types of
paging out happens(e.g., write it to slower storage or demote it to
lower tier. In the end, userspace will see the paging in, anyway.)

"Unmapped the page from page table and demotes the page to secondary
 device. User would see page fault when the next access happen"

If you agree it, yeah, I don't need to change anything in vmscan.c.
Instead, I could do everything in madvise.c

Let me know if you have other concern or suggestion.

Thanks, Michal.

  reply	other threads:[~2023-01-18 22:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 23:16 [PATCH 1/3] mm: return the number of pages successfully paged out Minchan Kim
2023-01-17 23:16 ` [PATCH 2/3] mm: return boolean for deactivate_page Minchan Kim
2023-01-17 23:16 ` [PATCH 3/3] mm: add vmstat statistics for madvise_[cold|pageout] Minchan Kim
2023-01-18  9:11   ` Michal Hocko
2023-01-18 17:15     ` Minchan Kim
2023-01-18 17:27       ` Michal Hocko
2023-01-18 17:55         ` Minchan Kim
2023-01-18 21:13           ` Michal Hocko
2023-01-18 21:47             ` Minchan Kim
2023-01-17 23:53 ` [PATCH 1/3] mm: return the number of pages successfully paged out Andrew Morton
2023-01-18  0:35   ` Minchan Kim
2023-01-18  0:58     ` Matthew Wilcox
2023-01-18  1:49       ` Minchan Kim
2023-01-18  9:10 ` Michal Hocko
2023-01-18 17:09   ` Minchan Kim
2023-01-18 17:35     ` Michal Hocko
2023-01-18 18:07       ` Minchan Kim
2023-01-18 21:23         ` Michal Hocko
2023-01-18 22:27           ` Minchan Kim [this message]
2023-01-19  9:07             ` Michal Hocko
2023-01-19 21:15               ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8hyS3yVnxXTsFIz@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.