All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: "ying.huang@intel.com" <ying.huang@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>
Cc: <akpm@linux-foundation.org>, <songmuchun@bytedance.com>,
	<hch@infradead.org>, <willy@infradead.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>, Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
Date: Sat, 16 Apr 2022 10:34:02 +0800	[thread overview]
Message-ID: <70b64476-6efc-c8ad-9cf3-b101c3b92db1@huawei.com> (raw)
In-Reply-To: <4b47e6317aca3deeabf610a7f4839563ff2b25a1.camel@intel.com>

On 2022/4/15 11:07, ying.huang@intel.com wrote:
> On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote:
>> On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote:
>>> On 12.04.22 15:42, Miaohe Lin wrote:
>>>> On 2022/4/12 16:59, Oscar Salvador wrote:
>>>>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>>>>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>>>>>> is checked before. It will avoid the long term pinned pages to be swapped
>>>>>> out. And we can live with short term pinned pages. Without can_split_folio
>>>>>> checking we can simplify the code. Also activate_locked can be changed to
>>>>>> keep_locked as it's just short term pinning.
>>>>>
>>>>> What do you mean by "we can live with short term pinned pages"?
>>>>> Does it mean that it was not pinned when we check
>>>>> folio_maybe_dma_pinned() but now it is?
>>>>>
>>>>> To me it looks like the pinning is fluctuating and we rely on
>>>>> split_folio_to_list() to see whether we succeed or not, and if not
>>>>> we give it another spin in the next round?
>>>>
>>>> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
>>>> pinned for a noticeable time. So it's expected to split the folio successfully in the next
>>>> round as the pinning is really fluctuating. Or am I miss something?
>>>>
>>>
>>> Just so we're on the same page. folio_maybe_dma_pinned() only capture
>>> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
>>> via vmsplice().
>>
>> Per my original understanding, folio_maybe_dma_pinned() can be used to
>> detect long-term pinned pages.  And it seems reasonable to skip the
>> long-term pinned pages and try short-term pinned pages during page
>> reclaiming.  But as you pointed out, vmsplice() doesn't use FOLL_PIN. 
>> So if vmsplice() is expected to pin pages for long time, and we have no
>> way to detect it, then we should keep can_split_folio() in the original
>> code.
>>
>> Copying more people who have worked on long-term pinning for comments.
> 
> Checked the discussion in the following thread,
> 
> https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/
> 
> It seems that from the practical point of view, folio_maybe_dma_pinned()
> can identify most long-term pinned pages that may block memory hot-
> remove or CMA allocation.  Although as David pointed out, some pages may
> still be GUPed for long time (e.g. via vmsplice) even if
> !folio_maybe_dma_pinned().
> 
> But from another point of view, can_split_huge_page() is cheap and THP
> swapout is expensive (swap space, disk IO, and hard to be recovered), so
> it may be better to keep can_split_huge_page() in shink_page_list().

Many thanks for your explanation. Looks convincing for me. Is it worth a comment about the above
stuff? Anyway, will drop this patch. Thanks!

> 
> Best Regards,
> Huang, Ying
> 
>>
>>> can_split_folio() is more precise then folio_maybe_dma_pinned(), but
>>> both are racy as long as the page is still mapped.
>>>
>>>
>>
> 
> 
> .
> 


  reply	other threads:[~2022-04-16  2:34 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
2022-04-11 11:50   ` ying.huang
2022-04-12  3:07     ` Miaohe Lin
2022-04-11 14:16   ` Christoph Hellwig
2022-04-12  3:15     ` Miaohe Lin
2022-04-12  8:44   ` Oscar Salvador
2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
2022-04-11 14:18   ` Christoph Hellwig
2022-04-12  3:14     ` Miaohe Lin
2022-04-12  0:52   ` ying.huang
2022-04-12  8:59   ` Oscar Salvador
2022-04-12 13:42     ` Miaohe Lin
2022-04-12 14:59       ` David Hildenbrand
2022-04-13  1:26         ` ying.huang
2022-04-13  2:17           ` Miaohe Lin
2022-04-15  3:07           ` ying.huang
2022-04-16  2:34             ` Miaohe Lin [this message]
2022-04-09  9:34 ` [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list() Miaohe Lin
2022-04-09 13:44   ` Matthew Wilcox
2022-04-11  1:53     ` Miaohe Lin
2022-04-11  3:17       ` Matthew Wilcox
2022-04-11  3:26         ` Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec Miaohe Lin
2022-04-12  0:57   ` ying.huang
2022-04-12  3:13     ` Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage Miaohe Lin
2022-04-12  0:59   ` ying.huang
2022-04-09  9:34 ` [PATCH v2 6/9] mm/vmscan: take all base pages of THP into account when race with speculative reference Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node Miaohe Lin
2022-04-12  1:36   ` ying.huang
2022-04-09  9:34 ` [PATCH v2 8/9] mm/vmscan: remove obsolete comment in kswapd_run Miaohe Lin
2022-04-09  9:35 ` [PATCH v2 9/9] mm/vmscan: use helper folio_is_file_lru() Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70b64476-6efc-c8ad-9cf3-b101c3b92db1@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=songmuchun@bytedance.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.