All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>, akpm@linux-foundation.org
Cc: arnd@arndb.de, jingshan@linux.alibaba.com, linux-mm@kvack.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: Introduce new MADV_NOMOVABLE behavior
Date: Wed, 19 Oct 2022 17:17:45 +0200	[thread overview]
Message-ID: <470dc638-a300-f261-94b4-e27250e42f96@redhat.com> (raw)
In-Reply-To: <c163ba0e-80d9-6362-b4f0-c5a2a12deec5@linux.alibaba.com>

> I observed one migration failure case (which is not easy to reproduce)
> is that, the 'thp_migration_fail' count is 1 and the
> 'thp_split_page_failed' count is also 1.
> 
> That means when migrating a THP which is in CMA area, but can not
> allocate a new THP due to memory fragmentation, so it will split the
> THP. However THP split is also failed, probably the reason is temporary
> reference count of this THP. And the temporary reference count can be
> caused by dropping page caches (I observed the drop caches operation in
> the system), but we can not drop the shmem page caches due to they are
> already dirty at that time.
> 
> So we can try again in migrate_pages() if THP split is failed to
> mitigate the failure of migration, especially for the failure reason is
> temporary reference count? Does this sound reasonable for you?

It sound reasonable, and I understand that debugging these issues is 
tricky. But we really have to figure out the root cause to make these 
pages that are indeed movable (but only temporarily not movable for 
reason XYZ) movable.

We'd need some indication to retry migration longer / again.

> 
> However I still worried there are other possible cases to cause
> migration failure, so no CMA allocation for our case seems more stable IMO.

Yes, I can understand that. But as one example, you're approach doesn't 
handle the case that a page that was allocated on !CMA/!ZONE_MOVABLE 
would get migrated to CMA/ZONE_MOVABLE just before you would try pinning 
the page (to migrate it again off CMA/ZONE_MOVABLE).

We really have to fix the root cause.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2022-10-19 15:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17  7:32 [RFC PATCH] mm: Introduce new MADV_NOMOVABLE behavior Baolin Wang
2022-10-17  8:41 ` David Hildenbrand
2022-10-17  9:09   ` Baolin Wang
2022-10-17 11:27     ` David Hildenbrand
2022-10-18  2:43       ` Baolin Wang
2022-10-19 15:17         ` David Hildenbrand [this message]
2022-10-20  7:15           ` Baolin Wang
2022-10-19 15:16 kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=470dc638-a300-f261-94b4-e27250e42f96@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=jingshan@linux.alibaba.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.