linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] mm: hwpoison: soft-offline support for thp migration
@ 2017-08-15  1:52 Zi Yan
  2017-08-15  1:52 ` [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support Zi Yan
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Zi Yan @ 2017-08-15  1:52 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Greg Kroah-Hartman, Kirill A . Shutemov, linux-kernel, linux-mm, Zi Yan

From: Zi Yan <zi.yan@cs.rutgers.edu>

Hi Naoya,

Here is soft-offline support for thp migration. I need comments since it has
an interface change (Patch 2) of soft_offline_page() and
a behavior change (Patch 3) in migrate_pages(). soft_offline_page() is used
in store_soft_offline_page() from drivers/base/memory.c.

The patchset is on top of mmotm-2017-08-10-15-33.

The patchset is tested with:
1. simple madvise() call program (https://github.com/x-y-z/soft-offline-test) and
2. a local kernel change to intentionally fail allocating THPs for soft offline,
   which makes to-be-soft-offlined THPs being split by Patch 3.

Patch 1: obtain the size of a offlined page before it is offlined. The size is
used as the step value of the for-loop inside madvise_inject_error().
Originally, the for-loop used the size of offlined pages, which was OK.
But as a THP is offlined, it is split afterwards, so the page size obtained
after offlined is PAGE_SIZE instead of THP page size, which causes a THP being
offlined 512 times.

Patch 2: when offlining a THP, there are two situations, a) the THP is offlined
as a whole, or b) the THP is split and only the raw error page is offlined.
Thus, we need soft_offline_page() to tell us whether a THP is split during
offlining, which leads to a new interface parameter.

Patch 3: as Naoya suggested, if a THP fails to be offlined as a whole, we should
retry the raw error subpage. This patch implement it. This also requires
migrate_pages() not splitting a THP if migration fails for MR_MEMORY_FAILURE.

Patch 4: enable thp migration support for soft offline.

Any suggestions and comments are welcome.

Thanks.


Zi Yan (4):
  mm: madvise: read loop's step size beforehand in
    madvise_inject_error(), prepare for THP support.
  mm: soft-offline: Change soft_offline_page() interface to tell if the
    page is split or not.
  mm: soft-offline: retry to split and soft-offline the raw error if the
    original THP offlining fails.
  mm: hwpoison: soft offline supports thp migration

 drivers/base/memory.c |   2 +-
 include/linux/mm.h    |   2 +-
 mm/madvise.c          |  24 ++++++++++--
 mm/memory-failure.c   | 103 +++++++++++++++++++++++++++++---------------------
 mm/migrate.c          |  16 ++++++++
 5 files changed, 97 insertions(+), 50 deletions(-)

-- 
2.13.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-08-24 14:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15  1:52 [RFC PATCH 0/4] mm: hwpoison: soft-offline support for thp migration Zi Yan
2017-08-15  1:52 ` [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support Zi Yan
2017-08-23  7:49   ` Naoya Horiguchi
2017-08-23 14:20     ` Zi Yan
2017-08-24  4:26       ` Naoya Horiguchi
2017-08-24 14:26         ` Zi Yan
2017-08-15  1:52 ` [RFC PATCH 2/4] mm: soft-offline: Change soft_offline_page() interface to tell if the page is split or not Zi Yan
2017-08-15  1:52 ` [RFC PATCH 3/4] mm: soft-offline: retry to split and soft-offline the raw error if the original THP offlining fails Zi Yan
2017-08-24  7:31   ` Naoya Horiguchi
2017-08-15  1:52 ` [RFC PATCH 4/4] mm: hwpoison: soft offline supports thp migration Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).