[PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes
@ 2018-11-26 23:13 Hugh Dickins
  2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Hi Andrew,

Here's a set of 10 patches, mostly fixing some crashes and lockups
which can happen when khugepaged collapses tmpfs pages to huge:
by-products of ongoing work to extend upstream's huge tmpfs to
match what we need in Google (but no enhancements included here).

Against v4.20-rc2 == v4.20-rc4: sorry, I haven't looked yet to see
what clashes there might be with mmotm, because I believe that although
these are all (except 10/10) to long-standing bugs, they still deserve
to get into v4.20 and -stable.  See what you think.

Most of the testing has been on the whole series, and on a slightly
earlier kernel: the move to XArray means that almost none of these
patches will apply cleanly to v4.19 or earlier, but I do have the
equivalents lined up ready.

 mm/huge_memory.c |   43 ++++++++-----
 mm/khugepaged.c  |  140 ++++++++++++++++++++++++++-------------------
 mm/rmap.c        |   13 ----
 mm/shmem.c       |    6 +
 4 files changed, 114 insertions(+), 88 deletions(-)

 1/10 mm/huge_memory: rename freeze_page() to unmap_page()
 2/10 mm/huge_memory: splitting set mapping+index before unfreeze
 3/10 mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
 4/10 mm/khugepaged: collapse_shmem() stop if punched or truncated
 5/10 mm/khugepaged: fix crashes due to misaccounted holes
 6/10 mm/khugepaged: collapse_shmem() remember to clear holes
 7/10 mm/khugepaged: minor reorderings in collapse_shmem()
 8/10 mm/khugepaged: collapse_shmem() without freezing new_page
 9/10 mm/khugepaged: collapse_shmem() do not crash on Compound
10/10 mm/khugepaged: fix the xas_create_range() error path

Thanks,
Hugh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
@ 2018-11-26 23:19 ` Hugh Dickins
  2018-11-27  7:26   ` Kirill A. Shutemov
  2018-11-26 23:21 ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Hugh Dickins
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox,
	Konstantin Khlebnikov, linux-mm

Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s
VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page).

Move the setting of mapping and index up before the page_ref_unfreeze()
in __split_huge_page_tail() to fix this: so that a page cache lookup
cannot get a reference while the tail's mapping and index are unstable.

In fact, might as well move them up before the smp_wmb(): I don't see
an actual need for that, but if I'm missing something, this way round
is safer than the other, and no less efficient.

You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page)
is misplaced, and should be left until after the trylock_page(); but
left as is has not crashed since, and gives more stringent assurance.

Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()")
Requires: 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/huge_memory.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 30100fac2341..cef2c256e7c4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2402,6 +2402,12 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_unevictable) |
 			 (1L << PG_dirty)));
 
+	/* ->mapping in first tail page is compound_mapcount */
+	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
+			page_tail);
+	page_tail->mapping = head->mapping;
+	page_tail->index = head->index + tail;
+
 	/* Page flags must be visible before we make the page non-compound. */
 	smp_wmb();
 
@@ -2422,12 +2428,6 @@ static void __split_huge_page_tail(struct page *head, int tail,
 	if (page_is_idle(head))
 		set_page_idle(page_tail);
 
-	/* ->mapping in first tail page is compound_mapcount */
-	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
-			page_tail);
-	page_tail->mapping = head->mapping;
-
-	page_tail->index = head->index + tail;
 	page_cpupid_xchg_last(page_tail, page_cpupid_last(head));
 
 	/*
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
  2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
@ 2018-11-26 23:21 ` Hugh Dickins
  2018-11-27  7:27   ` Kirill A. Shutemov
  2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that
__split_huge_page() was using i_size_read() while holding the irq-safe
lru_lock and page tree lock, but the 32-bit i_size_read() uses an
irq-unsafe seqlock which should not be nested inside them.

Instead, read the i_size earlier in split_huge_page_to_list(), and pass
the end offset down to __split_huge_page(): all while holding head page
lock, which is enough to prevent truncation of that extent before the
page tree lock has been taken.

Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/huge_memory.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cef2c256e7c4..622cced74fd9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2439,12 +2439,11 @@ static void __split_huge_page_tail(struct page *head, int tail,
 }
 
 static void __split_huge_page(struct page *page, struct list_head *list,
-		unsigned long flags)
+		pgoff_t end, unsigned long flags)
 {
 	struct page *head = compound_head(page);
 	struct zone *zone = page_zone(head);
 	struct lruvec *lruvec;
-	pgoff_t end = -1;
 	int i;
 
 	lruvec = mem_cgroup_page_lruvec(head, zone->zone_pgdat);
@@ -2452,9 +2451,6 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	/* complete memcg works before add pages to LRU */
 	mem_cgroup_split_huge_fixup(head);
 
-	if (!PageAnon(page))
-		end = DIV_ROUND_UP(i_size_read(head->mapping->host), PAGE_SIZE);
-
 	for (i = HPAGE_PMD_NR - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond i_size: drop them from page cache */
@@ -2626,6 +2622,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	int count, mapcount, extra_pins, ret;
 	bool mlocked;
 	unsigned long flags;
+	pgoff_t end;
 
 	VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
@@ -2648,6 +2645,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 			ret = -EBUSY;
 			goto out;
 		}
+		end = -1;
 		mapping = NULL;
 		anon_vma_lock_write(anon_vma);
 	} else {
@@ -2661,6 +2659,15 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 
 		anon_vma = NULL;
 		i_mmap_lock_read(mapping);
+
+		/*
+		 *__split_huge_page() may need to trim off pages beyond EOF:
+		 * but on 32-bit, i_size_read() takes an irq-unsafe seqlock,
+		 * which cannot be nested inside the page tree lock. So note
+		 * end now: i_size itself may be changed at any moment, but
+		 * head page lock is good enough to serialize the trimming.
+		 */
+		end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
 	}
 
 	/*
@@ -2707,7 +2714,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		if (mapping)
 			__dec_node_page_state(page, NR_SHMEM_THPS);
 		spin_unlock(&pgdata->split_queue_lock);
-		__split_huge_page(page, list, flags);
+		__split_huge_page(page, list, end, flags);
 		if (PageSwapCache(head)) {
 			swp_entry_t entry = { .val = page_private(head) };
 
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
  2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
  2018-11-26 23:21 ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Hugh Dickins
@ 2018-11-26 23:23 ` Hugh Dickins
  2018-11-27  7:31   ` Kirill A. Shutemov
  2018-11-27 13:26   ` Matthew Wilcox
  2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Huge tmpfs testing showed that although collapse_shmem() recognizes a
concurrently truncated or hole-punched page correctly, its handling of
holes was liable to refill an emptied extent.  Add check to stop that.

Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index c13625c1ad5e..2070c316f06e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1359,6 +1359,17 @@ static void collapse_shmem(struct mm_struct *mm,
 
 		VM_BUG_ON(index != xas.xa_index);
 		if (!page) {
+			/*
+			 * Stop if extent has been truncated or hole-punched,
+			 * and is now completely empty.
+			 */
+			if (index == start) {
+				if (!xas_next_entry(&xas, end - 1)) {
+					result = SCAN_TRUNCATED;
+					break;
+				}
+				xas_set(&xas, index);
+			}
 			if (!shmem_charge(mapping->host, 1)) {
 				result = SCAN_FAIL;
 				break;
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (2 preceding siblings ...)
  2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
@ 2018-11-26 23:25 ` Hugh Dickins
  2018-11-27  7:35   ` Kirill A. Shutemov
  2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent hit
shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by clear_inode()'s
BUG_ON(inode->i_data.nrpages) when the file was later closed and unlinked.

khugepaged's collapse_shmem() was forgetting to update mapping->nrpages on
the rollback path, after it had added but then needs to undo some holes.

There is indeed an irritating asymmetry between shmem_charge(), whose
callers want it to increment nrpages after successfully accounting blocks,
and shmem_uncharge(), when __delete_from_page_cache() already decremented
nrpages itself: oh well, just add a comment on that to them both.

And shmem_recalc_inode() is supposed to be called when the accounting is
expected to be in balance (so it can deduce from imbalance that reclaim
discarded some pages): so change shmem_charge() to update nrpages earlier
(though it's rare for the difference to matter at all).

Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 5 ++++-
 mm/shmem.c      | 6 +++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2070c316f06e..65e82f665c7c 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1506,9 +1506,12 @@ static void collapse_shmem(struct mm_struct *mm,
 		khugepaged_pages_collapsed++;
 	} else {
 		struct page *page;
+
 		/* Something went wrong: roll back page cache changes */
-		shmem_uncharge(mapping->host, nr_none);
 		xas_lock_irq(&xas);
+		mapping->nrpages -= nr_none;
+		shmem_uncharge(mapping->host, nr_none);
+
 		xas_set(&xas, start);
 		xas_for_each(&xas, page, end - 1) {
 			page = list_first_entry_or_null(&pagelist,
diff --git a/mm/shmem.c b/mm/shmem.c
index ea26d7a0342d..e6558e49b42a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -297,12 +297,14 @@ bool shmem_charge(struct inode *inode, long pages)
 	if (!shmem_inode_acct_block(inode, pages))
 		return false;
 
+	/* nrpages adjustment first, then shmem_recalc_inode() when balanced */
+	inode->i_mapping->nrpages += pages;
+
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced += pages;
 	inode->i_blocks += pages * BLOCKS_PER_PAGE;
 	shmem_recalc_inode(inode);
 	spin_unlock_irqrestore(&info->lock, flags);
-	inode->i_mapping->nrpages += pages;
 
 	return true;
 }
@@ -312,6 +314,8 @@ void shmem_uncharge(struct inode *inode, long pages)
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	unsigned long flags;
 
+	/* nrpages adjustment done by __delete_from_page_cache() or caller */
+
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced -= pages;
 	inode->i_blocks -= pages * BLOCKS_PER_PAGE;
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (3 preceding siblings ...)
  2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
@ 2018-11-26 23:26 ` Hugh Dickins
  2018-11-27  7:38   ` Kirill A. Shutemov
  2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Huge tmpfs testing reminds us that there is no __GFP_ZERO in the gfp
flags khugepaged uses to allocate a huge page - in all common cases it
would just be a waste of effort - so collapse_shmem() must remember to
clear out any holes that it instantiates.

The obvious place to do so, where they are put into the page cache tree,
is not a good choice: because interrupts are disabled there.  Leave it
until further down, once success is assured, where the other pages are
copied (before setting PageUptodate).

Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 65e82f665c7c..1c402d33547e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1467,7 +1467,12 @@ static void collapse_shmem(struct mm_struct *mm,
 		 * Replacing old pages with new one has succeeded, now we
 		 * need to copy the content and free the old pages.
 		 */
+		index = start;
 		list_for_each_entry_safe(page, tmp, &pagelist, lru) {
+			while (index < page->index) {
+				clear_highpage(new_page + (index % HPAGE_PMD_NR));
+				index++;
+			}
 			copy_highpage(new_page + (page->index % HPAGE_PMD_NR),
 					page);
 			list_del(&page->lru);
@@ -1477,6 +1482,11 @@ static void collapse_shmem(struct mm_struct *mm,
 			ClearPageActive(page);
 			ClearPageUnevictable(page);
 			put_page(page);
+			index++;
+		}
+		while (index < end) {
+			clear_highpage(new_page + (index % HPAGE_PMD_NR));
+			index++;
 		}
 
 		local_irq_disable();
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (4 preceding siblings ...)
  2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
@ 2018-11-26 23:27 ` Hugh Dickins
  2018-11-27  7:59   ` Kirill A. Shutemov
  2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

Several cleanups in collapse_shmem(): most of which probably do not
really matter, beyond doing things in a more familiar and reassuring
order.  Simplify the failure gotos in the main loop, and on success
update stats while interrupts still disabled from the last iteration.

Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 72 ++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 40 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1c402d33547e..9d4e9ff1af95 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm,
 		goto out;
 	}
 
+	__SetPageLocked(new_page);
+	__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
-	__SetPageSwapBacked(new_page);
-	__SetPageLocked(new_page);
 	BUG_ON(!page_ref_freeze(new_page, 1));
 
 	/*
@@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm,
 			if (index == start) {
 				if (!xas_next_entry(&xas, end - 1)) {
 					result = SCAN_TRUNCATED;
-					break;
+					goto xa_locked;
 				}
 				xas_set(&xas, index);
 			}
 			if (!shmem_charge(mapping->host, 1)) {
 				result = SCAN_FAIL;
-				break;
+				goto xa_locked;
 			}
 			xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
 			nr_none++;
@@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm,
 				result = SCAN_FAIL;
 				goto xa_unlocked;
 			}
-			xas_lock_irq(&xas);
-			xas_set(&xas, index);
 		} else if (trylock_page(page)) {
 			get_page(page);
+			xas_unlock_irq(&xas);
 		} else {
 			result = SCAN_PAGE_LOCK;
-			break;
+			goto xa_locked;
 		}
 
 		/*
@@ -1408,11 +1407,10 @@ static void collapse_shmem(struct mm_struct *mm,
 			result = SCAN_TRUNCATED;
 			goto out_unlock;
 		}
-		xas_unlock_irq(&xas);
 
 		if (isolate_lru_page(page)) {
 			result = SCAN_DEL_PAGE_LRU;
-			goto out_isolate_failed;
+			goto out_unlock;
 		}
 
 		if (page_mapped(page))
@@ -1432,7 +1430,9 @@ static void collapse_shmem(struct mm_struct *mm,
 		 */
 		if (!page_ref_freeze(page, 3)) {
 			result = SCAN_PAGE_COUNT;
-			goto out_lru;
+			xas_unlock_irq(&xas);
+			putback_lru_page(page);
+			goto out_unlock;
 		}
 
 		/*
@@ -1444,24 +1444,26 @@ static void collapse_shmem(struct mm_struct *mm,
 		/* Finally, replace with the new page. */
 		xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
 		continue;
-out_lru:
-		xas_unlock_irq(&xas);
-		putback_lru_page(page);
-out_isolate_failed:
-		unlock_page(page);
-		put_page(page);
-		goto xa_unlocked;
 out_unlock:
 		unlock_page(page);
 		put_page(page);
-		break;
+		goto xa_unlocked;
 	}
-	xas_unlock_irq(&xas);
 
+	__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	if (nr_none) {
+		struct zone *zone = page_zone(new_page);
+
+		__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
+		__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+	}
+
+xa_locked:
+	xas_unlock_irq(&xas);
 xa_unlocked:
+
 	if (result == SCAN_SUCCEED) {
 		struct page *page, *tmp;
-		struct zone *zone = page_zone(new_page);
 
 		/*
 		 * Replacing old pages with new one has succeeded, now we
@@ -1476,11 +1478,11 @@ static void collapse_shmem(struct mm_struct *mm,
 			copy_highpage(new_page + (page->index % HPAGE_PMD_NR),
 					page);
 			list_del(&page->lru);
-			unlock_page(page);
-			page_ref_unfreeze(page, 1);
 			page->mapping = NULL;
+			page_ref_unfreeze(page, 1);
 			ClearPageActive(page);
 			ClearPageUnevictable(page);
+			unlock_page(page);
 			put_page(page);
 			index++;
 		}
@@ -1489,28 +1491,17 @@ static void collapse_shmem(struct mm_struct *mm,
 			index++;
 		}
 
-		local_irq_disable();
-		__inc_node_page_state(new_page, NR_SHMEM_THPS);
-		if (nr_none) {
-			__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
-			__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
-		}
-		local_irq_enable();
-
-		/*
-		 * Remove pte page tables, so we can re-fault
-		 * the page as huge.
-		 */
-		retract_page_tables(mapping, start);
-
 		/* Everything is ready, let's unfreeze the new_page */
-		set_page_dirty(new_page);
 		SetPageUptodate(new_page);
 		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
+		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_anon(new_page);
-		unlock_page(new_page);
 
+		/*
+		 * Remove pte page tables, so we can re-fault the page as huge.
+		 */
+		retract_page_tables(mapping, start);
 		*hpage = NULL;
 
 		khugepaged_pages_collapsed++;
@@ -1543,8 +1534,8 @@ static void collapse_shmem(struct mm_struct *mm,
 			xas_store(&xas, page);
 			xas_pause(&xas);
 			xas_unlock_irq(&xas);
-			putback_lru_page(page);
 			unlock_page(page);
+			putback_lru_page(page);
 			xas_lock_irq(&xas);
 		}
 		VM_BUG_ON(nr_none);
@@ -1553,9 +1544,10 @@ static void collapse_shmem(struct mm_struct *mm,
 		/* Unfreeze new_page, caller would take care about freeing it */
 		page_ref_unfreeze(new_page, 1);
 		mem_cgroup_cancel_charge(new_page, memcg, true);
-		unlock_page(new_page);
 		new_page->mapping = NULL;
 	}
+
+	unlock_page(new_page);
 out:
 	VM_BUG_ON(!list_empty(&pagelist));
 	/* TODO: tracepoints */
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (5 preceding siblings ...)
  2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
@ 2018-11-26 23:29 ` Hugh Dickins
  2018-11-27  8:01   ` Kirill A. Shutemov
  2018-11-26 23:31 ` [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound Hugh Dickins
  2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

khugepaged's collapse_shmem() does almost all of its work, to assemble
the huge new_page from 512 scattered old pages, with the new_page's
refcount frozen to 0 (and refcounts of all old pages so far also frozen
to 0).  Including shmem_getpage() to read in any which were out on swap,
memory reclaim if necessary to allocate their intermediate pages, and
copying over all the data from old to new.

Imagine the frozen refcount as a spinlock held, but without any lock
debugging to highlight the abuse: it's not good, and under serious load
heads into lockups - speculative getters of the page are not expecting
to spin while khugepaged is rescheduled.

One can get a little further under load by hacking around elsewhere;
but fortunately, freezing the new_page turns out to have been entirely
unnecessary, with no hacks needed elsewhere.

The huge new_page lock is already held throughout, and guards all its
subpages as they are brought one by one into the page cache tree; and
anything reading the data in that page, without the lock, before it has
been marked PageUptodate, would already be in the wrong.  So simply
eliminate the freezing of the new_page.

Each of the old pages remains frozen with refcount 0 after it has been
replaced by a new_page subpage in the page cache tree, until they are all
unfrozen on success or failure: just as before.  They could be unfrozen
sooner, but cause no problem once no longer visible to find_get_entry(),
filemap_map_pages() and other speculative lookups.

Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 9d4e9ff1af95..55930cbed3fd 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1287,7 +1287,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
  *
  * Basic scheme is simple, details are more complex:
- *  - allocate and freeze a new huge page;
+ *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
  *    + swap in pages if necessary;
  *    + fill in gaps;
@@ -1295,11 +1295,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  *  - if replacing succeeds:
  *    + copy data over;
  *    + free old pages;
- *    + unfreeze huge page;
+ *    + unlock huge page;
  *  - if replacing failed;
  *    + put all pages back and unfreeze them;
  *    + restore gaps in the page cache;
- *    + free huge page;
+ *    + unlock and free huge page;
  */
 static void collapse_shmem(struct mm_struct *mm,
 		struct address_space *mapping, pgoff_t start,
@@ -1333,13 +1333,11 @@ static void collapse_shmem(struct mm_struct *mm,
 	__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
-	BUG_ON(!page_ref_freeze(new_page, 1));
 
 	/*
-	 * At this point the new_page is 'frozen' (page_count() is zero),
-	 * locked and not up-to-date. It's safe to insert it into the page
-	 * cache, because nobody would be able to map it or use it in other
-	 * way until we unfreeze it.
+	 * At this point the new_page is locked and not up-to-date.
+	 * It's safe to insert it into the page cache, because nobody would
+	 * be able to map it or use it in another way until we unlock it.
 	 */
 
 	/* This will be less messy when we use multi-index entries */
@@ -1491,9 +1489,8 @@ static void collapse_shmem(struct mm_struct *mm,
 			index++;
 		}
 
-		/* Everything is ready, let's unfreeze the new_page */
 		SetPageUptodate(new_page);
-		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
+		page_ref_add(new_page, HPAGE_PMD_NR - 1);
 		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_anon(new_page);
@@ -1541,8 +1538,6 @@ static void collapse_shmem(struct mm_struct *mm,
 		VM_BUG_ON(nr_none);
 		xas_unlock_irq(&xas);
 
-		/* Unfreeze new_page, caller would take care about freeing it */
-		page_ref_unfreeze(new_page, 1);
 		mem_cgroup_cancel_charge(new_page, memcg, true);
 		new_page->mapping = NULL;
 	}
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (6 preceding siblings ...)
  2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
@ 2018-11-26 23:31 ` Hugh Dickins
  2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
  8 siblings, 0 replies; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

collapse_shmem()'s VM_BUG_ON_PAGE(PageTransCompound) was unsafe: before
it holds page lock of the first page, racing truncation then extension
might conceivably have inserted a hugepage there already.  Fail with the
SCAN_PAGE_COMPOUND result, instead of crashing (CONFIG_DEBUG_VM=y) or
otherwise mishandling the unexpected hugepage - though later we might
code up a more constructive way of handling it, with SCAN_SUCCESS.

Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/khugepaged.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 55930cbed3fd..2c5fe4f7a0c6 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1399,7 +1399,15 @@ static void collapse_shmem(struct mm_struct *mm,
 		 */
 		VM_BUG_ON_PAGE(!PageLocked(page), page);
 		VM_BUG_ON_PAGE(!PageUptodate(page), page);
-		VM_BUG_ON_PAGE(PageTransCompound(page), page);
+
+		/*
+		 * If file was truncated then extended, or hole-punched, before
+		 * we locked the first page, then a THP might be there already.
+		 */
+		if (PageTransCompound(page)) {
+			result = SCAN_PAGE_COMPOUND;
+			goto out_unlock;
+		}
 
 		if (page_mapping(page) != mapping) {
 			result = SCAN_TRUNCATED;
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path
  2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
                   ` (7 preceding siblings ...)
  2018-11-26 23:31 ` [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound Hugh Dickins
@ 2018-11-26 23:33 ` Hugh Dickins
  2018-11-27  8:02   ` Kirill A. Shutemov
  8 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-26 23:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, Matthew Wilcox, linux-mm

collapse_shmem()'s xas_nomem() is very unlikely to fail, but it is
rightly given a failure path, so move the whole xas_create_range() block
up before __SetPageLocked(new_page): so that it does not need to remember
to unlock_page(new_page).  Add the missing mem_cgroup_cancel_charge(),
and set (currently unused) result to SCAN_FAIL rather than SCAN_SUCCEED.

Fixes: 77da9389b9d5 ("mm: Convert collapse_shmem to XArray")
Signed-off-by: Hugh Dickins <hughd@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/khugepaged.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2c5fe4f7a0c6..8e2ff195ecb3 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1329,6 +1329,20 @@ static void collapse_shmem(struct mm_struct *mm,
 		goto out;
 	}
 
+	/* This will be less messy when we use multi-index entries */
+	do {
+		xas_lock_irq(&xas);
+		xas_create_range(&xas);
+		if (!xas_error(&xas))
+			break;
+		xas_unlock_irq(&xas);
+		if (!xas_nomem(&xas, GFP_KERNEL)) {
+			mem_cgroup_cancel_charge(new_page, memcg, true);
+			result = SCAN_FAIL;
+			goto out;
+		}
+	} while (1);
+
 	__SetPageLocked(new_page);
 	__SetPageSwapBacked(new_page);
 	new_page->index = start;
@@ -1340,17 +1354,6 @@ static void collapse_shmem(struct mm_struct *mm,
 	 * be able to map it or use it in another way until we unlock it.
 	 */
 
-	/* This will be less messy when we use multi-index entries */
-	do {
-		xas_lock_irq(&xas);
-		xas_create_range(&xas);
-		if (!xas_error(&xas))
-			break;
-		xas_unlock_irq(&xas);
-		if (!xas_nomem(&xas, GFP_KERNEL))
-			goto out;
-	} while (1);
-
 	xas_set(&xas, start);
 	for (index = start; index < end; index++) {
 		struct page *page = xas_next(&xas);
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze
  2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
@ 2018-11-27  7:26   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:26 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox,
	Konstantin Khlebnikov, linux-mm

On Mon, Nov 26, 2018 at 03:19:57PM -0800, Hugh Dickins wrote:
> Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s
> VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page).
> 
> Move the setting of mapping and index up before the page_ref_unfreeze()
> in __split_huge_page_tail() to fix this: so that a page cache lookup
> cannot get a reference while the tail's mapping and index are unstable.
> 
> In fact, might as well move them up before the smp_wmb(): I don't see
> an actual need for that, but if I'm missing something, this way round
> is safer than the other, and no less efficient.
> 
> You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page)
> is misplaced, and should be left until after the trylock_page(); but
> left as is has not crashed since, and gives more stringent assurance.
> 
> Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()")
> Requires: 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: stable@vger.kernel.org # 4.8+

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
  2018-11-26 23:21 ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Hugh Dickins
@ 2018-11-27  7:27   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:27 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:21:58PM -0800, Hugh Dickins wrote:
> Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that
> __split_huge_page() was using i_size_read() while holding the irq-safe
> lru_lock and page tree lock, but the 32-bit i_size_read() uses an
> irq-unsafe seqlock which should not be nested inside them.
> 
> Instead, read the i_size earlier in split_huge_page_to_list(), and pass
> the end offset down to __split_huge_page(): all while holding head page
> lock, which is enough to prevent truncation of that extent before the
> page tree lock has been taken.
> 
> Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated
  2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
@ 2018-11-27  7:31   ` Kirill A. Shutemov
  2018-11-27 13:26   ` Matthew Wilcox
  1 sibling, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:31 UTC (permalink / raw)
  To: Hugh Dickins, Matthew Wilcox; +Cc: Andrew Morton, Kirill A. Shutemov, linux-mm

On Mon, Nov 26, 2018 at 03:23:37PM -0800, Hugh Dickins wrote:
> Huge tmpfs testing showed that although collapse_shmem() recognizes a
> concurrently truncated or hole-punched page correctly, its handling of
> holes was liable to refill an emptied extent.  Add check to stop that.
> 
> Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+

I'm not yet comfortable with XArray API. Matthew, what is your take on the patch?

> ---
>  mm/khugepaged.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index c13625c1ad5e..2070c316f06e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1359,6 +1359,17 @@ static void collapse_shmem(struct mm_struct *mm,
>  
>  		VM_BUG_ON(index != xas.xa_index);
>  		if (!page) {
> +			/*
> +			 * Stop if extent has been truncated or hole-punched,
> +			 * and is now completely empty.
> +			 */
> +			if (index == start) {
> +				if (!xas_next_entry(&xas, end - 1)) {
> +					result = SCAN_TRUNCATED;
> +					break;
> +				}
> +				xas_set(&xas, index);
> +			}
>  			if (!shmem_charge(mapping->host, 1)) {
>  				result = SCAN_FAIL;
>  				break;
> -- 
> 2.20.0.rc0.387.gc7a69e6b6c-goog
> 

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes
  2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
@ 2018-11-27  7:35   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:35 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:25:01PM -0800, Hugh Dickins wrote:
> Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent hit
> shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by clear_inode()'s
> BUG_ON(inode->i_data.nrpages) when the file was later closed and unlinked.
> 
> khugepaged's collapse_shmem() was forgetting to update mapping->nrpages on
> the rollback path, after it had added but then needs to undo some holes.
> 
> There is indeed an irritating asymmetry between shmem_charge(), whose
> callers want it to increment nrpages after successfully accounting blocks,
> and shmem_uncharge(), when __delete_from_page_cache() already decremented
> nrpages itself: oh well, just add a comment on that to them both.
> 
> And shmem_recalc_inode() is supposed to be called when the accounting is
> expected to be in balance (so it can deduce from imbalance that reclaim
> discarded some pages): so change shmem_charge() to update nrpages earlier
> (though it's rare for the difference to matter at all).
> 
> Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
> Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

I think we would need to revisit the accounting helpers to make them less
error prone. But it's out of scope for the patchset.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes
  2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
@ 2018-11-27  7:38   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:38 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:26:34PM -0800, Hugh Dickins wrote:
> Huge tmpfs testing reminds us that there is no __GFP_ZERO in the gfp
> flags khugepaged uses to allocate a huge page - in all common cases it
> would just be a waste of effort - so collapse_shmem() must remember to
> clear out any holes that it instantiates.
> 
> The obvious place to do so, where they are put into the page cache tree,
> is not a good choice: because interrupts are disabled there.  Leave it
> until further down, once success is assured, where the other pages are
> copied (before setting PageUptodate).
> 
> Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+

Ouch.  Thanks for catching this.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
@ 2018-11-27  7:59   ` Kirill A. Shutemov
  2018-11-27 20:23     ` Hugh Dickins
  0 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  7:59 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote:
> Several cleanups in collapse_shmem(): most of which probably do not
> really matter, beyond doing things in a more familiar and reassuring
> order.  Simplify the failure gotos in the main loop, and on success
> update stats while interrupts still disabled from the last iteration.
> 
> Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+
> ---
>  mm/khugepaged.c | 72 ++++++++++++++++++++++---------------------------
>  1 file changed, 32 insertions(+), 40 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1c402d33547e..9d4e9ff1af95 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm,
>  		goto out;
>  	}
>  
> +	__SetPageLocked(new_page);
> +	__SetPageSwapBacked(new_page);
>  	new_page->index = start;
>  	new_page->mapping = mapping;
> -	__SetPageSwapBacked(new_page);
> -	__SetPageLocked(new_page);
>  	BUG_ON(!page_ref_freeze(new_page, 1));
>  
>  	/*
> @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm,
>  			if (index == start) {
>  				if (!xas_next_entry(&xas, end - 1)) {
>  					result = SCAN_TRUNCATED;
> -					break;
> +					goto xa_locked;
>  				}
>  				xas_set(&xas, index);
>  			}
>  			if (!shmem_charge(mapping->host, 1)) {
>  				result = SCAN_FAIL;
> -				break;
> +				goto xa_locked;
>  			}
>  			xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
>  			nr_none++;
> @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm,
>  				result = SCAN_FAIL;
>  				goto xa_unlocked;
>  			}
> -			xas_lock_irq(&xas);
> -			xas_set(&xas, index);
>  		} else if (trylock_page(page)) {
>  			get_page(page);
> +			xas_unlock_irq(&xas);
>  		} else {
>  			result = SCAN_PAGE_LOCK;
> -			break;
> +			goto xa_locked;
>  		}
>  
>  		/*

I'm puzzled by locking change here.

Isn't the change responsible for the bug you are fixing in 09/10?

IIRC, my intend for the locking scheme was to protect against
truncate-repopulate race.

What do I miss?

The rest of the patch *looks* okay, but I found it hard to follow.
Splitting it up would make it easier.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page
  2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
@ 2018-11-27  8:01   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  8:01 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:29:17PM -0800, Hugh Dickins wrote:
> khugepaged's collapse_shmem() does almost all of its work, to assemble
> the huge new_page from 512 scattered old pages, with the new_page's
> refcount frozen to 0 (and refcounts of all old pages so far also frozen
> to 0).  Including shmem_getpage() to read in any which were out on swap,
> memory reclaim if necessary to allocate their intermediate pages, and
> copying over all the data from old to new.
> 
> Imagine the frozen refcount as a spinlock held, but without any lock
> debugging to highlight the abuse: it's not good, and under serious load
> heads into lockups - speculative getters of the page are not expecting
> to spin while khugepaged is rescheduled.
> 
> One can get a little further under load by hacking around elsewhere;
> but fortunately, freezing the new_page turns out to have been entirely
> unnecessary, with no hacks needed elsewhere.
> 
> The huge new_page lock is already held throughout, and guards all its
> subpages as they are brought one by one into the page cache tree; and
> anything reading the data in that page, without the lock, before it has
> been marked PageUptodate, would already be in the wrong.  So simply
> eliminate the freezing of the new_page.
> 
> Each of the old pages remains frozen with refcount 0 after it has been
> replaced by a new_page subpage in the page cache tree, until they are all
> unfrozen on success or failure: just as before.  They could be unfrozen
> sooner, but cause no problem once no longer visible to find_get_entry(),
> filemap_map_pages() and other speculative lookups.
> 
> Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.8+

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path
  2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
@ 2018-11-27  8:02   ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-27  8:02 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Mon, Nov 26, 2018 at 03:33:13PM -0800, Hugh Dickins wrote:
> collapse_shmem()'s xas_nomem() is very unlikely to fail, but it is
> rightly given a failure path, so move the whole xas_create_range() block
> up before __SetPageLocked(new_page): so that it does not need to remember
> to unlock_page(new_page).  Add the missing mem_cgroup_cancel_charge(),
> and set (currently unused) result to SCAN_FAIL rather than SCAN_SUCCEED.
> 
> Fixes: 77da9389b9d5 ("mm: Convert collapse_shmem to XArray")
> Signed-off-by: Hugh Dickins <hughd@kernel.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated
  2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
  2018-11-27  7:31   ` Kirill A. Shutemov
@ 2018-11-27 13:26   ` Matthew Wilcox
  1 sibling, 0 replies; 22+ messages in thread
From: Matthew Wilcox @ 2018-11-27 13:26 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Kirill A. Shutemov, Kirill A. Shutemov, linux-mm

On Mon, Nov 26, 2018 at 03:23:37PM -0800, Hugh Dickins wrote:
>  		VM_BUG_ON(index != xas.xa_index);
>  		if (!page) {
> +			/*
> +			 * Stop if extent has been truncated or hole-punched,
> +			 * and is now completely empty.
> +			 */
> +			if (index == start) {
> +				if (!xas_next_entry(&xas, end - 1)) {
> +					result = SCAN_TRUNCATED;
> +					break;
> +				}
> +				xas_set(&xas, index);
> +			}
>  			if (!shmem_charge(mapping->host, 1)) {

Reviewed-by: Matthew Wilcox <willy@infradead.org>

I'd use xas_find() directly here; I don't think it warrants the inlined
version of xas_next_entry().  But I'm happy either way.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-11-27  7:59   ` Kirill A. Shutemov
@ 2018-11-27 20:23     ` Hugh Dickins
  2018-11-28 10:59       ` Kirill A. Shutemov
  0 siblings, 1 reply; 22+ messages in thread
From: Hugh Dickins @ 2018-11-27 20:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hugh Dickins, Andrew Morton, Kirill A. Shutemov, Matthew Wilcox,
	linux-mm

On Tue, 27 Nov 2018, Kirill A. Shutemov wrote:
> On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote:
> > Several cleanups in collapse_shmem(): most of which probably do not
> > really matter, beyond doing things in a more familiar and reassuring
> > order.  Simplify the failure gotos in the main loop, and on success
> > update stats while interrupts still disabled from the last iteration.
> > 
> > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: stable@vger.kernel.org # 4.8+
> > ---
> >  mm/khugepaged.c | 72 ++++++++++++++++++++++---------------------------
> >  1 file changed, 32 insertions(+), 40 deletions(-)
> > 
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 1c402d33547e..9d4e9ff1af95 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm,
> >  		goto out;
> >  	}
> >  
> > +	__SetPageLocked(new_page);
> > +	__SetPageSwapBacked(new_page);
> >  	new_page->index = start;
> >  	new_page->mapping = mapping;
> > -	__SetPageSwapBacked(new_page);
> > -	__SetPageLocked(new_page);
> >  	BUG_ON(!page_ref_freeze(new_page, 1));
> >  
> >  	/*
> > @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm,
> >  			if (index == start) {
> >  				if (!xas_next_entry(&xas, end - 1)) {
> >  					result = SCAN_TRUNCATED;
> > -					break;
> > +					goto xa_locked;
> >  				}
> >  				xas_set(&xas, index);
> >  			}
> >  			if (!shmem_charge(mapping->host, 1)) {
> >  				result = SCAN_FAIL;
> > -				break;
> > +				goto xa_locked;
> >  			}
> >  			xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
> >  			nr_none++;
> > @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm,
> >  				result = SCAN_FAIL;
> >  				goto xa_unlocked;
> >  			}
> > -			xas_lock_irq(&xas);
> > -			xas_set(&xas, index);
> >  		} else if (trylock_page(page)) {
> >  			get_page(page);
> > +			xas_unlock_irq(&xas);
> >  		} else {
> >  			result = SCAN_PAGE_LOCK;
> > -			break;
> > +			goto xa_locked;
> >  		}
> >  
> >  		/*
> 
> I'm puzzled by locking change here.

The locking change here is to not re-get xas_lock_irq (in shmem_getpage
case) just before we drop it anyway: you point out that it used to cover
		/*
		 * The page must be locked, so we can drop the i_pages lock
		 * without racing with truncate.
		 */
		VM_BUG_ON_PAGE(!PageLocked(page), page);
		VM_BUG_ON_PAGE(!PageUptodate(page), page);
		VM_BUG_ON_PAGE(PageTransCompound(page), page);
		if (page_mapping(page) != mapping) {
but now does not.

But the comment you wrote there originally (ah, git blame shows
that Matthew has made it say i_pages lock instead of tree_lock),
"The page must be locked, so we can drop...", was correct all along,
I'm just following what it says.

It would wrong if the trylock_page came after the xas_unlock_irq,
but it comes before (as before): holding i_pages lock across the
lookup makes sure we look up the right page (no RCU racing) and
trylock_page makes sure that it cannot be truncated or hole-punched
or migrated or whatever from that point on - so can drop i_pages lock.

Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping),
couldn't we? Not that I propose to make such a change at this stage.

> 
> Isn't the change responsible for the bug you are fixing in 09/10?

In which I relaxed the VM_BUG_ON_PAGE(PageTransCompound(page), page)
that appears in the sequence above.

Well, what I was thinking of in 9/10 was a THP being inserted at some
stage between selecting this range for collapse and reaching the last
(usually first) xas_lock_irq(&xas) in the "This will be less messy..."
loop above: I don't see any locking against that possibility.  (And it
has to be that initial xas_lock_irq(&xas), because once the PageLocked
head of new_page is inserted in the i_pages tree, there is no more
chance for truncation and a competing THP to be inserted there.)

So 9/10 would be required anyway; but you're thinking that the page
we looked up under i_pages lock and got trylock_page on, could then
become Compound once i_pages lock is dropped?  I don't think so: pages
don't become Compound after they've left the page allocator, do they?
And if we ever manage to change that, I'm pretty sure it would be with
page locks held and page refcounts frozen.

> 
> IIRC, my intend for the locking scheme was to protect against
> truncate-repopulate race.
> 
> What do I miss?

The stage in between selecting the range for collapse, and getting
the initial i_pages lock?  Pages not becoming Compound underneath
you, with or without page lock, with or without i_pages lock?  Page
lock being sufficient protection against truncation and migration?

> 
> The rest of the patch *looks* okay, but I found it hard to follow.
> Splitting it up would make it easier.

It needs some time, I admit: thanks a lot for persisting with it.
And thanks (to you and to Matthew) for the speedy Acks elsewhere.

Hugh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-11-27 20:23     ` Hugh Dickins
@ 2018-11-28 10:59       ` Kirill A. Shutemov
  2018-11-28 19:40         ` Hugh Dickins
  0 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2018-11-28 10:59 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Kirill A. Shutemov, Matthew Wilcox, linux-mm

On Tue, Nov 27, 2018 at 12:23:32PM -0800, Hugh Dickins wrote:
> On Tue, 27 Nov 2018, Kirill A. Shutemov wrote:
> > On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote:
> > > Several cleanups in collapse_shmem(): most of which probably do not
> > > really matter, beyond doing things in a more familiar and reassuring
> > > order.  Simplify the failure gotos in the main loop, and on success
> > > update stats while interrupts still disabled from the last iteration.
> > > 
> > > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > > Signed-off-by: Hugh Dickins <hughd@google.com>
> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Cc: stable@vger.kernel.org # 4.8+
> > > ---
> > >  mm/khugepaged.c | 72 ++++++++++++++++++++++---------------------------
> > >  1 file changed, 32 insertions(+), 40 deletions(-)
> > > 
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index 1c402d33547e..9d4e9ff1af95 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm,
> > >  		goto out;
> > >  	}
> > >  
> > > +	__SetPageLocked(new_page);
> > > +	__SetPageSwapBacked(new_page);
> > >  	new_page->index = start;
> > >  	new_page->mapping = mapping;
> > > -	__SetPageSwapBacked(new_page);
> > > -	__SetPageLocked(new_page);
> > >  	BUG_ON(!page_ref_freeze(new_page, 1));
> > >  
> > >  	/*
> > > @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm,
> > >  			if (index == start) {
> > >  				if (!xas_next_entry(&xas, end - 1)) {
> > >  					result = SCAN_TRUNCATED;
> > > -					break;
> > > +					goto xa_locked;
> > >  				}
> > >  				xas_set(&xas, index);
> > >  			}
> > >  			if (!shmem_charge(mapping->host, 1)) {
> > >  				result = SCAN_FAIL;
> > > -				break;
> > > +				goto xa_locked;
> > >  			}
> > >  			xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
> > >  			nr_none++;
> > > @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm,
> > >  				result = SCAN_FAIL;
> > >  				goto xa_unlocked;
> > >  			}
> > > -			xas_lock_irq(&xas);
> > > -			xas_set(&xas, index);
> > >  		} else if (trylock_page(page)) {
> > >  			get_page(page);
> > > +			xas_unlock_irq(&xas);
> > >  		} else {
> > >  			result = SCAN_PAGE_LOCK;
> > > -			break;
> > > +			goto xa_locked;
> > >  		}
> > >  
> > >  		/*
> > 
> > I'm puzzled by locking change here.
> 
> The locking change here is to not re-get xas_lock_irq (in shmem_getpage
> case) just before we drop it anyway: you point out that it used to cover
> 		/*
> 		 * The page must be locked, so we can drop the i_pages lock
> 		 * without racing with truncate.
> 		 */
> 		VM_BUG_ON_PAGE(!PageLocked(page), page);
> 		VM_BUG_ON_PAGE(!PageUptodate(page), page);
> 		VM_BUG_ON_PAGE(PageTransCompound(page), page);
> 		if (page_mapping(page) != mapping) {
> but now does not.
> 
> But the comment you wrote there originally (ah, git blame shows
> that Matthew has made it say i_pages lock instead of tree_lock),
> "The page must be locked, so we can drop...", was correct all along,
> I'm just following what it says.
> 
> It would wrong if the trylock_page came after the xas_unlock_irq,
> but it comes before (as before): holding i_pages lock across the
> lookup makes sure we look up the right page (no RCU racing) and
> trylock_page makes sure that it cannot be truncated or hole-punched
> or migrated or whatever from that point on - so can drop i_pages lock.

You are right. I confused myself.

> Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping),
> couldn't we? Not that I propose to make such a change at this stage.

Yeah it should be safe. We may put WARN there.

> > Isn't the change responsible for the bug you are fixing in 09/10?
> 
> In which I relaxed the VM_BUG_ON_PAGE(PageTransCompound(page), page)
> that appears in the sequence above.
> 
> Well, what I was thinking of in 9/10 was a THP being inserted at some
> stage between selecting this range for collapse and reaching the last
> (usually first) xas_lock_irq(&xas) in the "This will be less messy..."
> loop above: I don't see any locking against that possibility.  (And it
> has to be that initial xas_lock_irq(&xas), because once the PageLocked
> head of new_page is inserted in the i_pages tree, there is no more
> chance for truncation and a competing THP to be inserted there.)
> 
> So 9/10 would be required anyway; but you're thinking that the page
> we looked up under i_pages lock and got trylock_page on, could then
> become Compound once i_pages lock is dropped?  I don't think so: pages
> don't become Compound after they've left the page allocator, do they?
> And if we ever manage to change that, I'm pretty sure it would be with
> page locks held and page refcounts frozen.
> > IIRC, my intend for the locking scheme was to protect against
> > truncate-repopulate race.
> > 
> > What do I miss?
> 
> The stage in between selecting the range for collapse, and getting
> the initial i_pages lock?  Pages not becoming Compound underneath
> you, with or without page lock, with or without i_pages lock?  Page
> lock being sufficient protection against truncation and migration?

Agreed on all fronts. Sorry for the noise.

> > The rest of the patch *looks* okay, but I found it hard to follow.
> > Splitting it up would make it easier.
> 
> It needs some time, I admit: thanks a lot for persisting with it.
> And thanks (to you and to Matthew) for the speedy Acks elsewhere.
> 
> Hugh

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-11-28 10:59       ` Kirill A. Shutemov
@ 2018-11-28 19:40         ` Hugh Dickins
  0 siblings, 0 replies; 22+ messages in thread
From: Hugh Dickins @ 2018-11-28 19:40 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hugh Dickins, Andrew Morton, Kirill A. Shutemov, Matthew Wilcox,
	linux-mm

On Wed, 28 Nov 2018, Kirill A. Shutemov wrote:
> On Tue, Nov 27, 2018 at 12:23:32PM -0800, Hugh Dickins wrote:
> 
> > Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping),
> > couldn't we? Not that I propose to make such a change at this stage.
> 
> Yeah it should be safe. We may put WARN there.

Later yes, but for now I'm leaving the patch unchanged -
been burnt before by last minute changes that didn't turn out so well!

> Agreed on all fronts. Sorry for the noise.

No problem at all: it's important that you challenge what looked wrong.
This time around, I was the one with the advantage of recent familiarity.

> 
> > > The rest of the patch *looks* okay, but I found it hard to follow.
> > > Splitting it up would make it easier.
> > 
> > It needs some time, I admit: thanks a lot for persisting with it.
> > And thanks (to you and to Matthew) for the speedy Acks elsewhere.
> > 
> > Hugh
> 
> -- 
>  Kirill A. Shutemov

Thanks again,
Hugh

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-11-28 19:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
2018-11-27  7:26   ` Kirill A. Shutemov
2018-11-26 23:21 ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Hugh Dickins
2018-11-27  7:27   ` Kirill A. Shutemov
2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
2018-11-27  7:31   ` Kirill A. Shutemov
2018-11-27 13:26   ` Matthew Wilcox
2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
2018-11-27  7:35   ` Kirill A. Shutemov
2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
2018-11-27  7:38   ` Kirill A. Shutemov
2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
2018-11-27  7:59   ` Kirill A. Shutemov
2018-11-27 20:23     ` Hugh Dickins
2018-11-28 10:59       ` Kirill A. Shutemov
2018-11-28 19:40         ` Hugh Dickins
2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
2018-11-27  8:01   ` Kirill A. Shutemov
2018-11-26 23:31 ` [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound Hugh Dickins
2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
2018-11-27  8:02   ` Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.