Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* mm: migrate: Fix page migration stalls for blkdev pages
@ 2018-12-11 17:21 Jan Kara
  2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

Hello,

this patch set deals with page migration stalls that were reported by our
customer due to block device page that had buffer head that was in bh LRU
cache.

The patch set modifies page migration code so that buffer heads are completely
handled inside buffer_migrate_page() and then provides new migration helper
for pages with buffer heads that is safe to use even for block device pages
and that also deals with bh lrus.

								Honza

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 13:05   ` Mel Gorman
  2018-12-14 15:10   ` Mel Gorman
  2018-12-11 17:21 ` [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping() Jan Kara
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

Factor out function to compute number of expected page references in
migrate_page_move_mapping(). Note that we move hpage_nr_pages() and
page_has_private() checks from under xas_lock_irq() however this is safe
since we hold page lock.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/migrate.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index f7e4bfdc13b7..789c7bc90a0c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -428,6 +428,22 @@ static inline bool buffer_migrate_lock_buffers(struct buffer_head *head,
 }
 #endif /* CONFIG_BLOCK */
 
+static int expected_page_refs(struct page *page)
+{
+	int expected_count = 1;
+
+	/*
+	 * Device public or private pages have an extra refcount as they are
+	 * ZONE_DEVICE pages.
+	 */
+	expected_count += is_device_private_page(page);
+	expected_count += is_device_public_page(page);
+	if (page->mapping)
+		expected_count += hpage_nr_pages(page) + page_has_private(page);
+
+	return expected_count;
+}
+
 /*
  * Replace the page in the mapping.
  *
@@ -444,14 +460,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
 	XA_STATE(xas, &mapping->i_pages, page_index(page));
 	struct zone *oldzone, *newzone;
 	int dirty;
-	int expected_count = 1 + extra_count;
-
-	/*
-	 * Device public or private pages have an extra refcount as they are
-	 * ZONE_DEVICE pages.
-	 */
-	expected_count += is_device_private_page(page);
-	expected_count += is_device_public_page(page);
+	int expected_count = expected_page_refs(page) + extra_count;
 
 	if (!mapping) {
 		/* Anonymous page without mapping */
@@ -471,8 +480,6 @@ int migrate_page_move_mapping(struct address_space *mapping,
 	newzone = page_zone(newpage);
 
 	xas_lock_irq(&xas);
-
-	expected_count += hpage_nr_pages(page) + page_has_private(page);
 	if (page_count(page) != expected_count || xas_load(&xas) != page) {
 		xas_unlock_irq(&xas);
 		return -EAGAIN;
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping()
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
  2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 14:19   ` Mel Gorman
  2018-12-11 17:21 ` [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers() Jan Kara
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

Lock buffers before calling into migrate_page_move_mapping() so that
that function doesn't have to know about buffers (which is somewhat
unexpected anyway) and all the buffer head logic is in
buffer_migrate_page().

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/migrate.c | 39 +++++++++++++--------------------------
 1 file changed, 13 insertions(+), 26 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 789c7bc90a0c..d58a8ecf275e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -490,20 +490,6 @@ int migrate_page_move_mapping(struct address_space *mapping,
 		return -EAGAIN;
 	}
 
-	/*
-	 * In the async migration case of moving a page with buffers, lock the
-	 * buffers using trylock before the mapping is moved. If the mapping
-	 * was moved, we later failed to lock the buffers and could not move
-	 * the mapping back due to an elevated page count, we would have to
-	 * block waiting on other references to be dropped.
-	 */
-	if (mode == MIGRATE_ASYNC && head &&
-			!buffer_migrate_lock_buffers(head, mode)) {
-		page_ref_unfreeze(page, expected_count);
-		xas_unlock_irq(&xas);
-		return -EAGAIN;
-	}
-
 	/*
 	 * Now we know that no one else is looking at the page:
 	 * no turning back from here.
@@ -779,24 +765,23 @@ int buffer_migrate_page(struct address_space *mapping,
 {
 	struct buffer_head *bh, *head;
 	int rc;
+	int expected_count;
 
 	if (!page_has_buffers(page))
 		return migrate_page(mapping, newpage, page, mode);
 
-	head = page_buffers(page);
+	/* Check whether page does not have extra refs before we do more work */
+	expected_count = expected_page_refs(page);
+	if (page_count(page) != expected_count)
+		return -EAGAIN;
 
-	rc = migrate_page_move_mapping(mapping, newpage, page, head, mode, 0);
+	head = page_buffers(page);
+	if (!buffer_migrate_lock_buffers(head, mode))
+		return -EAGAIN;
 
+	rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
 	if (rc != MIGRATEPAGE_SUCCESS)
-		return rc;
-
-	/*
-	 * In the async case, migrate_page_move_mapping locked the buffers
-	 * with an IRQ-safe spinlock held. In the sync case, the buffers
-	 * need to be locked now
-	 */
-	if (mode != MIGRATE_ASYNC)
-		BUG_ON(!buffer_migrate_lock_buffers(head, mode));
+		goto unlock_buffers;
 
 	ClearPagePrivate(page);
 	set_page_private(newpage, page_private(page));
@@ -818,6 +803,8 @@ int buffer_migrate_page(struct address_space *mapping,
 	else
 		migrate_page_states(newpage, page);
 
+	rc = MIGRATEPAGE_SUCCESS;
+unlock_buffers:
 	bh = head;
 	do {
 		unlock_buffer(bh);
@@ -826,7 +813,7 @@ int buffer_migrate_page(struct address_space *mapping,
 
 	} while (bh != head);
 
-	return MIGRATEPAGE_SUCCESS;
+	return rc;
 }
 EXPORT_SYMBOL(buffer_migrate_page);
 #endif
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers()
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
  2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
  2018-12-11 17:21 ` [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping() Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 14:57   ` Mel Gorman
  2018-12-11 17:21 ` [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs() Jan Kara
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

buffer_migrate_page() is the only caller of migrate_page_lock_buffers()
move it close to it and also drop the now unused stub for !CONFIG_BLOCK.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/migrate.c | 92 +++++++++++++++++++++++++++---------------------------------
 1 file changed, 42 insertions(+), 50 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index d58a8ecf275e..f8df1ad6e7cf 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -378,56 +378,6 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd)
 }
 #endif
 
-#ifdef CONFIG_BLOCK
-/* Returns true if all buffers are successfully locked */
-static bool buffer_migrate_lock_buffers(struct buffer_head *head,
-							enum migrate_mode mode)
-{
-	struct buffer_head *bh = head;
-
-	/* Simple case, sync compaction */
-	if (mode != MIGRATE_ASYNC) {
-		do {
-			get_bh(bh);
-			lock_buffer(bh);
-			bh = bh->b_this_page;
-
-		} while (bh != head);
-
-		return true;
-	}
-
-	/* async case, we cannot block on lock_buffer so use trylock_buffer */
-	do {
-		get_bh(bh);
-		if (!trylock_buffer(bh)) {
-			/*
-			 * We failed to lock the buffer and cannot stall in
-			 * async migration. Release the taken locks
-			 */
-			struct buffer_head *failed_bh = bh;
-			put_bh(failed_bh);
-			bh = head;
-			while (bh != failed_bh) {
-				unlock_buffer(bh);
-				put_bh(bh);
-				bh = bh->b_this_page;
-			}
-			return false;
-		}
-
-		bh = bh->b_this_page;
-	} while (bh != head);
-	return true;
-}
-#else
-static inline bool buffer_migrate_lock_buffers(struct buffer_head *head,
-							enum migrate_mode mode)
-{
-	return true;
-}
-#endif /* CONFIG_BLOCK */
-
 static int expected_page_refs(struct page *page)
 {
 	int expected_count = 1;
@@ -755,6 +705,48 @@ int migrate_page(struct address_space *mapping,
 EXPORT_SYMBOL(migrate_page);
 
 #ifdef CONFIG_BLOCK
+/* Returns true if all buffers are successfully locked */
+static bool buffer_migrate_lock_buffers(struct buffer_head *head,
+							enum migrate_mode mode)
+{
+	struct buffer_head *bh = head;
+
+	/* Simple case, sync compaction */
+	if (mode != MIGRATE_ASYNC) {
+		do {
+			get_bh(bh);
+			lock_buffer(bh);
+			bh = bh->b_this_page;
+
+		} while (bh != head);
+
+		return true;
+	}
+
+	/* async case, we cannot block on lock_buffer so use trylock_buffer */
+	do {
+		get_bh(bh);
+		if (!trylock_buffer(bh)) {
+			/*
+			 * We failed to lock the buffer and cannot stall in
+			 * async migration. Release the taken locks
+			 */
+			struct buffer_head *failed_bh = bh;
+			put_bh(failed_bh);
+			bh = head;
+			while (bh != failed_bh) {
+				unlock_buffer(bh);
+				put_bh(bh);
+				bh = bh->b_this_page;
+			}
+			return false;
+		}
+
+		bh = bh->b_this_page;
+	} while (bh != head);
+	return true;
+}
+
 /*
  * Migration function for pages with buffers. This function can only be used
  * if the underlying filesystem guarantees that no other references to "page"
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs()
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
                   ` (2 preceding siblings ...)
  2018-12-11 17:21 ` [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers() Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 15:34   ` Mel Gorman
  2018-12-14  4:53   ` Andrew Morton
  2018-12-11 17:21 ` [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages Jan Kara
  2018-12-11 17:21 ` [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping() Jan Kara
  5 siblings, 2 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

Provide a variant of buffer_migrate_page() that also checks whether
there are no unexpected references to buffer heads. This function will
then be safe to use for block device pages.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/fs.h |  4 ++++
 mm/migrate.c       | 61 +++++++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index c95c0807471f..4bb1a8b65474 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3264,8 +3264,12 @@ extern int generic_check_addressable(unsigned, u64);
 extern int buffer_migrate_page(struct address_space *,
 				struct page *, struct page *,
 				enum migrate_mode);
+extern int buffer_migrate_page_norefs(struct address_space *,
+				struct page *, struct page *,
+				enum migrate_mode);
 #else
 #define buffer_migrate_page NULL
+#define buffer_migrate_page_norefs NULL
 #endif
 
 extern int setattr_prepare(struct dentry *, struct iattr *);
diff --git a/mm/migrate.c b/mm/migrate.c
index f8df1ad6e7cf..c4075d5ec073 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -747,13 +747,9 @@ static bool buffer_migrate_lock_buffers(struct buffer_head *head,
 	return true;
 }
 
-/*
- * Migration function for pages with buffers. This function can only be used
- * if the underlying filesystem guarantees that no other references to "page"
- * exist.
- */
-int buffer_migrate_page(struct address_space *mapping,
-		struct page *newpage, struct page *page, enum migrate_mode mode)
+static int __buffer_migrate_page(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode,
+		bool check_refs)
 {
 	struct buffer_head *bh, *head;
 	int rc;
@@ -771,6 +767,33 @@ int buffer_migrate_page(struct address_space *mapping,
 	if (!buffer_migrate_lock_buffers(head, mode))
 		return -EAGAIN;
 
+	if (check_refs) {
+		bool busy;
+		bool invalidated = false;
+
+recheck_buffers:
+		busy = false;
+		spin_lock(&mapping->private_lock);
+		bh = head;
+		do {
+			if (atomic_read(&bh->b_count)) {
+				busy = true;
+				break;
+			}
+			bh = bh->b_this_page;
+		} while (bh != head);
+		spin_unlock(&mapping->private_lock);
+		if (busy) {
+			if (invalidated) {
+				rc = -EAGAIN;
+				goto unlock_buffers;
+			}
+			invalidate_bh_lrus();
+			invalidated = true;
+			goto recheck_buffers;
+		}
+	}
+
 	rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
 	if (rc != MIGRATEPAGE_SUCCESS)
 		goto unlock_buffers;
@@ -807,7 +830,31 @@ int buffer_migrate_page(struct address_space *mapping,
 
 	return rc;
 }
+
+/*
+ * Migration function for pages with buffers. This function can only be used
+ * if the underlying filesystem guarantees that no other references to "page"
+ * exist. For example attached buffer heads are accessed only under page lock.
+ */
+int buffer_migrate_page(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	return __buffer_migrate_page(mapping, newpage, page, mode, false);
+}
 EXPORT_SYMBOL(buffer_migrate_page);
+
+/*
+ * Same as above except that this variant is more careful and checks that there
+ * are also no buffer head references. This function is the right one for
+ * mappings where buffer heads are directly looked up and referenced (such as
+ * block device mappings).
+ */
+int buffer_migrate_page_norefs(struct address_space *mapping,
+		struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+	return __buffer_migrate_page(mapping, newpage, page, mode, true);
+}
+EXPORT_SYMBOL(buffer_migrate_page_norefs);
 #endif
 
 /*
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
                   ` (3 preceding siblings ...)
  2018-12-11 17:21 ` [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs() Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 15:35   ` Mel Gorman
  2018-12-11 17:21 ` [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping() Jan Kara
  5 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

Currently, block device pages don't provide a ->migratepage callback and
thus fallback_migrate_page() is used for them. This handler cannot deal
with dirty pages in async mode and also with the case a buffer head is in
the LRU buffer head cache (as it has elevated b_count). Thus such page can
block memory offlining.

Fix the problem by using buffer_migrate_page_norefs() for migrating
block device pages. That function takes care of dropping bh LRU in case
migration would fail due to elevated buffer refcount to avoid stalls and
can also migrate dirty pages without writing them.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index a80b4f0ee7c4..de2135178e62 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1966,6 +1966,7 @@ static const struct address_space_operations def_blk_aops = {
 	.writepages	= blkdev_writepages,
 	.releasepage	= blkdev_releasepage,
 	.direct_IO	= blkdev_direct_IO,
+	.migratepage	= buffer_migrate_page_norefs,
 	.is_dirty_writeback = buffer_check_dirty_writeback,
 };
 
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping()
  2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
                   ` (4 preceding siblings ...)
  2018-12-11 17:21 ` [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages Jan Kara
@ 2018-12-11 17:21 ` Jan Kara
  2018-12-13 15:35   ` Mel Gorman
  5 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-11 17:21 UTC (permalink / raw)
  To: linux-mm; +Cc: mhocko, mgorman, Jan Kara

All callers of migrate_page_move_mapping() now pass NULL for 'head'
argument. Drop it.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/aio.c                | 2 +-
 fs/f2fs/data.c          | 2 +-
 fs/iomap.c              | 2 +-
 fs/ubifs/file.c         | 2 +-
 include/linux/migrate.h | 3 +--
 mm/migrate.c            | 7 +++----
 6 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 97f983592925..4f4878ebca9a 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -408,7 +408,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
 	BUG_ON(PageWriteback(old));
 	get_page(new);
 
-	rc = migrate_page_move_mapping(mapping, new, old, NULL, mode, 1);
+	rc = migrate_page_move_mapping(mapping, new, old, mode, 1);
 	if (rc != MIGRATEPAGE_SUCCESS) {
 		put_page(new);
 		goto out_unlock;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index b293cb3e27a2..008b74eff00d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2738,7 +2738,7 @@ int f2fs_migrate_page(struct address_space *mapping,
 	 */
 	extra_count = (atomic_written ? 1 : 0) - page_has_private(page);
 	rc = migrate_page_move_mapping(mapping, newpage,
-				page, NULL, mode, extra_count);
+				page, mode, extra_count);
 	if (rc != MIGRATEPAGE_SUCCESS) {
 		if (atomic_written)
 			mutex_unlock(&fi->inmem_lock);
diff --git a/fs/iomap.c b/fs/iomap.c
index 3ffb776fbebe..8df6a75d2d11 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -550,7 +550,7 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage,
 {
 	int ret;
 
-	ret = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+	ret = migrate_page_move_mapping(mapping, newpage, page, mode, 0);
 	if (ret != MIGRATEPAGE_SUCCESS)
 		return ret;
 
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 1b78f2e09218..5d2ffb1a45fc 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1481,7 +1481,7 @@ static int ubifs_migrate_page(struct address_space *mapping,
 {
 	int rc;
 
-	rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+	rc = migrate_page_move_mapping(mapping, newpage, page, mode, 0);
 	if (rc != MIGRATEPAGE_SUCCESS)
 		return rc;
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index f2b4abbca55e..8eeeaf946f95 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -77,8 +77,7 @@ extern void migrate_page_copy(struct page *newpage, struct page *page);
 extern int migrate_huge_page_move_mapping(struct address_space *mapping,
 				  struct page *newpage, struct page *page);
 extern int migrate_page_move_mapping(struct address_space *mapping,
-		struct page *newpage, struct page *page,
-		struct buffer_head *head, enum migrate_mode mode,
+		struct page *newpage, struct page *page, enum migrate_mode mode,
 		int extra_count);
 #else
 
diff --git a/mm/migrate.c b/mm/migrate.c
index c4075d5ec073..1e47ea88a5b3 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -403,8 +403,7 @@ static int expected_page_refs(struct page *page)
  * 3 for pages with a mapping and PagePrivate/PagePrivate2 set.
  */
 int migrate_page_move_mapping(struct address_space *mapping,
-		struct page *newpage, struct page *page,
-		struct buffer_head *head, enum migrate_mode mode,
+		struct page *newpage, struct page *page, enum migrate_mode mode,
 		int extra_count)
 {
 	XA_STATE(xas, &mapping->i_pages, page_index(page));
@@ -691,7 +690,7 @@ int migrate_page(struct address_space *mapping,
 
 	BUG_ON(PageWriteback(page));	/* Writeback must be complete */
 
-	rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+	rc = migrate_page_move_mapping(mapping, newpage, page, mode, 0);
 
 	if (rc != MIGRATEPAGE_SUCCESS)
 		return rc;
@@ -794,7 +793,7 @@ static int __buffer_migrate_page(struct address_space *mapping,
 		}
 	}
 
-	rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+	rc = migrate_page_move_mapping(mapping, newpage, page, mode, 0);
 	if (rc != MIGRATEPAGE_SUCCESS)
 		goto unlock_buffers;
 
-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
@ 2018-12-13 13:05   ` Mel Gorman
  2018-12-14 15:10   ` Mel Gorman
  1 sibling, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 13:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:38PM +0100, Jan Kara wrote:
> Factor out function to compute number of expected page references in
> migrate_page_move_mapping(). Note that we move hpage_nr_pages() and
> page_has_private() checks from under xas_lock_irq() however this is safe
> since we hold page lock.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping()
  2018-12-11 17:21 ` [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping() Jan Kara
@ 2018-12-13 14:19   ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 14:19 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:39PM +0100, Jan Kara wrote:
> Lock buffers before calling into migrate_page_move_mapping() so that
> that function doesn't have to know about buffers (which is somewhat
> unexpected anyway) and all the buffer head logic is in
> buffer_migrate_page().
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

This is a much nicer flow.

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers()
  2018-12-11 17:21 ` [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers() Jan Kara
@ 2018-12-13 14:57   ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 14:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:40PM +0100, Jan Kara wrote:
> buffer_migrate_page() is the only caller of migrate_page_lock_buffers()
> move it close to it and also drop the now unused stub for !CONFIG_BLOCK.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs()
  2018-12-11 17:21 ` [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs() Jan Kara
@ 2018-12-13 15:34   ` Mel Gorman
  2018-12-14  4:53   ` Andrew Morton
  1 sibling, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 15:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:41PM +0100, Jan Kara wrote:
> Provide a variant of buffer_migrate_page() that also checks whether
> there are no unexpected references to buffer heads. This function will
> then be safe to use for block device pages.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages
  2018-12-11 17:21 ` [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages Jan Kara
@ 2018-12-13 15:35   ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 15:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:42PM +0100, Jan Kara wrote:
> Currently, block device pages don't provide a ->migratepage callback and
> thus fallback_migrate_page() is used for them. This handler cannot deal
> with dirty pages in async mode and also with the case a buffer head is in
> the LRU buffer head cache (as it has elevated b_count). Thus such page can
> block memory offlining.
> 
> Fix the problem by using buffer_migrate_page_norefs() for migrating
> block device pages. That function takes care of dropping bh LRU in case
> migration would fail due to elevated buffer refcount to avoid stalls and
> can also migrate dirty pages without writing them.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping()
  2018-12-11 17:21 ` [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping() Jan Kara
@ 2018-12-13 15:35   ` Mel Gorman
  2018-12-13 16:17     ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Mel Gorman @ 2018-12-13 15:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:43PM +0100, Jan Kara wrote:
> All callers of migrate_page_move_mapping() now pass NULL for 'head'
> argument. Drop it.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping()
  2018-12-13 15:35   ` Mel Gorman
@ 2018-12-13 16:17     ` Jan Kara
  2018-12-17 13:17       ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-13 16:17 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Jan Kara, linux-mm, mhocko, Andrew Morton

On Thu 13-12-18 15:35:43, Mel Gorman wrote:
> On Tue, Dec 11, 2018 at 06:21:43PM +0100, Jan Kara wrote:
> > All callers of migrate_page_move_mapping() now pass NULL for 'head'
> > argument. Drop it.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Acked-by: Mel Gorman <mgorman@suse.de>

Thanks for review Mel! Andrew, can you please pick up the series? Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs()
  2018-12-11 17:21 ` [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs() Jan Kara
  2018-12-13 15:34   ` Mel Gorman
@ 2018-12-14  4:53   ` Andrew Morton
  2018-12-14  9:26     ` Jan Kara
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2018-12-14  4:53 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko, mgorman

On Tue, 11 Dec 2018 18:21:41 +0100 Jan Kara <jack@suse.cz> wrote:

> Provide a variant of buffer_migrate_page() that also checks whether
> there are no unexpected references to buffer heads. This function will
> then be safe to use for block device pages.
> 
> ...
>
> +EXPORT_SYMBOL(buffer_migrate_page_norefs);

The export is presently unneeded and I don't think we expect that this
will be used by anything other than fs/block_dev.c?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs()
  2018-12-14  4:53   ` Andrew Morton
@ 2018-12-14  9:26     ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-14  9:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, linux-mm, mhocko, mgorman

On Thu 13-12-18 20:53:53, Andrew Morton wrote:
> On Tue, 11 Dec 2018 18:21:41 +0100 Jan Kara <jack@suse.cz> wrote:
> 
> > Provide a variant of buffer_migrate_page() that also checks whether
> > there are no unexpected references to buffer heads. This function will
> > then be safe to use for block device pages.
> > 
> > ...
> >
> > +EXPORT_SYMBOL(buffer_migrate_page_norefs);
> 
> The export is presently unneeded and I don't think we expect that this
> will be used by anything other than fs/block_dev.c?

Good point. We can always re-add the export if someone needs it. Thanks for
removing it!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
  2018-12-13 13:05   ` Mel Gorman
@ 2018-12-14 15:10   ` Mel Gorman
  2018-12-14 15:53     ` Jan Kara
  1 sibling, 1 reply; 21+ messages in thread
From: Mel Gorman @ 2018-12-14 15:10 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Tue, Dec 11, 2018 at 06:21:38PM +0100, Jan Kara wrote:
> Factor out function to compute number of expected page references in
> migrate_page_move_mapping(). Note that we move hpage_nr_pages() and
> page_has_private() checks from under xas_lock_irq() however this is safe
> since we hold page lock.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/migrate.c | 27 +++++++++++++++++----------
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f7e4bfdc13b7..789c7bc90a0c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -428,6 +428,22 @@ static inline bool buffer_migrate_lock_buffers(struct buffer_head *head,
>  }
>  #endif /* CONFIG_BLOCK */
>  
> +static int expected_page_refs(struct page *page)
> +{
> +	int expected_count = 1;
> +
> +	/*
> +	 * Device public or private pages have an extra refcount as they are
> +	 * ZONE_DEVICE pages.
> +	 */
> +	expected_count += is_device_private_page(page);
> +	expected_count += is_device_public_page(page);
> +	if (page->mapping)
> +		expected_count += hpage_nr_pages(page) + page_has_private(page);
> +
> +	return expected_count;
> +}
> +

I noticed during testing that THP allocation success rates under the
mmtests configuration global-dhp__workload_thpscale-madvhugepage-xfs were
terrible with massive latencies introduced somewhere in the series. I
haven't tried chasing it down as it's relatively late but this block
looked odd and I missed it the first time.

This page->mapping test is relevant for the "Anonymous page without
mapping" check but I think it's wrong. An anonymous page without mapping
doesn't have a NULL mapping, it sets PAGE_MAPPING_ANON and the field can
be special in other ways. I think you meant to use page_mapping(page)
here, not page->mapping?

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-14 15:10   ` Mel Gorman
@ 2018-12-14 15:53     ` Jan Kara
  2018-12-14 16:24       ` Mel Gorman
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2018-12-14 15:53 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Jan Kara, linux-mm, mhocko

On Fri 14-12-18 15:10:46, Mel Gorman wrote:
> On Tue, Dec 11, 2018 at 06:21:38PM +0100, Jan Kara wrote:
> > Factor out function to compute number of expected page references in
> > migrate_page_move_mapping(). Note that we move hpage_nr_pages() and
> > page_has_private() checks from under xas_lock_irq() however this is safe
> > since we hold page lock.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  mm/migrate.c | 27 +++++++++++++++++----------
> >  1 file changed, 17 insertions(+), 10 deletions(-)
> > 
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f7e4bfdc13b7..789c7bc90a0c 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -428,6 +428,22 @@ static inline bool buffer_migrate_lock_buffers(struct buffer_head *head,
> >  }
> >  #endif /* CONFIG_BLOCK */
> >  
> > +static int expected_page_refs(struct page *page)
> > +{
> > +	int expected_count = 1;
> > +
> > +	/*
> > +	 * Device public or private pages have an extra refcount as they are
> > +	 * ZONE_DEVICE pages.
> > +	 */
> > +	expected_count += is_device_private_page(page);
> > +	expected_count += is_device_public_page(page);
> > +	if (page->mapping)
> > +		expected_count += hpage_nr_pages(page) + page_has_private(page);
> > +
> > +	return expected_count;
> > +}
> > +
> 
> I noticed during testing that THP allocation success rates under the
> mmtests configuration global-dhp__workload_thpscale-madvhugepage-xfs were
> terrible with massive latencies introduced somewhere in the series. I
> haven't tried chasing it down as it's relatively late but this block
> looked odd and I missed it the first time.

Interesting. I've run config-global-dhp__workload_thpscale and that didn't
show anything strange. But the numbers were fluctuating a lot both with and
without my patches applied. I'll have a look if I can reproduce this
sometime next week and look what could be causing the delays.

> This page->mapping test is relevant for the "Anonymous page without
> mapping" check but I think it's wrong. An anonymous page without mapping
> doesn't have a NULL mapping, it sets PAGE_MAPPING_ANON and the field can
> be special in other ways. I think you meant to use page_mapping(page)
> here, not page->mapping?

Yes, that's a bug. It should have been page_mapping(page). Thanks for
catching this.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-14 15:53     ` Jan Kara
@ 2018-12-14 16:24       ` Mel Gorman
  2018-12-17 13:11         ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Mel Gorman @ 2018-12-14 16:24 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-mm, mhocko

On Fri, Dec 14, 2018 at 04:53:11PM +0100, Jan Kara wrote:
> > I noticed during testing that THP allocation success rates under the
> > mmtests configuration global-dhp__workload_thpscale-madvhugepage-xfs were
> > terrible with massive latencies introduced somewhere in the series. I
> > haven't tried chasing it down as it's relatively late but this block
> > looked odd and I missed it the first time.
> 
> Interesting. I've run config-global-dhp__workload_thpscale and that didn't
> show anything strange. But the numbers were fluctuating a lot both with and
> without my patches applied. I'll have a look if I can reproduce this
> sometime next week and look what could be causing the delays.
> 

Ah, it's the difference between madvise and !madvise. The configuration
you used does very little compaction as it neither wakes kswapd of
kcompactd. It just falls back to base pages to limit fault latency so
you wouldn't have hit the same paths of interest.

> > This page->mapping test is relevant for the "Anonymous page without
> > mapping" check but I think it's wrong. An anonymous page without mapping
> > doesn't have a NULL mapping, it sets PAGE_MAPPING_ANON and the field can
> > be special in other ways. I think you meant to use page_mapping(page)
> > here, not page->mapping?
> 
> Yes, that's a bug. It should have been page_mapping(page). Thanks for
> catching this.
> 

My pleasure, should have spotted it the first time around :/

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references
  2018-12-14 16:24       ` Mel Gorman
@ 2018-12-17 13:11         ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-17 13:11 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Jan Kara, linux-mm, mhocko

On Fri 14-12-18 16:24:28, Mel Gorman wrote:
> On Fri, Dec 14, 2018 at 04:53:11PM +0100, Jan Kara wrote:
> > > This page->mapping test is relevant for the "Anonymous page without
> > > mapping" check but I think it's wrong. An anonymous page without mapping
> > > doesn't have a NULL mapping, it sets PAGE_MAPPING_ANON and the field can
> > > be special in other ways. I think you meant to use page_mapping(page)
> > > here, not page->mapping?
> > 
> > Yes, that's a bug. It should have been page_mapping(page). Thanks for
> > catching this.
> > 
> 
> My pleasure, should have spotted it the first time around :/

And after fixing this, I can see the success rates for
global-dhp__workload_thpscale-madvhugepage to go back to the original
values. So that was indeed the problem. I'll send the fixup to Andrew.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping()
  2018-12-13 16:17     ` Jan Kara
@ 2018-12-17 13:17       ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2018-12-17 13:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, linux-mm, mhocko, Mel Gorman


[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

On Thu 13-12-18 17:17:46, Jan Kara wrote:
> On Thu 13-12-18 15:35:43, Mel Gorman wrote:
> > On Tue, Dec 11, 2018 at 06:21:43PM +0100, Jan Kara wrote:
> > > All callers of migrate_page_move_mapping() now pass NULL for 'head'
> > > argument. Drop it.
> > > 
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > 
> > Acked-by: Mel Gorman <mgorman@suse.de>
> 
> Thanks for review Mel! Andrew, can you please pick up the series? Thanks!

Andrew, Mel has spotted a bug in patch 1/6 or this series. Can you please
add the attached fixup to the patch 1? Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: migratepage-fixup.patch --]
[-- Type: text/x-patch, Size: 435 bytes --]

diff --git a/mm/migrate.c b/mm/migrate.c
index 789c7bc90a0c..816ce10e908f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -438,7 +438,7 @@ static int expected_page_refs(struct page *page)
 	 */
 	expected_count += is_device_private_page(page);
 	expected_count += is_device_public_page(page);
-	if (page->mapping)
+	if (page_mapping(page))
 		expected_count += hpage_nr_pages(page) + page_has_private(page);
 
 	return expected_count;

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 17:21 mm: migrate: Fix page migration stalls for blkdev pages Jan Kara
2018-12-11 17:21 ` [PATCH 1/6] mm: migration: Factor out code to compute expected number of page references Jan Kara
2018-12-13 13:05   ` Mel Gorman
2018-12-14 15:10   ` Mel Gorman
2018-12-14 15:53     ` Jan Kara
2018-12-14 16:24       ` Mel Gorman
2018-12-17 13:11         ` Jan Kara
2018-12-11 17:21 ` [PATCH 2/6] mm: migrate: Lock buffers before migrate_page_move_mapping() Jan Kara
2018-12-13 14:19   ` Mel Gorman
2018-12-11 17:21 ` [PATCH 3/6] mm: migrate: Move migrate_page_lock_buffers() Jan Kara
2018-12-13 14:57   ` Mel Gorman
2018-12-11 17:21 ` [PATCH 4/6] mm: migrate: Provide buffer_migrate_page_norefs() Jan Kara
2018-12-13 15:34   ` Mel Gorman
2018-12-14  4:53   ` Andrew Morton
2018-12-14  9:26     ` Jan Kara
2018-12-11 17:21 ` [PATCH 5/6] blkdev: Avoid migration stalls for blkdev pages Jan Kara
2018-12-13 15:35   ` Mel Gorman
2018-12-11 17:21 ` [PATCH 6/6] mm: migrate: Drop unused argument of migrate_page_move_mapping() Jan Kara
2018-12-13 15:35   ` Mel Gorman
2018-12-13 16:17     ` Jan Kara
2018-12-17 13:17       ` Jan Kara

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git