Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org
Subject: [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
Date: Mon, 26 Nov 2018 15:21:58 -0800 (PST)
Message-ID: <alpine.LSU.2.11.1811261520070.2275@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.1811261444420.2275@eggly.anvils>

Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that
__split_huge_page() was using i_size_read() while holding the irq-safe
lru_lock and page tree lock, but the 32-bit i_size_read() uses an
irq-unsafe seqlock which should not be nested inside them.

Instead, read the i_size earlier in split_huge_page_to_list(), and pass
the end offset down to __split_huge_page(): all while holding head page
lock, which is enough to prevent truncation of that extent before the
page tree lock has been taken.

Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 4.8+
---
 mm/huge_memory.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cef2c256e7c4..622cced74fd9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2439,12 +2439,11 @@ static void __split_huge_page_tail(struct page *head, int tail,
 }
 
 static void __split_huge_page(struct page *page, struct list_head *list,
-		unsigned long flags)
+		pgoff_t end, unsigned long flags)
 {
 	struct page *head = compound_head(page);
 	struct zone *zone = page_zone(head);
 	struct lruvec *lruvec;
-	pgoff_t end = -1;
 	int i;
 
 	lruvec = mem_cgroup_page_lruvec(head, zone->zone_pgdat);
@@ -2452,9 +2451,6 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	/* complete memcg works before add pages to LRU */
 	mem_cgroup_split_huge_fixup(head);
 
-	if (!PageAnon(page))
-		end = DIV_ROUND_UP(i_size_read(head->mapping->host), PAGE_SIZE);
-
 	for (i = HPAGE_PMD_NR - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond i_size: drop them from page cache */
@@ -2626,6 +2622,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	int count, mapcount, extra_pins, ret;
 	bool mlocked;
 	unsigned long flags;
+	pgoff_t end;
 
 	VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
@@ -2648,6 +2645,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 			ret = -EBUSY;
 			goto out;
 		}
+		end = -1;
 		mapping = NULL;
 		anon_vma_lock_write(anon_vma);
 	} else {
@@ -2661,6 +2659,15 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 
 		anon_vma = NULL;
 		i_mmap_lock_read(mapping);
+
+		/*
+		 *__split_huge_page() may need to trim off pages beyond EOF:
+		 * but on 32-bit, i_size_read() takes an irq-unsafe seqlock,
+		 * which cannot be nested inside the page tree lock. So note
+		 * end now: i_size itself may be changed at any moment, but
+		 * head page lock is good enough to serialize the trimming.
+		 */
+		end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
 	}
 
 	/*
@@ -2707,7 +2714,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		if (mapping)
 			__dec_node_page_state(page, NR_SHMEM_THPS);
 		spin_unlock(&pgdata->split_queue_lock);
-		__split_huge_page(page, list, flags);
+		__split_huge_page(page, list, end, flags);
 		if (PageSwapCache(head)) {
 			swp_entry_t entry = { .val = page_private(head) };
 
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

  parent reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
2018-11-27  7:26   ` Kirill A. Shutemov
2018-11-26 23:21 ` Hugh Dickins [this message]
2018-11-27  7:27   ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Kirill A. Shutemov
2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
2018-11-27  7:31   ` Kirill A. Shutemov
2018-11-27 13:26   ` Matthew Wilcox
2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
2018-11-27  7:35   ` Kirill A. Shutemov
2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
2018-11-27  7:38   ` Kirill A. Shutemov
2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
2018-11-27  7:59   ` Kirill A. Shutemov
2018-11-27 20:23     ` Hugh Dickins
2018-11-28 10:59       ` Kirill A. Shutemov
2018-11-28 19:40         ` Hugh Dickins
2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
2018-11-27  8:01   ` Kirill A. Shutemov
2018-11-26 23:31 ` [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound Hugh Dickins
2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
2018-11-27  8:02   ` Kirill A. Shutemov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1811261520070.2275@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org linux-mm@archiver.kernel.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox