From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de>, Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, stable@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Date: Thu, 28 Mar 2013 11:42:38 -0400 [thread overview] Message-ID: <1364485358-8745-3-git-send-email-n-horiguchi@ah.jp.nec.com> (raw) In-Reply-To: <1364485358-8745-1-git-send-email-n-horiguchi@ah.jp.nec.com> With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory error happens on a hugepage and the affected processes try to access the error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0) in get_page(). The reason for this bug is that coredump-related code doesn't recognise "hugepage hwpoison entry" with which a pmd entry is replaced when a memory error occurs on a hugepage. In other words, physical address information is stored in different bit layout between hugepage hwpoison entry and pmd entry, so follow_hugetlb_page() which is called in get_dump_page() returns a wrong page from a given address. We need to filter out only hwpoison hugepages to have data on healthy hugepages in coredump. So this patch makes follow_hugetlb_page() avoid trying to get page when a pmd is in swap entry like format. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> --- mm/hugetlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git v3.9-rc3.orig/mm/hugetlb.c v3.9-rc3/mm/hugetlb.c index 0d1705b..8462e2c 100644 --- v3.9-rc3.orig/mm/hugetlb.c +++ v3.9-rc3/mm/hugetlb.c @@ -2968,7 +2968,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * first, for the page indexing below to work. */ pte = huge_pte_offset(mm, vaddr & huge_page_mask(h)); - absent = !pte || huge_pte_none(huge_ptep_get(pte)); + absent = !pte || huge_pte_none(huge_ptep_get(pte)) || + is_swap_pte(huge_ptep_get(pte)); /* * When coredumping, it suits get_dump_page if we just return -- 1.7.11.7
WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de>, Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, stable@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Date: Thu, 28 Mar 2013 11:42:38 -0400 [thread overview] Message-ID: <1364485358-8745-3-git-send-email-n-horiguchi@ah.jp.nec.com> (raw) In-Reply-To: <1364485358-8745-1-git-send-email-n-horiguchi@ah.jp.nec.com> With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory error happens on a hugepage and the affected processes try to access the error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0) in get_page(). The reason for this bug is that coredump-related code doesn't recognise "hugepage hwpoison entry" with which a pmd entry is replaced when a memory error occurs on a hugepage. In other words, physical address information is stored in different bit layout between hugepage hwpoison entry and pmd entry, so follow_hugetlb_page() which is called in get_dump_page() returns a wrong page from a given address. We need to filter out only hwpoison hugepages to have data on healthy hugepages in coredump. So this patch makes follow_hugetlb_page() avoid trying to get page when a pmd is in swap entry like format. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> --- mm/hugetlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git v3.9-rc3.orig/mm/hugetlb.c v3.9-rc3/mm/hugetlb.c index 0d1705b..8462e2c 100644 --- v3.9-rc3.orig/mm/hugetlb.c +++ v3.9-rc3/mm/hugetlb.c @@ -2968,7 +2968,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * first, for the page indexing below to work. */ pte = huge_pte_offset(mm, vaddr & huge_page_mask(h)); - absent = !pte || huge_pte_none(huge_ptep_get(pte)); + absent = !pte || huge_pte_none(huge_ptep_get(pte)) || + is_swap_pte(huge_ptep_get(pte)); /* * When coredumping, it suits get_dump_page if we just return -- 1.7.11.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-03-28 15:43 UTC|newest] Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-03-28 15:42 [PATCH 0/2] fix hugepage coredump Naoya Horiguchi 2013-03-28 15:42 ` [PATCH 1/2] hugetlbfs: stop setting VM_DONTDUMP in initializing vma(VM_HUGETLB) Naoya Horiguchi 2013-03-28 15:42 ` Naoya Horiguchi 2013-03-28 15:51 ` Greg KH 2013-03-28 15:51 ` Greg KH 2013-03-28 16:04 ` Naoya Horiguchi 2013-03-28 16:04 ` Naoya Horiguchi 2013-03-28 19:39 ` Ben Hutchings 2013-03-28 19:39 ` Ben Hutchings 2013-03-28 19:47 ` Naoya Horiguchi 2013-03-28 19:47 ` Naoya Horiguchi 2013-03-28 17:03 ` Konstantin Khlebnikov 2013-03-28 17:03 ` Konstantin Khlebnikov 2013-03-28 18:29 ` Naoya Horiguchi 2013-03-28 18:29 ` Naoya Horiguchi 2013-03-29 5:30 ` Konstantin Khlebnikov 2013-03-29 5:30 ` Konstantin Khlebnikov 2013-03-29 12:09 ` Konstantin Khlebnikov 2013-03-29 12:09 ` Konstantin Khlebnikov 2013-03-29 16:59 ` Naoya Horiguchi 2013-03-29 16:59 ` Naoya Horiguchi 2013-03-29 13:47 ` Michal Hocko 2013-03-29 13:47 ` Michal Hocko 2013-03-28 15:42 ` Naoya Horiguchi [this message] 2013-03-28 15:42 ` [PATCH 2/2] hugetlbfs: add swap entry check in follow_hugetlb_page() Naoya Horiguchi 2013-03-29 13:57 ` Michal Hocko 2013-03-29 13:57 ` Michal Hocko 2013-03-29 17:23 ` Naoya Horiguchi 2013-03-29 17:23 ` Naoya Horiguchi 2013-04-02 9:24 ` Michal Hocko 2013-04-02 9:24 ` Michal Hocko 2013-04-02 14:35 ` Naoya Horiguchi 2013-04-02 14:35 ` Naoya Horiguchi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1364485358-8745-3-git-send-email-n-horiguchi@ah.jp.nec.com \ --to=n-horiguchi@ah.jp.nec.com \ --cc=akpm@linux-foundation.org \ --cc=hughd@google.com \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=riel@redhat.com \ --cc=stable@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.