From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755166Ab2ALTak (ORCPT ); Thu, 12 Jan 2012 14:30:40 -0500 Received: from mx1.redhat.com ([209.132.183.28]:13953 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755058Ab2ALT3y (ORCPT ); Thu, 12 Jan 2012 14:29:54 -0500 From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Rientjes , Andi Kleen , Wu Fengguang , Andrea Arcangeli , KOSAKI Motohiro , KAMEZAWA Hiroyuki , linux-kernel@vger.kernel.org, Naoya Horiguchi Subject: [PATCH 1/6] pagemap: avoid splitting thp when reading /proc/pid/pagemap Date: Thu, 12 Jan 2012 14:34:53 -0500 Message-Id: <1326396898-5579-2-git-send-email-n-horiguchi@ah.jp.nec.com> In-Reply-To: <1326396898-5579-1-git-send-email-n-horiguchi@ah.jp.nec.com> References: <1326396898-5579-1-git-send-email-n-horiguchi@ah.jp.nec.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thp split is not necessary if we explicitly check whether pmds are mapping thps or not. This patch introduces the check and the code to generate pagemap entries for pmds mapping thps, which results in less performance impact of pagemap on thp. Signed-off-by: Naoya Horiguchi Reviewed-by: Andi Kleen Reviewed-by: KAMEZAWA Hiroyuki Changes since v2: - Add comment on if check in thp_pte_to_pagemap_entry() - Convert type of offset into unsigned long Changes since v1: - Move pfn declaration to the beginning of pagemap_pte_range() --- fs/proc/task_mmu.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 48 insertions(+), 6 deletions(-) diff --git 3.2-rc5.orig/fs/proc/task_mmu.c 3.2-rc5/fs/proc/task_mmu.c index e418c5a..bd19177 100644 --- 3.2-rc5.orig/fs/proc/task_mmu.c +++ 3.2-rc5/fs/proc/task_mmu.c @@ -600,6 +600,9 @@ struct pagemapread { u64 *buffer; }; +#define PAGEMAP_WALK_SIZE (PMD_SIZE) +#define PAGEMAP_WALK_MASK (PMD_MASK) + #define PM_ENTRY_BYTES sizeof(u64) #define PM_STATUS_BITS 3 #define PM_STATUS_OFFSET (64 - PM_STATUS_BITS) @@ -658,6 +661,27 @@ static u64 pte_to_pagemap_entry(pte_t pte) return pme; } +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static u64 thp_pte_to_pagemap_entry(pte_t pte, int offset) +{ + u64 pme = 0; + /* + * Currently pte for thp is always present because thp can not be + * swapped-out, migrated, or HWPOISONed (split in such cases instead.) + * This if-check is just to prepare for future implementation. + */ + if (pte_present(pte)) + pme = PM_PFRAME(pte_pfn(pte) + offset) + | PM_PSHIFT(PAGE_SHIFT) | PM_PRESENT; + return pme; +} +#else +static inline u64 thp_pte_to_pagemap_entry(pte_t pte, int offset) +{ + return 0; +} +#endif + static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -665,14 +689,34 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct pagemapread *pm = walk->private; pte_t *pte; int err = 0; - - split_huge_page_pmd(walk->mm, pmd); + u64 pfn = PM_NOT_PRESENT; /* find the first VMA at or above 'addr' */ vma = find_vma(walk->mm, addr); - for (; addr != end; addr += PAGE_SIZE) { - u64 pfn = PM_NOT_PRESENT; + spin_lock(&walk->mm->page_table_lock); + if (pmd_trans_huge(*pmd)) { + if (pmd_trans_splitting(*pmd)) { + spin_unlock(&walk->mm->page_table_lock); + wait_split_huge_page(vma->anon_vma, pmd); + } else { + for (; addr != end; addr += PAGE_SIZE) { + unsigned long offset = (addr & ~PAGEMAP_WALK_MASK) + >> PAGE_SHIFT; + pfn = thp_pte_to_pagemap_entry(*(pte_t *)pmd, + offset); + err = add_to_pagemap(addr, pfn, pm); + if (err) + break; + } + spin_unlock(&walk->mm->page_table_lock); + return err; + } + } else { + spin_unlock(&walk->mm->page_table_lock); + } + + for (; addr != end; addr += PAGE_SIZE) { /* check to see if we've left 'vma' behind * and need a new, higher one */ if (vma && (addr >= vma->vm_end)) @@ -754,8 +798,6 @@ static int pagemap_hugetlb_range(pte_t *pte, unsigned long hmask, * determine which areas of memory are actually mapped and llseek to * skip over unmapped regions. */ -#define PAGEMAP_WALK_SIZE (PMD_SIZE) -#define PAGEMAP_WALK_MASK (PMD_MASK) static ssize_t pagemap_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { -- 1.7.6.5