From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754507Ab2KUKWq (ORCPT ); Wed, 21 Nov 2012 05:22:46 -0500 Received: from cantor2.suse.de ([195.135.220.15]:43874 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754580Ab2KUKWk (ORCPT ); Wed, 21 Nov 2012 05:22:40 -0500 From: Mel Gorman To: Peter Zijlstra , Andrea Arcangeli , Ingo Molnar Cc: Rik van Riel , Johannes Weiner , Hugh Dickins , Thomas Gleixner , Paul Turner , Lee Schermerhorn , Alex Shi , Linus Torvalds , Andrew Morton , Linux-MM , LKML , Mel Gorman Subject: [PATCH 26/46] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges Date: Wed, 21 Nov 2012 10:21:32 +0000 Message-Id: <1353493312-8069-27-git-send-email-mgorman@suse.de> X-Mailer: git-send-email 1.7.9.2 In-Reply-To: <1353493312-8069-1-git-send-email-mgorman@suse.de> References: <1353493312-8069-1-git-send-email-mgorman@suse.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org By accounting against the present PTEs, scanning speed reflects the actual present (mapped) memory. Suggested-by: Ingo Molnar Signed-off-by: Peter Zijlstra Cc: Linus Torvalds Cc: Andrew Morton Cc: Peter Zijlstra Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Mel Gorman Signed-off-by: Ingo Molnar Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 66d8bd2..773ef97 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -827,8 +827,8 @@ void task_numa_work(struct callback_head *work) struct task_struct *p = current; struct mm_struct *mm = p->mm; struct vm_area_struct *vma; - unsigned long offset, end; - long length; + unsigned long start, end; + long pages; WARN_ON_ONCE(p != container_of(work, struct task_struct, numa_work)); @@ -858,18 +858,20 @@ void task_numa_work(struct callback_head *work) if (cmpxchg(&mm->numa_next_scan, migrate, next_scan) != migrate) return; - offset = mm->numa_scan_offset; - length = sysctl_balance_numa_scan_size; - length <<= 20; + start = mm->numa_scan_offset; + pages = sysctl_balance_numa_scan_size; + pages <<= 20 - PAGE_SHIFT; /* MB in pages */ + if (!pages) + return; down_read(&mm->mmap_sem); - vma = find_vma(mm, offset); + vma = find_vma(mm, start); if (!vma) { reset_ptenuma_scan(p); - offset = 0; + start = 0; vma = mm->mmap; } - for (; vma && length > 0; vma = vma->vm_next) { + for (; vma; vma = vma->vm_next) { if (!vma_migratable(vma)) continue; @@ -877,15 +879,19 @@ void task_numa_work(struct callback_head *work) if (((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) < HPAGE_PMD_NR) continue; - offset = max(offset, vma->vm_start); - end = min(ALIGN(offset + length, HPAGE_SIZE), vma->vm_end); - length -= end - offset; - - change_prot_numa(vma, offset, end); + do { + start = max(start, vma->vm_start); + end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE); + end = min(end, vma->vm_end); + pages -= change_prot_numa(vma, start, end); - offset = end; + start = end; + if (pages <= 0) + goto out; + } while (end != vma->vm_end); } +out: /* * It is possible to reach the end of the VMA list but the last few VMAs are * not guaranteed to the vma_migratable. If they are not, we would find the @@ -893,7 +899,7 @@ void task_numa_work(struct callback_head *work) * so check it now. */ if (vma) - mm->numa_scan_offset = offset; + mm->numa_scan_offset = start; else reset_ptenuma_scan(p); up_read(&mm->mmap_sem); -- 1.7.9.2