linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wanpeng Li <liwanp@linux.vnet.ibm.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Rik van Riel <riel@redhat.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 19/63] sched: Track NUMA hinting faults on per-node basis
Date: Wed, 4 Dec 2013 13:37:04 +0800	[thread overview]
Message-ID: <529ebf88.28e9440a.54f2.fffff6aaSMTPIN_ADDED_BROKEN@mx.google.com> (raw)
In-Reply-To: <529ebe8c.a19e420a.72bb.ffff9a55SMTPIN_ADDED_BROKEN@mx.google.com>

On Wed, Dec 04, 2013 at 01:32:42PM +0800, Wanpeng Li wrote:
>On Mon, Oct 07, 2013 at 11:28:57AM +0100, Mel Gorman wrote:
>>This patch tracks what nodes numa hinting faults were incurred on.
>>This information is later used to schedule a task on the node storing
>>the pages most frequently faulted by the task.
>>
>>Signed-off-by: Mel Gorman <mgorman@suse.de>
>>---
>> include/linux/sched.h |  2 ++
>> kernel/sched/core.c   |  3 +++
>> kernel/sched/fair.c   | 11 ++++++++++-
>> kernel/sched/sched.h  | 12 ++++++++++++
>> 4 files changed, 27 insertions(+), 1 deletion(-)
>>
>>diff --git a/include/linux/sched.h b/include/linux/sched.h
>>index a8095ad..8828e40 100644
>>--- a/include/linux/sched.h
>>+++ b/include/linux/sched.h
>>@@ -1332,6 +1332,8 @@ struct task_struct {
>> 	unsigned int numa_scan_period_max;
>> 	u64 node_stamp;			/* migration stamp  */
>> 	struct callback_head numa_work;
>>+
>>+	unsigned long *numa_faults;
>> #endif /* CONFIG_NUMA_BALANCING */
>>
>> 	struct rcu_head rcu;
>>diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>index 681945e..aad2e02 100644
>>--- a/kernel/sched/core.c
>>+++ b/kernel/sched/core.c
>>@@ -1629,6 +1629,7 @@ static void __sched_fork(struct task_struct *p)
>> 	p->numa_migrate_seq = p->mm ? p->mm->numa_scan_seq - 1 : 0;
>> 	p->numa_scan_period = sysctl_numa_balancing_scan_delay;
>> 	p->numa_work.next = &p->numa_work;
>>+	p->numa_faults = NULL;
>> #endif /* CONFIG_NUMA_BALANCING */
>>
>> 	cpu_hotplug_init_task(p);
>>@@ -1892,6 +1893,8 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)
>> 	if (mm)
>> 		mmdrop(mm);
>> 	if (unlikely(prev_state == TASK_DEAD)) {
>>+		task_numa_free(prev);
>
>Function task_numa_free() depends on patch 43/64.

Sorry, I miss it.

>
>Regards,
>Wanpeng Li 
>
>>+
>> 		/*
>> 		 * Remove function-return probe instances associated with this
>> 		 * task and put them back on the free list.
>>diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>index 8cea7a2..df300d9 100644
>>--- a/kernel/sched/fair.c
>>+++ b/kernel/sched/fair.c
>>@@ -902,7 +902,14 @@ void task_numa_fault(int node, int pages, bool migrated)
>> 	if (!numabalancing_enabled)
>> 		return;
>>
>>-	/* FIXME: Allocate task-specific structure for placement policy here */
>>+	/* Allocate buffer to track faults on a per-node basis */
>>+	if (unlikely(!p->numa_faults)) {
>>+		int size = sizeof(*p->numa_faults) * nr_node_ids;
>>+
>>+		p->numa_faults = kzalloc(size, GFP_KERNEL|__GFP_NOWARN);
>>+		if (!p->numa_faults)
>>+			return;
>>+	}
>>
>> 	/*
>> 	 * If pages are properly placed (did not migrate) then scan slower.
>>@@ -918,6 +925,8 @@ void task_numa_fault(int node, int pages, bool migrated)
>> 	}
>>
>> 	task_numa_placement(p);
>>+
>>+	p->numa_faults[node] += pages;
>> }
>>
>> static void reset_ptenuma_scan(struct task_struct *p)
>>diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>>index b3c5653..6a955f4 100644
>>--- a/kernel/sched/sched.h
>>+++ b/kernel/sched/sched.h
>>@@ -6,6 +6,7 @@
>> #include <linux/spinlock.h>
>> #include <linux/stop_machine.h>
>> #include <linux/tick.h>
>>+#include <linux/slab.h>
>>
>> #include "cpupri.h"
>> #include "cpuacct.h"
>>@@ -552,6 +553,17 @@ static inline u64 rq_clock_task(struct rq *rq)
>> 	return rq->clock_task;
>> }
>>
>>+#ifdef CONFIG_NUMA_BALANCING
>>+static inline void task_numa_free(struct task_struct *p)
>>+{
>>+	kfree(p->numa_faults);
>>+}
>>+#else /* CONFIG_NUMA_BALANCING */
>>+static inline void task_numa_free(struct task_struct *p)
>>+{
>>+}
>>+#endif /* CONFIG_NUMA_BALANCING */
>>+
>> #ifdef CONFIG_SMP
>>
>> #define rcu_dereference_check_sched_domain(p) \
>>-- 
>>1.8.4
>>
>>--
>>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>the body to majordomo@kvack.org.  For more info on Linux MM,
>>see: http://www.linux-mm.org/ .
>>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org.  For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-12-04  5:37 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-07 10:28 [PATCH 0/63] Basic scheduler support for automatic NUMA balancing V9 Mel Gorman
2013-10-07 10:28 ` [PATCH 01/63] hotplug: Optimize {get,put}_online_cpus() Mel Gorman
2013-10-07 10:28 ` [PATCH 02/63] mm: numa: Document automatic NUMA balancing sysctls Mel Gorman
2013-10-07 12:46   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 03/63] sched, numa: Comment fixlets Mel Gorman
2013-10-07 12:46   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 04/63] mm: numa: Do not account for a hinting fault if we raced Mel Gorman
2013-10-07 12:47   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 05/63] mm: Wait for THP migrations to complete during NUMA hinting faults Mel Gorman
2013-10-07 13:55   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 06/63] mm: Prevent parallel splits during THP migration Mel Gorman
2013-10-07 14:01   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 07/63] mm: numa: Sanitize task_numa_fault() callsites Mel Gorman
2013-10-07 14:02   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 08/63] mm: Close races between THP migration and PMD numa clearing Mel Gorman
2013-10-07 14:02   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 09/63] mm: Account for a THP NUMA hinting update as one PTE update Mel Gorman
2013-10-07 14:02   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 10/63] mm: Do not flush TLB during protection change if !pte_present && !migration_entry Mel Gorman
2013-10-07 15:12   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 11/63] mm: Only flush TLBs if a transhuge PMD is modified for NUMA pte scanning Mel Gorman
2013-10-07 10:28 ` [PATCH 12/63] mm: numa: Do not migrate or account for hinting faults on the zero page Mel Gorman
2013-10-07 17:10   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 13/63] sched: numa: Mitigate chance that same task always updates PTEs Mel Gorman
2013-10-07 17:24   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 14/63] sched: numa: Continue PTE scanning even if migrate rate limited Mel Gorman
2013-10-07 17:24   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 15/63] Revert "mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node" Mel Gorman
2013-10-07 17:42   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 16/63] sched: numa: Initialise numa_next_scan properly Mel Gorman
2013-10-07 17:44   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 17/63] sched: Set the scan rate proportional to the memory usage of the task being scanned Mel Gorman
2013-10-07 17:44   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 18/63] sched: numa: Slow scan rate if no NUMA hinting faults are being recorded Mel Gorman
2013-10-07 18:02   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 19/63] sched: Track NUMA hinting faults on per-node basis Mel Gorman
2013-10-07 18:02   ` Rik van Riel
2013-12-04  5:32   ` Wanpeng Li
2013-12-04  5:37     ` Wanpeng Li [this message]
2013-10-07 10:28 ` [PATCH 20/63] sched: Select a preferred node with the most numa hinting faults Mel Gorman
2013-10-07 18:04   ` Rik van Riel
2013-10-07 10:28 ` [PATCH 21/63] sched: Update NUMA hinting faults once per scan Mel Gorman
2013-10-07 18:39   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 22/63] sched: Favour moving tasks towards the preferred node Mel Gorman
2013-10-07 18:39   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 23/63] sched: Resist moving tasks towards nodes with fewer hinting faults Mel Gorman
2013-10-07 18:40   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 24/63] sched: Reschedule task on preferred NUMA node once selected Mel Gorman
2013-10-07 18:40   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 25/63] sched: Add infrastructure for split shared/private accounting of NUMA hinting faults Mel Gorman
2013-10-07 18:41   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 26/63] sched: Check current->mm before allocating NUMA faults Mel Gorman
2013-10-07 18:41   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 27/63] mm: numa: Scan pages with elevated page_mapcount Mel Gorman
2013-10-07 18:43   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 28/63] sched: Remove check that skips small VMAs Mel Gorman
2013-10-07 18:44   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 29/63] sched: Set preferred NUMA node based on number of private faults Mel Gorman
2013-10-07 18:45   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 30/63] sched: Do not migrate memory immediately after switching node Mel Gorman
2013-10-07 10:29 ` [PATCH 31/63] mm: numa: only unmap migrate-on-fault VMAs Mel Gorman
2013-10-07 10:29 ` [PATCH 32/63] sched: Avoid overloading CPUs on a preferred NUMA node Mel Gorman
2013-10-07 18:58   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 33/63] sched: Retry migration of tasks to CPU on a preferred node Mel Gorman
2013-10-07 18:58   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 34/63] sched: numa: increment numa_migrate_seq when task runs in correct location Mel Gorman
2013-10-07 10:29 ` [PATCH 35/63] sched: numa: Do not trap hinting faults for shared libraries Mel Gorman
2013-10-07 19:04   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 36/63] mm: numa: Only trap pmd hinting faults if we would otherwise trap PTE faults Mel Gorman
2013-10-07 19:06   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 37/63] stop_machine: Introduce stop_two_cpus() Mel Gorman
2013-10-07 10:29 ` [PATCH 38/63] sched: Introduce migrate_swap() Mel Gorman
2013-10-07 19:06   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 39/63] sched: numa: Use a system-wide search to find swap/migration candidates Mel Gorman
2013-10-07 19:07   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 40/63] sched: numa: Favor placing a task on the preferred node Mel Gorman
2013-10-07 19:07   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 41/63] sched: numa: fix placement of workloads spread across multiple nodes Mel Gorman
2013-10-07 10:29 ` [PATCH 42/63] mm: numa: Change page last {nid,pid} into {cpu,pid} Mel Gorman
2013-10-07 19:08   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 43/63] sched: numa: Use {cpu, pid} to create task groups for shared faults Mel Gorman
2013-10-07 19:09   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 44/63] sched: numa: Report a NUMA task group ID Mel Gorman
2013-10-07 19:09   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 45/63] mm: numa: copy cpupid on page migration Mel Gorman
2013-10-07 10:29 ` [PATCH 46/63] mm: numa: Do not group on RO pages Mel Gorman
2013-10-07 19:10   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 47/63] mm: numa: Do not batch handle PMD pages Mel Gorman
2013-10-07 19:11   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 48/63] sched: numa: stay on the same node if CLONE_VM Mel Gorman
2013-10-07 10:29 ` [PATCH 49/63] sched: numa: use group fault statistics in numa placement Mel Gorman
2013-10-07 10:29 ` [PATCH 50/63] sched: numa: call task_numa_free from do_execve Mel Gorman
2013-10-07 10:29 ` [PATCH 51/63] sched: numa: Prevent parallel updates to group stats during placement Mel Gorman
2013-10-07 19:13   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 52/63] sched: numa: add debugging Mel Gorman
2013-10-07 19:13   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 53/63] sched: numa: Decide whether to favour task or group weights based on swap candidate relationships Mel Gorman
2013-10-07 10:29 ` [PATCH 54/63] sched: numa: fix task or group comparison Mel Gorman
2013-10-07 10:29 ` [PATCH 55/63] sched: numa: Avoid migrating tasks that are placed on their preferred node Mel Gorman
2013-10-07 19:14   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 56/63] sched: numa: be more careful about joining numa groups Mel Gorman
2013-10-07 10:29 ` [PATCH 57/63] sched: numa: Take false sharing into account when adapting scan rate Mel Gorman
2013-10-07 19:14   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 58/63] sched: numa: adjust scan rate in task_numa_placement Mel Gorman
2013-10-07 10:29 ` [PATCH 59/63] sched: numa: Remove the numa_balancing_scan_period_reset sysctl Mel Gorman
2013-10-07 19:14   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 60/63] mm: numa: revert temporarily disabling of NUMA migration Mel Gorman
2013-10-07 10:29 ` [PATCH 61/63] sched: numa: skip some page migrations after a shared fault Mel Gorman
2013-10-07 10:29 ` [PATCH 62/63] sched: numa: use unsigned longs for numa group fault stats Mel Gorman
2013-10-07 19:15   ` Rik van Riel
2013-10-07 10:29 ` [PATCH 63/63] sched: numa: periodically retry task_numa_migrate Mel Gorman
2013-10-09 11:03 ` [PATCH 0/63] Basic scheduler support for automatic NUMA balancing V9 Ingo Molnar
2013-10-09 11:11   ` Ingo Molnar
2013-10-09 11:13     ` Ingo Molnar
2013-10-09 12:05   ` Peter Zijlstra
2013-10-09 12:48     ` Ingo Molnar
2013-10-10  7:05   ` Mel Gorman
2013-10-09 16:28 ` Ingo Molnar
2013-10-09 16:29   ` Ingo Molnar
2013-10-09 16:57     ` Ingo Molnar
2013-10-09 17:09       ` Ingo Molnar
2013-10-09 17:11         ` Peter Zijlstra
2013-10-09 17:08   ` Peter Zijlstra
2013-10-09 17:15     ` Ingo Molnar
2013-10-09 17:18       ` Peter Zijlstra
2013-10-24 12:26 ` Automatic NUMA balancing patches for tip-urgent/stable Mel Gorman
2013-10-26 12:11   ` Ingo Molnar
2013-10-29  9:42     ` Mel Gorman
2013-10-29  9:48       ` Ingo Molnar
2013-10-29 10:24         ` Mel Gorman
2013-10-29 10:41           ` Ingo Molnar
2013-10-29 12:48             ` Mel Gorman
2013-10-31  9:51   ` [RFC GIT PULL] NUMA-balancing memory corruption fixes Ingo Molnar
2013-10-31 22:25     ` Linus Torvalds
2013-11-01  7:36       ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2013-09-27 13:26 [PATCH 0/63] Basic scheduler support for automatic NUMA balancing V8 Mel Gorman
2013-09-27 13:27 ` [PATCH 19/63] sched: Track NUMA hinting faults on per-node basis Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=529ebf88.28e9440a.54f2.fffff6aaSMTPIN_ADDED_BROKEN@mx.google.com \
    --to=liwanp@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).