linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/intel_rdt: Fix setting of closid when adding CPUs to a group
@ 2016-11-18 23:18 Fenghua Yu
  2016-11-18 23:18 ` [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount Fenghua Yu
  2016-11-28 10:16 ` [tip:x86/cache] x86/intel_rdt: Fix setting of closid when adding CPUs to a group tip-bot for Fenghua Yu
  0 siblings, 2 replies; 8+ messages in thread
From: Fenghua Yu @ 2016-11-18 23:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: H. Peter Anvin, Ingo Molnar, Tony Luck, Ravi V Shankar,
	Sai Prakhya, Vikas Shivappa, linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

There was a cut & paste error when adding code to update the per-cpu
closid when changing the bitmask of CPUs to an rdt group. CPUs removed
from the group were correctly given back to the default group. But
CPUs added to the group should be given the closid of their new group,
not the default group closid.

Fixes: f410770293a1 ("x86/intel_rdt: Update percpu closid immeditately
on CPUs affected by changee")

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 98edba4..eccea8a 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -278,7 +278,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 				continue;
 			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
 		}
-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
 	}
 
 	/* Done pushing/pulling - update this group with new mask */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-18 23:18 [PATCH 1/2] x86/intel_rdt: Fix setting of closid when adding CPUs to a group Fenghua Yu
@ 2016-11-18 23:18 ` Fenghua Yu
  2016-11-23 14:23   ` Thomas Gleixner
  2016-11-28 10:17   ` [tip:x86/cache] " tip-bot for Fenghua Yu
  2016-11-28 10:16 ` [tip:x86/cache] x86/intel_rdt: Fix setting of closid when adding CPUs to a group tip-bot for Fenghua Yu
  1 sibling, 2 replies; 8+ messages in thread
From: Fenghua Yu @ 2016-11-18 23:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: H. Peter Anvin, Ingo Molnar, Tony Luck, Ravi V Shankar,
	Sai Prakhya, Vikas Shivappa, linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

When removing a sub directory/rdtgroup by rmdir or umount, closid in a
task in the sub directory is set to default rdtgroup's closid which is 0.
If the task is running on a CPU, the PQR_ASSOC MSR is only updated
when the task runs through a context switch. Up to the context switch,
the task runs with the wrong closid.

Make the change immediately effective by invoking a smp function call
on all online CPUs which calls intel_rdt_sched_in() to update the
PQR_ASSOC MSR.

rdt_update_closid() (renamed from rdt_update_percpu_closid()) calls
intel_rdt_sched_in() to update closid in the PQR_ASSOC MSR on a CPU.
The task closid and percpu closid are set up before
rdt_update_closid() is called. Handling PQR_ASSOC MSR update for
both task closid and percpu closid in rdt_update_closid() reduces
redundant smp function calls.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 88 +++++++++++++++++++++++---------
 1 file changed, 63 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index eccea8a..ff0ee57 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -194,12 +194,14 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 /*
  * This is safe against intel_rdt_sched_in() called from __switch_to()
  * because __switch_to() is executed with interrupts disabled. A local call
- * from rdt_update_percpu_closid() is proteced against __switch_to() because
+ * from rdt_update_closid() is proteced against __switch_to() because
  * preemption is disabled.
+ *
+ * Task closid and percpu closid should be set up before calling
+ * this function.
  */
-static void rdt_update_cpu_closid(void *v)
+static void rdt_update_cpu_closid(void *unused)
 {
-	this_cpu_write(cpu_closid, *(int *)v);
 	/*
 	 * We cannot unconditionally write the MSR because the current
 	 * executing task might have its own closid selected. Just reuse
@@ -208,14 +210,14 @@ static void rdt_update_cpu_closid(void *v)
 	intel_rdt_sched_in();
 }
 
-/* Update the per cpu closid and eventually the PGR_ASSOC MSR */
-static void rdt_update_percpu_closid(const struct cpumask *cpu_mask, int closid)
+/* Update closid in the PGR_ASSOC MSR */
+static void rdt_update_closid(const struct cpumask *cpu_mask)
 {
 	int cpu = get_cpu();
 
 	if (cpumask_test_cpu(cpu, cpu_mask))
-		rdt_update_cpu_closid(&closid);
-	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, &closid, 1);
+		rdt_update_cpu_closid(NULL);
+	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, NULL, 1);
 	put_cpu();
 }
 
@@ -224,7 +226,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 {
 	cpumask_var_t tmpmask, newmask;
 	struct rdtgroup *rdtgrp, *r;
-	int ret;
+	int cpu, ret;
 
 	if (!buf)
 		return -EINVAL;
@@ -264,7 +266,9 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 		/* Give any dropped cpus to rdtgroup_default */
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, tmpmask);
-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		for_each_cpu(cpu, tmpmask)
+			per_cpu(cpu_closid, cpu) = rdtgroup_default.closid;
+		rdt_update_closid(tmpmask);
 	}
 
 	/*
@@ -278,7 +282,9 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 				continue;
 			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
 		}
-		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
+		for_each_cpu(cpu, tmpmask)
+			per_cpu(cpu_closid, cpu) = rdtgrp->closid;
+		rdt_update_closid(tmpmask);
 	}
 
 	/* Done pushing/pulling - update this group with new mask */
@@ -806,19 +812,33 @@ static int reset_all_cbms(struct rdt_resource *r)
 	return 0;
 }
 
+static int rdt_move_task_closid(struct rdtgroup *from, struct rdtgroup *to)
+{
+	struct task_struct *p, *t;
+	bool moved_task = false;
+
+	read_lock(&tasklist_lock);
+	for_each_process_thread(p, t) {
+		if (!from || t->closid == from->closid) {
+			t->closid = to->closid;
+			moved_task = true;
+		}
+	}
+	read_unlock(&tasklist_lock);
+
+	return moved_task;
+}
+
 /*
  * Forcibly remove all of subdirectories under root.
  */
 static void rmdir_all_sub(void)
 {
 	struct rdtgroup *rdtgrp, *tmp;
-	struct task_struct *p, *t;
+	int cpu;
 
 	/* move all tasks to default resource group */
-	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t)
-		t->closid = 0;
-	read_unlock(&tasklist_lock);
+	rdt_move_task_closid(NULL, &rdtgroup_default);
 
 	list_for_each_entry_safe(rdtgrp, tmp, &rdt_all_groups, rdtgroup_list) {
 		/* Remove each rdtgroup other than root */
@@ -833,13 +853,18 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
-		rdt_update_percpu_closid(&rdtgrp->cpu_mask,
-					 rdtgroup_default.closid);
+		for_each_cpu(cpu, &rdtgrp->cpu_mask)
+			per_cpu(cpu_closid, cpu) = rdtgroup_default.closid;
 
 		kernfs_remove(rdtgrp->kn);
 		list_del(&rdtgrp->rdtgroup_list);
 		kfree(rdtgrp);
 	}
+	/* Simply notify every online CPU to update PQR_ASSOC MSR */
+	get_online_cpus();
+	rdt_update_closid(cpu_online_mask);
+	put_online_cpus();
+
 	kernfs_remove(kn_info);
 }
 
@@ -944,8 +969,9 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 
 static int rdtgroup_rmdir(struct kernfs_node *kn)
 {
-	struct task_struct *p, *t;
 	struct rdtgroup *rdtgrp;
+	bool moved_task = false;
+	int cpu;
 
 	rdtgrp = rdtgroup_kn_lock_live(kn);
 	if (!rdtgrp) {
@@ -954,17 +980,29 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)
 	}
 
 	/* Give any tasks back to the default group */
-	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t) {
-		if (t->closid == rdtgrp->closid)
-			t->closid = 0;
-	}
-	read_unlock(&tasklist_lock);
+	moved_task = rdt_move_task_closid(rdtgrp, &rdtgroup_default);
 
 	/* Give any CPUs back to the default group */
 	cpumask_or(&rdtgroup_default.cpu_mask,
 		   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
-	rdt_update_percpu_closid(&rdtgrp->cpu_mask, rdtgroup_default.closid);
+
+	/* Update per cpu closid storage in rdtgrp */
+	for_each_cpu(cpu, &rdtgrp->cpu_mask)
+		per_cpu(cpu_closid, cpu) = rdtgroup_default.closid;
+
+	/*
+	 * If we moved any tasks out of this group, then force an
+	 * update on *all* CPUs so this will take effect right away.
+	 * Otherwise if we reallocated some CPUs from this group
+	 * we can just update the affected CPUs.
+	 */
+	if (moved_task) {
+		get_online_cpus();
+		rdt_update_closid(cpu_online_mask);
+		put_online_cpus();
+	} else if (!cpumask_empty(&rdtgrp->cpu_mask)) {
+		rdt_update_closid(&rdtgrp->cpu_mask);
+	}
 
 	rdtgrp->flags = RDT_DELETED;
 	closid_free(rdtgrp->closid);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-18 23:18 ` [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount Fenghua Yu
@ 2016-11-23 14:23   ` Thomas Gleixner
  2016-11-26  2:36     ` Fenghua Yu
  2016-11-28 10:17   ` [tip:x86/cache] " tip-bot for Fenghua Yu
  1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-11-23 14:23 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: H. Peter Anvin, Ingo Molnar, Tony Luck, Ravi V Shankar,
	Sai Prakhya, Vikas Shivappa, linux-kernel, x86, Peter Zijlstra

On Fri, 18 Nov 2016, Fenghua Yu wrote:
> 
> When removing a sub directory/rdtgroup by rmdir or umount, closid in a
> task in the sub directory is set to default rdtgroup's closid which is 0.
> If the task is running on a CPU, the PQR_ASSOC MSR is only updated
> when the task runs through a context switch. Up to the context switch,
> the task runs with the wrong closid.
> 
> Make the change immediately effective by invoking a smp function call
> on all online CPUs which calls intel_rdt_sched_in() to update the
> PQR_ASSOC MSR.

We can be smarter than that and figure out which cpus need that update
rather than interrupting the world for nothing.
 
> rdt_update_closid() (renamed from rdt_update_percpu_closid()) calls
> intel_rdt_sched_in() to update closid in the PQR_ASSOC MSR on a CPU.
> The task closid and percpu closid are set up before
> rdt_update_closid() is called. Handling PQR_ASSOC MSR update for
> both task closid and percpu closid in rdt_update_closid() reduces
> redundant smp function calls.

You should not describe WHAT the patch is doing, the WHY and the concept is
what needs to be in the changelog. The WHAT is in the patch.

> @@ -224,7 +226,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>  {
>  	cpumask_var_t tmpmask, newmask;
>  	struct rdtgroup *rdtgrp, *r;
> -	int ret;
> +	int cpu, ret;
>  
>  	if (!buf)
>  		return -EINVAL;
> @@ -264,7 +266,9 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>  		/* Give any dropped cpus to rdtgroup_default */
>  		cpumask_or(&rdtgroup_default.cpu_mask,
>  			   &rdtgroup_default.cpu_mask, tmpmask);
> -		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
> +		for_each_cpu(cpu, tmpmask)
> +			per_cpu(cpu_closid, cpu) = rdtgroup_default.closid;
> +		rdt_update_closid(tmpmask);

Bah, this is silly. We can make that conditional instead of mindlessly
slapping a for_each_cpu() loop into every other function.

static void rdt_update_cpu_closid(void *closid)
{
	if (closid)
		this_cpu_write(cpu_closid, *(int *)closid);
	....
}

and have:

static void rdt_update_closid(const struct cpumask *cpu_mask, int *closid)
 {
        int cpu = get_cpu();
 
        if (cpumask_test_cpu(cpu, cpu_mask))
		rdt_update_cpu_closid(closid);
	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, closid, 1);
	put_cpu();
}

So the change here becomes:

-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		rdt_update_closid(tmpmask, &rdtgroup_default.closid);

and for the case where we need to deal with tasks, do the task move, the
percpu update and call rdt_update_closid(mask, NULL);
    	
> +static int rdt_move_task_closid(struct rdtgroup *from, struct rdtgroup *to)

Bah, consistent types are optional, right? int != bool

> +{

So the right thing to do here is aside of using a proper function name:

static bool rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
       	    			 struct cpumask *mask)
{
	struct task_struct *p, *t;
	bool moved_task = false;

	read_lock(&tasklist_lock);
	for_each_process_thread(p, t) {
		if (!from || t->closid == from->closid) {
			t->closid = to->closid;
#ifdef CONFIG_SMP
			if (mask && task->on_cpu) {
				moved_task = true;
				cpumask_set_cpu(task_cpu(t), mask);
			}
#endif
		}
	}
	read_unlock(&tasklist_lock);

	return moved_task;
}

>  /*
>   * Forcibly remove all of subdirectories under root.
>   */
>  static void rmdir_all_sub(void)
>  {
>  	struct rdtgroup *rdtgrp, *tmp;
> -	struct task_struct *p, *t;
> +	int cpu;
>  
>  	/* move all tasks to default resource group */
> -	read_lock(&tasklist_lock);
> -	for_each_process_thread(p, t)
> -		t->closid = 0;
> -	read_unlock(&tasklist_lock);
> +	rdt_move_task_closid(NULL, &rdtgroup_default);

So this becomes

	rdt_move_group_tasks(NULL, &rdtgroup_default, NULL);

>  
>  	list_for_each_entry_safe(rdtgrp, tmp, &rdt_all_groups, rdtgroup_list) {
>  		/* Remove each rdtgroup other than root */
> @@ -833,13 +853,18 @@ static void rmdir_all_sub(void)
>  		cpumask_or(&rdtgroup_default.cpu_mask,
>  			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>  
> -		rdt_update_percpu_closid(&rdtgrp->cpu_mask,
> -					 rdtgroup_default.closid);
> +		for_each_cpu(cpu, &rdtgrp->cpu_mask)
> +			per_cpu(cpu_closid, cpu) = rdtgroup_default.closid;

And this is pointless when we have the cpuid pointer in rdt_update_closid()
and the below becomes:

> +	rdt_update_closid(cpu_online_mask);

  	rdt_update_closid(cpu_online_mask, &rdtgroup_default.closid);

> +	put_online_cpus();

Reworked untested patch below.

Thanks,

	tglx

8<-----------------
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -194,12 +194,13 @@ static int rdtgroup_cpus_show(struct ker
 /*
  * This is safe against intel_rdt_sched_in() called from __switch_to()
  * because __switch_to() is executed with interrupts disabled. A local call
- * from rdt_update_percpu_closid() is proteced against __switch_to() because
+ * from rdt_update_closid() is proteced against __switch_to() because
  * preemption is disabled.
  */
-static void rdt_update_cpu_closid(void *v)
+static void rdt_update_cpu_closid(void *closid)
 {
-	this_cpu_write(cpu_closid, *(int *)v);
+	if (closid)
+		this_cpu_write(cpu_closid, *(int *)closid);
 	/*
 	 * We cannot unconditionally write the MSR because the current
 	 * executing task might have its own closid selected. Just reuse
@@ -208,14 +209,23 @@ static void rdt_update_cpu_closid(void *
 	intel_rdt_sched_in();
 }
 
-/* Update the per cpu closid and eventually the PGR_ASSOC MSR */
-static void rdt_update_percpu_closid(const struct cpumask *cpu_mask, int closid)
+/*
+ * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
+ *
+ * Per task closids must have been set up before calling this function.
+ *
+ * The per cpu closids are updated with the smp function call, when @closid
+ * is not NULL. If @closid is NULL then all affected percpu closids must
+ * have been set up before calling this function.
+ */
+static void
+rdt_update_closid(const struct cpumask *cpu_mask, int *closid)
 {
 	int cpu = get_cpu();
 
 	if (cpumask_test_cpu(cpu, cpu_mask))
-		rdt_update_cpu_closid(&closid);
-	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, &closid, 1);
+		rdt_update_cpu_closid(closid);
+	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, closid, 1);
 	put_cpu();
 }
 
@@ -264,7 +274,7 @@ static ssize_t rdtgroup_cpus_write(struc
 		/* Give any dropped cpus to rdtgroup_default */
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, tmpmask);
-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		rdt_update_closid(tmpmask, &rdtgroup_default.closid);
 	}
 
 	/*
@@ -278,7 +288,7 @@ static ssize_t rdtgroup_cpus_write(struc
 				continue;
 			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
 		}
-		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
+		rdt_update_closid(tmpmask, &rdtgrp->closid);
 	}
 
 	/* Done pushing/pulling - update this group with new mask */
@@ -807,18 +817,49 @@ static int reset_all_cbms(struct rdt_res
 }
 
 /*
- * Forcibly remove all of subdirectories under root.
+ * Move tasks from one to the other group. If @from is NULL, then all tasks
+ * in the systems are moved unconditionally (used for teardown).
+ *
+ * If @mask is not NULL the cpus on which moved tasks are running are set
+ * in that mask so the update smp function call is restricted to affected
+ * cpus.
  */
-static void rmdir_all_sub(void)
+static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
+				 struct cpumask *mask)
 {
-	struct rdtgroup *rdtgrp, *tmp;
 	struct task_struct *p, *t;
 
-	/* move all tasks to default resource group */
 	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t)
-		t->closid = 0;
+	for_each_process_thread(p, t) {
+		if (!from || t->closid == from->closid) {
+			t->closid = to->closid;
+#ifdef CONFIG_SMP
+			/*
+			 * This is safe on x86 w/o barriers as the ordering
+			 * of writing to task_cpu() and t->on_cpu is
+			 * reverse to the reading here. The detection is
+			 * inaccurate as tasks might move or schedule
+			 * before the smp function call takes place. In
+			 * such a case the function call is pointless, but
+			 * there is no other side effect.
+			 */
+			if (mask && t->on_cpu)
+				cpumask_set_cpu(task_cpu(t), mask);
+#endif
+		}
+	}
 	read_unlock(&tasklist_lock);
+}
+
+/*
+ * Forcibly remove all of subdirectories under root.
+ */
+static void rmdir_all_sub(void)
+{
+	struct rdtgroup *rdtgrp, *tmp;
+
+	/* Move all tasks to the default resource group */
+	rdt_move_group_tasks(NULL, &rdtgroup_default, NULL);
 
 	list_for_each_entry_safe(rdtgrp, tmp, &rdt_all_groups, rdtgroup_list) {
 		/* Remove each rdtgroup other than root */
@@ -833,13 +874,15 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
-		rdt_update_percpu_closid(&rdtgrp->cpu_mask,
-					 rdtgroup_default.closid);
-
 		kernfs_remove(rdtgrp->kn);
 		list_del(&rdtgrp->rdtgroup_list);
 		kfree(rdtgrp);
 	}
+	/* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */
+	get_online_cpus();
+	rdt_update_closid(cpu_online_mask, &rdtgroup_default.closid);
+	put_online_cpus();
+
 	kernfs_remove(kn_info);
 }
 
@@ -944,27 +987,35 @@ static int rdtgroup_mkdir(struct kernfs_
 
 static int rdtgroup_rmdir(struct kernfs_node *kn)
 {
-	struct task_struct *p, *t;
+	int ret, cpu, closid = rdtgroup_default.closid;
 	struct rdtgroup *rdtgrp;
+	cpumask_var_t tmpmask;
+
+	if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL))
+		return -ENOMEM;
 
 	rdtgrp = rdtgroup_kn_lock_live(kn);
 	if (!rdtgrp) {
-		rdtgroup_kn_unlock(kn);
-		return -EPERM;
+		ret = -EPERM;
+		goto out;
 	}
 
 	/* Give any tasks back to the default group */
-	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t) {
-		if (t->closid == rdtgrp->closid)
-			t->closid = 0;
-	}
-	read_unlock(&tasklist_lock);
+	rdt_move_group_tasks(rdtgrp, &rdtgroup_default, tmpmask);
 
 	/* Give any CPUs back to the default group */
 	cpumask_or(&rdtgroup_default.cpu_mask,
 		   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
-	rdt_update_percpu_closid(&rdtgrp->cpu_mask, rdtgroup_default.closid);
+
+	/* Update per cpu closid of the moved CPUs first */
+	for_each_cpu(cpu, &rdtgrp->cpu_mask)
+		per_cpu(cpu_closid, cpu) = closid;
+	/*
+	 * Update the MSR on moved CPUs and CPUs which have moved
+	 * task running on them.
+	 */
+	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
+	rdt_update_closid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
 	closid_free(rdtgrp->closid);
@@ -976,9 +1027,11 @@ static int rdtgroup_rmdir(struct kernfs_
 	 */
 	kernfs_get(kn);
 	kernfs_remove(rdtgrp->kn);
-
+	ret = 0;
+out:
 	rdtgroup_kn_unlock(kn);
-	return 0;
+	free_cpumask_var(tmpmask);
+	return ret;
 }
 
 static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-23 14:23   ` Thomas Gleixner
@ 2016-11-26  2:36     ` Fenghua Yu
  2016-11-26  9:08       ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Fenghua Yu @ 2016-11-26  2:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Tony Luck,
	Ravi V Shankar, Sai Prakhya, Vikas Shivappa, linux-kernel, x86,
	Peter Zijlstra

On Wed, Nov 23, 2016 at 03:23:50PM +0100, Thomas Gleixner wrote:
> On Fri, 18 Nov 2016, Fenghua Yu wrote:
> Reworked untested patch below.

The reworked patch passes my baisc tests. But I have a quesiton on
rdt_move_group_tasks() (please see below).

> 
> Thanks,
> 
> 	tglx
> 
> 8<-----------------
> --- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
> +++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
> @@ -194,12 +194,13 @@ static int rdtgroup_cpus_show(struct ker
>  /*
>   * This is safe against intel_rdt_sched_in() called from __switch_to()
>   * because __switch_to() is executed with interrupts disabled. A local call
> - * from rdt_update_percpu_closid() is proteced against __switch_to() because
> + * from rdt_update_closid() is proteced against __switch_to() because
>   * preemption is disabled.
>   */
> -static void rdt_update_cpu_closid(void *v)
> +static void rdt_update_cpu_closid(void *closid)
>  {
> -	this_cpu_write(cpu_closid, *(int *)v);
> +	if (closid)
> +		this_cpu_write(cpu_closid, *(int *)closid);
>  	/*
>  	 * We cannot unconditionally write the MSR because the current
>  	 * executing task might have its own closid selected. Just reuse
> @@ -208,14 +209,23 @@ static void rdt_update_cpu_closid(void *
>  	intel_rdt_sched_in();
>  }
>  
> -/* Update the per cpu closid and eventually the PGR_ASSOC MSR */
> -static void rdt_update_percpu_closid(const struct cpumask *cpu_mask, int closid)
> +/*
> + * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
> + *
> + * Per task closids must have been set up before calling this function.
> + *
> + * The per cpu closids are updated with the smp function call, when @closid
> + * is not NULL. If @closid is NULL then all affected percpu closids must
> + * have been set up before calling this function.
> + */
> +static void
> +rdt_update_closid(const struct cpumask *cpu_mask, int *closid)
>  {
>  	int cpu = get_cpu();
>  
>  	if (cpumask_test_cpu(cpu, cpu_mask))
> -		rdt_update_cpu_closid(&closid);
> -	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, &closid, 1);
> +		rdt_update_cpu_closid(closid);
> +	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, closid, 1);
>  	put_cpu();
>  }
>  
> @@ -264,7 +274,7 @@ static ssize_t rdtgroup_cpus_write(struc
>  		/* Give any dropped cpus to rdtgroup_default */
>  		cpumask_or(&rdtgroup_default.cpu_mask,
>  			   &rdtgroup_default.cpu_mask, tmpmask);
> -		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
> +		rdt_update_closid(tmpmask, &rdtgroup_default.closid);
>  	}
>  
>  	/*
> @@ -278,7 +288,7 @@ static ssize_t rdtgroup_cpus_write(struc
>  				continue;
>  			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
>  		}
> -		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
> +		rdt_update_closid(tmpmask, &rdtgrp->closid);
>  	}
>  
>  	/* Done pushing/pulling - update this group with new mask */
> @@ -807,18 +817,49 @@ static int reset_all_cbms(struct rdt_res
>  }
>  
>  /*
> - * Forcibly remove all of subdirectories under root.
> + * Move tasks from one to the other group. If @from is NULL, then all tasks
> + * in the systems are moved unconditionally (used for teardown).
> + *
> + * If @mask is not NULL the cpus on which moved tasks are running are set
> + * in that mask so the update smp function call is restricted to affected
> + * cpus.
>   */
> -static void rmdir_all_sub(void)
> +static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
> +				 struct cpumask *mask)
>  {
> -	struct rdtgroup *rdtgrp, *tmp;
>  	struct task_struct *p, *t;
>  
> -	/* move all tasks to default resource group */
>  	read_lock(&tasklist_lock);
> -	for_each_process_thread(p, t)
> -		t->closid = 0;
> +	for_each_process_thread(p, t) {
> +		if (!from || t->closid == from->closid) {
> +			t->closid = to->closid;
> +#ifdef CONFIG_SMP
> +			/*
> +			 * This is safe on x86 w/o barriers as the ordering
> +			 * of writing to task_cpu() and t->on_cpu is
> +			 * reverse to the reading here. The detection is
> +			 * inaccurate as tasks might move or schedule
> +			 * before the smp function call takes place. In
> +			 * such a case the function call is pointless, but
> +			 * there is no other side effect.
> +			 */

If process p1 is running on CPU1 before this point,

> +			if (mask && t->on_cpu)
> +				cpumask_set_cpu(task_cpu(t), mask);

If between CPU1 is set in mask and rdt_update_closid(tmpmask, NULL) is
called, p1 is switched to CPU2, and process p2 with its own closid
(e.g. 2) is switched to CPU1.

Then closid in PQR_ASSOC is set incorrectly as 0 instead of 2 on CPU1.
0 may stay in PQR_ASSOC until next context switch which may take long time
in cases of real time or HPC.

Don't we need to care this situation? In this situation, the function call
is not "pointless" but it's wrong, right?
 
Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-26  2:36     ` Fenghua Yu
@ 2016-11-26  9:08       ` Thomas Gleixner
  2016-11-26 21:06         ` Fenghua Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2016-11-26  9:08 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: H. Peter Anvin, Ingo Molnar, Tony Luck, Ravi V Shankar,
	Sai Prakhya, Vikas Shivappa, linux-kernel, x86, Peter Zijlstra

On Fri, 25 Nov 2016, Fenghua Yu wrote:
> On Wed, Nov 23, 2016 at 03:23:50PM +0100, Thomas Gleixner wrote:
> > +#ifdef CONFIG_SMP
> > +			/*
> > +			 * This is safe on x86 w/o barriers as the ordering
> > +			 * of writing to task_cpu() and t->on_cpu is
> > +			 * reverse to the reading here. The detection is
> > +			 * inaccurate as tasks might move or schedule
> > +			 * before the smp function call takes place. In
> > +			 * such a case the function call is pointless, but
> > +			 * there is no other side effect.
> > +			 */
> 
> If process p1 is running on CPU1 before this point,
> 
> > +			if (mask && t->on_cpu)
> > +				cpumask_set_cpu(task_cpu(t), mask);
> 
> If between CPU1 is set in mask and rdt_update_closid(tmpmask, NULL) is
> called, p1 is switched to CPU2, and process p2 with its own closid
> (e.g. 2) is switched to CPU1.
> 
> Then closid in PQR_ASSOC is set incorrectly as 0 instead of 2 on CPU1.
> 0 may stay in PQR_ASSOC until next context switch which may take long time
> in cases of real time or HPC.
> 
> Don't we need to care this situation? In this situation, the function call
> is not "pointless" but it's wrong, right?

No.

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		  closid = T2->closid	 closid = T1->closid
		  closid =2 	  	 closid = CPU2->closid
				 				 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()
		  intel_rdt_sched_in()
		   closid = T2->closid
		   closid = 2

IOW, whatever comes first, sched_switch() or function call will update the
closid to the correct value.

If CPU2 was in the removed group then this looks the following way:

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		   closid = T2->closid	 closid = T1->closid (0)
		  closid =2 	  	 closid = CPU2->closid
		  	 		 closid = 5
for_each_cpu(grp->mask)
	CPU2->closid = 0
		 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()	rdt_update_cpu_closid()
		  intel_rdt_sched_in(	 intel_rdt_sched_in()
		   closid = T2->closid	  closid = T1->closid (0)
		   closid = 2		  closid = CPU2->closid
		   	    		  closid = 0

But on CPU2 the function call might be pointless as well in the following
situation:

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)

for_each_cpu(grp->mask)
	CPU2->closid = 0
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		   closid = T2->closid	 closid = T1->closid (0)
		  closid =2 	  	 closid = CPU2->closid
		  	 		 closid = 0
		 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()	rdt_update_cpu_closid()
		  intel_rdt_sched_in(	 intel_rdt_sched_in()
		   closid = T2->closid	  closid = T1->closid (0)
		   closid = 2		  closid = CPU2->closid
		   	    		  closid = 0

The whole thing works by ordering:

1) Update closids of each task in the group and if a task is running on a
   cpu then mark the CPU on which the task is running for the function
   call.

2) Update closids of each CPU in the group

3) Or the cpumasks of the tasks and the groups and invoke the function call
   on all of them

If an affected task does a sched_switch after task->closid is updated and
before the function call is invoked then the function call is pointless.

If a sched switch happens on a CPU after cpu->closid is updated and before
the function call is invoked then the function call is pointless.

But there is no case where the function call can result in a wrong value.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-26  9:08       ` Thomas Gleixner
@ 2016-11-26 21:06         ` Fenghua Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Fenghua Yu @ 2016-11-26 21:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Tony Luck,
	Ravi V Shankar, Sai Prakhya, Vikas Shivappa, linux-kernel, x86,
	Peter Zijlstra

On Sat, Nov 26, 2016 at 10:08:57AM +0100, Thomas Gleixner wrote:
> On Fri, 25 Nov 2016, Fenghua Yu wrote:
> > On Wed, Nov 23, 2016 at 03:23:50PM +0100, Thomas Gleixner wrote:
> > > +#ifdef CONFIG_SMP
> > > +			/*
> > > +			 * This is safe on x86 w/o barriers as the ordering
> > > +			 * of writing to task_cpu() and t->on_cpu is
> > > +			 * reverse to the reading here. The detection is
> > > +			 * inaccurate as tasks might move or schedule
> > > +			 * before the smp function call takes place. In
> > > +			 * such a case the function call is pointless, but
> > > +			 * there is no other side effect.
> > > +			 */
> > 
> > If process p1 is running on CPU1 before this point,
> > 
> > > +			if (mask && t->on_cpu)
> > > +				cpumask_set_cpu(task_cpu(t), mask);
> > 
> > If between CPU1 is set in mask and rdt_update_closid(tmpmask, NULL) is
> > called, p1 is switched to CPU2, and process p2 with its own closid
> > (e.g. 2) is switched to CPU1.
> > 
> > Then closid in PQR_ASSOC is set incorrectly as 0 instead of 2 on CPU1.
> > 0 may stay in PQR_ASSOC until next context switch which may take long time
> > in cases of real time or HPC.
> > 
> > Don't we need to care this situation? In this situation, the function call
> > is not "pointless" but it's wrong, right?
> 
> No.
> 
> CPU0		CPU1			CPU2
> 		T1 (closid 0)		T2 (closid 2)
> 
> (t1->on_cpu)
>   set(1, mask)
> 		preemption
> 		   T1 ->CPU2
> 		switch_to(T3)		preemption
> 		switch_to(idle)
> 					T2 -> CPU1
> 		switch_to(T2)		switch_to(T1)
> 		 intel_rdt_sched_in()	 intel_rdt_sched_in()
> 		  closid = T2->closid	 closid = T1->closid
> 		  closid =2 	  	 closid = CPU2->closid
> 				 				 
> rdt_update_closid(mask)
> 		
> 		rdt_update_cpu_closid()
> 		  intel_rdt_sched_in()
> 		   closid = T2->closid
> 		   closid = 2
> 
> IOW, whatever comes first, sched_switch() or function call will update the
> closid to the correct value.
> 
> If CPU2 was in the removed group then this looks the following way:
> 
> CPU0		CPU1			CPU2
> 		T1 (closid 0)		T2 (closid 2)
> 
> (t1->on_cpu)
>   set(1, mask)
> 		preemption
> 		   T1 ->CPU2
> 		switch_to(T3)		preemption
> 		switch_to(idle)
> 					T2 -> CPU1
> 		switch_to(T2)		switch_to(T1)
> 		 intel_rdt_sched_in()	 intel_rdt_sched_in()
> 		   closid = T2->closid	 closid = T1->closid (0)
> 		  closid =2 	  	 closid = CPU2->closid
> 		  	 		 closid = 5
> for_each_cpu(grp->mask)
> 	CPU2->closid = 0
> 		 
> rdt_update_closid(mask)
> 		
> 		rdt_update_cpu_closid()	rdt_update_cpu_closid()
> 		  intel_rdt_sched_in(	 intel_rdt_sched_in()
> 		   closid = T2->closid	  closid = T1->closid (0)
> 		   closid = 2		  closid = CPU2->closid
> 		   	    		  closid = 0
> 
> But on CPU2 the function call might be pointless as well in the following
> situation:
> 
> CPU0		CPU1			CPU2
> 		T1 (closid 0)		T2 (closid 2)
> 
> (t1->on_cpu)
>   set(1, mask)
> 		preemption
> 		   T1 ->CPU2
> 		switch_to(T3)		preemption
> 		switch_to(idle)
> 
> for_each_cpu(grp->mask)
> 	CPU2->closid = 0
> 					T2 -> CPU1
> 		switch_to(T2)		switch_to(T1)
> 		 intel_rdt_sched_in()	 intel_rdt_sched_in()
> 		   closid = T2->closid	 closid = T1->closid (0)
> 		  closid =2 	  	 closid = CPU2->closid
> 		  	 		 closid = 0
> 		 
> rdt_update_closid(mask)
> 		
> 		rdt_update_cpu_closid()	rdt_update_cpu_closid()
> 		  intel_rdt_sched_in(	 intel_rdt_sched_in()
> 		   closid = T2->closid	  closid = T1->closid (0)
> 		   closid = 2		  closid = CPU2->closid
> 		   	    		  closid = 0
> 
> The whole thing works by ordering:
> 
> 1) Update closids of each task in the group and if a task is running on a
>    cpu then mark the CPU on which the task is running for the function
>    call.
> 
> 2) Update closids of each CPU in the group
> 
> 3) Or the cpumasks of the tasks and the groups and invoke the function call
>    on all of them
> 
> If an affected task does a sched_switch after task->closid is updated and
> before the function call is invoked then the function call is pointless.
> 
> If a sched switch happens on a CPU after cpu->closid is updated and before
> the function call is invoked then the function call is pointless.
> 
> But there is no case where the function call can result in a wrong value.
> 
> Thanks,
> 
> 	tglx

Thank you for your clear explanation. You are absolutely correct. I know
I must miss something.

The reworked second patch and the first patch were tested successfully.
I assume you will check them in tip tree and I will not send v2.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:x86/cache] x86/intel_rdt: Fix setting of closid when adding CPUs to a group
  2016-11-18 23:18 [PATCH 1/2] x86/intel_rdt: Fix setting of closid when adding CPUs to a group Fenghua Yu
  2016-11-18 23:18 ` [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount Fenghua Yu
@ 2016-11-28 10:16 ` tip-bot for Fenghua Yu
  1 sibling, 0 replies; 8+ messages in thread
From: tip-bot for Fenghua Yu @ 2016-11-28 10:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, h.peter.anvin, ravi.v.shankar, tony.luck,
	sai.praneeth.prakhya, tglx, linux-kernel, hpa, fenghua.yu,
	vikas.shivappa

Commit-ID:  2659f46da8307871989f475accdcdfc4807e9e6c
Gitweb:     http://git.kernel.org/tip/2659f46da8307871989f475accdcdfc4807e9e6c
Author:     Fenghua Yu <fenghua.yu@intel.com>
AuthorDate: Fri, 18 Nov 2016 15:18:03 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 28 Nov 2016 11:07:50 +0100

x86/intel_rdt: Fix setting of closid when adding CPUs to a group

There was a cut & paste error when adding code to update the per-cpu
closid when changing the bitmask of CPUs to an rdt group.

The update erronously assigns the closid of the default group to the CPUs
which are moved to a group instead of assigning the closid of their new
group. Use the proper closid.

Fixes: f410770293a1 ("x86/intel_rdt: Update percpu closid immeditately on CPUs affected by change")
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1479511084-59727-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 98edba4..eccea8a 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -278,7 +278,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 				continue;
 			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
 		}
-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
 	}
 
 	/* Done pushing/pulling - update this group with new mask */

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [tip:x86/cache] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  2016-11-18 23:18 ` [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount Fenghua Yu
  2016-11-23 14:23   ` Thomas Gleixner
@ 2016-11-28 10:17   ` tip-bot for Fenghua Yu
  1 sibling, 0 replies; 8+ messages in thread
From: tip-bot for Fenghua Yu @ 2016-11-28 10:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, h.peter.anvin, ravi.v.shankar, tony.luck,
	fenghua.yu, tglx, sai.praneeth.prakhya, mingo, vikas.shivappa,
	hpa

Commit-ID:  0efc89be9471b152599d2db7eb47de8e0d71c59f
Gitweb:     http://git.kernel.org/tip/0efc89be9471b152599d2db7eb47de8e0d71c59f
Author:     Fenghua Yu <fenghua.yu@intel.com>
AuthorDate: Fri, 18 Nov 2016 15:18:04 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 28 Nov 2016 11:07:50 +0100

x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount

When removing a sub directory/rdtgroup by rmdir or umount, closid in a
task in the sub directory is set to default rdtgroup's closid which is 0.
If the task is running on a CPU, the PQR_ASSOC MSR is only updated
when the task runs through a context switch. Up to the context switch,
the task runs with the wrong closid.

Make the change immediately effective by invoking a smp function call on
all CPUs which are running moved task. If one of the affected tasks was
moved or scheduled out before the function call is executed on the CPU the
only damage is the extra interruption of the CPU.

[ tglx: Reworked it to avoid blindly interrupting all CPUs and extra loops ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1479511084-59727-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 113 +++++++++++++++++++++++--------
 1 file changed, 83 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index eccea8a..fb8e03e 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -194,12 +194,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 /*
  * This is safe against intel_rdt_sched_in() called from __switch_to()
  * because __switch_to() is executed with interrupts disabled. A local call
- * from rdt_update_percpu_closid() is proteced against __switch_to() because
+ * from rdt_update_closid() is proteced against __switch_to() because
  * preemption is disabled.
  */
-static void rdt_update_cpu_closid(void *v)
+static void rdt_update_cpu_closid(void *closid)
 {
-	this_cpu_write(cpu_closid, *(int *)v);
+	if (closid)
+		this_cpu_write(cpu_closid, *(int *)closid);
 	/*
 	 * We cannot unconditionally write the MSR because the current
 	 * executing task might have its own closid selected. Just reuse
@@ -208,14 +209,23 @@ static void rdt_update_cpu_closid(void *v)
 	intel_rdt_sched_in();
 }
 
-/* Update the per cpu closid and eventually the PGR_ASSOC MSR */
-static void rdt_update_percpu_closid(const struct cpumask *cpu_mask, int closid)
+/*
+ * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
+ *
+ * Per task closids must have been set up before calling this function.
+ *
+ * The per cpu closids are updated with the smp function call, when @closid
+ * is not NULL. If @closid is NULL then all affected percpu closids must
+ * have been set up before calling this function.
+ */
+static void
+rdt_update_closid(const struct cpumask *cpu_mask, int *closid)
 {
 	int cpu = get_cpu();
 
 	if (cpumask_test_cpu(cpu, cpu_mask))
-		rdt_update_cpu_closid(&closid);
-	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, &closid, 1);
+		rdt_update_cpu_closid(closid);
+	smp_call_function_many(cpu_mask, rdt_update_cpu_closid, closid, 1);
 	put_cpu();
 }
 
@@ -264,7 +274,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 		/* Give any dropped cpus to rdtgroup_default */
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, tmpmask);
-		rdt_update_percpu_closid(tmpmask, rdtgroup_default.closid);
+		rdt_update_closid(tmpmask, &rdtgroup_default.closid);
 	}
 
 	/*
@@ -278,7 +288,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 				continue;
 			cpumask_andnot(&r->cpu_mask, &r->cpu_mask, tmpmask);
 		}
-		rdt_update_percpu_closid(tmpmask, rdtgrp->closid);
+		rdt_update_closid(tmpmask, &rdtgrp->closid);
 	}
 
 	/* Done pushing/pulling - update this group with new mask */
@@ -807,18 +817,49 @@ static int reset_all_cbms(struct rdt_resource *r)
 }
 
 /*
- * Forcibly remove all of subdirectories under root.
+ * Move tasks from one to the other group. If @from is NULL, then all tasks
+ * in the systems are moved unconditionally (used for teardown).
+ *
+ * If @mask is not NULL the cpus on which moved tasks are running are set
+ * in that mask so the update smp function call is restricted to affected
+ * cpus.
  */
-static void rmdir_all_sub(void)
+static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
+				 struct cpumask *mask)
 {
-	struct rdtgroup *rdtgrp, *tmp;
 	struct task_struct *p, *t;
 
-	/* move all tasks to default resource group */
 	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t)
-		t->closid = 0;
+	for_each_process_thread(p, t) {
+		if (!from || t->closid == from->closid) {
+			t->closid = to->closid;
+#ifdef CONFIG_SMP
+			/*
+			 * This is safe on x86 w/o barriers as the ordering
+			 * of writing to task_cpu() and t->on_cpu is
+			 * reverse to the reading here. The detection is
+			 * inaccurate as tasks might move or schedule
+			 * before the smp function call takes place. In
+			 * such a case the function call is pointless, but
+			 * there is no other side effect.
+			 */
+			if (mask && t->on_cpu)
+				cpumask_set_cpu(task_cpu(t), mask);
+#endif
+		}
+	}
 	read_unlock(&tasklist_lock);
+}
+
+/*
+ * Forcibly remove all of subdirectories under root.
+ */
+static void rmdir_all_sub(void)
+{
+	struct rdtgroup *rdtgrp, *tmp;
+
+	/* Move all tasks to the default resource group */
+	rdt_move_group_tasks(NULL, &rdtgroup_default, NULL);
 
 	list_for_each_entry_safe(rdtgrp, tmp, &rdt_all_groups, rdtgroup_list) {
 		/* Remove each rdtgroup other than root */
@@ -833,13 +874,15 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
-		rdt_update_percpu_closid(&rdtgrp->cpu_mask,
-					 rdtgroup_default.closid);
-
 		kernfs_remove(rdtgrp->kn);
 		list_del(&rdtgrp->rdtgroup_list);
 		kfree(rdtgrp);
 	}
+	/* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */
+	get_online_cpus();
+	rdt_update_closid(cpu_online_mask, &rdtgroup_default.closid);
+	put_online_cpus();
+
 	kernfs_remove(kn_info);
 }
 
@@ -944,27 +987,35 @@ out_unlock:
 
 static int rdtgroup_rmdir(struct kernfs_node *kn)
 {
-	struct task_struct *p, *t;
+	int ret, cpu, closid = rdtgroup_default.closid;
 	struct rdtgroup *rdtgrp;
+	cpumask_var_t tmpmask;
+
+	if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL))
+		return -ENOMEM;
 
 	rdtgrp = rdtgroup_kn_lock_live(kn);
 	if (!rdtgrp) {
-		rdtgroup_kn_unlock(kn);
-		return -EPERM;
+		ret = -EPERM;
+		goto out;
 	}
 
 	/* Give any tasks back to the default group */
-	read_lock(&tasklist_lock);
-	for_each_process_thread(p, t) {
-		if (t->closid == rdtgrp->closid)
-			t->closid = 0;
-	}
-	read_unlock(&tasklist_lock);
+	rdt_move_group_tasks(rdtgrp, &rdtgroup_default, tmpmask);
 
 	/* Give any CPUs back to the default group */
 	cpumask_or(&rdtgroup_default.cpu_mask,
 		   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
-	rdt_update_percpu_closid(&rdtgrp->cpu_mask, rdtgroup_default.closid);
+
+	/* Update per cpu closid of the moved CPUs first */
+	for_each_cpu(cpu, &rdtgrp->cpu_mask)
+		per_cpu(cpu_closid, cpu) = closid;
+	/*
+	 * Update the MSR on moved CPUs and CPUs which have moved
+	 * task running on them.
+	 */
+	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
+	rdt_update_closid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
 	closid_free(rdtgrp->closid);
@@ -976,9 +1027,11 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)
 	 */
 	kernfs_get(kn);
 	kernfs_remove(rdtgrp->kn);
-
+	ret = 0;
+out:
 	rdtgroup_kn_unlock(kn);
-	return 0;
+	free_cpumask_var(tmpmask);
+	return ret;
 }
 
 static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-28 10:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-18 23:18 [PATCH 1/2] x86/intel_rdt: Fix setting of closid when adding CPUs to a group Fenghua Yu
2016-11-18 23:18 ` [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount Fenghua Yu
2016-11-23 14:23   ` Thomas Gleixner
2016-11-26  2:36     ` Fenghua Yu
2016-11-26  9:08       ` Thomas Gleixner
2016-11-26 21:06         ` Fenghua Yu
2016-11-28 10:17   ` [tip:x86/cache] " tip-bot for Fenghua Yu
2016-11-28 10:16 ` [tip:x86/cache] x86/intel_rdt: Fix setting of closid when adding CPUs to a group tip-bot for Fenghua Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).