linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Yu, Fenghua" <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	"Anvin, H Peter" <h.peter.anvin@intel.com>,
	"Luck, Tony" <tony.luck@intel.com>, Tejun Heo <tj@kernel.org>,
	Borislav Petkov <bp@suse.de>,
	Stephane Eranian <eranian@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	David Carrillo-Cisneros <davidcc@google.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	"Prakhya, Sai Praneeth" <sai.praneeth.prakhya@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>
Subject: RE: [PATCH 24/32] Task fork and exit for rdtgroup
Date: Wed, 13 Jul 2016 23:02:31 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1607132058030.4083@nanos> (raw)
In-Reply-To: <3E5A0FA7E9CA944F9D5414FEC6C712205DFD4A75@ORSMSX106.amr.corp.intel.com>

On Wed, 13 Jul 2016, Yu, Fenghua wrote:
> On Wed, July 2016, Thomas Gleixner wrote
> > On Tue, 12 Jul 2016, Fenghua Yu wrote:
> > > +void rdtgroup_post_fork(struct task_struct *child) {
> > > +	if (!use_rdtgroup_tasks)
> > > +		return;
> > > +
> > > +	spin_lock_irq(&rdtgroup_task_lock);
> > > +	if (list_empty(&child->rg_list)) {
> > 
> > Why would the list be non empty after a fork?
> 
> In this situation for a pid:
> 1.rdtgroup_fork(): rg_list=null.
> 2.setup_task_rg_lists(): rg_list is setup
> 3.rdtgroup_fork(): rg_list is not empty

Why would rdtgroup_fork() be called twice for a given thread?

> This situation happens only during rscctrl mount time. Before mount, post_fork()
> returns from !use_rdtgroup_tasks and doesn't set up rg_list. After mount, rg_list()
> is always empty in post_fork(). But we need to check rg_list for above situation.
> 
> Does that make sense?

No, that does not make any sense at all.

> Any suggestion for better soluation?

The problem you have is:

    fork
	list_init(rg_list);
	write_lock(tasklist_lock);
	
	  task becomes visible

	write_unlock(tasklist_lock);

	rdtgroup_post_fork();
	    if (!use_rdtgroup_tasks)
	       return;

	    spin_lock_irq(&rdtgroup_task_lock);
  	    list_add();
	    spin_unlock_irq(&rdtgroup_task_lock);

I have no idea why this lock must be taken with _irq, but thats another
story. Let's look at the mount side:

       spin_lock_irq(&rdtgroup_task_lock);
       read_lock(&tasklist_lock);
       
       do_each_thread(g, p) { 
       	   WARN_ON(More magic crap happening there)

	   spin_lock_irq(&p->sighand->siglock);
	   list_add();
	   spin_unlock_irq(&p->sighand->siglock);
		      ^^^^ 
Great: You undo the irq disable of (&rdtgroup_task_lock) above! Oh well....

       read_unlock(&tasklist_lock);
       spin_unlock_irq(&rdtgroup_task_lock);

So you need all this magic in rdtgroup_post_fork() and setup_task_rg_lists()
just because you blindly positioned rdtgroup_post_fork() at the point where
the cgroup_post_fork() stuff is. But you did not think a second about the
locking rules here otherwise they would be documented somewhere.

You need a read_lock(&tasklist_lock) for the mount part anyway. So why don't
you do the obvious:

    fork
	list_init(rg_list);
	write_lock(tasklist_lock);

	rdtgroup_post_fork();
	    if (use_rdtgroup_tasks)
	        spin_lock(&rdtgroup_task_lock);
	        list_add();
	        spin_unlock(&rdtgroup_task_lock);

        task becomes visible

	write_unlock(tasklist_lock);

And reorder the lock ordering in the mount path:

       read_lock(&tasklist_lock);
       spin_lock(&rdtgroup_task_lock);
	
Now using rdtgroup_task_lock to protect current->rdtgroup is horrible as
well. You need task->sighand->siglock in the mount path anyway to prevent exit
races. So you can simplify the whole magic to:

    fork
	list_init(rg_list);
	write_lock(tasklist_lock);

        spin_lock(&current->sighand->siglock);

	rdtgroup_post_fork();
	    if (use_rdtgroup_tasks)
	        list_add();

	spin_unlock(&current->sighand->siglock);
	write_unlock(tasklist_lock);

That removes an extra lock/unlock operation from the fork path because
current->sighand->siglock is taken inside of the tasklist_lock write locked
section already.

So you need protection for use_rdtgroup_task, which is a complete misnomer
btw. (rdtgroup_active would be too obvious, right?). That protection is simple
because you can set that flag with tasklist_lock read locked which you hold
anyway for iterating all threads in the mount path.

Aside of that you need to take tsk->sighand->siglock when you change
tsk->rdtgroup, but that's a no-brainer and it gives you the extra benefit that
you can protect such an operation against exit of the task that way by
checking PF_EXITING under the lock. I don't see any protection against exit in
your current implementation when a task is moved to a different partition.

Please sit down and describe the complete locking and protection scheme of
this stuff. I'm not going to figure this out from the obscure code another
time.

Thanks,

	tglx

  reply	other threads:[~2016-07-13 21:05 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13  1:02 [PATCH 00/32] Enable Intel Resource Allocation in Resource Director Technology Fenghua Yu
2016-07-13  1:02 ` [PATCH 01/32] x86/intel_rdt: Cache Allocation documentation Fenghua Yu
2016-07-13  1:02 ` [PATCH 02/32] x86/intel_rdt: Add support for Cache Allocation detection Fenghua Yu
2016-07-26 19:00   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 03/32] x86/intel_rdt: Add Class of service management Fenghua Yu
2016-07-13  1:02 ` [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management Fenghua Yu
2016-07-22  7:12   ` Marcelo Tosatti
2016-07-22 21:43     ` Luck, Tony
2016-07-23  4:31       ` Marcelo Tosatti
2016-07-26  3:18         ` Luck, Tony
2016-07-26 17:10         ` Shivappa Vikas
2016-07-13  1:02 ` [PATCH 05/32] x86/intel_rdt: Implement scheduling support for Intel RDT Fenghua Yu
2016-07-25 16:25   ` Nilay Vaish
2016-07-25 16:31   ` Nilay Vaish
2016-07-25 18:05     ` Luck, Tony
2016-07-25 22:47       ` David Carrillo-Cisneros
2016-07-13  1:02 ` [PATCH 06/32] x86/intel_rdt: Hot cpu support for Cache Allocation Fenghua Yu
2016-07-13  9:19   ` Thomas Gleixner
2016-07-21 19:46     ` Shivappa Vikas
2016-07-14  0:40   ` David Carrillo-Cisneros
2016-07-14 22:58     ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 07/32] x86/intel_rdt: Intel haswell Cache Allocation enumeration Fenghua Yu
2016-07-13  1:02 ` [PATCH 08/32] Define CONFIG_INTEL_RDT Fenghua Yu
2016-07-13 10:25   ` Thomas Gleixner
2016-07-13 18:05     ` Yu, Fenghua
2016-07-13 21:09       ` Thomas Gleixner
2016-07-13 21:18         ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 09/32] x86/intel_rdt: Intel Code Data Prioritization detection Fenghua Yu
2016-07-13  1:02 ` [PATCH 10/32] x86/intel_rdt: Adds support to enable Code Data Prioritization Fenghua Yu
2016-07-26 19:23   ` Nilay Vaish
2016-07-26 20:32     ` Shivappa Vikas
2016-07-13  1:02 ` [PATCH 11/32] x86/intel_rdt: Class of service and capacity bitmask management for CDP Fenghua Yu
2016-07-13  1:02 ` [PATCH 12/32] x86/intel_rdt: Hot cpu update for code data prioritization Fenghua Yu
2016-07-13  1:02 ` [PATCH 13/32] Documentation, x86: Documentation for Intel resource allocation user interface Fenghua Yu
2016-07-13 12:47   ` Thomas Gleixner
2016-07-13 17:13     ` Luck, Tony
2016-07-14  6:53       ` Thomas Gleixner
2016-07-14 17:16         ` Luck, Tony
2016-07-19 12:32           ` Thomas Gleixner
2016-08-04 23:38             ` Yu, Fenghua
2016-07-27 16:20   ` Nilay Vaish
2016-07-27 16:57     ` Luck, Tony
2016-08-03 22:15   ` Marcelo Tosatti
2016-07-13  1:02 ` [PATCH 14/32] x86/cpufeatures: Get max closid and max cbm len and clean feature comments and code Fenghua Yu
2016-07-27 16:49   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 15/32] cacheinfo: Introduce cache id Fenghua Yu
2016-07-27 17:04   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 16/32] Documentation, ABI: Add a document entry for " Fenghua Yu
2016-07-13  1:02 ` [PATCH 17/32] x86, intel_cacheinfo: Enable cache id in x86 Fenghua Yu
2016-07-28  5:41   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 18/32] drivers/base/cacheinfo.c: Export some cacheinfo functions for others to use Fenghua Yu
2016-07-13  1:02 ` [PATCH 19/32] sched.h: Add rg_list and rdtgroup in task_struct Fenghua Yu
2016-07-13 12:56   ` Thomas Gleixner
2016-07-13 17:50     ` Yu, Fenghua
2016-07-28  5:53   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 20/32] magic number for rscctrl file system Fenghua Yu
2016-07-28  5:57   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 21/32] x86/intel_rdt.h: Header for inter_rdt.c Fenghua Yu
2016-07-28 14:07   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 22/32] x86/intel_rdt_rdtgroup.h: Header for user interface Fenghua Yu
2016-07-13  1:02 ` [PATCH 23/32] x86/intel_rdt.c: Extend RDT to per cache and per resources Fenghua Yu
2016-07-13 13:07   ` Thomas Gleixner
2016-07-13 17:40     ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 24/32] Task fork and exit for rdtgroup Fenghua Yu
2016-07-13 13:14   ` Thomas Gleixner
2016-07-13 17:32     ` Yu, Fenghua
2016-07-13 21:02       ` Thomas Gleixner [this message]
2016-07-13 21:22         ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 25/32] x86/intel_rdt_rdtgroup.c: User interface for RDT Fenghua Yu
2016-07-14 12:30   ` Thomas Gleixner
2016-07-13  1:02 ` [PATCH 26/32] x86/intel_rdt_rdtgroup.c: Create info directory Fenghua Yu
2016-07-13  1:03 ` [PATCH 27/32] x86/intel_rdt_rdtgroup.c: Implement rscctrl file system commands Fenghua Yu
2016-07-13  1:03 ` [PATCH 28/32] x86/intel_rdt_rdtgroup.c: Read and write cpus Fenghua Yu
2016-07-13  1:03 ` [PATCH 29/32] x86/intel_rdt_rdtgroup.c: Tasks iterator and write Fenghua Yu
2016-07-13  1:03 ` [PATCH 30/32] x86/intel_rdt_rdtgroup.c: Process schemas input from rscctrl interface Fenghua Yu
2016-07-14  0:41   ` David Carrillo-Cisneros
2016-07-14  6:11     ` Thomas Gleixner
2016-07-14  6:16       ` Yu, Fenghua
2016-07-14  6:32     ` Yu, Fenghua
2016-07-13  1:03 ` [PATCH 31/32] MAINTAINERS: Add maintainer for Intel RDT resource allocation Fenghua Yu
2016-07-13  1:03 ` [PATCH 32/32] x86/Makefile: Build intel_rdt_rdtgroup.c Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1607132058030.4083@nanos \
    --to=tglx@linutronix.de \
    --cc=bp@suse.de \
    --cc=davidcc@google.com \
    --cc=eranian@google.com \
    --cc=fenghua.yu@intel.com \
    --cc=h.peter.anvin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=sai.praneeth.prakhya@intel.com \
    --cc=tj@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).